1fi_poll(3) Libfabric v1.8.0 fi_poll(3)
2
3
4
6 fi_poll - Polling and wait set operations
7
8 fi_poll_open / fi_close
9 Open/close a polling set
10
11 fi_poll_add / fi_poll_del
12 Add/remove a completion queue or counter to/from a poll set.
13
14 fi_poll
15 Poll for progress and events across multiple completion queues
16 and counters.
17
18 fi_wait_open / fi_close
19 Open/close a wait set
20
21 fi_wait
22 Waits for one or more wait objects in a set to be signaled.
23
24 fi_trywait
25 Indicate when it is safe to block on wait objects using native
26 OS calls.
27
28 fi_control
29 Control wait set operation or attributes.
30
32 #include <rdma/fi_domain.h>
33
34 int fi_poll_open(struct fid_domain *domain, struct fi_poll_attr *attr,
35 struct fid_poll **pollset);
36
37 int fi_close(struct fid *pollset);
38
39 int fi_poll_add(struct fid_poll *pollset, struct fid *event_fid,
40 uint64_t flags);
41
42 int fi_poll_del(struct fid_poll *pollset, struct fid *event_fid,
43 uint64_t flags);
44
45 int fi_poll(struct fid_poll *pollset, void **context, int count);
46
47 int fi_wait_open(struct fid_fabric *fabric, struct fi_wait_attr *attr,
48 struct fid_wait **waitset);
49
50 int fi_close(struct fid *waitset);
51
52 int fi_wait(struct fid_wait *waitset, int timeout);
53
54 int fi_trywait(struct fid_fabric *fabric, struct fid **fids, size_t count);
55
56 int fi_control(struct fid *waitset, int command, void *arg);
57
59 fabric Fabric provider
60
61 domain Resource domain
62
63 pollset
64 Event poll set
65
66 waitset
67 Wait object set
68
69 attr Poll or wait set attributes
70
71 context
72 On success, an array of user context values associated with com‐
73 pletion queues or counters.
74
75 fids An array of fabric descriptors, each one associated with a na‐
76 tive wait object.
77
78 count Number of entries in context or fids array.
79
80 timeout
81 Time to wait for a signal, in milliseconds.
82
83 command
84 Command of control operation to perform on the wait set.
85
86 arg Optional control argument.
87
89 fi_poll_open
90 fi_poll_open creates a new polling set. A poll set enables an opti‐
91 mized method for progressing asynchronous operations across multiple
92 completion queues and counters and checking for their completions.
93
94 A poll set is defined with the following attributes.
95
96 struct fi_poll_attr {
97 uint64_t flags; /* operation flags */
98 };
99
100 flags Flags that set the default operation of the poll set. The use
101 of this field is reserved and must be set to 0 by the caller.
102
103 fi_close
104 The fi_close call releases all resources associated with a poll set.
105 The poll set must not be associated with any other resources prior to
106 being closed, otherwise the call will return -FI_EBUSY.
107
108 fi_poll_add
109 Associates a completion queue or counter with a poll set.
110
111 fi_poll_del
112 Removes a completion queue or counter from a poll set.
113
114 fi_poll
115 Progresses all completion queues and counters associated with a poll
116 set and checks for events. If events might have occurred, contexts as‐
117 sociated with the completion queues and/or counters are returned. Com‐
118 pletion queues will return their context if they are not empty. The
119 context associated with a counter will be returned if the counter's
120 success value or error value have changed since the last time fi_poll,
121 fi_cntr_set, or fi_cntr_add were called. The number of contexts is
122 limited to the size of the context array, indicated by the count param‐
123 eter.
124
125 Note that fi_poll only indicates that events might be available. In
126 some cases, providers may consume such events internally, to drive
127 progress, for example. This can result in fi_poll returning false pos‐
128 itives. Applications should drive their progress based on the results
129 of reading events from a completion queue or reading counter values.
130 The fi_poll function will always return all completion queues and coun‐
131 ters that do have new events.
132
133 fi_wait_open
134 fi_wait_open allocates a new wait set. A wait set enables an optimized
135 method of waiting for events across multiple completion queues and
136 counters. Where possible, a wait set uses a single underlying wait ob‐
137 ject that is signaled when a specified condition occurs on an associat‐
138 ed completion queue or counter.
139
140 The properties and behavior of a wait set are defined by struct
141 fi_wait_attr.
142
143 struct fi_wait_attr {
144 enum fi_wait_obj wait_obj; /* requested wait object */
145 uint64_t flags; /* operation flags */
146 };
147
148 wait_obj
149 Wait sets are associated with specific wait object(s). Wait ob‐
150 jects allow applications to block until the wait object is sig‐
151 naled, indicating that an event is available to be read. The
152 following values may be used to specify the type of wait object
153 associated with a wait set: FI_WAIT_UNSPEC, FI_WAIT_FD, and
154 FI_WAIT_MUTEX_COND.
155
156 - FI_WAIT_UNSPEC
157 Specifies that the user will only wait on the wait set using
158 fabric interface calls, such as fi_wait. In this case, the un‐
159 derlying provider may select the most appropriate or highest
160 performing wait object available, including custom wait mecha‐
161 nisms. Applications that select FI_WAIT_UNSPEC are not guaran‐
162 teed to retrieve the underlying wait object.
163
164 - FI_WAIT_FD
165 Indicates that the wait set should use file descriptor(s) as its
166 wait mechanism. It may not always be possible for a wait set to
167 be implemented using a single underlying file descriptor, but
168 all wait objects will be file descriptors. File descriptor wait
169 objects must be usable in the POSIX select(2), poll(2), and
170 epoll(7) routines (if available). However, a provider may sig‐
171 nal an FD wait object by marking it as readable or with an er‐
172 ror.
173
174 - FI_WAIT_MUTEX_COND
175 Specifies that the wait set should use a pthread mutex and cond
176 variable as a wait object.
177
178 - FI_WAIT_CRITSEC_COND
179 Windows specific. Specifies that the EQ should use a critical
180 section and condition variable as a wait object.
181
182 flags Flags that set the default operation of the wait set. The use
183 of this field is reserved and must be set to 0 by the caller.
184
185 fi_close
186 The fi_close call releases all resources associated with a wait set.
187 The wait set must not be bound to any other opened resources prior to
188 being closed, otherwise the call will return -FI_EBUSY.
189
190 fi_wait
191 Waits on a wait set until one or more of its underlying wait objects is
192 signaled.
193
194 fi_trywait
195 The fi_trywait call was introduced in libfabric version 1.3. The be‐
196 havior of using native wait objects without the use of fi_trywait is
197 provider specific and should be considered non-deterministic.
198
199 The fi_trywait() call is used in conjunction with native operating sys‐
200 tem calls to block on wait objects, such as file descriptors. The ap‐
201 plication must call fi_trywait and obtain a return value of FI_SUCCESS
202 prior to blocking on a native wait object. Failure to do so may result
203 in the wait object not being signaled, and the application not observ‐
204 ing the desired events. The following pseudo-code demonstrates the use
205 of fi_trywait in conjunction with the OS select(2) call.
206
207 fi_control(&cq->fid, FI_GETWAIT, (void *) &fd);
208 FD_ZERO(&fds);
209 FD_SET(fd, &fds);
210
211 while (1) {
212 if (fi_trywait(&cq, 1) == FI_SUCCESS)
213 select(fd + 1, &fds, NULL, &fds, &timeout);
214
215 do {
216 ret = fi_cq_read(cq, &comp, 1);
217 } while (ret > 0);
218 }
219
220 fi_trywait() will return FI_SUCCESS if it is safe to block on the wait
221 object(s) corresponding to the fabric descriptor(s), or -FI_EAGAIN if
222 there are events queued on the fabric descriptor or if blocking could
223 hang the application.
224
225 The call takes an array of fabric descriptors. For each wait object
226 that will be passed to the native wait routine, the corresponding fab‐
227 ric descriptor should first be passed to fi_trywait. All fabric de‐
228 scriptors passed into a single fi_trywait call must make use of the
229 same underlying wait object type.
230
231 The following types of fabric descriptors may be passed into fi_try‐
232 wait: event queues, completion queues, counters, and wait sets. Appli‐
233 cations that wish to use native wait calls should select specific wait
234 objects when allocating such resources. For example, by setting the
235 item's creation attribute wait_obj value to FI_WAIT_FD.
236
237 In the case the wait object to check belongs to a wait set, only the
238 wait set itself needs to be passed into fi_trywait. The fabric re‐
239 sources associated with the wait set do not.
240
241 On receiving a return value of -FI_EAGAIN from fi_trywait, an applica‐
242 tion should read all queued completions and events, and call fi_trywait
243 again before attempting to block. Applications can make use of a fab‐
244 ric poll set to identify completion queues and counters that may re‐
245 quire processing.
246
247 fi_control
248 The fi_control call is used to access provider or implementation spe‐
249 cific details of the wait set. Access to the wait set should be seri‐
250 alized across all calls when fi_control is invoked, as it may redirect
251 the implementation of wait set operations. The following control com‐
252 mands are usable with a wait set.
253
254 FI_GETWAIT (void **)
255 This command allows the user to retrieve the low-level wait ob‐
256 ject associated with the wait set. The format of the wait set
257 is specified during wait set creation, through the wait set at‐
258 tributes. The fi_control arg parameter should be an address
259 where a pointer to the returned wait object will be written.
260 This should be an 'int *' for FI_WAIT_FD, or 'struct fi_mu‐
261 tex_cond' for FI_WAIT_MUTEX_COND. Support for FI_GETWAIT is
262 provider specific and may fail if not supported or if the wait
263 set is implemented using more than one wait object.
264
266 Returns FI_SUCCESS on success. On error, a negative value correspond‐
267 ing to fabric errno is returned.
268
269 Fabric errno values are defined in rdma/fi_errno.h.
270
271 fi_poll
272 On success, if events are available, returns the number of en‐
273 tries written to the context array.
274
277 fi_getinfo(3), fi_domain(3), fi_cntr(3), fi_eq(3)
278
280 OpenFabrics.
281
282
283
284Libfabric Programmer's Manual 2018-10-05 fi_poll(3)