1fi_cntr(3) Libfabric v1.17.0 fi_cntr(3)
2
3
4
6 fi_cntr - Completion and event counter operations
7
8 fi_cntr_open / fi_close
9 Allocate/free a counter
10
11 fi_cntr_read
12 Read the current value of a counter
13
14 fi_cntr_readerr
15 Reads the number of operations which have completed in error.
16
17 fi_cntr_add
18 Increment a counter by a specified value
19
20 fi_cntr_set
21 Set a counter to a specified value
22
23 fi_cntr_wait
24 Wait for a counter to be greater or equal to a threshold value
25
27 #include <rdma/fi_domain.h>
28
29 int fi_cntr_open(struct fid_domain *domain, struct fi_cntr_attr *attr,
30 struct fid_cntr **cntr, void *context);
31
32 int fi_close(struct fid *cntr);
33
34 uint64_t fi_cntr_read(struct fid_cntr *cntr);
35
36 uint64_t fi_cntr_readerr(struct fid_cntr *cntr);
37
38 int fi_cntr_add(struct fid_cntr *cntr, uint64_t value);
39
40 int fi_cntr_adderr(struct fid_cntr *cntr, uint64_t value);
41
42 int fi_cntr_set(struct fid_cntr *cntr, uint64_t value);
43
44 int fi_cntr_seterr(struct fid_cntr *cntr, uint64_t value);
45
46 int fi_cntr_wait(struct fid_cntr *cntr, uint64_t threshold,
47 int timeout);
48
50 domain Fabric domain
51
52 cntr Fabric counter
53
54 attr Counter attributes
55
56 context
57 User specified context associated with the counter
58
59 value Value to increment or set counter
60
61 threshold
62 Value to compare counter against
63
64 timeout
65 Time in milliseconds to wait. A negative value indicates infi‐
66 nite timeout.
67
69 Counters record the number of requested operations that have completed.
70 Counters can provide a light-weight completion mechanism by allowing
71 the suppression of CQ completion entries. They are useful for applica‐
72 tions that only need to know the number of requests that have complet‐
73 ed, and not details about each request. For example, counters may be
74 useful for implementing credit based flow control or tracking the num‐
75 ber of remote processes that have responded to a request.
76
77 Counters typically only count successful completions. However, if an
78 operation completes in error, it may increment an associated error val‐
79 ue. That is, a counter actually stores two distinct values, with error
80 completions updating an error specific value.
81
82 Counters are updated following the completion event semantics defined
83 in fi_cq(3). The timing of the update is based on the type of transfer
84 and any specified operation flags.
85
86 fi_cntr_open
87 fi_cntr_open allocates a new fabric counter. The properties and behav‐
88 ior of the counter are defined by struct fi_cntr_attr.
89
90 struct fi_cntr_attr {
91 enum fi_cntr_events events; /* type of events to count */
92 enum fi_wait_obj wait_obj; /* requested wait object */
93 struct fid_wait *wait_set; /* optional wait set */
94 uint64_t flags; /* operation flags */
95 };
96
97 events A counter captures different types of events. The specific type
98 which is to counted are one of the following:
99
100 - FI_CNTR_EVENTS_COMP
101 The counter increments for every successful completion that oc‐
102 curs on an associated bound endpoint. The type of completions –
103 sends and/or receives – which are counted may be restricted us‐
104 ing control flags when binding the counter and the endpoint.
105 Counters increment on all successful completions, separately
106 from whether the operation generates an entry in an event queue.
107
108 wait_obj
109 Counters may be associated with a specific wait object. Wait
110 objects allow applications to block until the wait object is
111 signaled, indicating that a counter has reached a specific
112 threshold. Users may use fi_control to retrieve the underlying
113 wait object associated with a counter, in order to use it in
114 other system calls. The following values may be used to specify
115 the type of wait object associated with a counter: FI_WAIT_NONE,
116 FI_WAIT_UNSPEC, FI_WAIT_SET, FI_WAIT_FD, FI_WAIT_MUTEX_COND, and
117 FI_WAIT_YIELD. The default is FI_WAIT_NONE.
118
119 - FI_WAIT_NONE
120 Used to indicate that the user will not block (wait) for events
121 on the counter.
122
123 - FI_WAIT_UNSPEC
124 Specifies that the user will only wait on the counter using fab‐
125 ric interface calls, such as fi_cntr_wait. In this case, the
126 underlying provider may select the most appropriate or highest
127 performing wait object available, including custom wait mecha‐
128 nisms. Applications that select FI_WAIT_UNSPEC are not guaran‐
129 teed to retrieve the underlying wait object.
130
131 - FI_WAIT_SET
132 Indicates that the event counter should use a wait set object to
133 wait for events. If specified, the wait_set field must refer‐
134 ence an existing wait set object.
135
136 - FI_WAIT_FD
137 Indicates that the counter should use a file descriptor as its
138 wait mechanism. A file descriptor wait object must be usable in
139 select, poll, and epoll routines. However, a provider may sig‐
140 nal an FD wait object by marking it as readable, writable, or
141 with an error.
142
143 - FI_WAIT_MUTEX_COND
144 Specifies that the counter should use a pthread mutex and cond
145 variable as a wait object.
146
147 - FI_WAIT_YIELD
148 Indicates that the counter will wait without a wait object but
149 instead yield on every wait. Allows usage of fi_cntr_wait
150 through a spin.
151
152 wait_set
153 If wait_obj is FI_WAIT_SET, this field references a wait object
154 to which the event counter should attach. When an event is
155 added to the event counter, the corresponding wait set will be
156 signaled if all necessary conditions are met. The use of a
157 wait_set enables an optimized method of waiting for events
158 across multiple event counters. This field is ignored if
159 wait_obj is not FI_WAIT_SET.
160
161 flags Flags are reserved for future use, and must be set to 0.
162
163 fi_close
164 The fi_close call releases all resources associated with a counter.
165 When closing the counter, there must be no opened endpoints, transmit
166 contexts, receive contexts or memory regions associated with the
167 counter. If resources are still associated with the counter when at‐
168 tempting to close, the call will return -FI_EBUSY.
169
170 fi_cntr_control
171 The fi_cntr_control call is used to access provider or implementation
172 specific details of the counter. Access to the counter should be seri‐
173 alized across all calls when fi_cntr_control is invoked, as it may re‐
174 direct the implementation of counter operations. The following control
175 commands are usable with a counter:
176
177 FI_GETOPSFLAG (uint64_t *)
178 Returns the current default operational flags associated with
179 the counter.
180
181 FI_SETOPSFLAG (uint64_t *)
182 Modifies the current default operational flags associated with
183 the counter.
184
185 FI_GETWAIT (void **)
186 This command allows the user to retrieve the low-level wait ob‐
187 ject associated with the counter. The format of the wait-object
188 is specified during counter creation, through the counter at‐
189 tributes. See fi_eq.3 for addition details using control with
190 FI_GETWAIT.
191
192 fi_cntr_read
193 The fi_cntr_read call returns the current value of the counter.
194
195 fi_cntr_readerr
196 The read error call returns the number of operations that completed in
197 error and were unable to update the counter.
198
199 fi_cntr_add
200 This adds the user-specified value to the counter.
201
202 fi_cntr_adderr
203 This adds the user-specified value to the error value of the counter.
204
205 fi_cntr_set
206 This sets the counter to the specified value.
207
208 fi_cntr_seterr
209 This sets the error value of the counter to the specified value.
210
211 fi_cntr_wait
212 This call may be used to wait until the counter reaches the specified
213 threshold, or until an error or timeout occurs. Upon successful return
214 from this call, the counter will be greater than or equal to the input
215 threshold value.
216
217 If an operation associated with the counter encounters an error, it
218 will increment the error value associated with the counter. Any change
219 in a counter’s error value will unblock any thread inside fi_cntr_wait.
220
221 If the call returns due to timeout, -FI_ETIMEDOUT will be returned.
222 The error value associated with the counter remains unchanged.
223
224 It is invalid for applications to call this function if the counter has
225 been configured with a wait object of FI_WAIT_NONE or FI_WAIT_SET.
226
228 Returns 0 on success. On error, a negative value corresponding to fab‐
229 ric errno is returned.
230
231 fi_cntr_read / fi_cntr_readerr
232 Returns the current value of the counter.
233
234 Fabric errno values are defined in rdma/fi_errno.h.
235
237 In order to support a variety of counter implementations, updates made
238 to counter values (e.g. fi_cntr_set or fi_cntr_add) may not be immedi‐
239 ately visible to counter read operations (i.e. fi_cntr_read or fi_cn‐
240 tr_readerr). A small, but undefined, delay may occur between the
241 counter changing and the reported value being updated. However, a fi‐
242 nal updated value will eventually be reflected in the read counter val‐
243 ue.
244
245 Additionally, applications should ensure that the value of a counter is
246 stable and not subject to change prior to calling fi_cntr_set or fi_cn‐
247 tr_seterr. Otherwise, the resulting value of the counter after fi_cn‐
248 tr_set / fi_cntr_seterr is undefined, as updates to the counter may be
249 lost. A counter value is considered stable if all previous updates us‐
250 ing fi_cntr_set / fi_cntr_seterr and results of related operations are
251 reflected in the observed value of the counter.
252
254 fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_eq(3), fi_poll(3)
255
257 OpenFabrics.
258
259
260
261Libfabric Programmer’s Manual 2022-12-11 fi_cntr(3)