1fi_atomic(3) Libfabric v1.17.0 fi_atomic(3)
2
3
4
6 fi_atomic - Remote atomic functions
7
8 fi_atomic / fi_atomicv / fi_atomicmsg / fi_inject_atomic
9 Initiates an atomic operation to remote memory
10
11 fi_fetch_atomic / fi_fetch_atomicv / fi_fetch_atomicmsg
12 Initiates an atomic operation to remote memory, retrieving the
13 initial value.
14
15 fi_compare_atomic / fi_compare_atomicv / fi_compare_atomicmsg
16 Initiates an atomic compare-operation to remote memory, retriev‐
17 ing the initial value.
18
19 fi_atomicvalid / fi_fetch_atomicvalid / fi_compare_atomicvalid /
20 fi_query_atomic : Indicates if a provider supports a specific atomic
21 operation
22
24 #include <rdma/fi_atomic.h>
25
26 ssize_t fi_atomic(struct fid_ep *ep, const void *buf,
27 size_t count, void *desc, fi_addr_t dest_addr,
28 uint64_t addr, uint64_t key,
29 enum fi_datatype datatype, enum fi_op op, void *context);
30
31 ssize_t fi_atomicv(struct fid_ep *ep, const struct fi_ioc *iov,
32 void **desc, size_t count, fi_addr_t dest_addr,
33 uint64_t addr, uint64_t key,
34 enum fi_datatype datatype, enum fi_op op, void *context);
35
36 ssize_t fi_atomicmsg(struct fid_ep *ep, const struct fi_msg_atomic *msg,
37 uint64_t flags);
38
39 ssize_t fi_inject_atomic(struct fid_ep *ep, const void *buf,
40 size_t count, fi_addr_t dest_addr,
41 uint64_t addr, uint64_t key,
42 enum fi_datatype datatype, enum fi_op op);
43
44 ssize_t fi_fetch_atomic(struct fid_ep *ep, const void *buf,
45 size_t count, void *desc, void *result, void *result_desc,
46 fi_addr_t dest_addr, uint64_t addr, uint64_t key,
47 enum fi_datatype datatype, enum fi_op op, void *context);
48
49 ssize_t fi_fetch_atomicv(struct fid_ep *ep, const struct fi_ioc *iov,
50 void **desc, size_t count, struct fi_ioc *resultv,
51 void **result_desc, size_t result_count, fi_addr_t dest_addr,
52 uint64_t addr, uint64_t key, enum fi_datatype datatype,
53 enum fi_op op, void *context);
54
55 ssize_t fi_fetch_atomicmsg(struct fid_ep *ep,
56 const struct fi_msg_atomic *msg, struct fi_ioc *resultv,
57 void **result_desc, size_t result_count, uint64_t flags);
58
59 ssize_t fi_compare_atomic(struct fid_ep *ep, const void *buf,
60 size_t count, void *desc, const void *compare,
61 void *compare_desc, void *result, void *result_desc,
62 fi_addr_t dest_addr, uint64_t addr, uint64_t key,
63 enum fi_datatype datatype, enum fi_op op, void *context);
64
65 size_t fi_compare_atomicv(struct fid_ep *ep, const struct fi_ioc *iov,
66 void **desc, size_t count, const struct fi_ioc *comparev,
67 void **compare_desc, size_t compare_count, struct fi_ioc *resultv,
68 void **result_desc, size_t result_count, fi_addr_t dest_addr,
69 uint64_t addr, uint64_t key, enum fi_datatype datatype,
70 enum fi_op op, void *context);
71
72 ssize_t fi_compare_atomicmsg(struct fid_ep *ep,
73 const struct fi_msg_atomic *msg, const struct fi_ioc *comparev,
74 void **compare_desc, size_t compare_count,
75 struct fi_ioc *resultv, void **result_desc, size_t result_count,
76 uint64_t flags);
77
78 int fi_atomicvalid(struct fid_ep *ep, enum fi_datatype datatype,
79 enum fi_op op, size_t *count);
80
81 int fi_fetch_atomicvalid(struct fid_ep *ep, enum fi_datatype datatype,
82 enum fi_op op, size_t *count);
83
84 int fi_compare_atomicvalid(struct fid_ep *ep, enum fi_datatype datatype,
85 enum fi_op op, size_t *count);
86
87 int fi_query_atomic(struct fid_domain *domain,
88 enum fi_datatype datatype, enum fi_op op,
89 struct fi_atomic_attr *attr, uint64_t flags);
90
92 ep Fabric endpoint on which to initiate atomic operation.
93
94 buf Local data buffer that specifies first operand of atomic opera‐
95 tion
96
97 iov / comparev / resultv
98 Vectored data buffer(s).
99
100 count / compare_count / result_count
101 Count of vectored data entries. The number of elements refer‐
102 enced, where each element is the indicated datatype.
103
104 addr Address of remote memory to access.
105
106 key Protection key associated with the remote memory.
107
108 datatype
109 Datatype associated with atomic operands
110
111 op Atomic operation to perform
112
113 compare
114 Local compare buffer, containing comparison data.
115
116 result Local data buffer to store initial value of remote buffer
117
118 desc / compare_desc / result_desc
119 Data descriptor associated with the local data buffer, local
120 compare buffer, and local result buffer, respectively. See
121 fi_mr(3).
122
123 dest_addr
124 Destination address for connectionless atomic operations. Ig‐
125 nored for connected endpoints.
126
127 msg Message descriptor for atomic operations
128
129 flags Additional flags to apply for the atomic operation
130
131 context
132 User specified pointer to associate with the operation. This
133 parameter is ignored if the operation will not generate a suc‐
134 cessful completion, unless an op flag specifies the context pa‐
135 rameter be used for required input.
136
138 Atomic transfers are used to read and update data located in remote
139 memory regions in an atomic fashion. Conceptually, they are similar to
140 local atomic operations of a similar nature (e.g. atomic increment,
141 compare and swap, etc.). Updates to remote data involve one of several
142 operations on the data, and act on specific types of data, as listed
143 below. As such, atomic transfers have knowledge of the format of the
144 data being accessed. A single atomic function may operate across an
145 array of data applying an atomic operation to each entry, but the atom‐
146 icity of an operation is limited to a single datatype or entry.
147
148 Atomic Data Types
149 Atomic functions may operate on one of the following identified data
150 types. A given atomic function may support any datatype, subject to
151 provider implementation constraints.
152
153 FI_INT8
154 Signed 8-bit integer.
155
156 FI_UINT8
157 Unsigned 8-bit integer.
158
159 FI_INT16
160 Signed 16-bit integer.
161
162 FI_UINT16
163 Unsigned 16-bit integer.
164
165 FI_INT32
166 Signed 32-bit integer.
167
168 FI_UINT32
169 Unsigned 32-bit integer.
170
171 FI_INT64
172 Signed 64-bit integer.
173
174 FI_UINT64
175 Unsigned 64-bit integer.
176
177 FI_INT128
178 Signed 128-bit integer.
179
180 FI_UINT128
181 Unsigned 128-bit integer.
182
183 FI_FLOAT
184 A single-precision floating point value (IEEE 754).
185
186 FI_DOUBLE
187 A double-precision floating point value (IEEE 754).
188
189 FI_FLOAT_COMPLEX
190 An ordered pair of single-precision floating point values (IEEE
191 754), with the first value representing the real portion of a
192 complex number and the second representing the imaginary por‐
193 tion.
194
195 FI_DOUBLE_COMPLEX
196 An ordered pair of double-precision floating point values (IEEE
197 754), with the first value representing the real portion of a
198 complex number and the second representing the imaginary por‐
199 tion.
200
201 FI_LONG_DOUBLE
202 A double-extended precision floating point value (IEEE 754).
203 Note that the size of a long double and number of bits used for
204 precision is compiler, platform, and/or provider specific. De‐
205 velopers that use long double should ensure that libfabric is
206 built using a long double format that is compatible with their
207 application, and that format is supported by the provider. The
208 mechanism used for this validation is currently beyond the scope
209 of the libfabric API.
210
211 FI_LONG_DOUBLE_COMPLEX
212 An ordered pair of double-extended precision floating point val‐
213 ues (IEEE 754), with the first value representing the real por‐
214 tion of a complex number and the second representing the imagi‐
215 nary portion.
216
217 Atomic Operations
218 The following atomic operations are defined. An atomic operation often
219 acts against a target value in the remote memory buffer and source val‐
220 ue provided with the atomic function. It may also carry source data to
221 replace the target value in compare and swap operations. A conceptual
222 description of each operation is provided.
223
224 FI_MIN Minimum
225
226 if (buf[i] < addr[i])
227 addr[i] = buf[i]
228
229 FI_MAX Maximum
230
231 if (buf[i] > addr[i])
232 addr[i] = buf[i]
233
234 FI_SUM Sum
235
236 addr[i] = addr[i] + buf[i]
237
238 FI_PROD
239 Product
240
241 addr[i] = addr[i] * buf[i]
242
243 FI_LOR Logical OR
244
245 addr[i] = (addr[i] || buf[i])
246
247 FI_LAND
248 Logical AND
249
250 addr[i] = (addr[i] && buf[i])
251
252 FI_BOR Bitwise OR
253
254 addr[i] = addr[i] | buf[i]
255
256 FI_BAND
257 Bitwise AND
258
259 addr[i] = addr[i] & buf[i]
260
261 FI_LXOR
262 Logical exclusive-OR (XOR)
263
264 addr[i] = ((addr[i] && !buf[i]) || (!addr[i] && buf[i]))
265
266 FI_BXOR
267 Bitwise exclusive-OR (XOR)
268
269 addr[i] = addr[i] ^ buf[i]
270
271 FI_ATOMIC_READ
272 Read data atomically
273
274 result[i] = addr[i]
275
276 FI_ATOMIC_WRITE
277 Write data atomically
278
279 addr[i] = buf[i]
280
281 FI_CSWAP
282 Compare values and if equal swap with data
283
284 if (compare[i] == addr[i])
285 addr[i] = buf[i]
286
287 FI_CSWAP_NE
288 Compare values and if not equal swap with data
289
290 if (compare[i] != addr[i])
291 addr[i] = buf[i]
292
293 FI_CSWAP_LE
294 Compare values and if less than or equal swap with data
295
296 if (compare[i] <= addr[i])
297 addr[i] = buf[i]
298
299 FI_CSWAP_LT
300 Compare values and if less than swap with data
301
302 if (compare[i] < addr[i])
303 addr[i] = buf[i]
304
305 FI_CSWAP_GE
306 Compare values and if greater than or equal swap with data
307
308 if (compare[i] >= addr[i])
309 addr[i] = buf[i]
310
311 FI_CSWAP_GT
312 Compare values and if greater than swap with data
313
314 if (compare[i] > addr[i])
315 addr[i] = buf[i]
316
317 FI_MSWAP
318 Swap masked bits with data
319
320 addr[i] = (buf[i] & compare[i]) | (addr[i] & ~compare[i])
321
322 Base Atomic Functions
323 The base atomic functions – fi_atomic, fi_atomicv, fi_atomicmsg – are
324 used to transmit data to a remote node, where the specified atomic op‐
325 eration is performed against the target data. The result of a base
326 atomic function is stored at the remote memory region. The main dif‐
327 ference between atomic functions are the number and type of parameters
328 that they accept as input. Otherwise, they perform the same general
329 function.
330
331 The call fi_atomic transfers the data contained in the user-specified
332 data buffer to a remote node. For connectionless endpoints, the desti‐
333 nation endpoint is specified through the dest_addr parameter. Unless
334 the endpoint has been configured differently, the data buffer passed
335 into fi_atomic must not be touched by the application until the
336 fi_atomic call completes asynchronously. The target buffer of a base
337 atomic operation must allow for remote read an/or write access, as ap‐
338 propriate.
339
340 The fi_atomicv call adds support for a scatter-gather list to fi_atom‐
341 ic. The fi_atomicv transfers the set of data buffers referenced by the
342 ioc parameter to the remote node for processing.
343
344 The fi_inject_atomic call is an optimized version of fi_atomic. The
345 fi_inject_atomic function behaves as if the FI_INJECT transfer flag
346 were set, and FI_COMPLETION were not. That is, the data buffer is
347 available for reuse immediately on returning from from fi_inject_atom‐
348 ic, and no completion event will be generated for this atomic. The
349 completion event will be suppressed even if the endpoint has not been
350 configured with FI_SELECTIVE_COMPLETION. See the flags discussion be‐
351 low for more details. The requested message size that can be used with
352 fi_inject_atomic is limited by inject_size.
353
354 The fi_atomicmsg call supports atomic functions over both connected and
355 connectionless endpoints, with the ability to control the atomic opera‐
356 tion per call through the use of flags. The fi_atomicmsg function
357 takes a struct fi_msg_atomic as input.
358
359 struct fi_msg_atomic {
360 const struct fi_ioc *msg_iov; /* local scatter-gather array */
361 void **desc; /* local access descriptors */
362 size_t iov_count;/* # elements in ioc */
363 const void *addr; /* optional endpoint address */
364 const struct fi_rma_ioc *rma_iov; /* remote SGL */
365 size_t rma_iov_count;/* # elements in remote SGL */
366 enum fi_datatype datatype; /* operand datatype */
367 enum fi_op op; /* atomic operation */
368 void *context; /* user-defined context */
369 uint64_t data; /* optional data */
370 };
371
372 struct fi_ioc {
373 void *addr; /* local address */
374 size_t count; /* # target operands */
375 };
376
377 struct fi_rma_ioc {
378 uint64_t addr; /* target address */
379 size_t count; /* # target operands */
380 uint64_t key; /* access key */
381 };
382
383 The following list of atomic operations are usable with base atomic op‐
384 erations: FI_MIN, FI_MAX, FI_SUM, FI_PROD, FI_LOR, FI_LAND, FI_BOR,
385 FI_BAND, FI_LXOR, FI_BXOR, and FI_ATOMIC_WRITE.
386
387 Fetch-Atomic Functions
388 The fetch atomic functions – fi_fetch_atomic, fi_fetch_atomicv, and
389 fi_fetch atomicmsg – behave similar to the equivalent base atomic func‐
390 tion. The difference between the fetch and base atomic calls are the
391 fetch atomic routines return the initial value that was stored at the
392 target to the user. The initial value is read into the user provided
393 result buffer. The target buffer of fetch-atomic operations must be
394 enabled for remote read access.
395
396 The following list of atomic operations are usable with fetch atomic
397 operations: FI_MIN, FI_MAX, FI_SUM, FI_PROD, FI_LOR, FI_LAND, FI_BOR,
398 FI_BAND, FI_LXOR, FI_BXOR, FI_ATOMIC_READ, and FI_ATOMIC_WRITE.
399
400 For FI_ATOMIC_READ operations, the source buffer operand (e.g.
401 fi_fetch_atomic buf parameter) is ignored and may be NULL. The results
402 are written into the result buffer.
403
404 Compare-Atomic Functions
405 The compare atomic functions – fi_compare_atomic, fi_compare_atomicv,
406 and fi_compare atomicmsg – are used for operations that require compar‐
407 ing the target data against a value before performing a swap operation.
408 The compare atomic functions support: FI_CSWAP, FI_CSWAP_NE,
409 FI_CSWAP_LE, FI_CSWAP_LT, FI_CSWAP_GE, FI_CSWAP_GT, and FI_MSWAP.
410
411 Atomic Valid Functions
412 The atomic valid functions – fi_atomicvalid, fi_fetch_atomicvalid, and
413 fi_compare_atomicvalid –indicate which operations the local provider
414 supports. Needed operations not supported by the provider must be emu‐
415 lated by the application. Each valid call corresponds to a set of
416 atomic functions. fi_atomicvalid checks whether a provider supports a
417 specific base atomic operation for a given datatype and operation.
418 fi_fetch_atomicvalid indicates if a provider supports a specific fetch-
419 atomic operation for a given datatype and operation. And fi_com‐
420 pare_atomicvalid checks if a provider supports a specified compare-
421 atomic operation for a given datatype and operation.
422
423 If an operation is supported, an atomic valid call will return 0, along
424 with a count of atomic data units that a single function call will op‐
425 erate on.
426
427 Query Atomic Attributes
428 The fi_query_atomic call acts as an enhanced atomic valid operation
429 (see the atomic valid function definitions above). It is provided, in
430 part, for future extensibility. The query operation reports which
431 atomic operations are supported by the domain, for suitably configured
432 endpoints.
433
434 The behavior of fi_query_atomic is adjusted based on the flags parame‐
435 ter. If flags is 0, then the operation reports the supported atomic
436 attributes for base atomic operations, similar to fi_atomicvalid for
437 endpoints. If flags has the FI_FETCH_ATOMIC bit set, the operation be‐
438 haves similar to fi_fetch_atomicvalid. Similarly, the flag bit FI_COM‐
439 PARE_ATOMIC results in query acting as fi_compare_atomicvalid. The
440 FI_FETCH_ATOMIC and FI_COMPARE_ATOMIC bits may not both be set.
441
442 If the FI_TAGGED bit is set, the provider will indicate if it supports
443 atomic operations to tagged receive buffers. The FI_TAGGED bit may be
444 used by itself, or in conjunction with the FI_FETCH_ATOMIC and FI_COM‐
445 PARE_ATOMIC flags.
446
447 The output of fi_query_atomic is struct fi_atomic_attr:
448
449 struct fi_atomic_attr {
450 size_t count;
451 size_t size;
452 };
453
454 The count attribute field is as defined for the atomic valid calls.
455 The size field indicates the size in bytes of the atomic datatype. The
456 size field is useful for datatypes that may differ in sizes based on
457 the platform or compiler, such FI_LONG_DOUBLE.
458
459 Completions
460 Completed atomic operations are reported to the initiator of the re‐
461 quest through an associated completion queue or counter. Any user pro‐
462 vided context specified with the request will be returned as part of
463 any completion event written to a CQ. See fi_cq for completion event
464 details.
465
466 Any results returned to the initiator as part of an atomic operation
467 will be available prior to a completion event being generated. This
468 will be true even if the requested completion semantic provides a weak‐
469 er guarantee. That is, atomic fetch operations have FI_DELIVERY_COM‐
470 PLETE semantics. Completions generated for other types of atomic oper‐
471 ations indicate that it is safe to re-use the source data buffers.
472
473 Any updates to data at the target of an atomic operation will be visi‐
474 ble to agents (CPU processes, NICs, and other devices) on the target
475 node prior to one of the following occurring. If the atomic operation
476 generates a completion event or updates a completion counter at the
477 target endpoint, the results will be available prior to the completion
478 notification. After processing a completion for the atomic, if the
479 initiator submits a transfer between the same endpoints that generates
480 a completion at the target, the results will be available prior to the
481 subsequent transfer’s event. Or, if a fenced data transfer from the
482 initiator follows the atomic request, the results will be available
483 prior to a completion at the target for the fenced transfer.
484
485 The correctness of atomic operations on a target memory region is guar‐
486 anteed only when performed by a single actor for a given window of
487 time. An actor is defined as a single libfabric domain (identified by
488 the domain name, and not an open instance of that domain), a coherent
489 CPU complex, or other device (e.g. GPU) capable of performing atomic
490 operations on the target memory. The results of atomic operations per‐
491 formed by multiple actors simultaneously are undefined. For example,
492 issuing CPU based atomic operations to a target region concurrently be‐
493 ing updated by NIC based atomics may leave the region’s data in an un‐
494 known state. The results of a first actor’s atomic operations must be
495 visible to a second actor prior to the second actor issuing its own
496 atomics.
497
499 The fi_atomicmsg, fi_fetch_atomicmsg, and fi_compare_atomicmsg calls
500 allow the user to specify flags which can change the default data
501 transfer operation. Flags specified with atomic message operations
502 override most flags previously configured with the endpoint, except
503 where noted (see fi_control). The following list of flags are usable
504 with atomic message calls.
505
506 FI_COMPLETION
507 Indicates that a completion entry should be generated for the
508 specified operation. The endpoint must be bound to a completion
509 queue with FI_SELECTIVE_COMPLETION that corresponds to the spec‐
510 ified operation, or this flag is ignored.
511
512 FI_MORE
513 Indicates that the user has additional requests that will imme‐
514 diately be posted after the current call returns. Use of this
515 flag may improve performance by enabling the provider to opti‐
516 mize its access to the fabric hardware.
517
518 FI_INJECT
519 Indicates that the control of constant data buffers should be
520 returned to the user immediately after the call returns, even if
521 the operation is handled asynchronously. This may require that
522 the underlying provider implementation copy the data into a lo‐
523 cal buffer and transfer out of that buffer. Constant data buf‐
524 fers refers to any data buffer or iovec used by the atomic APIs
525 that are marked as `const'. Non-constant or output buffers are
526 unaffected by this flag and may be accessed by the provider at
527 anytime until the operation has completed. This flag can only
528 be used with messages smaller than inject_size.
529
530 FI_FENCE
531 Applies to transmits. Indicates that the requested operation,
532 also known as the fenced operation, and any operation posted af‐
533 ter the fenced operation will be deferred until all previous op‐
534 erations targeting the same peer endpoint have completed. Oper‐
535 ations posted after the fencing will see and/or replace the re‐
536 sults of any operations initiated prior to the fenced operation.
537
538 The ordering of operations starting at the posting of the fenced opera‐
539 tion (inclusive) to the posting of a subsequent fenced operation (ex‐
540 clusive) is controlled by the endpoint’s ordering semantics.
541
542 FI_TAGGED
543 Specifies that the target of the atomic operation is a tagged
544 receive buffer instead of an RMA buffer. When a tagged buffer
545 is the target memory region, the addr parameter is used as a
546 0-based byte offset into the tagged buffer, with the key parame‐
547 ter specifying the tag.
548
550 Returns 0 on success. On error, a negative value corresponding to fab‐
551 ric errno is returned. Fabric errno values are defined in rdma/fi_er‐
552 rno.h.
553
555 -FI_EAGAIN
556 See fi_msg(3) for a detailed description of handling FI_EAGAIN.
557
558 -FI_EOPNOTSUPP
559 The requested atomic operation is not supported on this end‐
560 point.
561
562 -FI_EMSGSIZE
563 The number of atomic operations in a single request exceeds that
564 supported by the underlying provider.
565
567 Atomic operations operate on an array of values of a specific data
568 type. Atomicity is only guaranteed for each data type operation, not
569 across the entire array. The following pseudo-code demonstrates this
570 operation for 64-bit unsigned atomic write. ATOMIC_WRITE_U64 is a
571 platform dependent macro that atomically writes 8 bytes to an aligned
572 memory location.
573
574 fi_atomic(ep, buf, count, NULL, dest_addr, addr, key,
575 FI_UINT64, FI_ATOMIC_WRITE, context)
576 {
577 for (i = 1; i < count; i ++)
578 ATOMIC_WRITE_U64(((uint64_t *) addr)[i],
579 ((uint64_t *) buf)[i]);
580 }
581
582 The number of array elements to operate on is specified through a count
583 parameter. This must be between 1 and the maximum returned through the
584 relevant valid operation, inclusive. The requested operation and data
585 type must also be valid for the given provider.
586
587 The ordering of atomic operations carried as part of different request
588 messages is subject to the message and data ordering definitions as‐
589 signed to the transmitting and receiving endpoints. Both message and
590 data ordering are required if the results of two atomic operations to
591 the same memory buffers are to reflect the second operation acting on
592 the results of the first. See fi_endpoint(3) for further details and
593 message size restrictions.
594
596 fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_cq(3), fi_rma(3)
597
599 OpenFabrics.
600
601
602
603Libfabric Programmer’s Manual 2022-12-11 fi_atomic(3)