fi_alias(3)

1fi_endpoint(3)                 Libfabric v1.18.1                fi_endpoint(3)
2
3
4

NAME

6       fi_endpoint - Fabric endpoint operations
7
8       fi_endpoint / fi_endpoint2 / fi_scalable_ep / fi_passive_ep / fi_close
9              Allocate or close an endpoint.
10
11       fi_ep_bind
12              Associate  an  endpoint  with  hardware resources, such as event
13              queues, completion queues, counters, address vectors, or  shared
14              transmit/receive contexts.
15
16       fi_scalable_ep_bind
17              Associate a scalable endpoint with an address vector
18
19       fi_pep_bind
20              Associate a passive endpoint with an event queue
21
22       fi_enable
23              Transitions an active endpoint into an enabled state.
24
25       fi_cancel
26              Cancel a pending asynchronous data transfer
27
28       fi_ep_alias
29              Create an alias to the endpoint
30
31       fi_control
32              Control endpoint operation.
33
34       fi_getopt / fi_setopt
35              Get or set endpoint options.
36
37       fi_rx_context / fi_tx_context / fi_srx_context / fi_stx_context
38              Open a transmit or receive context.
39
40       fi_tc_dscp_set / fi_tc_dscp_get
41              Convert between a DSCP value and a network traffic class
42
43       fi_rx_size_left / fi_tx_size_left (DEPRECATED)
44              Query the lower bound on how many RX/TX operations may be posted
45              without an operation returning -FI_EAGAIN.  This functions  have
46              been  deprecated  and will be removed in a future version of the
47              library.
48

SYNOPSIS

50              #include <rdma/fabric.h>
51
52              #include <rdma/fi_endpoint.h>
53
54              int fi_endpoint(struct fid_domain *domain, struct fi_info *info,
55                  struct fid_ep **ep, void *context);
56
57              int fi_endpoint2(struct fid_domain *domain, struct fi_info *info,
58                  struct fid_ep **ep, uint64_t flags, void *context);
59
60              int fi_scalable_ep(struct fid_domain *domain, struct fi_info *info,
61                  struct fid_ep **sep, void *context);
62
63              int fi_passive_ep(struct fi_fabric *fabric, struct fi_info *info,
64                  struct fid_pep **pep, void *context);
65
66              int fi_tx_context(struct fid_ep *sep, int index,
67                  struct fi_tx_attr *attr, struct fid_ep **tx_ep,
68                  void *context);
69
70              int fi_rx_context(struct fid_ep *sep, int index,
71                  struct fi_rx_attr *attr, struct fid_ep **rx_ep,
72                  void *context);
73
74              int fi_stx_context(struct fid_domain *domain,
75                  struct fi_tx_attr *attr, struct fid_stx **stx,
76                  void *context);
77
78              int fi_srx_context(struct fid_domain *domain,
79                  struct fi_rx_attr *attr, struct fid_ep **rx_ep,
80                  void *context);
81
82              int fi_close(struct fid *ep);
83
84              int fi_ep_bind(struct fid_ep *ep, struct fid *fid, uint64_t flags);
85
86              int fi_scalable_ep_bind(struct fid_ep *sep, struct fid *fid, uint64_t flags);
87
88              int fi_pep_bind(struct fid_pep *pep, struct fid *fid, uint64_t flags);
89
90              int fi_enable(struct fid_ep *ep);
91
92              int fi_cancel(struct fid_ep *ep, void *context);
93
94              int fi_ep_alias(struct fid_ep *ep, struct fid_ep **alias_ep, uint64_t flags);
95
96              int fi_control(struct fid *ep, int command, void *arg);
97
98              int fi_getopt(struct fid *ep, int level, int optname,
99                  void *optval, size_t *optlen);
100
101              int fi_setopt(struct fid *ep, int level, int optname,
102                  const void *optval, size_t optlen);
103
104              uint32_t fi_tc_dscp_set(uint8_t dscp);
105
106              uint8_t fi_tc_dscp_get(uint32_t tclass);
107
108              DEPRECATED ssize_t fi_rx_size_left(struct fid_ep *ep);
109
110              DEPRECATED ssize_t fi_tx_size_left(struct fid_ep *ep);
111

ARGUMENTS

113       fid    On creation, specifies a fabric  or  access  domain.   On  bind,
114              identifies  the  event  queue, completion queue, counter, or ad‐
115              dress vector to bind to the endpoint.  In other  cases,  it’s  a
116              fabric identifier of an associated resource.
117
118       info   Details  about  the  fabric interface endpoint to be opened, ob‐
119              tained from fi_getinfo.
120
121       ep     A fabric endpoint.
122
123       sep    A scalable fabric endpoint.
124
125       pep    A passive fabric endpoint.
126
127       context
128              Context associated with the endpoint or asynchronous operation.
129
130       index  Index to retrieve a specific transmit/receive context.
131
132       attr   Transmit or receive context attributes.
133
134       flags  Additional flags to apply to the operation.
135
136       command
137              Command of control operation to perform on endpoint.
138
139       arg    Optional control argument.
140
141       level  Protocol level at which the desired option resides.
142
143       optname
144              The protocol option to read or set.
145
146       optval The option value that was read or to set.
147
148       optlen The size of the optval buffer.
149

DESCRIPTION

151       Endpoints are transport level communication  portals.   There  are  two
152       types  of endpoints: active and passive.  Passive endpoints belong to a
153       fabric domain and are most often used to listen for incoming connection
154       requests.   However, a passive endpoint may be used to reserve a fabric
155       address that can be granted to an active  endpoint.   Active  endpoints
156       belong to access domains and can perform data transfers.
157
158       Active  endpoints may be connection-oriented or connectionless, and may
159       provide data reliability.  The  data  transfer  interfaces  –  messages
160       (fi_msg),  tagged  messages  (fi_tagged),  RMA  (fi_rma),  and  atomics
161       (fi_atomic) – are associated with active endpoints.  In basic  configu‐
162       rations, an active endpoint has transmit and receive queues.  In gener‐
163       al, operations that generate traffic on the fabric are  posted  to  the
164       transmit  queue.   This  includes  all RMA and atomic operations, along
165       with sent messages and sent tagged messages.  Operations that post buf‐
166       fers for receiving incoming data are submitted to the receive queue.
167
168       Active  endpoints are created in the disabled state.  They must transi‐
169       tion into an enabled state before accepting data  transfer  operations,
170       including  posting  of  receive buffers.  The fi_enable call is used to
171       transition an active endpoint into an enabled  state.   The  fi_connect
172       and  fi_accept  calls will also transition an endpoint into the enabled
173       state, if it is not already active.
174
175       In order to transition an endpoint into an enabled state,  it  must  be
176       bound  to one or more fabric resources.  An endpoint that will generate
177       asynchronous completions, either through data  transfer  operations  or
178       communication  establishment  events,  must be bound to the appropriate
179       completion queues or event queues, respectively, before being  enabled.
180       Additionally,  endpoints  that  use  manual progress must be associated
181       with relevant completion queues or  event  queues  in  order  to  drive
182       progress.   For  endpoints  that  are only used as the target of RMA or
183       atomic operations, this means binding  the  endpoint  to  a  completion
184       queue  associated  with  receive  processing.  Connectionless endpoints
185       must be bound to an address vector.
186
187       Once an endpoint has been activated, it may be associated with  an  ad‐
188       dress  vector.   Receive  buffers  may be posted to it and calls may be
189       made to connection establishment  routines.   Connectionless  endpoints
190       may also perform data transfers.
191
192       The behavior of an endpoint may be adjusted by setting its control data
193       and protocol options.  This allows the underlying provider to  redirect
194       function  calls to implementations optimized to meet the desired appli‐
195       cation behavior.
196
197       If an endpoint experiences a critical error, it  will  transition  back
198       into  a disabled state.  Critical errors are reported through the event
199       queue associated with the EP.  In certain cases,  a  disabled  endpoint
200       may  be  re-enabled.   The  ability  to transition back into an enabled
201       state is provider specific and depends on the type of  error  that  the
202       endpoint  experienced.   When  an endpoint is disabled as a result of a
203       critical error, all pending operations are discarded.
204
205   fi_endpoint / fi_passive_ep / fi_scalable_ep
206       fi_endpoint allocates a new active endpoint.  fi_passive_ep allocates a
207       new  passive  endpoint.   fi_scalable_ep allocates a scalable endpoint.
208       The properties and behavior of the endpoint are defined  based  on  the
209       provided  struct  fi_info.   See  fi_getinfo  for additional details on
210       fi_info.  fi_info flags that control the operation of an  endpoint  are
211       defined below.  See section SCALABLE ENDPOINTS.
212
213       If  an active endpoint is allocated in order to accept a connection re‐
214       quest, the fi_info parameter must be the same as the fi_info  structure
215       provided with the connection request (FI_CONNREQ) event.
216
217       An  active endpoint may acquire the properties of a passive endpoint by
218       setting the fi_info handle field to the  passive  endpoint  fabric  de‐
219       scriptor.   This  is  useful  for applications that need to reserve the
220       fabric address of an endpoint prior to knowing if the endpoint will  be
221       used  on the active or passive side of a connection.  For example, this
222       feature is useful for simulating socket semantics.  Once an active end‐
223       point  acquires  the properties of a passive endpoint, the passive end‐
224       point is no longer bound to any fabric resources and must no longer  be
225       used.  The user is expected to close the passive endpoint after opening
226       the active endpoint in order to free up any  lingering  resources  that
227       had been used.
228
229   fi_endpoint2
230       Similar  to  fi_endpoint, buf accepts an extra parameter flags.  Mainly
231       used for  opening  endpoints  that  use  peer  transfer  feature.   See
232       fi_peer(3)
233
234   fi_close
235       Closes an endpoint and release all resources associated with it.
236
237       When closing a scalable endpoint, there must be no opened transmit con‐
238       texts, or receive contexts associated with the scalable  endpoint.   If
239       resources are still associated with the scalable endpoint when attempt‐
240       ing to close, the call will return -FI_EBUSY.
241
242       Outstanding operations posted to the endpoint when fi_close  is  called
243       will be discarded.  Discarded operations will silently be dropped, with
244       no completions reported.  Additionally, a provider may  discard  previ‐
245       ously  completed  operations  from  the associated completion queue(s).
246       The behavior to discard completed operations is provider specific.
247
248   fi_ep_bind
249       fi_ep_bind is used to associate an endpoint with  other  allocated  re‐
250       sources,  such  as  completion queues, counters, address vectors, event
251       queues, shared contexts, and memory regions.  The type of objects  that
252       must be bound with an endpoint depend on the endpoint type and its con‐
253       figuration.
254
255       Passive endpoints must be bound with an  EQ  that  supports  connection
256       management  events.  Connectionless endpoints must be bound to a single
257       address vector.  If an endpoint is using a shared transmit  and/or  re‐
258       ceive context, the shared contexts must be bound to the endpoint.  CQs,
259       counters, AV, and shared contexts must be  bound  to  endpoints  before
260       they are enabled either explicitly or implicitly.
261
262       An endpoint must be bound with CQs capable of reporting completions for
263       any asynchronous operation initiated on the endpoint.  For example,  if
264       the  endpoint  supports  any  outbound  transfers (sends, RMA, atomics,
265       etc.), then it must be bound to a  completion  queue  that  can  report
266       transmit  completions.  This is true even if the endpoint is configured
267       to suppress successful completions, in order that operations that  com‐
268       plete in error may be reported to the user.
269
270       An  active  endpoint  may  direct asynchronous completions to different
271       CQs,  based  on  the  type  of  operation.   This  is  specified  using
272       fi_ep_bind flags.  The following flags may be OR’ed together when bind‐
273       ing an endpoint to a completion domain CQ.
274
275       FI_RECV
276              Directs the notification of inbound data transfers to the speci‐
277              fied  completion  queue.  This includes received messages.  This
278              binding automatically includes FI_REMOTE_WRITE, if applicable to
279              the endpoint.
280
281       FI_SELECTIVE_COMPLETION
282              By default, data transfer operations write CQ completion entries
283              into the associated completion queue after they have successful‐
284              ly completed.  Applications can use this bind flag to selective‐
285              ly enable when completions are generated.  If  FI_SELECTIVE_COM‐
286              PLETION is specified, data transfer operations will not generate
287              CQ entries for successful completions  unless  FI_COMPLETION  is
288              set  as an operational flag for the given operation.  Operations
289              that fail asynchronously will still generate  completions,  even
290              if  a completion is not requested.  FI_SELECTIVE_COMPLETION must
291              be OR’ed with FI_TRANSMIT and/or FI_RECV flags.
292
293       When FI_SELECTIVE_COMPLETION is set, the user must determine when a re‐
294       quest  that  does  NOT have FI_COMPLETION set has completed indirectly,
295       usually based on the completion of a subsequent operation or  by  using
296       completion  counters.   Use of this flag may improve performance by al‐
297       lowing the provider to avoid writing a CQ completion  entry  for  every
298       operation.
299
300       See Notes section below for additional information on how this flag in‐
301       teracts with the FI_CONTEXT and FI_CONTEXT2 mode bits.
302
303       FI_TRANSMIT
304              Directs the completion of outbound data transfer requests to the
305              specified  completion  queue.   This includes send message, RMA,
306              and atomic operations.
307
308       An endpoint may optionally be bound to a completion counter.  Associat‐
309       ing  an endpoint with a counter is in addition to binding the EP with a
310       CQ.  When binding an endpoint to a counter, the following flags may  be
311       specified.
312
313       FI_READ
314              Increments  the  specified  counter whenever an RMA read, atomic
315              fetch, or atomic compare operation initiated from  the  endpoint
316              has completed successfully or in error.
317
318       FI_RECV
319              Increments  the specified counter whenever a message is received
320              over the endpoint.  Received messages include  both  tagged  and
321              normal message operations.
322
323       FI_REMOTE_READ
324              Increments  the  specified  counter whenever an RMA read, atomic
325              fetch, or atomic compare operation is initiated  from  a  remote
326              endpoint  that targets the given endpoint.  Use of this flag re‐
327              quires that the endpoint be created using FI_RMA_EVENT.
328
329       FI_REMOTE_WRITE
330              Increments the specified counter whenever an RMA write  or  base
331              atomic  operation  is initiated from a remote endpoint that tar‐
332              gets the given endpoint.  Use of this  flag  requires  that  the
333              endpoint be created using FI_RMA_EVENT.
334
335       FI_SEND
336              Increments  the  specified  counter  whenever a message transfer
337              initiated over the endpoint has completed successfully or in er‐
338              ror.  Sent messages include both tagged and normal message oper‐
339              ations.
340
341       FI_WRITE
342              Increments the specified counter whenever an RMA write  or  base
343              atomic  operation initiated from the endpoint has completed suc‐
344              cessfully or in error.
345
346       An endpoint may only be bound to a single CQ or  counter  for  a  given
347       type of operation.  For example, a EP may not bind to two counters both
348       using FI_WRITE.  Furthermore, providers may limit CQ and counter  bind‐
349       ings to endpoints of the same endpoint type (DGRAM, MSG, RDM, etc.).
350
351   fi_scalable_ep_bind
352       fi_scalable_ep_bind  is  used  to associate a scalable endpoint with an
353       address vector.  See section on SCALABLE ENDPOINTS.   A  scalable  end‐
354       point  has  a  single  transport level address and can support multiple
355       transmit and receive contexts.  The transmit and receive contexts share
356       the  transport-level  address.  Address vectors that are bound to scal‐
357       able endpoints are implicitly bound to any transmit or receive contexts
358       created using the scalable endpoint.
359
360   fi_enable
361       This  call transitions the endpoint into an enabled state.  An endpoint
362       must be enabled before it may be used to perform data  transfers.   En‐
363       abling  an  endpoint  typically results in hardware resources being as‐
364       signed to it.  Endpoints making use  of  completion  queues,  counters,
365       event queues, and/or address vectors must be bound to them before being
366       enabled.
367
368       Calling connect or accept on an endpoint will implicitly enable an end‐
369       point if it has not already been enabled.
370
371       fi_enable  may also be used to re-enable an endpoint that has been dis‐
372       abled as a result  of  experiencing  a  critical  error.   Applications
373       should  check the return value from fi_enable to see if a disabled end‐
374       point has successfully be re-enabled.
375
376   fi_cancel
377       fi_cancel attempts to cancel  an  outstanding  asynchronous  operation.
378       Canceling an operation causes the fabric provider to search for the op‐
379       eration and, if it is still pending, complete it as  having  been  can‐
380       celed.   An error queue entry will be available in the associated error
381       queue with error code FI_ECANCELED.  On the other hand, if  the  opera‐
382       tion completed before the call to fi_cancel, then the completion status
383       of that operation will be available in the associated completion queue.
384       No specific entry related to fi_cancel itself will be posted.
385
386       Cancel uses the context parameter associated with an operation to iden‐
387       tify the request to cancel.  Operations posted without a valid  context
388       parameter  –  either  no  context parameter is specified or the context
389       value was ignored by the provider – cannot be  canceled.   If  multiple
390       outstanding  operations  match  the context parameter, only one will be
391       canceled.  In this case, the operation which is  canceled  is  provider
392       specific.   The  cancel  operation  is  asynchronous, but will complete
393       within a bounded period of time.
394
395   fi_ep_alias
396       This call creates an alias to the specified endpoint.  Conceptually, an
397       endpoint alias provides an alternate software path from the application
398       to the underlying provider hardware.  An alias EP differs from its par‐
399       ent  endpoint only by its default data transfer flags.  For example, an
400       alias EP may be configured to use a different completion mode.  By  de‐
401       fault,  an alias EP inherits the same data transfer flags as the parent
402       endpoint.  An application can use fi_control to modify the alias EP op‐
403       erational flags.
404
405       When  allocating  an  alias,  an  application  may configure either the
406       transmit or receive operational flags.  This avoids needing a  separate
407       call to fi_control to set those flags.  The flags passed to fi_ep_alias
408       must include FI_TRANSMIT or FI_RECV (not both) with  other  operational
409       flags  OR’ed in.  This will override the transmit or receive flags, re‐
410       spectively, for operations posted through the alias endpoint.  All  al‐
411       located  aliases  must  be closed for the underlying endpoint to be re‐
412       leased.
413
414   fi_control
415       The control operation is used to adjust the default behavior of an end‐
416       point.  It allows the underlying provider to redirect function calls to
417       implementations optimized to meet the desired application behavior.  As
418       a  result,  calls to fi_ep_control must be serialized against all other
419       calls to an endpoint.
420
421       The base operation of an endpoint is  selected  during  creation  using
422       struct  fi_info.   The  following control commands and arguments may be
423       assigned to an endpoint.
424
425       **FI_BACKLOG - int *value**
426              This option only applies to passive endpoints.  It  is  used  to
427              set the connection request backlog for listening endpoints.
428
429       **FI_GETOPSFLAG – uint64_t *flags**
430              Used  to retrieve the current value of flags associated with the
431              data transfer operations initiated on the endpoint.  The control
432              argument must include FI_TRANSMIT or FI_RECV (not both) flags to
433              indicate the type of data transfer flags to  be  returned.   See
434              below for a list of control flags.
435
436       FI_GETWAIT – void **
437              This command allows the user to retrieve the file descriptor as‐
438              sociated with a socket endpoint.  The fi_control  arg  parameter
439              should  be  an  address where a pointer to the returned file de‐
440              scriptor will be written.  See fi_eq.3 for addition details  us‐
441              ing fi_control with FI_GETWAIT.  The file descriptor may be used
442              for notification that the endpoint is ready to send  or  receive
443              data.
444
445       **FI_SETOPSFLAG – uint64_t *flags**
446              Used to change the data transfer operation flags associated with
447              an endpoint.  The control argument must include  FI_TRANSMIT  or
448              FI_RECV  (not  both)  to indicate the type of data transfer that
449              the flags should apply to, with other flags OR’ed in.  The given
450              flags will override the previous transmit and receive attributes
451              that were set when the  endpoint  was  created.   Valid  control
452              flags are defined below.
453
454   fi_getopt / fi_setopt
455       Endpoint  protocol  operations  may be retrieved using fi_getopt or set
456       using fi_setopt.  Applications specify the level that a desired  option
457       exists, identify the option, and provide input/output buffers to get or
458       set the option.  fi_setopt provides an application a way to adjust low-
459       level protocol and implementation specific details of an endpoint.
460
461       The  following  option  levels  and option names and parameters are de‐
462       fined.
463
464       FI_OPT_ENDPOINT • .RS 2
465
466       FI_OPT_BUFFERED_LIMIT - size_t
467              Defines the maximum size of a buffered message that will be  re‐
468              ported  to  users  as  part  of  a  receive  completion when the
469              FI_BUFFERED_RECV mode is enabled on an endpoint.
470
471       fi_getopt() will return the  currently  configured  threshold,  or  the
472       provider’s  default threshold if one has not be set by the application.
473       fi_setopt() allows an application to configure the threshold.   If  the
474       provider  cannot  support  the  requested  threshold,  it will fail the
475       fi_setopt()  call  with  FI_EMSGSIZE.   Calling  fi_setopt()  with  the
476       threshold  set  to  SIZE_MAX will set the threshold to the maximum sup‐
477       ported by the provider.  fi_getopt() can then be used to  retrieve  the
478       set size.
479
480       In  most  cases, the sending and receiving endpoints must be configured
481       to use the same threshold value, and the threshold must be set prior to
482       enabling the endpoint.
483       • .RS 2
484
485       FI_OPT_BUFFERED_MIN - size_t
486              Defines  the minimum size of a buffered message that will be re‐
487              ported.  Applications would set this to a size that’s big enough
488              to decide whether to discard or claim a buffered receive or when
489              to claim a buffered receive on getting a buffered  receive  com‐
490              pletion.  The value is typically used by a provider when sending
491              a rendezvous protocol request  where  it  would  send  at  least
492              FI_OPT_BUFFERED_MIN  bytes of application data along with it.  A
493              smaller sized rendezvous protocol  message  usually  results  in
494              better latency for the overall transfer of a large message.
495       • .RS 2
496
497       FI_OPT_CM_DATA_SIZE - size_t
498              Defines  the size of available space in CM messages for user-de‐
499              fined data.  This value limits the amount of data that  applica‐
500              tions  can exchange between peer endpoints using the fi_connect,
501              fi_accept, and fi_reject operations.  The size returned  is  de‐
502              pendent  upon the properties of the endpoint, except in the case
503              of passive endpoints, in which the  size  reflects  the  maximum
504              size of the data that may be present as part of a connection re‐
505              quest event.  This option is read only.
506       • .RS 2
507
508       FI_OPT_MIN_MULTI_RECV - size_t
509              Defines the minimum receive buffer space available when the  re‐
510              ceive  buffer  is  released by the provider (see FI_MULTI_RECV).
511              Modifying this value is only guaranteed to set the minimum  buf‐
512              fer  space  needed  on  receives posted after the value has been
513              changed.  It is recommended that applications that want to over‐
514              ride the default MIN_MULTI_RECV value set this option before en‐
515              abling the corresponding endpoint.
516       • .RS 2
517
518       FI_OPT_FI_HMEM_P2P - int
519              Defines how the provider should  handle  peer  to  peer  FI_HMEM
520              transfers  for  this  endpoint.   By  default, the provider will
521              chose whether to use peer to peer support based on the  type  of
522              transfer (FI_HMEM_P2P_ENABLED).  Valid values defined in fi_end‐
523              point.h are:
524
525              • FI_HMEM_P2P_ENABLED: Peer to peer support may be used  by  the
526                provider  to handle FI_HMEM transfers, and which transfers are
527                initiated using peer to peer is subject to the provider imple‐
528                mentation.
529
530              • FI_HMEM_P2P_REQUIRED:  Peer  to  peer support must be used for
531                transfers, transfers that cannot be performed using  p2p  will
532                be reported as failing.
533
534              • FI_HMEM_P2P_PREFERRED:  Peer to peer support should be used by
535                the provider for all transfers if available, but the  provider
536                may  choose  to copy the data to initiate the transfer if peer
537                to peer support is unavailable.
538
539              • FI_HMEM_P2P_DISABLED: Peer to peer support should not be used.
540       fi_setopt() will return -FI_EOPNOTSUPP if the mode requested cannot  be
541       supported  by  the provider.  The FI_HMEM_DISABLE_P2P environment vari‐
542       able discussed in fi_mr(3) takes precedence over this setopt option.
543       • .RS 2
544
545       FI_OPT_XPU_TRIGGER - struct fi_trigger_xpu *
546              This option only applies to the fi_getopt() call.  It is used to
547              query  the  maximum  number of variables required to support XPU
548              triggered operations, along with the size of each variable.
549
550       The user provides a filled out struct  fi_trigger_xpu  on  input.   The
551       iface  and  device  fields  should  reference  an  HMEM domain.  If the
552       provider does not support XPU triggered operations from the  given  de‐
553       vice,  fi_getopt()  will  return  -FI_EOPNOTSUPP.  On input, var should
554       reference an array of struct fi_trigger_var data structures, with count
555       set  to the size of the referenced array.  If count is 0, the var field
556       will be ignored, and the provider will return the  number  of  fi_trig‐
557       ger_var  structures  needed.   If  count  is > 0, the provider will set
558       count to the needed value, and for each fi_trigger_var  available,  set
559       the datatype and count of the variable used for the trigger.
560       • .RS 2
561
562       FI_OPT_CUDA_API_PERMITTED - bool *
563              This  option  only applies to the fi_setopt call.  It is used to
564              control endpoint’s behavior in making calls to CUDA API.  By de‐
565              fault,  an endpoint is permitted to call CUDA API.  If user wish
566              to prohibit an endpoint from making such calls, user can achieve
567              that  by  set this option to false.  If an endpoint’s support of
568              CUDA memory relies on making calls to CUDA API, it  will  return
569              -FI_EOPNOTSUPP  for  the  call to fi_setopt.  If either CUDA li‐
570              brary or CUDA device is  not  available,  endpoint  will  return
571              -FI_EINVAL.   All  providers that support FI_HMEM capability im‐
572              plement this option.
573
574   fi_tc_dscp_set
575       This call converts a DSCP defined value into a libfabric traffic  class
576       value.   It should be used when assigning a DSCP value when setting the
577       tclass field in either domain or endpoint attributes
578
579   fi_tc_dscp_get
580       This call returns the DSCP value associated with the tclass  field  for
581       the domain or endpoint attributes.
582
583   fi_rx_size_left (DEPRECATED)
584       This  function has been deprecated and will be removed in a future ver‐
585       sion of the library.  It may not be supported by all providers.
586
587       The fi_rx_size_left call returns a lower bound on the number of receive
588       operations that may be posted to the given endpoint without that opera‐
589       tion returning -FI_EAGAIN.  Depending on the specific  details  of  the
590       subsequently  posted  receive  operations (e.g., number of iov entries,
591       which receive function is called, etc.), it may  be  possible  to  post
592       more receive operations than originally indicated by fi_rx_size_left.
593
594   fi_tx_size_left (DEPRECATED)
595       This  function has been deprecated and will be removed in a future ver‐
596       sion of the library.  It may not be supported by all providers.
597
598       The fi_tx_size_left call returns a lower bound on the number of  trans‐
599       mit  operations  that  may be posted to the given endpoint without that
600       operation returning -FI_EAGAIN.  Depending on the specific  details  of
601       the  subsequently  posted  transmit operations (e.g., number of iov en‐
602       tries, which transmit function is called, etc.), it may be possible  to
603       post   more   transmit   operations   than   originally   indicated  by
604       fi_tx_size_left.
605

ENDPOINT ATTRIBUTES

607       The fi_ep_attr structure defines the set of attributes associated  with
608       an  endpoint.   Endpoint  attributes  may  be further refined using the
609       transmit and receive context attributes as shown below.
610
611              struct fi_ep_attr {
612                  enum fi_ep_type type;
613                  uint32_t        protocol;
614                  uint32_t        protocol_version;
615                  size_t          max_msg_size;
616                  size_t          msg_prefix_size;
617                  size_t          max_order_raw_size;
618                  size_t          max_order_war_size;
619                  size_t          max_order_waw_size;
620                  uint64_t        mem_tag_format;
621                  size_t          tx_ctx_cnt;
622                  size_t          rx_ctx_cnt;
623                  size_t          auth_key_size;
624                  uint8_t         *auth_key;
625              };
626
627   type - Endpoint Type
628       If specified, indicates the type of fabric interface communication  de‐
629       sired.  Supported types are:
630
631       FI_EP_DGRAM
632              Supports  a  connectionless,  unreliable datagram communication.
633              Message boundaries are maintained, but the maximum message  size
634              may  be  limited to the fabric MTU.  Flow control is not guaran‐
635              teed.
636
637       FI_EP_MSG
638              Provides a reliable, connection-oriented data  transfer  service
639              with flow control that maintains message boundaries.
640
641       FI_EP_RDM
642              Reliable  datagram message.  Provides a reliable, connectionless
643              data transfer service with flow control that  maintains  message
644              boundaries.
645
646       FI_EP_SOCK_DGRAM
647              A  connectionless, unreliable datagram endpoint with UDP socket-
648              like semantics.  FI_EP_SOCK_DGRAM is most  useful  for  applica‐
649              tions  designed  around  using UDP sockets.  See the SOCKET END‐
650              POINT section for additional details and restrictions that apply
651              to datagram socket endpoints.
652
653       FI_EP_SOCK_STREAM
654              Data  streaming  endpoint  with TCP socket-like semantics.  Pro‐
655              vides a reliable, connection-oriented data transfer service that
656              does not maintain message boundaries.  FI_EP_SOCK_STREAM is most
657              useful for applications designed around using TCP sockets.   See
658              the  SOCKET ENDPOINT section for additional details and restric‐
659              tions that apply to stream endpoints.
660
661       FI_EP_UNSPEC
662              The type of endpoint is not specified.  This is usually provided
663              as  input, with other attributes of the endpoint or the provider
664              selecting the type.
665
666   Protocol
667       Specifies the low-level end to end protocol employed by  the  provider.
668       A  matching  protocol must be used by communicating endpoints to ensure
669       interoperability.  The following protocol values are defined.  Provider
670       specific  protocols are also allowed.  Provider specific protocols will
671       be indicated by having the upper bit of the protocol value set to one.
672
673       FI_PROTO_EFA
674              Proprietary protocol on Elastic Fabric Adapter fabric.  It  sup‐
675              ports both DGRAM and RDM endpoints.
676
677       FI_PROTO_GNI
678              Protocol runs over Cray GNI low-level interface.
679
680       FI_PROTO_IB_RDM
681              Reliable-datagram protocol implemented over InfiniBand reliable-
682              connected queue pairs.
683
684       FI_PROTO_IB_UD
685              The protocol runs  over  Infiniband  unreliable  datagram  queue
686              pairs.
687
688       FI_PROTO_IWARP
689              The  protocol  runs  over  the  Internet wide area RDMA protocol
690              transport.
691
692       FI_PROTO_IWARP_RDM
693              Reliable-datagram protocol implemented over iWarp  reliable-con‐
694              nected queue pairs.
695
696       FI_PROTO_NETWORKDIRECT
697              Protocol  runs over Microsoft NetworkDirect service provider in‐
698              terface.  This adds reliable-datagram semantics  over  the  Net‐
699              workDirect connection- oriented endpoint semantics.
700
701       FI_PROTO_PSMX
702              The  protocol is based on an Intel proprietary protocol known as
703              PSM, performance scaled messaging.  PSMX is an extended  version
704              of the PSM protocol to support the libfabric interfaces.
705
706       FI_PROTO_PSMX2
707              The  protocol is based on an Intel proprietary protocol known as
708              PSM2, performance scaled messaging version 2.  PSMX2 is  an  ex‐
709              tended version of the PSM2 protocol to support the libfabric in‐
710              terfaces.
711
712       FI_PROTO_PSMX3
713              The protocol is Intel’s  protocol  known  as  PSM3,  performance
714              scaled  messaging  version  3.  PSMX3 is implemented over RoCEv2
715              and verbs.
716
717       FI_PROTO_RDMA_CM_IB_RC
718              The  protocol  runs  over  Infiniband  reliable-connected  queue
719              pairs, using the RDMA CM protocol for connection establishment.
720
721       FI_PROTO_RXD
722              Reliable-datagram  protocol implemented over datagram endpoints.
723              RXD is a libfabric utility component that adds RDM endpoint  se‐
724              mantics over DGRAM endpoint semantics.
725
726       FI_PROTO_RXM
727              Reliable-datagram  protocol  implemented over message endpoints.
728              RXM is a libfabric utility component that adds RDM endpoint  se‐
729              mantics over MSG endpoint semantics.
730
731       FI_PROTO_SOCK_TCP
732              The protocol is layered over TCP packets.
733
734       FI_PROTO_UDP
735              The  protocol sends and receives UDP datagrams.  For example, an
736              endpoint using FI_PROTO_UDP will be able to communicate  with  a
737              remote  peer that is using Berkeley SOCK_DGRAM sockets using IP‐
738              PROTO_UDP.
739
740       FI_PROTO_UNSPEC
741              The protocol is not specified.  This is usually provided as  in‐
742              put, with other attributes of the socket or the provider select‐
743              ing the actual protocol.
744
745   protocol_version - Protocol Version
746       Identifies which version of the protocol is employed by  the  provider.
747       The  protocol  version allows providers to extend an existing protocol,
748       by adding support for additional features or functionality for example,
749       in a backward compatible manner.  Providers that support different ver‐
750       sions of the same protocol should inter-operate, but  only  when  using
751       the capabilities defined for the lesser version.
752
753   max_msg_size - Max Message Size
754       Defines  the  maximum size for an application data transfer as a single
755       operation.
756
757   msg_prefix_size - Message Prefix Size
758       Specifies the size of any required message prefix buffer  space.   This
759       field  will be 0 unless the FI_MSG_PREFIX mode is enabled.  If msg_pre‐
760       fix_size is > 0 the specified value will be a multiple of 8-bytes.
761
762   Max RMA Ordered Size
763       The maximum ordered size specifies the delivery order of transport data
764       into  target  memory  for  RMA and atomic operations.  Data ordering is
765       separate, but dependent on message ordering (defined below).  Data  or‐
766       dering is unspecified where message order is not defined.
767
768       Data  ordering refers to the access of the same target memory by subse‐
769       quent operations.  When back to back RMA read or write  operations  ac‐
770       cess  the  same  registered  memory  location,  data ordering indicates
771       whether the second operation reads or writes the  target  memory  after
772       the  first operation has completed.  For example, will an RMA read that
773       follows an RMA write read back the data that was  written?   Similarly,
774       will an RMA write that follows an RMA read update the target buffer af‐
775       ter the read has transferred the original data?  Data ordering  answers
776       these  questions,  even  in the presence of errors, such as the need to
777       resend data because of lost or corrupted network traffic.
778
779       RMA ordering applies between two operations, and not  within  a  single
780       data  transfer.   Therefore,  ordering  is defined per byte-addressable
781       memory location.  I.e.  ordering specifies whether location  X  is  ac‐
782       cessed  by  the second operation after the first operation.  Nothing is
783       implied about the completion of the first operation before  the  second
784       operation  is  initiated.   For example, if the first operation updates
785       locations X and Y, but the second operation only accesses  location  X,
786       there  are  no guarantees defined relative to location Y and the second
787       operation.
788
789       In order to support large data transfers  being  broken  into  multiple
790       packets and sent using multiple paths through the fabric, data ordering
791       may be limited to transfers of a  specific  size  or  less.   Providers
792       specify  when data ordering is maintained through the following values.
793       Note that even if data ordering is not maintained, message ordering may
794       be.
795
796       max_order_raw_size
797              Read  after write size.  If set, an RMA or atomic read operation
798              issued after an RMA or atomic write operation, both of which are
799              smaller than the size, will be ordered.  Where the target memory
800              locations overlap, the RMA or atomic read operation will see the
801              results of the previous RMA or atomic write.
802
803       max_order_war_size
804              Write after read size.  If set, an RMA or atomic write operation
805              issued after an RMA or atomic read operation, both of which  are
806              smaller  than the size, will be ordered.  The RMA or atomic read
807              operation will see the initial value of the target memory  loca‐
808              tion before a subsequent RMA or atomic write updates the value.
809
810       max_order_waw_size
811              Write  after  write size.  If set, an RMA or atomic write opera‐
812              tion issued after an RMA or  atomic  write  operation,  both  of
813              which  are  smaller  than the size, will be ordered.  The target
814              memory location will reflect the results of the  second  RMA  or
815              atomic write.
816
817       An  order size value of 0 indicates that ordering is not guaranteed.  A
818       value of -1 guarantees ordering for any data size.
819
820   mem_tag_format - Memory Tag Format
821       The memory tag format is a bit array  used  to  convey  the  number  of
822       tagged  bits  supported by a provider.  Additionally, it may be used to
823       divide the bit array into separate fields.  The mem_tag_format  option‐
824       ally  begins  with a series of bits set to 0, to signify bits which are
825       ignored by the provider.  Following the initial prefix of ignored bits,
826       the  array will consist of alternating groups of bits set to all 1’s or
827       all 0’s.  Each group of bits corresponds to a tagged field.  The impli‐
828       cation of defining a tagged field is that when a mask is applied to the
829       tagged bit array, all bits belonging to a single field will  either  be
830       set to 1 or 0, collectively.
831
832       For example, a mem_tag_format of 0x30FF indicates support for 14 tagged
833       bits, separated into 3 fields.  The first field consists of 2-bits, the
834       second  field 4-bits, and the final field 8-bits.  Valid masks for such
835       a tagged field would be a bitwise OR’ing of zero or more of the follow‐
836       ing  values: 0x3000, 0x0F00, and 0x00FF.  The provider may not validate
837       the mask provided by the application for performance reasons.
838
839       By identifying fields within a tag, a provider may be able to  optimize
840       their  search  routines.  An application which requests tag fields must
841       provide tag masks that either set all  mask  bits  corresponding  to  a
842       field  to  all 0 or all 1.  When negotiating tag fields, an application
843       can request a specific number of fields of a given  size.   A  provider
844       must  return a tag format that supports the requested number of fields,
845       with each field being at least the size requested, or fail the request.
846       A provider may increase the size of the fields.  When reporting comple‐
847       tions (see FI_CQ_FORMAT_TAGGED), it is not guaranteed that the provider
848       would  clear  out any unsupported tag bits in the tag field of the com‐
849       pletion entry.
850
851       It is recommended that field sizes be ordered from smallest to largest.
852       A  generic,  unstructured  tag and mask can be achieved by requesting a
853       bit array consisting of alternating 1’s and 0’s.
854
855   tx_ctx_cnt - Transmit Context Count
856       Number of transmit contexts to associate with  the  endpoint.   If  not
857       specified (0), 1 context will be assigned if the endpoint supports out‐
858       bound transfers.  Transmit contexts  are  independent  transmit  queues
859       that  may be separately configured.  Each transmit context may be bound
860       to a separate CQ, and no ordering is defined between  contexts.   Addi‐
861       tionally,  no synchronization is needed when accessing contexts in par‐
862       allel.
863
864       If the count is set to the value FI_SHARED_CONTEXT, the  endpoint  will
865       be  configured  to  use  a shared transmit context, if supported by the
866       provider.  Providers that do not support shared transmit contexts  will
867       fail the request.
868
869       See  the  scalable endpoint and shared contexts sections for additional
870       details.
871
872   rx_ctx_cnt - Receive Context Count
873       Number of receive contexts to associate  with  the  endpoint.   If  not
874       specified,  1 context will be assigned if the endpoint supports inbound
875       transfers.  Receive contexts are independent processing queues that may
876       be separately configured.  Each receive context may be bound to a sepa‐
877       rate CQ, and no ordering is defined between contexts.  Additionally, no
878       synchronization is needed when accessing contexts in parallel.
879
880       If  the  count is set to the value FI_SHARED_CONTEXT, the endpoint will
881       be configured to use a shared receive  context,  if  supported  by  the
882       provider.   Providers  that do not support shared receive contexts will
883       fail the request.
884
885       See the scalable endpoint and shared contexts sections  for  additional
886       details.
887
888   auth_key_size - Authorization Key Length
889       The  length of the authorization key in bytes.  This field will be 0 if
890       authorization keys are not available or used.  This  field  is  ignored
891       unless the fabric is opened with API version 1.5 or greater.
892
893   auth_key - Authorization Key
894       If  supported  by the fabric, an authorization key (a.k.a.  job key) to
895       associate with the endpoint.  An authorization key  is  used  to  limit
896       communication  between  endpoints.   Only  peer endpoints that are pro‐
897       grammed to use the same authorization key may communicate.   Authoriza‐
898       tion keys are often used to implement job keys, to ensure that process‐
899       es running in different jobs do not accidentally  cross  traffic.   The
900       domain  authorization  key  will  be used if auth_key_size is set to 0.
901       This field is ignored unless the fabric is opened with API version  1.5
902       or greater.
903

TRANSMIT CONTEXT ATTRIBUTES

905       Attributes  specific  to  the  transmit capabilities of an endpoint are
906       specified using struct fi_tx_attr.
907
908              struct fi_tx_attr {
909                  uint64_t  caps;
910                  uint64_t  mode;
911                  uint64_t  op_flags;
912                  uint64_t  msg_order;
913                  uint64_t  comp_order;
914                  size_t    inject_size;
915                  size_t    size;
916                  size_t    iov_limit;
917                  size_t    rma_iov_limit;
918                  uint32_t  tclass;
919              };
920
921   caps - Capabilities
922       The requested capabilities of the context.  The capabilities must be  a
923       subset of those requested of the associated endpoint.  See the CAPABIL‐
924       ITIES section of fi_getinfo(3) for capability  details.   If  the  caps
925       field  is  0  on input to fi_getinfo(3), the applicable capability bits
926       from the fi_info structure will be used.
927
928       The following capabilities apply to the  transmit  attributes:  FI_MSG,
929       FI_RMA,  FI_TAGGED,  FI_ATOMIC,  FI_READ,  FI_WRITE,  FI_SEND, FI_HMEM,
930       FI_TRIGGER,  FI_FENCE,  FI_MULTICAST,   FI_RMA_PMEM,   FI_NAMED_RX_CTX,
931       FI_COLLECTIVE, and FI_XPU.
932
933       Many  applications will be able to ignore this field and rely solely on
934       the fi_info::caps field.  Use of this field provides fine grained  con‐
935       trol over the transmit capabilities associated with an endpoint.  It is
936       useful when handling scalable endpoints, with  multiple  transmit  con‐
937       texts,  for example, and allows configuring a specific transmit context
938       with fewer capabilities than that supported by the  endpoint  or  other
939       transmit contexts.
940
941   mode
942       The operational mode bits of the context.  The mode bits will be a sub‐
943       set of those associated with the endpoint.  See  the  MODE  section  of
944       fi_getinfo(3)  for details.  A mode value of 0 will be ignored on input
945       to fi_getinfo(3), with the mode value of the fi_info structure used in‐
946       stead.   On  return  from  fi_getinfo(3),  the mode will be set only to
947       those constraints specific to transmit operations.
948
949   op_flags - Default transmit operation flags
950       Flags that control the operation of operations  submitted  against  the
951       context.  Applicable flags are listed in the Operation Flags section.
952
953   msg_order - Message Ordering
954       Message  ordering  refers to the order in which transport layer headers
955       (as viewed by the application) are identified and  processed.   Relaxed
956       message order enables data transfers to be sent and received out of or‐
957       der, which may improve performance by utilizing multiple paths  through
958       the  fabric from the initiating endpoint to a target endpoint.  Message
959       order applies only between a single  source  and  destination  endpoint
960       pair.  Ordering between different target endpoints is not defined.
961
962       Message order is determined using a set of ordering bits.  Each set bit
963       indicates that ordering is maintained between  data  transfers  of  the
964       specified type.  Message order is defined for [read | write | send] op‐
965       erations submitted by an application after [read | write | send] opera‐
966       tions.
967
968       Message  ordering only applies to the end to end transmission of trans‐
969       port headers.  Message ordering is necessary, but does  not  guarantee,
970       the  order  in  which message data is sent or received by the transport
971       layer.  Message ordering requires matching ordering  semantics  on  the
972       receiving  side of a data transfer operation in order to guarantee that
973       ordering is met.
974
975       FI_ORDER_ATOMIC_RAR
976              Atomic read after read.  If set,  atomic  fetch  operations  are
977              transmitted  in  the  order  submitted  relative to other atomic
978              fetch operations.  If not set, atomic fetches may be transmitted
979              out of order from their submission.
980
981       FI_ORDER_ATOMIC_RAW
982              Atomic  read  after  write.  If set, atomic fetch operations are
983              transmitted in the order submitted relative to atomic update op‐
984              erations.   If  not set, atomic fetches may be transmitted ahead
985              of atomic updates.
986
987       FI_ORDER_ATOMIC_WAR
988              RMA write after read.  If  set,  atomic  update  operations  are
989              transmitted  in the order submitted relative to atomic fetch op‐
990              erations.  If not set, atomic updates may be  transmitted  ahead
991              of atomic fetches.
992
993       FI_ORDER_ATOMIC_WAW
994              RMA  write  after  write.   If set, atomic update operations are
995              transmitted in the order submitted relative to other atomic  up‐
996              date  operations.   If not atomic updates may be transmitted out
997              of order from their submission.
998
999       FI_ORDER_NONE
1000              No ordering is specified.  This value may be used  as  input  in
1001              order  to  obtain  the  default  message  order supported by the
1002              provider.  FI_ORDER_NONE is an alias for the value 0.
1003
1004       FI_ORDER_RAR
1005              Read after read.  If set, RMA and  atomic  read  operations  are
1006              transmitted  in  the  order  submitted relative to other RMA and
1007              atomic read operations.  If not set, RMA and atomic reads may be
1008              transmitted out of order from their submission.
1009
1010       FI_ORDER_RAS
1011              Read  after  send.   If  set, RMA and atomic read operations are
1012              transmitted in the order submitted relative to message send  op‐
1013              erations,  including  tagged  sends.  If not set, RMA and atomic
1014              reads may be transmitted ahead of sends.
1015
1016       FI_ORDER_RAW
1017              Read after write.  If set, RMA and atomic  read  operations  are
1018              transmitted  in  the  order submitted relative to RMA and atomic
1019              write operations.  If not set,  RMA  and  atomic  reads  may  be
1020              transmitted ahead of RMA and atomic writes.
1021
1022       FI_ORDER_RMA_RAR
1023              RMA  read after read.  If set, RMA read operations are transmit‐
1024              ted in the order submitted relative to  other  RMA  read  opera‐
1025              tions.   If  not  set, RMA reads may be transmitted out of order
1026              from their submission.
1027
1028       FI_ORDER_RMA_RAW
1029              RMA read after write.  If set, RMA read operations are transmit‐
1030              ted in the order submitted relative to RMA write operations.  If
1031              not set, RMA reads may be transmitted ahead of RMA writes.
1032
1033       FI_ORDER_RMA_WAR
1034              RMA write after read.  If set, RMA write operations  are  trans‐
1035              mitted  in  the order submitted relative to RMA read operations.
1036              If not set, RMA writes may be transmitted ahead of RMA reads.
1037
1038       FI_ORDER_RMA_WAW
1039              RMA write after write.  If set, RMA write operations are  trans‐
1040              mitted in the order submitted relative to other RMA write opera‐
1041              tions.  If not set, RMA writes may be transmitted out  of  order
1042              from their submission.
1043
1044       FI_ORDER_SAR
1045              Send  after  read.   If  set, message send operations, including
1046              tagged sends, are transmitted in order submitted relative to RMA
1047              and  atomic  read  operations.  If not set, message sends may be
1048              transmitted ahead of RMA and atomic reads.
1049
1050       FI_ORDER_SAS
1051              Send after send.  If set,  message  send  operations,  including
1052              tagged sends, are transmitted in the order submitted relative to
1053              other message send.  If not set, message sends may be  transmit‐
1054              ted out of order from their submission.
1055
1056       FI_ORDER_SAW
1057              Send  after  write.   If set, message send operations, including
1058              tagged sends, are transmitted in order submitted relative to RMA
1059              and  atomic  write operations.  If not set, message sends may be
1060              transmitted ahead of RMA and atomic writes.
1061
1062       FI_ORDER_WAR
1063              Write after read.  If set, RMA and atomic write  operations  are
1064              transmitted  in  the  order submitted relative to RMA and atomic
1065              read operations.  If not set,  RMA  and  atomic  writes  may  be
1066              transmitted ahead of RMA and atomic reads.
1067
1068       FI_ORDER_WAS
1069              Write  after  send.  If set, RMA and atomic write operations are
1070              transmitted in the order submitted relative to message send  op‐
1071              erations,  including  tagged  sends.  If not set, RMA and atomic
1072              writes may be transmitted ahead of sends.
1073
1074       FI_ORDER_WAW
1075              Write after write.  If set, RMA and atomic write operations  are
1076              transmitted  in  the  order  submitted relative to other RMA and
1077              atomic write operations.  If not set, RMA and atomic writes  may
1078              be transmitted out of order from their submission.
1079
1080   comp_order - Completion Ordering
1081       Completion ordering refers to the order in which completed requests are
1082       written into the completion queue.  Completion ordering is  similar  to
1083       message order.  Relaxed completion order may enable faster reporting of
1084       completed transfers, allow acknowledgments to be  sent  over  different
1085       fabric  paths,  and  support more sophisticated retry mechanisms.  This
1086       can result in lower-latency completions, particularly when  using  con‐
1087       nectionless  endpoints.   Strict  completion  ordering may require that
1088       providers queue completed operations or limit available optimizations.
1089
1090       For transmit requests, completion ordering depends on the endpoint com‐
1091       munication type.  For unreliable communication, completion ordering ap‐
1092       plies to all data transfer requests submitted to an endpoint.  For  re‐
1093       liable communication, completion ordering only applies to requests that
1094       target a single destination endpoint.  Completion ordering of  requests
1095       that  target  different  endpoints over a reliable transport is not de‐
1096       fined.
1097
1098       Applications should specify the completion ordering that  they  support
1099       or require.  Providers should return the completion order that they ac‐
1100       tually provide, with the  constraint  that  the  returned  ordering  is
1101       stricter  than that specified by the application.  Supported completion
1102       order values are:
1103
1104       FI_ORDER_NONE
1105              No ordering is defined for completed operations.  Requests  sub‐
1106              mitted to the transmit context may complete in any order.
1107
1108       FI_ORDER_STRICT
1109              Requests  complete  in  the order in which they are submitted to
1110              the transmit context.
1111
1112   inject_size
1113       The requested inject operation size (see the FI_INJECT flag)  that  the
1114       context  will support.  This is the maximum size data transfer that can
1115       be associated with an inject operation (such as fi_inject)  or  may  be
1116       used with the FI_INJECT data transfer flag.
1117
1118   size
1119       The size of the transmit context.  The mapping of the size value to re‐
1120       sources is provider specific, but it is directly related to the  number
1121       of  command  entries  allocated for the endpoint.  A smaller size value
1122       consumes fewer hardware and software resources, while a larger size al‐
1123       lows queuing more transmit requests.
1124
1125       While  the size attribute guides the size of underlying endpoint trans‐
1126       mit queue, there is not necessarily  a  one-to-one  mapping  between  a
1127       transmit  operation and a queue entry.  A single transmit operation may
1128       consume multiple queue entries; for example, one per scatter-gather en‐
1129       try.   Additionally, the size field is intended to guide the allocation
1130       of the endpoint’s transmit context.  Specifically,  for  connectionless
1131       endpoints,  there  may be lower-level queues use to track communication
1132       on a per peer basis.  The sizes of any lower-level queues may  only  be
1133       significantly  smaller  than  the endpoint’s transmit size, in order to
1134       reduce resource utilization.
1135
1136   iov_limit
1137       This is the maximum number of IO vectors (scatter-gather elements) that
1138       a single posted operation may reference.
1139
1140   rma_iov_limit
1141       This  is the maximum number of RMA IO vectors (scatter-gather elements)
1142       that an RMA or atomic operation may reference.  The rma_iov_limit  cor‐
1143       responds to the rma_iov_count values in RMA and atomic operations.  See
1144       struct fi_msg_rma and struct fi_msg_atomic in fi_rma.3 and fi_atomic.3,
1145       for  additional  details.  This limit applies to both the number of RMA
1146       IO vectors that may be specified when initiating an operation from  the
1147       local endpoint, as well as the maximum number of IO vectors that may be
1148       carried in a single request from a remote endpoint.
1149
1150   Traffic Class (tclass)
1151       Traffic classes can be a differentiated services code point (DSCP) val‐
1152       ue, one of the following defined labels, or a provider-specific defini‐
1153       tion.  If tclass is unset or set to FI_TC_UNSPEC, the endpoint will use
1154       the default traffic class associated with the domain.
1155
1156       FI_TC_BEST_EFFORT
1157              This  is the default in the absence of any other local or fabric
1158              configuration.  This class carries the traffic for a  number  of
1159              applications executing concurrently over the same network infra‐
1160              structure.  Even though it is shared, network capacity  and  re‐
1161              source  allocation  are  distributed  fairly across the applica‐
1162              tions.
1163
1164       FI_TC_BULK_DATA
1165              This class is intended for large data transfers associated  with
1166              I/O and is present to separate sustained I/O transfers from oth‐
1167              er application inter-process communications.
1168
1169       FI_TC_DEDICATED_ACCESS
1170              This class operates at the highest priority, except the  manage‐
1171              ment class.  It carries a high bandwidth allocation, minimum la‐
1172              tency targets, and the highest scheduling and arbitration prior‐
1173              ity.
1174
1175       FI_TC_LOW_LATENCY
1176              This  class supports low latency, low jitter data patterns typi‐
1177              cally caused by transactional data exchanges,  barrier  synchro‐
1178              nizations, and collective operations that are typical of HPC ap‐
1179              plications.  This class often requires maximum tolerable  laten‐
1180              cies that data transfers must achieve for correct or performance
1181              operations.  Fulfillment of such requests  in  this  class  will
1182              typically  require accompanying bandwidth and message size limi‐
1183              tations so as not to consume excessive bandwidth at high priori‐
1184              ty.
1185
1186       FI_TC_NETWORK_CTRL
1187              This  class  is  intended for traffic directly related to fabric
1188              (network) management, which is critical to the correct operation
1189              of  the  network.  Its use is typically restricted to privileged
1190              network management applications.
1191
1192       FI_TC_SCAVENGER
1193              This class is used for data that is desired but  does  not  have
1194              strict  delivery requirements, such as in-band network or appli‐
1195              cation level monitoring data.  Use of this class indicates  that
1196              the  traffic  is considered lower priority and should not inter‐
1197              fere with higher priority workflows.
1198
1199       fi_tc_dscp_set / fi_tc_dscp_get
1200              DSCP values are supported via the DSCP get  and  set  functions.
1201              The definitions for DSCP values are outside the scope of libfab‐
1202              ric.  See the fi_tc_dscp_set and fi_tc_dscp_get function defini‐
1203              tions for details on their use.
1204

RECEIVE CONTEXT ATTRIBUTES

1206       Attributes  specific  to  the  receive  capabilities of an endpoint are
1207       specified using struct fi_rx_attr.
1208
1209              struct fi_rx_attr {
1210                  uint64_t  caps;
1211                  uint64_t  mode;
1212                  uint64_t  op_flags;
1213                  uint64_t  msg_order;
1214                  uint64_t  comp_order;
1215                  size_t    total_buffered_recv;
1216                  size_t    size;
1217                  size_t    iov_limit;
1218              };
1219
1220   caps - Capabilities
1221       The requested capabilities of the context.  The capabilities must be  a
1222       subset of those requested of the associated endpoint.  See the CAPABIL‐
1223       ITIES section if fi_getinfo(3) for capability  details.   If  the  caps
1224       field  is  0  on input to fi_getinfo(3), the applicable capability bits
1225       from the fi_info structure will be used.
1226
1227       The following capabilities apply to  the  receive  attributes:  FI_MSG,
1228       FI_RMA, FI_TAGGED, FI_ATOMIC, FI_REMOTE_READ, FI_REMOTE_WRITE, FI_RECV,
1229       FI_HMEM, FI_TRIGGER,  FI_RMA_PMEM,  FI_DIRECTED_RECV,  FI_VARIABLE_MSG,
1230       FI_MULTI_RECV,  FI_SOURCE,  FI_RMA_EVENT, FI_SOURCE_ERR, FI_COLLECTIVE,
1231       and FI_XPU.
1232
1233       Many applications will be able to ignore this field and rely solely  on
1234       the  fi_info::caps field.  Use of this field provides fine grained con‐
1235       trol over the receive capabilities associated with an endpoint.  It  is
1236       useful  when  handling  scalable  endpoints, with multiple receive con‐
1237       texts, for example, and allows configuring a specific  receive  context
1238       with  fewer  capabilities  than that supported by the endpoint or other
1239       receive contexts.
1240
1241   mode
1242       The operational mode bits of the context.  The mode bits will be a sub‐
1243       set  of  those  associated  with the endpoint.  See the MODE section of
1244       fi_getinfo(3) for details.  A mode value of 0 will be ignored on  input
1245       to fi_getinfo(3), with the mode value of the fi_info structure used in‐
1246       stead.  On return from fi_getinfo(3), the mode  will  be  set  only  to
1247       those constraints specific to receive operations.
1248
1249   op_flags - Default receive operation flags
1250       Flags  that  control  the operation of operations submitted against the
1251       context.  Applicable flags are listed in the Operation Flags section.
1252
1253   msg_order - Message Ordering
1254       For a description of message ordering, see the msg_order field  in  the
1255       Transmit  Context  Attribute section.  Receive context message ordering
1256       defines the order in which received transport message headers are  pro‐
1257       cessed  when  received  by an endpoint.  When ordering is set, it indi‐
1258       cates that message headers will be processed in order, based on how the
1259       transmit  side has identified the messages.  Typically, this means that
1260       messages will be handled in order based on  a  message  level  sequence
1261       number.
1262
1263       The  following  ordering  flags, as defined for transmit ordering, also
1264       apply to the processing of received operations:  FI_ORDER_NONE,  FI_OR‐
1265       DER_RAR, FI_ORDER_RAW, FI_ORDER_RAS, FI_ORDER_WAR, FI_ORDER_WAW, FI_OR‐
1266       DER_WAS, FI_ORDER_SAR,  FI_ORDER_SAW,  FI_ORDER_SAS,  FI_ORDER_RMA_RAR,
1267       FI_ORDER_RMA_RAW,  FI_ORDER_RMA_WAR,  FI_ORDER_RMA_WAW,  FI_ORDER_ATOM‐
1268       IC_RAR, FI_ORDER_ATOMIC_RAW,  FI_ORDER_ATOMIC_WAR,  and  FI_ORDER_ATOM‐
1269       IC_WAW.
1270
1271   comp_order - Completion Ordering
1272       For  a  description of completion ordering, see the comp_order field in
1273       the Transmit Context Attribute section.
1274
1275       FI_ORDER_DATA
1276              When set, this bit indicates that received data is written  into
1277              memory  in  order.   Data ordering applies to memory accessed as
1278              part of a single operation and between operations if message or‐
1279              dering is guaranteed.
1280
1281       FI_ORDER_NONE
1282              No ordering is defined for completed operations.  Receive opera‐
1283              tions may complete in any order, regardless of their  submission
1284              order.
1285
1286       FI_ORDER_STRICT
1287              Receive  operations complete in the order in which they are pro‐
1288              cessed by the receive context, based on the receive side msg_or‐
1289              der attribute.
1290
1291   total_buffered_recv
1292       This  field is supported for backwards compatibility purposes.  It is a
1293       hint to the provider of the total available space that may be needed to
1294       buffer  messages  that  are received for which there is no matching re‐
1295       ceive operation.  The provider may adjust or ignore  this  value.   The
1296       allocation  of  internal  network  buffering  among received message is
1297       provider specific.  For instance, a provider may limit the size of mes‐
1298       sages  which  can be buffered or the amount of buffering allocated to a
1299       single message.
1300
1301       If receive side buffering is disabled (total_buffered_recv = 0)  and  a
1302       message  is  received by an endpoint, then the behavior is dependent on
1303       whether resource management has been enabled (FI_RM_ENABLED has be  set
1304       or  not).   See the Resource Management section of fi_domain.3 for fur‐
1305       ther clarification.  It is recommended  that  applications  enable  re‐
1306       source  management  if  they  anticipate receiving unexpected messages,
1307       rather than modifying this value.
1308
1309   size
1310       The size of the receive context.  The mapping of the size value to  re‐
1311       sources  is provider specific, but it is directly related to the number
1312       of command entries allocated for the endpoint.  A  smaller  size  value
1313       consumes fewer hardware and software resources, while a larger size al‐
1314       lows queuing more transmit requests.
1315
1316       While the size attribute guides the size of underlying endpoint receive
1317       queue,  there is not necessarily a one-to-one mapping between a receive
1318       operation and a queue entry.  A single receive  operation  may  consume
1319       multiple queue entries; for example, one per scatter-gather entry.  Ad‐
1320       ditionally, the size field is intended to guide the allocation  of  the
1321       endpoint’s  receive  context.   Specifically,  for  connectionless end‐
1322       points, there may be lower-level queues use to track communication on a
1323       per  peer  basis.  The sizes of any lower-level queues may only be sig‐
1324       nificantly smaller than the endpoint’s receive size, in order to reduce
1325       resource utilization.
1326
1327   iov_limit
1328       This is the maximum number of IO vectors (scatter-gather elements) that
1329       a single posted operating may reference.
1330

SCALABLE ENDPOINTS

1332       A scalable endpoint is a communication portal  that  supports  multiple
1333       transmit  and receive contexts.  Scalable endpoints are loosely modeled
1334       after the networking concept of  transmit/receive  side  scaling,  also
1335       known as multi-queue.  Support for scalable endpoints is domain specif‐
1336       ic.  Scalable endpoints may improve the performance  of  multi-threaded
1337       and  parallel  applications,  by allowing threads to access independent
1338       transmit and receive queues.  A scalable endpoint has a  single  trans‐
1339       port  level address, which can reduce the memory requirements needed to
1340       store remote addressing data, versus using standard  endpoints.   Scal‐
1341       able  endpoints  cannot  be used directly for communication operations,
1342       and require the application to explicitly create transmit  and  receive
1343       contexts as described below.
1344
1345   fi_tx_context
1346       Transmit  contexts  are independent transmit queues.  Ordering and syn‐
1347       chronization between contexts are not defined.  Conceptually a transmit
1348       context  behaves  similar  to a send-only endpoint.  A transmit context
1349       may be configured with fewer capabilities than the  base  endpoint  and
1350       with  different  attributes  (such  as ordering requirements and inject
1351       size) than other contexts associated with the same  scalable  endpoint.
1352       Each  transmit  context  has  its  own completion queue.  The number of
1353       transmit contexts associated with an endpoint is specified during  end‐
1354       point creation.
1355
1356       The  fi_tx_context call is used to retrieve a specific context, identi‐
1357       fied by an index  (see  above  for  details  on  transmit  context  at‐
1358       tributes).  Providers may dynamically allocate contexts when fi_tx_con‐
1359       text is called, or may statically create all contexts when  fi_endpoint
1360       is  invoked.  By default, a transmit context inherits the properties of
1361       its associated endpoint.  However,  applications  may  request  context
1362       specific attributes through the attr parameter.  Support for per trans‐
1363       mit  context  attributes  is  provider  specific  and  not  guaranteed.
1364       Providers  will  return  the  actual attributes assigned to the context
1365       through the attr parameter, if provided.
1366
1367   fi_rx_context
1368       Receive contexts are independent receive queues for receiving  incoming
1369       data.   Ordering  and  synchronization between contexts are not guaran‐
1370       teed.  Conceptually a receive context behaves similar to a receive-only
1371       endpoint.   A receive context may be configured with fewer capabilities
1372       than the base endpoint and with different attributes (such as  ordering
1373       requirements  and  inject size) than other contexts associated with the
1374       same scalable endpoint.  Each receive context has  its  own  completion
1375       queue.   The  number of receive contexts associated with an endpoint is
1376       specified during endpoint creation.
1377
1378       Receive contexts are often associated with steering flows, that specify
1379       which  incoming packets targeting a scalable endpoint to process.  How‐
1380       ever, receive contexts may be targeted directly by  the  initiator,  if
1381       supported by the underlying protocol.  Such contexts are referred to as
1382       `named'.  Support for named contexts must be indicated by  setting  the
1383       caps FI_NAMED_RX_CTX capability when the corresponding endpoint is cre‐
1384       ated.  Support for named receive contexts is coordinated  with  address
1385       vectors.  See fi_av(3) and fi_rx_addr(3).
1386
1387       The  fi_rx_context call is used to retrieve a specific context, identi‐
1388       fied by an index (see above for details on receive context attributes).
1389       Providers  may  dynamically  allocate  contexts  when  fi_rx_context is
1390       called, or may statically create all contexts when fi_endpoint  is  in‐
1391       voked.   By  default,  a receive context inherits the properties of its
1392       associated endpoint.  However, applications may request context specif‐
1393       ic attributes through the attr parameter.  Support for per receive con‐
1394       text attributes is provider specific  and  not  guaranteed.   Providers
1395       will  return  the actual attributes assigned to the context through the
1396       attr parameter, if provided.
1397

SHARED CONTEXTS

1399       Shared contexts are transmit and  receive  contexts  explicitly  shared
1400       among one or more endpoints.  A shareable context allows an application
1401       to use a single dedicated provider resource  among  multiple  transport
1402       addressable endpoints.  This can greatly reduce the resources needed to
1403       manage communication over multiple endpoints by  multiplexing  transmit
1404       and/or  receive  processing, with the potential cost of serializing ac‐
1405       cess across multiple endpoints.  Support for shareable contexts is  do‐
1406       main specific.
1407
1408       Conceptually,  shareable transmit contexts are transmit queues that may
1409       be accessed by many endpoints.  The use of a shared transmit context is
1410       mostly  opaque  to an application.  Applications must allocate and bind
1411       shared transmit contexts to endpoints, but operations  are  posted  di‐
1412       rectly  to  the  endpoint.  Shared transmit contexts are not associated
1413       with completion queues or counters.  Completed operations are posted to
1414       the CQs bound to the endpoint.  An endpoint may only be associated with
1415       a single shared transmit context.
1416
1417       Unlike shared transmit contexts, applications  interact  directly  with
1418       shared  receive  contexts.   Users  post  receive buffers directly to a
1419       shared receive context, with the buffers usable by any  endpoint  bound
1420       to the shared receive context.  Shared receive contexts are not associ‐
1421       ated with completion queues or counters.  Completed receive  operations
1422       are  posted  to the CQs bound to the endpoint.  An endpoint may only be
1423       associated with a single receive context, and all  connectionless  end‐
1424       points  associated  with  a  shared receive context must also share the
1425       same address vector.
1426
1427       Endpoints associated with a shared transmit context may  use  dedicated
1428       receive contexts, and vice-versa.  Or an endpoint may use shared trans‐
1429       mit and receive contexts.  And there is no requirement  that  the  same
1430       group of endpoints sharing a context of one type also share the context
1431       of an alternate type.  Furthermore, an endpoint may use a  shared  con‐
1432       text of one type, but a scalable set of contexts of the alternate type.
1433
1434   fi_stx_context
1435       This  call  is used to open a shareable transmit context (see above for
1436       details on the transmit context attributes).  Endpoints associated with
1437       a  shared  transmit context must use a subset of the transmit context’s
1438       attributes.  Note that this is  the  reverse  of  the  requirement  for
1439       transmit contexts for scalable endpoints.
1440
1441   fi_srx_context
1442       This  allocates  a  shareable receive context (see above for details on
1443       the receive context attributes).  Endpoints associated  with  a  shared
1444       receive  context must use a subset of the receive context’s attributes.
1445       Note that this is the reverse of the requirement for  receive  contexts
1446       for scalable endpoints.
1447

SOCKET ENDPOINTS

1449       The  following  feature and description should be considered experimen‐
1450       tal.  Until the experimental tag is removed, the interfaces, semantics,
1451       and data structures associated with socket endpoints may change between
1452       library versions.
1453
1454       This  section  applies  to  endpoints  of  type  FI_EP_SOCK_STREAM  and
1455       FI_EP_SOCK_DGRAM, commonly referred to as socket endpoints.
1456
1457       Socket  endpoints  are  defined  with semantics that allow them to more
1458       easily be adopted by developers familiar with the UNIX socket  API,  or
1459       by middleware that exposes the socket API, while still taking advantage
1460       of high-performance hardware features.
1461
1462       The key difference between socket endpoints and other active  endpoints
1463       are  socket  endpoints  use synchronous data transfers.  Buffers passed
1464       into send and receive operations revert to the control of the  applica‐
1465       tion  upon  returning  from  the  function  call.  As a result, no data
1466       transfer completions are reported to the application, and  socket  end‐
1467       points are not associated with completion queues or counters.
1468
1469       Socket  endpoints  support  a  subset  of  message operations: fi_send,
1470       fi_sendv, fi_sendmsg, fi_recv,  fi_recvv,  fi_recvmsg,  and  fi_inject.
1471       Because  data transfers are synchronous, the return value from send and
1472       receive operations indicate the number of bytes transferred on success,
1473       or a negative value on error, including -FI_EAGAIN if the endpoint can‐
1474       not send or receive any data because of full or empty  queues,  respec‐
1475       tively.
1476
1477       Socket  endpoints are associated with event queues and address vectors,
1478       and process connection management  events  asynchronously,  similar  to
1479       other  endpoints.   Unlike  UNIX sockets, socket endpoint must still be
1480       declared as either active or passive.
1481
1482       Socket endpoints behave like non-blocking sockets.  In order to support
1483       select  and poll semantics, active socket endpoints are associated with
1484       a file descriptor that is signaled whenever the endpoint  is  ready  to
1485       send  and/or  receive data.  The file descriptor may be retrieved using
1486       fi_control.
1487

OPERATION FLAGS

1489       Operation flags are obtained by OR-ing the  following  flags  together.
1490       Operation  flags define the default flags applied to an endpoint’s data
1491       transfer operations, where a flags parameter is  not  available.   Data
1492       transfer operations that take flags as input override the op_flags val‐
1493       ue of transmit or receive context attributes of an endpoint.
1494
1495       FI_COMMIT_COMPLETE
1496              Indicates that a completion should not be generated (locally  or
1497              at  the  peer)  until  the result of an operation have been made
1498              persistent.  See fi_cq(3) for additional details  on  completion
1499              semantics.
1500
1501       FI_COMPLETION
1502              Indicates  that  a  completion queue entry should be written for
1503              data transfer operations.  This flag only applies to  operations
1504              issued  on an endpoint that was bound to a completion queue with
1505              the FI_SELECTIVE_COMPLETION flag set, otherwise, it is  ignored.
1506              See the fi_ep_bind section above for more detail.
1507
1508       FI_DELIVERY_COMPLETE
1509              Indicates  that a completion should be generated when the opera‐
1510              tion has been processed by  the  destination  endpoint(s).   See
1511              fi_cq(3) for additional details on completion semantics.
1512
1513       FI_INJECT
1514              Indicates  that  all outbound data buffers should be returned to
1515              the user’s control immediately after a data  transfer  call  re‐
1516              turns,  even  if  the operation is handled asynchronously.  This
1517              may require that the provider copy the data into a local  buffer
1518              and transfer out of that buffer.  A provider can limit the total
1519              amount of send data that may be buffered and/or the  size  of  a
1520              single send that can use this flag.  This limit is indicated us‐
1521              ing inject_size (see inject_size above).
1522
1523       FI_INJECT_COMPLETE
1524              Indicates that a completion should be generated when the  source
1525              buffer(s) may be reused.  See fi_cq(3) for additional details on
1526              completion semantics.
1527
1528       FI_MULTICAST
1529              Indicates that data transfers will target multicast addresses by
1530              default.   Any  fi_addr_t  passed into a data transfer operation
1531              will be treated as a multicast address.
1532
1533       FI_MULTI_RECV
1534              Applies to posted receive operations.  This flag allows the user
1535              to post a single buffer that will receive multiple incoming mes‐
1536              sages.  Received messages will be packed into the receive buffer
1537              until  the buffer has been consumed.  Use of this flag may cause
1538              a single posted receive operation to generate  multiple  comple‐
1539              tions  as messages are placed into the buffer.  The placement of
1540              received data into the buffer may be subjected to provider  spe‐
1541              cific  alignment  restrictions.   The buffer will be released by
1542              the provider when the available buffer  space  falls  below  the
1543              specified minimum (see FI_OPT_MIN_MULTI_RECV).
1544
1545       FI_TRANSMIT_COMPLETE
1546              Indicates  that a completion should be generated when the trans‐
1547              mit operation has completed relative to the local provider.  See
1548              fi_cq(3) for additional details on completion semantics.
1549

NOTES

1551       Users  should  call  fi_close to release all resources allocated to the
1552       fabric endpoint.
1553
1554       Endpoints allocated with the FI_CONTEXT or FI_CONTEXT2  mode  bits  set
1555       must typically provide struct fi_context(2) as their per operation con‐
1556       text parameter.  (See fi_getinfo.3 for details.) However,  when  FI_SE‐
1557       LECTIVE_COMPLETION is enabled to suppress CQ completion entries, and an
1558       operation is initiated without the FI_COMPLETION  flag  set,  then  the
1559       context  parameter is ignored.  An application does not need to pass in
1560       a valid struct fi_context(2) into such data transfers.
1561
1562       Operations that complete in error that are not  associated  with  valid
1563       operational  context will use the endpoint context in any error report‐
1564       ing structures.
1565
1566       Although applications typically associate individual  completions  with
1567       either  completion  queues  or counters, an endpoint can be attached to
1568       both a counter and completion queue.  When combined with  using  selec‐
1569       tive  completions,  this allows an application to use counters to track
1570       successful completions, with a CQ used to  report  errors.   Operations
1571       that  complete with an error increment the error counter and generate a
1572       CQ completion event.
1573
1574       As mentioned in fi_getinfo(3), the ep_attr structure  can  be  used  to
1575       query  providers  that support various endpoint attributes.  fi_getinfo
1576       can return provider info structures that can support the minimal set of
1577       requirements (such that the application maintains correctness).  Howev‐
1578       er, it can also return provider info structures that exceed application
1579       requirements.   As  an  example,  consider  an  application  requesting
1580       msg_order as FI_ORDER_NONE.  The resulting output from  fi_getinfo  may
1581       have all the ordering bits set.  The application can reset the ordering
1582       bits it does not require before creating the endpoint.  The provider is
1583       free  to implement a stricter ordering than is required by the applica‐
1584       tion.
1585

RETURN VALUES

1587       Returns 0 on success.  On error, a negative value corresponding to fab‐
1588       ric  errno  is  returned.  For fi_cancel, a return value of 0 indicates
1589       that the cancel request  was  submitted  for  processing.   For  fi_se‐
1590       topt/fi_getopt,   a  return  value  of  -FI_ENOPROTOOPT  indicates  the
1591       provider does not support the requested option.
1592
1593       Fabric errno values are defined in rdma/fi_errno.h.
1594

ERRORS

1596       -FI_EDOMAIN
1597              A resource domain was not bound to the endpoint  or  an  attempt
1598              was made to bind multiple domains.
1599
1600       -FI_ENOCQ
1601              The endpoint has not been configured with necessary event queue.
1602
1603       -FI_EOPBADSTATE
1604              The endpoint’s state does not permit the requested operation.
1605

AUTHORS

1611       OpenFabrics.
1612
1613
1614
1615Libfabric Programmer’s Manual     2023-03-15                    fi_endpoint(3)