1fi_av(3)                       Libfabric v1.17.0                      fi_av(3)
2
3
4

NAME

6       fi_av - Address vector operations
7
8       fi_av_open / fi_close
9              Open or close an address vector
10
11       fi_av_bind
12              Associate an address vector with an event queue.
13
14       fi_av_insert / fi_av_insertsvc / fi_av_remove
15              Insert/remove an address into/from the address vector.
16
17       fi_av_lookup
18              Retrieve an address stored in the address vector.
19
20       fi_av_straddr
21              Convert an address into a printable string.
22

SYNOPSIS

24              #include <rdma/fi_domain.h>
25
26              int fi_av_open(struct fid_domain *domain, struct fi_av_attr *attr,
27                  struct fid_av **av, void *context);
28
29              int fi_close(struct fid *av);
30
31              int fi_av_bind(struct fid_av *av, struct fid *eq, uint64_t flags);
32
33              int fi_av_insert(struct fid_av *av, void *addr, size_t count,
34                  fi_addr_t *fi_addr, uint64_t flags, void *context);
35
36              int fi_av_insertsvc(struct fid_av *av, const char *node,
37                  const char *service, fi_addr_t *fi_addr, uint64_t flags,
38                  void *context);
39
40              int fi_av_insertsym(struct fid_av *av, const char *node,
41                  size_t nodecnt, const char *service, size_t svccnt,
42                  fi_addr_t *fi_addr, uint64_t flags, void *context);
43
44              int fi_av_remove(struct fid_av *av, fi_addr_t *fi_addr, size_t count,
45                  uint64_t flags);
46
47              int fi_av_lookup(struct fid_av *av, fi_addr_t fi_addr,
48                  void *addr, size_t *addrlen);
49
50              fi_addr_t fi_rx_addr(fi_addr_t fi_addr, int rx_index,
51                    int rx_ctx_bits);
52
53              const char * fi_av_straddr(struct fid_av *av, const void *addr,
54                    char *buf, size_t *len);
55

ARGUMENTS

57       domain Resource domain
58
59       av     Address vector
60
61       eq     Event queue
62
63       attr   Address vector attributes
64
65       context
66              User specified context associated with the address vector or in‐
67              sert operation.
68
69       addr   Buffer containing one or more addresses to insert  into  address
70              vector.
71
72       addrlen
73              On input, specifies size of addr buffer.  On output, stores num‐
74              ber of bytes written to addr buffer.
75
76       fi_addr
77              For insert, a reference to an array where  returned  fabric  ad‐
78              dresses  will  be  written.   For remove, one or more fabric ad‐
79              dresses to remove.  If FI_AV_USER_ID is requested, also used  as
80              input into insert calls to assign the user ID with the added ad‐
81              dress.
82
83       count  Number of addresses to insert/remove from an AV.
84
85       flags  Additional flags to apply to the operation.
86

DESCRIPTION

88       Address vectors are used to map higher-level addresses,  which  may  be
89       more natural for an application to use, into fabric specific addresses.
90       For example, an endpoint may be associated with  a  struct  sockaddr_in
91       address,  indicating  the endpoint is reachable using a TCP port number
92       over an IPv4 address.  This may hold even if the endpoint  communicates
93       using  a proprietary network protocol.  The purpose of the AV is to as‐
94       sociate a higher-level address with a  simpler,  more  efficient  value
95       that  can  be  used by the libfabric API in a fabric agnostic way.  The
96       mapped address is of type fi_addr_t and is returned through an  AV  in‐
97       sertion  call.   The fi_addr_t is designed such that it may be a simple
98       index into an array, a pointer to a structure, or a compact network ad‐
99       dress that may be placed directly into protocol headers.
100
101       The  process of mapping an address is fabric and provider specific, but
102       may involve lengthy address resolution and fabric management protocols.
103       AV  operations  are  synchronous  by default, but may be set to operate
104       asynchronously by specifying the FI_EVENT flag to fi_av_open.  When re‐
105       questing  asynchronous  operation,  the  application must first bind an
106       event queue to the AV before inserting addresses.  See the  NOTES  sec‐
107       tion for AV restrictions on duplicate addresses.
108
109   fi_av_open
110       fi_av_open  allocates  or  opens an address vector.  The properties and
111       behavior of the address vector are defined by struct fi_av_attr.
112
113              struct fi_av_attr {
114                  enum fi_av_type  type;        /* type of AV */
115                  int              rx_ctx_bits; /* address bits to identify rx ctx */
116                  size_t           count;       /* # entries for AV */
117                  size_t           ep_per_node; /* # endpoints per fabric address */
118                  const char       *name;       /* system name of AV */
119                  void             *map_addr;   /* base mmap address */
120                  uint64_t         flags;       /* operation flags */
121              };
122
123       type   An AV type corresponds to a conceptual implementation of an  ad‐
124              dress  vector.  The type specifies how an application views data
125              stored in the AV, including how it may be accessed.  Valid  val‐
126              ues are:
127
128       - FI_AV_MAP
129              Addresses  which  are inserted into an AV are mapped to a native
130              fabric address for use by the application.  The use of FI_AV_MAP
131              requires  that an application store the returned fi_addr_t value
132              that is associated with each inserted address.  The advantage of
133              using FI_AV_MAP is that the returned fi_addr_t value may contain
134              encoded address data, which is immediately available  when  pro‐
135              cessing  data  transfer  requests.  This can eliminate or reduce
136              the number of memory lookups needed when initiating a  transfer.
137              The  disadvantage  of  FI_AV_MAP is the increase in memory usage
138              needed to store the returned addresses.  Addresses are stored in
139              the  AV  using a provider specific mechanism, including, but not
140              limited to a tree, hash table, or maintained on the heap.
141
142       - FI_AV_TABLE
143              Addresses which are inserted into an AV of type FI_AV_TABLE  are
144              accessible  using  a  simple index.  Conceptually, the AV may be
145              treated as an array of addresses, though the provider may imple‐
146              ment  the AV using a variety of mechanisms.  When FI_AV_TABLE is
147              used, the returned fi_addr_t is an index, with the index for  an
148              inserted address the same as its insertion order into the table.
149              The index of the first address inserted into an FI_AV_TABLE will
150              be  0,  and  successive  insertions will be given sequential in‐
151              dices.  Sequential indices will  be  assigned  across  insertion
152              calls on the same AV.
153
154       - FI_AV_UNSPEC
155              Provider  will  choose  its preferred AV type.  The AV type used
156              will be returned through the type field in fi_av_attr.
157
158       Receive Context Bits (rx_ctx_bits)
159              The receive context bits field is only  for  use  with  scalable
160              endpoints.   It  indicates  the number of bits reserved in a re‐
161              turned fi_addr_t, which will be used to identify a specific tar‐
162              get  receive  context.   See fi_rx_addr() and fi_endpoint(3) for
163              additional details on receive contexts.  The requested number of
164              bits  should be selected such that 2 ^ rx_ctx_bits >= rx_ctx_cnt
165              for the endpoint.
166
167       count  Indicates the expected number of addresses that will be inserted
168              into  the AV.  The provider uses this to optimize resource allo‐
169              cations.
170
171       ep_per_node
172              This field indicates the number of endpoints that will be  asso‐
173              ciated with a specific fabric, or network, address.  If the num‐
174              ber of endpoints per node is unknown, this value should  be  set
175              to 0.  The provider uses this value to optimize resource alloca‐
176              tions.  For example, distributed, parallel applications may  set
177              this  to  the  number of processes allocated per node, times the
178              number of endpoints each process will open.
179
180       name   An optional system name associated with the  address  vector  to
181              create  or  open.  Address vectors may be shared across multiple
182              processes which access the same named domain on the  same  node.
183              The  name  field  allows  the  underlying provider to identify a
184              shared AV.
185
186       If the name field is non-NULL and the AV is not  opened  for  read-only
187       access, a named AV will be created, if it does not already exist.
188
189       map_addr
190              The  map_addr  determines  the  base  fi_addr_t  address  that a
191              provider should use when sharing an AV of type FI_AV_MAP between
192              processes.   Processes  that provide the same value for map_addr
193              to a shared AV may use the same fi_addr_t values  returned  from
194              an fi_av_insert call.
195
196       The map_addr may be used by the provider to mmap memory allocated for a
197       shared AV between processes; however, the provider is not  required  to
198       use  the  map_addr  in  this  fashion.  The only requirement is that an
199       fi_addr_t returned as part of an fi_av_insert call on  one  process  is
200       usable  on  another  process  which opens an AV of the same name at the
201       same map_addr value.  The relationship between the map_addr and any re‐
202       turned fi_addr_t is not defined.
203
204       If  name  is  non-NULL and map_addr is 0, then the map_addr used by the
205       provider will be returned through the attribute structure.  The map_ad‐
206       dr field is ignored if name is NULL.
207
208       flags  The following flags may be used when opening an AV.
209
210       - FI_EVENT
211              When  the  flag  FI_EVENT is specified, all insert operations on
212              this AV will occur asynchronously.  There will be one  EQ  error
213              entry  generated  for each failed address insertion, followed by
214              one non-error event indicating that the insertion operation  has
215              completed.   There will always be one non-error completion event
216              for each insert operation, even if all addresses fail.  The con‐
217              text  field  in all completions will be the context specified to
218              the insert call, and the data field in the final completion  en‐
219              try  will  report the number of addresses successfully inserted.
220              If an error occurs during the asynchronous insertion,  an  error
221              completion  entry  is returned (see fi_eq(3) for a discussion of
222              the fi_eq_err_entry error completion struct).  The context field
223              of  the  error completion will be the context that was specified
224              in the insert call; the data field will contain the index of the
225              failed address.  There will be one error completion returned for
226              each address that fails to insert into the AV.
227
228       If an AV is opened with FI_EVENT, any insertions attempted before an EQ
229       is bound to the AV will fail with -FI_ENOEQ.
230
231       Error  completions  for failed insertions will contain the index of the
232       failed address in the index field of the error completion entry.
233
234       Note that the order of delivery of insert completions may not match the
235       order in which the calls to fi_av_insert were made.  The only guarantee
236       is that all error completions for a given  call  to  fi_av_insert  will
237       precede the single associated non-error completion.  • .RS 2
238
239       FI_READ
240              Opens  an  AV  for read-only access.  An AV opened for read-only
241              access must be named (name attribute specified), and the AV must
242              exist.
243       • .RS 2
244
245       FI_SYMMETRIC
246              Indicates that each node will be associated with the same number
247              of endpoints, the same transport addresses will be allocated  on
248              each node, and the transport addresses will be sequential.  This
249              feature targets distributed applications on  large  fabrics  and
250              allows  for highly-optimized storage of remote endpoint address‐
251              ing.
252
253   fi_close
254       The fi_close call is used to release all resources associated  with  an
255       address  vector.   Note that any events queued on an event queue refer‐
256       encing the AV are left untouched.  It is recommended that  callers  re‐
257       trieve all events associated with the AV before closing it.
258
259       When  closing the address vector, there must be no opened endpoints as‐
260       sociated with the AV.  If resources are still associated  with  the  AV
261       when attempting to close, the call will return -FI_EBUSY.
262
263   fi_av_bind
264       Associates  an  event queue with the AV.  If an AV has been opened with
265       FI_EVENT, then an event queue must be bound to the AV before any inser‐
266       tion  calls  are  attempted.   Any  calls to insert addresses before an
267       event queue has been bound will fail with  -FI_ENOEQ.   Flags  are  re‐
268       served for future use and must be 0.
269
270   fi_av_insert
271       The  fi_av_insert  call inserts zero or more addresses into an AV.  The
272       number of addresses is specified through the count parameter.  The addr
273       parameter  references an array of addresses to insert into the AV.  Ad‐
274       dresses inserted into an address vector must be in the same  format  as
275       specified  in the addr_format field of the fi_info struct provided when
276       opening the corresponding domain.  When using the  FI_ADDR_STR  format,
277       the addr parameter should reference an array of strings (char **).
278
279       For  AV’s  of type FI_AV_MAP, once inserted addresses have been mapped,
280       the mapped values are written into the buffer  referenced  by  fi_addr.
281       The  fi_addr  buffer  must remain valid until the AV insertion has com‐
282       pleted and an event has been generated to an  associated  event  queue.
283       The  value  of  the returned fi_addr should be considered opaque by the
284       application for AVs of type FI_AV_MAP.  The returned value may point to
285       an  internal structure or a provider specific encoding of low-level ad‐
286       dressing data, for example.  In the latter case, use of  FI_AV_MAP  may
287       be able to avoid memory references during data transfer operations.
288
289       For  AV’s  of  type FI_AV_TABLE, addresses are placed into the table in
290       order.  An address is inserted at the lowest index that corresponds  to
291       an  unused  table  location,  with indices starting at 0.  That is, the
292       first address inserted may be referenced at index 0, the second at  in‐
293       dex 1, and so forth.  When addresses are inserted into an AV table, the
294       assigned fi_addr values will be simple indices corresponding to the en‐
295       try  into the table where the address was inserted.  Index values accu‐
296       mulate across successive insert calls in the order the calls are  made,
297       not necessarily in the order the insertions complete.
298
299       Because insertions occur at a pre-determined index, the fi_addr parame‐
300       ter may be NULL.  If fi_addr is non-NULL, it must reference an array of
301       fi_addr_t,  and the buffer must remain valid until the insertion opera‐
302       tion completes.  Note that if fi_addr is NULL and synchronous operation
303       is requested without using FI_SYNC_ERR flag, individual insertion fail‐
304       ures cannot be reported and the application must use other calls,  such
305       as  fi_av_lookup  to  learn  which specific addresses failed to insert.
306       Since fi_av_remove is provider-specific, it is recommended  that  calls
307       to  fi_av_insert  following  a  call to fi_av_remove always reference a
308       valid buffer in the fi_addr parameter.  Otherwise it may  be  difficult
309       to determine what the next assigned index will be.
310
311       flags  The  following  flag  may  be  passed  to  AV  insertion  calls:
312              fi_av_insert, fi_av_insertsvc, or fi_av_insertsym.
313
314       - FI_MORE
315              In order to allow optimized address insertion,  the  application
316              may  specify  the FI_MORE flag to the insert call to give a hint
317              to the provider that more insertion requests will follow, allow‐
318              ing the provider to aggregate insertion requests if desired.  An
319              application may make any number of insertion calls with  FI_MORE
320              set,  provided that they are followed by an insertion call with‐
321              out FI_MORE.  This signifies to the provider that the  insertion
322              list is complete.  Providers are free to ignore FI_MORE.
323
324       - FI_SYNC_ERR
325              This flag applies to synchronous insertions only, and is used to
326              retrieve error details of failed insertions.  If set,  the  con‐
327              text  parameter  of insertion calls references an array of inte‐
328              gers, with context set to address of the first  element  of  the
329              array.   The  resulting  status of attempting to insert each ad‐
330              dress will be written to the corresponding array location.  Suc‐
331              cessful  insertions will be updated to 0.  Failures will contain
332              a fabric errno code.
333
334       - FI_AV_USER_ID
335              This flag associates a user-assigned identifier with each AV en‐
336              try  that  is returned with any completion entry in place of the
337              AV’s address.  See the user ID section below.
338
339   fi_av_insertsvc
340       The fi_av_insertsvc call behaves similar to  fi_av_insert,  but  allows
341       the  application  to specify the node and service names, similar to the
342       fi_getinfo inputs, rather than an encoded address.  The node  and  ser‐
343       vice  parameters are defined the same as fi_getinfo(3).  Node should be
344       a string that corresponds to a hostname or network address.   The  ser‐
345       vice  string corresponds to a textual representation of a transport ad‐
346       dress.  Applications may also pass in an FI_ADDR_STR formatted  address
347       as  the  node  parameter.  In such cases, the service parameter must be
348       NULL.  See fi_getinfo.3 for details on  using  FI_ADDR_STR.   Supported
349       flags are the same as for fi_av_insert.
350
351   fi_av_insertsym
352       fi_av_insertsym  performs  a symmetric insert that inserts a sequential
353       range of nodes and/or service addresses into an AV.  The svccnt parame‐
354       ter  indicates  the  number of transport (endpoint) addresses to insert
355       into the AV for each node address, with the service parameter  specify‐
356       ing  the starting transport address.  Inserted transport addresses will
357       be of the range {service, service + svccnt - 1}, inclusive.   All  ser‐
358       vice  addresses for a node will be inserted before the next node is in‐
359       serted.
360
361       The nodecnt parameter indicates the number of node (network)  addresses
362       to  insert into the AV, with the node parameter specifying the starting
363       node address.  Inserted node addresses will be of the range {node, node
364       +  nodecnt - 1}, inclusive.  If node is a non-numeric string, such as a
365       hostname, it must contain a numeric suffix if nodecnt > 1.
366
367       As an example, if node = “10.1.1.1”, nodecnt = 2, service = “5000”, and
368       svccnt = 2, the following addresses will be inserted into the AV in the
369       order    shown:    10.1.1.1:5000,     10.1.1.1:5001,     10.1.1.2:5000,
370       10.1.1.2:5001.  If node were replaced by the hostname “host10”, the ad‐
371       dresses would be: host10:5000, host10:5001, host11:5000, host11:5001.
372
373       The total number of inserted addresses will be nodecnt x svccnt.
374
375       Supported flags are the same as for fi_av_insert.
376
377   fi_av_remove
378       fi_av_remove removes a set of addresses from an  address  vector.   All
379       resources  associated  with  the indicated addresses are released.  The
380       removed address - either the mapped address (in the case of  FI_AV_MAP)
381       or index (FI_AV_TABLE) - is invalid until it is returned again by a new
382       fi_av_insert.
383
384       The behavior of operations in progress that reference the  removed  ad‐
385       dresses is undefined.
386
387       The use of fi_av_remove is an optimization that applications may use to
388       free memory allocated with addresses that will no longer  be  accessed.
389       Inserted  addresses  are  not required to be removed.  fi_av_close will
390       automatically cleanup any resources associated with addresses remaining
391       in the AV when it is invoked.
392
393       Flags are reserved for future use and must be 0.
394
395   fi_av_lookup
396       This  call returns the address stored in the address vector that corre‐
397       sponds to the given fi_addr.  The returned address is the  same  format
398       as  those stored by the AV.  On input, the addrlen parameter should in‐
399       dicate the size of the addr buffer.  If the actual  address  is  larger
400       than  what  can  fit into the buffer, it will be truncated.  On output,
401       addrlen is set to the size of the buffer needed to store  the  address,
402       which may be larger than the input value.
403
404   fi_rx_addr
405       This  function  is  used  to  convert  an endpoint address, returned by
406       fi_av_insert, into an address that specifies a target receive  context.
407       The  specified  fi_addr  parameter must either be a value returned from
408       fi_av_insert, in the case of FI_AV_MAP, or an index,  in  the  case  of
409       FI_AV_TABLE.   The  value  for rx_ctx_bits must match that specified in
410       the AV attributes for the given address.
411
412       Connected endpoints that support multiple receive contexts, but are not
413       associated with address vectors should specify FI_ADDR_NOTAVAIL for the
414       fi_addr parameter.
415
416   fi_av_straddr
417       The fi_av_straddr function converts the provided address into a  print‐
418       able string.  The specified address must be of the same format as those
419       stored by the AV, though the address itself is  not  required  to  have
420       been  inserted.  On input, the len parameter should specify the size of
421       the buffer referenced by buf.  On output, addrlen is set to the size of
422       the  buffer  needed to store the address.  This size may be larger than
423       the input len.  If the provided buffer is too small, the  results  will
424       be truncated.  fi_av_straddr returns a pointer to buf.
425

NOTES

427       An AV should only store a single instance of an address.  Attempting to
428       insert a duplicate copy of the same address into an AV  may  result  in
429       undefined   behavior,   depending   on   the  provider  implementation.
430       Providers are not required to check for duplicates, as doing  so  could
431       incur  significant overhead to the insertion process.  For portability,
432       applications may need to track which peer addresses have been  inserted
433       into  a  given  AV  in  order  to  avoid  duplicate  entries.  However,
434       providers are required to support the removal, followed by  the  re-in‐
435       sertion of an address.  Only duplicate insertions are restricted.
436
437       Providers  may  implement AV’s using a variety of mechanisms.  Specifi‐
438       cally, a provider may begin resolving inserted  addresses  as  soon  as
439       they  have been added to an AV, even if asynchronous operation has been
440       specified.  Similarly, a provider may lazily release resources from re‐
441       moved entries.
442

USER IDENTIFIERS FOR ADDRESSES

444       As described above, endpoint addresses that are inserted into an AV are
445       mapped to an fi_addr_t value.  The fi_addr_t is used in  data  transfer
446       APIs  to  specify  the  destination of an outbound transfer, in receive
447       APIs to indicate the source for an inbound transfer, and also  in  com‐
448       pletion  events to report the source address of inbound transfers.  The
449       FI_AV_USER_ID capability bit and flag provide a mechanism by which  the
450       fi_addr_t value reported by a completion event is replaced with a user-
451       specified value instead.  This is useful for applications that need  to
452       map the source address to their own data structure.
453
454       Support for FI_AV_USER_ID is provider specific, as it may not be feasi‐
455       ble for a provider to implement this support without significant  over‐
456       head.   For  example,  some  providers may need to add a reverse lookup
457       mechanism.  This feature may be unavailable if shared AVs are  request‐
458       ed, or negatively impact the per process memory footprint if implement‐
459       ed.  For providers that do not support FI_AV_USER_ID, users may be able
460       to  trade  off  lookup  processing  with protocol overhead, by carrying
461       source identification within a message header.
462
463       User-specified fi_addr_t values are provided as part of address  inser‐
464       tion  (e.g. fi_av_insert)  through  the fi_addr parameter.  The fi_addr
465       parameter acts as input/output in this case.   When  the  FI_AV_USER_ID
466       flag  is  passed to any of the insert calls, the caller must specify an
467       fi_addr_t  identifier  value  to  associate  with  each  address.   The
468       provider  will record that identifier and use it where required as part
469       of any completion event.  Note that the output from  the  AV  insertion
470       call  is  unchanged.   The provider will return an fi_addr_t value that
471       maps to each address, and that value must be used for all data transfer
472       operations.
473

RETURN VALUES

475       Insertion  calls for an AV opened for synchronous operation will return
476       the number of addresses that were successfully inserted.  In  the  case
477       of  failure, the return value will be less than the number of addresses
478       that was specified.
479
480       Insertion calls for an  AV  opened  for  asynchronous  operation  (with
481       FI_EVENT flag specified) will return 0 if the operation was successful‐
482       ly initiated.  In the case of failure, a negative fabric errno will  be
483       returned.   Providers  are allowed to abort insertion operations in the
484       case of an error.  Addresses that are not inserted  because  they  were
485       aborted will fail with an error code of FI_ECANCELED.
486
487       In both the synchronous and asynchronous modes of operation, the fi_ad‐
488       dr buffer associated with a failed or aborted insertion will be set  to
489       FI_ADDR_NOTAVAIL.
490
491       All  other calls return 0 on success, or a negative value corresponding
492       to fabric errno on error.  Fabric  errno  values  are  defined  in  rd‐
493       ma/fi_errno.h.
494

SEE ALSO

496       fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_eq(3)
497

AUTHORS

499       OpenFabrics.
500
501
502
503Libfabric Programmer’s Manual     2022-12-11                          fi_av(3)
Impressum