1fi_av(3) Libfabric v1.6.1 fi_av(3)
2
3
4
6 fi_av - Address vector operations
7
8 fi_av_open / fi_close : Open or close an address vector
9
10 fi_av_bind : Associate an address vector with an event queue.
11
12 fi_av_insert / fi_av_insertsvc / fi_av_remove : Insert/remove an
13 address into/from the address vector.
14
15 fi_av_lookup : Retrieve an address stored in the address vector.
16
17 fi_av_straddr : Convert an address into a printable string.
18
20 #include <rdma/fi_domain.h>
21
22 int fi_av_open(struct fid_domain *domain, struct fi_av_attr *attr,
23 struct fid_av **av, void *context);
24
25 int fi_close(struct fid *av);
26
27 int fi_av_bind(struct fid_av *av, struct fid *eq, uint64_t flags);
28
29 int fi_av_insert(struct fid_av *av, void *addr, size_t count,
30 fi_addr_t *fi_addr, uint64_t flags, void *context);
31
32 int fi_av_insertsvc(struct fid_av *av, const char *node,
33 const char *service, fi_addr_t *fi_addr, uint64_t flags,
34 void *context);
35
36 int fi_av_insertsym(struct fid_av *av, const char *node,
37 size_t nodecnt, const char *service, size_t svccnt,
38 fi_addr_t *fi_addr, uint64_t flags, void *context);
39
40 int fi_av_remove(struct fid_av *av, fi_addr_t *fi_addr, size_t count,
41 uint64_t flags);
42
43 int fi_av_lookup(struct fid_av *av, fi_addr_t fi_addr,
44 void *addr, size_t *addrlen);
45
46 fi_addr_t fi_rx_addr(fi_addr_t fi_addr, int rx_index,
47 int rx_ctx_bits);
48
49 const char * fi_av_straddr(struct fid_av *av, const void *addr,
50 char *buf, size_t *len);
51
53 domain : Resource domain
54
55 av : Address vector
56
57 eq : Event queue
58
59 attr : Address vector attributes
60
61 context : User specified context associated with the address vector or
62 insert operation.
63
64 addr : Buffer containing one or more addresses to insert into address
65 vector.
66
67 addrlen : On input, specifies size of addr buffer. On output, stores
68 number of bytes written to addr buffer.
69
70 fi_addr : For insert, a reference to an array where returned fabric
71 addresses will be written. For remove, one or more fabric addresses to
72 remove.
73
74 count : Number of addresses to insert/remove from an AV.
75
76 flags : Additional flags to apply to the operation.
77
79 Address vectors are used to map higher level addresses, which may be
80 more natural for an application to use, into fabric specific addresses.
81 The mapping of addresses is fabric and provider specific, but may
82 involve lengthy address resolution and fabric management protocols. AV
83 operations are synchronous by default, but may be set to operate asyn‐
84 chronously by specifying the FI_EVENT flag to fi_av_open. When
85 requesting asynchronous operation, the application must first bind an
86 event queue to the AV before inserting addresses.
87
88 fi_av_open
89 fi_av_open allocates or opens an address vector. The properties and
90 behavior of the address vector are defined by struct fi_av_attr.
91
92 struct fi_av_attr {
93 enum fi_av_type type; /* type of AV */
94 int rx_ctx_bits; /* address bits to identify rx ctx */
95 size_t count; /* # entries for AV */
96 size_t ep_per_node; /* # endpoints per fabric address */
97 const char *name; /* system name of AV */
98 void *map_addr; /* base mmap address */
99 uint64_t flags; /* operation flags */
100 };
101
102 type : An AV type corresponds to a conceptual implementation of an
103 address vector. The type specifies how an application views data
104 stored in the AV, including how it may be accessed. Valid values are:
105
106 · FI_AV_MAP : Addresses which are inserted into an AV are mapped to a
107 native fabric address for use by the application. The use of
108 FI_AV_MAP requires that an application store the returned fi_addr_t
109 value that is associated with each inserted address. The advantage
110 of using FI_AV_MAP is that the returned fi_addr_t value may contain
111 encoded address data, which is immediately available when processing
112 data transfer requests. This can eliminate or reduce the number of
113 memory lookups needed when initiating a transfer. The disadvantage
114 of FI_AV_MAP is the increase in memory usage needed to store the
115 returned addresses. Addresses are stored in the AV using a provider
116 specific mechanism, including, but not limited to a tree, hash table,
117 or maintained on the heap.
118
119 · FI_AV_TABLE : Addresses which are inserted into an AV of type
120 FI_AV_TABLE are accessible using a simple index. Conceptually, the
121 AV may be treated as an array of addresses, though the provider may
122 implement the AV using a variety of mechanisms. When FI_AV_TABLE is
123 used, the returned fi_addr_t is an index, with the index for an
124 inserted address the same as its insertion order into the table. The
125 index of the first address inserted into an FI_AV_TABLE will be 0,
126 and successive insertions will be given sequential indices. Sequen‐
127 tial indices will be assigned across insertion calls on the same AV.
128
129 · FI_AV_UNSPEC : Provider will choose its preferred AV type. The AV
130 type used will be returned through the type field in fi_av_attr.
131
132 Receive Context Bits (rx_ctx_bits) : The receive context bits field is
133 only for use with scalable endpoints. It indicates the number of bits
134 reserved in a returned fi_addr_t, which will be used to identify a spe‐
135 cific target receive context. See fi_rx_addr() and fi_endpoint(3) for
136 additional details on receive contexts. The requested number of bits
137 should be selected such that 2 ^ rx_ctx_bits >= rx_ctx_cnt for the end‐
138 point.
139
140 count : Indicates the expected number of addresses that will be
141 inserted into the AV. The provider uses this to optimize resource
142 allocations.
143
144 ep_per_node : This field indicates the number of endpoints that will be
145 associated with a specific fabric, or network, address. If the number
146 of endpoints per node is unknown, this value should be set to 0. The
147 provider uses this value to optimize resource allocations. For exam‐
148 ple, distributed, parallel applications may set this to the number of
149 processes allocated per node, times the number of endpoints each
150 process will open.
151
152 name : An optional system name associated with the address vector to
153 create or open. Address vectors may be shared across multiple pro‐
154 cesses which access the same named domain on the same node. The name
155 field allows the underlying provider to identify a shared AV.
156
157 If the name field is non-NULL and the AV is not opened for read-only
158 access, a named AV will be created, if it does not already exist.
159
160 map_addr : The map_addr determines the base fi_addr_t address that a
161 provider should use when sharing an AV of type FI_AV_MAP between pro‐
162 cesses. Processes that provide the same value for map_addr to a shared
163 AV may use the same fi_addr_t values returned from an fi_av_insert
164 call.
165
166 The map_addr may be used by the provider to mmap memory allocated for a
167 shared AV between processes; however, the provider is not required to
168 use the map_addr in this fashion. The only requirement is that an
169 fi_addr_t returned as part of an fi_av_insert call on one process is
170 usable on another process which opens an AV of the same name at the
171 same map_addr value. The relationship between the map_addr and any
172 returned fi_addr_t is not defined.
173
174 If name is non-NULL and map_addr is 0, then the map_addr used by the
175 provider will be returned through the attribute structure. The
176 map_addr field is ignored if name is NULL.
177
178 flags : The following flags may be used when opening an AV.
179
180 · FI_EVENT : When the flag FI_EVENT is specified, all insert operations
181 on this AV will occur asynchronously. There will be one EQ error
182 entry generated for each failed address insertion, followed by one
183 non-error event indicating that the insertion operation has com‐
184 pleted. There will always be one non-error completion event for each
185 insert operation, even if all addresses fail. The context field in
186 all completions will be the context specified to the insert call, and
187 the data field in the final completion entry will report the number
188 of addresses successfully inserted. If an error occurs during the
189 asynchronous insertion, an error completion entry is returned (see
190 fi_eq(3) for a discussion of the fi_eq_err_entry error completion
191 struct). The context field of the error completion will be the con‐
192 text that was specified in the insert call; the data field will con‐
193 tain the index of the failed address. There will be one error com‐
194 pletion returned for each address that fails to insert into the AV.
195
196 If an AV is opened with FI_EVENT, any insertions attempted before an EQ
197 is bound to the AV will fail with -FI_ENOEQ.
198
199 Error completions for failed insertions will contain the index of the
200 failed address in the index field of the error completion entry.
201
202 Note that the order of delivery of insert completions may not match the
203 order in which the calls to fi_av_insert were made. The only guarantee
204 is that all error completions for a given call to fi_av_insert will
205 precede the single associated non-error completion.
206
207 · FI_READ : Opens an AV for read-only access. An AV opened for
208 read-only access must be named (name attribute specified), and the AV
209 must exist.
210
211 · FI_SYMMETRIC : Indicates that each node will be associated with the
212 same number of endpoints, the same transport addresses will be allo‐
213 cated on each node, and the transport addresses will be sequential.
214 This feature targets distributed applications on large fabrics and
215 allows for highly-optimized storage of remote endpoint addressing.
216
217 fi_close
218 The fi_close call is used to release all resources associated with an
219 address vector. Note that any events queued on an event queue refer‐
220 encing the AV are left untouched. It is recommended that callers
221 retrieve all events associated with the AV before closing it.
222
223 When closing the address vector, there must be no opened endpoints
224 associated with the AV. If resources are still associated with the AV
225 when attempting to close, the call will return -FI_EBUSY.
226
227 fi_av_bind
228 Associates an event queue with the AV. If an AV has been opened with
229 FI_EVENT, then an event queue must be bound to the AV before any inser‐
230 tion calls are attempted. Any calls to insert addresses before an
231 event queue has been bound will fail with -FI_ENOEQ. Flags are
232 reserved for future use and must be 0.
233
234 fi_av_insert
235 The fi_av_insert call inserts zero or more addresses into an AV. The
236 number of addresses is specified through the count parameter. The addr
237 parameter references an array of addresses to insert into the AV.
238 Addresses inserted into an address vector must be in the same format as
239 specified in the addr_format field of the fi_info struct provided when
240 opening the corresponding domain. When using the FI_ADDR_STR format,
241 the addr parameter should reference an array of strings (char **).
242
243 For AV's of type FI_AV_MAP, once inserted addresses have been mapped,
244 the mapped values are written into the buffer referenced by fi_addr.
245 The fi_addr buffer must remain valid until the AV insertion has com‐
246 pleted and an event has been generated to an associated event queue.
247 The value of the returned fi_addr should be considered opaque by the
248 application for AVs of type FI_AV_MAP. The returned value may point to
249 an internal structure or a provider specific encoding of low-level
250 addressing data, for example. In the latter case, use of FI_AV_MAP may
251 be able to avoid memory references during data transfer operations.
252
253 For AV's of type FI_AV_TABLE, addresses are placed into the table in
254 order. An address is inserted at the lowest index that corresponds to
255 an unused table location, with indices starting at 0. That is, the
256 first address inserted may be referenced at index 0, the second at
257 index 1, and so forth. When addresses are inserted into an AV table,
258 the assigned fi_addr values will be simple indices corresponding to the
259 entry into the table where the address was inserted. Index values
260 accumulate across successive insert calls in the order the calls are
261 made, not necessarily in the order the insertions complete.
262
263 Because insertions occur at a pre-determined index, the fi_addr parame‐
264 ter may be NULL. If fi_addr is non-NULL, it must reference an array of
265 fi_addr_t, and the buffer must remain valid until the insertion opera‐
266 tion completes. Note that if fi_addr is NULL and synchronous operation
267 is requested without using FI_SYNC_ERR flag, individual insertion fail‐
268 ures cannot be reported and the application must use other calls, such
269 as fi_av_lookup to learn which specific addresses failed to insert.
270 Since fi_av_remove is provider-specific, it is recommended that calls
271 to fi_av_insert following a call to fi_av_remove always reference a
272 valid buffer in the fi_addr parameter. Otherwise it may be difficult
273 to determine what the next assigned index will be.
274
275 flags : The following flag may be passed to AV insertion calls:
276 fi_av_insert, fi_av_insertsvc, or fi_av_insertsym.
277
278 · FI_MORE : In order to allow optimized address insertion, the applica‐
279 tion may specify the FI_MORE flag to the insert call to give a hint
280 to the provider that more insertion requests will follow, allowing
281 the provider to aggregate insertion requests if desired. An applica‐
282 tion may make any number of insertion calls with FI_MORE set, pro‐
283 vided that they are followed by an insertion call without FI_MORE.
284 This signifies to the provider that the insertion list is complete.
285 Providers are free to ignore FI_MORE.
286
287 · FI_SYNC_ERR : This flag applies to synchronous insertions only, and
288 is used to retrieve error details of failed insertions. If set, the
289 context parameter of insertion calls references an array of integers,
290 with context set to address of the first element of the array. The
291 resulting status of attempting to insert each address will be written
292 to the corresponding array location. Successful insertions will be
293 updated to 0. Failures will contain a fabric errno code.
294
295 fi_av_insertsvc
296 The fi_av_insertsvc call behaves similar to fi_av_insert, but allows
297 the application to specify the node and service names, similar to the
298 fi_getinfo inputs, rather than an encoded address. The node and ser‐
299 vice parameters are defined the same as fi_getinfo(3). Node should be
300 a string that corresponds to a hostname or network address. The ser‐
301 vice string corresponds to a textual representation of a transport
302 address. Applications may also pass in an FI_ADDR_STR formatted
303 address as the node parameter. In such cases, the service parameter
304 must be NULL. See fi_getinfo.3 for details on using FI_ADDR_STR. Sup‐
305 ported flags are the same as for fi_av_insert.
306
307 fi_av_insertsym
308 fi_av_insertsym performs a symmetric insert that inserts a sequential
309 range of nodes and/or service addresses into an AV. The svccnt parame‐
310 ter indicates the number of transport (endpoint) addresses to insert
311 into the AV for each node address, with the service parameter specify‐
312 ing the starting transport address. Inserted transport addresses will
313 be of the range {service, service + svccnt - 1}, inclusive. All ser‐
314 vice addresses for a node will be inserted before the next node is
315 inserted.
316
317 The nodecnt parameter indicates the number of node (network) addresses
318 to insert into the AV, with the node parameter specifying the starting
319 node address. Inserted node addresses will be of the range {node, node
320 + nodecnt - 1}, inclusive. If node is a non-numeric string, such as a
321 hostname, it must contain a numeric suffix if nodecnt > 1.
322
323 As an example, if node = "10.1.1.1", nodecnt = 2, service = "5000", and
324 svccnt = 2, the following addresses will be inserted into the AV in the
325 order shown: 10.1.1.1:5000, 10.1.1.1:5001, 10.1.1.2:5000,
326 10.1.1.2:5001. If node were replaced by the hostname "host10", the
327 addresses would be: host10:5000, host10:5001, host11:5000, host11:5001.
328
329 The total number of inserted addresses will be nodecnt x svccnt.
330
331 Supported flags are the same as for fi_av_insert.
332
333 fi_av_remove
334 fi_av_remove removes a set of addresses from an address vector. All
335 resources associated with the indicated addresses are released. The
336 removed address - either the mapped address (in the case of FI_AV_MAP)
337 or index (FI_AV_TABLE) - is invalid until it is returned again by a new
338 fi_av_insert.
339
340 The behavior of operations in progress that reference the removed
341 addresses is undefined.
342
343 The use of fi_av_remove is an optimization that applications may use to
344 free memory allocated with addresses that will no longer be accessed.
345 Inserted addresses are not required to be removed. fi_av_close will
346 automatically cleanup any resources associated with addresses remaining
347 in the AV when it is invoked.
348
349 Flags are reserved for future use and must be 0.
350
351 fi_av_lookup
352 This call returns the address stored in the address vector that corre‐
353 sponds to the given fi_addr. The returned address is the same format
354 as those stored by the AV. On input, the addrlen parameter should
355 indicate the size of the addr buffer. If the actual address is larger
356 than what can fit into the buffer, it will be truncated. On output,
357 addrlen is set to the size of the buffer needed to store the address,
358 which may be larger than the input value.
359
360 fi_rx_addr
361 This function is used to convert an endpoint address, returned by
362 fi_av_insert, into an address that specifies a target receive context.
363 The specified fi_addr parameter must either be a value returned from
364 fi_av_insert, in the case of FI_AV_MAP, or an index, in the case of
365 FI_AV_TABLE. The value for rx_ctx_bits must match that specified in
366 the AV attributes for the given address.
367
368 Connected endpoints that support multiple receive contexts, but are not
369 associated with address vectors should specify FI_ADDR_NOTAVAIL for the
370 fi_addr parameter.
371
372 fi_av_straddr
373 The fi_av_straddr function converts the provided address into a print‐
374 able string. The specified address must be of the same format as those
375 stored by the AV, though the address itself is not required to have
376 been inserted. On input, the len parameter should specify the size of
377 the buffer referenced by buf. On output, addrlen is set to the size of
378 the buffer needed to store the address. This size may be larger than
379 the input len. If the provided buffer is too small, the results will
380 be truncated. fi_av_straddr returns a pointer to buf.
381
383 Providers may implement AV's using a variety of mechanisms. Specifi‐
384 cally, a provider may begin resolving inserted addresses as soon as
385 they have been added to an AV, even if asynchronous operation has been
386 specified. Similarly, a provider may lazily release resources from
387 removed entries.
388
390 Insertion calls for an AV opened for synchronous operation will return
391 the number of addresses that were successfully inserted. In the case
392 of failure, the return value will be less than the number of addresses
393 that was specified.
394
395 Insertion calls for an AV opened for asynchronous operation (with
396 FI_EVENT flag specified) will return 0 if the operation was success‐
397 fully initiated. In the case of failure, a negative fabric errno will
398 be returned. Providers are allowed to abort insertion operations in
399 the case of an error. Addresses that are not inserted because they
400 were aborted will fail with an error code of FI_ECANCELED.
401
402 In both the synchronous and asynchronous modes of operation, the
403 fi_addr buffer associated with a failed or aborted insertion will be
404 set to FI_ADDR_NOTAVAIL.
405
406 All other calls return 0 on success, or a negative value corresponding
407 to fabric errno on error. Fabric errno values are defined in
408 rdma/fi_errno.h.
409
412 fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_eq(3)
413
415 OpenFabrics.
416
417
418
419Libfabric Programmer's Manual 2017-06-21 fi_av(3)