1fi_domain(3) Libfabric v1.6.1 fi_domain(3)
2
3
4
6 fi_domain - Open a fabric access domain
7
9 #include <rdma/fabric.h>
10
11 #include <rdma/fi_domain.h>
12
13 int fi_domain(struct fid_fabric *fabric, struct fi_info *info,
14 struct fid_domain **domain, void *context);
15
16 int fi_close(struct fid *domain);
17
18 int fi_domain_bind(struct fid_domain *domain, struct fid *eq,
19 uint64_t flags);
20
21 int fi_open_ops(struct fid *domain, const char *name, uint64_t flags,
22 void **ops, void *context);
23
25 fabric : Fabric domain
26
27 info : Fabric information, including domain capabilities and
28 attributes.
29
30 domain : An opened access domain.
31
32 context : User specified context associated with the domain. This con‐
33 text is returned as part of any asynchronous event associated with the
34 domain.
35
36 eq : Event queue for asynchronous operations initiated on the domain.
37
38 name : Name associated with an interface.
39
40 ops : Fabric interface operations.
41
43 An access domain typically refers to a physical or virtual NIC or hard‐
44 ware port; however, a domain may span across multiple hardware compo‐
45 nents for fail-over or data striping purposes. A domain defines the
46 boundary for associating different resources together. Fabric
47 resources belonging to the same domain may share resources.
48
49 fi_domain
50 Opens a fabric access domain, also referred to as a resource domain.
51 Fabric domains are identified by a name. The properties of the opened
52 domain are specified using the info parameter.
53
54 fi_open_ops
55 fi_open_ops is used to open provider specific interfaces. Provider
56 interfaces may be used to access low-level resources and operations
57 that are specific to the opened resource domain. The details of domain
58 interfaces are outside the scope of this documentation.
59
60 fi_domain_bind
61 Associates an event queue with the domain. An event queue bound to a
62 domain will be the default EQ associated with asynchronous control
63 events that occur on the domain or active endpoints allocated on a
64 domain. This includes CM events. Endpoints may direct their control
65 events to alternate EQs by binding directly with the EQ.
66
67 Binding an event queue to a domain with the FI_REG_MR flag indicates
68 that the provider should perform all memory registration operations
69 asynchronously, with the completion reported through the event queue.
70 If an event queue is not bound to the domain with the FI_REG_MR flag,
71 then memory registration requests complete synchronously.
72
73 See fi_av_bind(3), fi_ep_bind(3), fi_mr_bind(3), fi_pep_bind(3), and
74 fi_scalable_ep_bind(3) for more information.
75
76 fi_close
77 The fi_close call is used to release all resources associated with a
78 domain or interface. All objects associated with the opened domain
79 must be released prior to calling fi_close, otherwise the call will
80 return -FI_EBUSY.
81
83 The fi_domain_attr structure defines the set of attributes associated
84 with a domain.
85
86 struct fi_domain_attr {
87 struct fid_domain *domain;
88 char *name;
89 enum fi_threading threading;
90 enum fi_progress control_progress;
91 enum fi_progress data_progress;
92 enum fi_resource_mgmt resource_mgmt;
93 enum fi_av_type av_type;
94 int mr_mode;
95 size_t mr_key_size;
96 size_t cq_data_size;
97 size_t cq_cnt;
98 size_t ep_cnt;
99 size_t tx_ctx_cnt;
100 size_t rx_ctx_cnt;
101 size_t max_ep_tx_ctx;
102 size_t max_ep_rx_ctx;
103 size_t max_ep_stx_ctx;
104 size_t max_ep_srx_ctx;
105 size_t cntr_cnt;
106 size_t mr_iov_limit;
107 uint64_t caps;
108 uint64_t mode;
109 uint8_t *auth_key;
110 size_t auth_key_size;
111 size_t max_err_data;
112 size_t mr_cnt;
113 };
114
115 domain
116 On input to fi_getinfo, a user may set this to an opened domain
117 instance to restrict output to the given domain. On output from
118 fi_getinfo, if no domain was specified, but the user has an opened
119 instance of the named domain, this will reference the first opened
120 instance. If no instance has been opened, this field will be NULL.
121
122 Name
123 The name of the access domain.
124
125 Multi-threading Support (threading)
126 The threading model specifies the level of serialization required of an
127 application when using the libfabric data transfer interfaces. Control
128 interfaces are always considered thread safe, and may be accessed by
129 multiple threads. Applications which can guarantee serialization in
130 their access of provider allocated resources and interfaces enables a
131 provider to eliminate lower-level locks.
132
133 FI_THREAD_UNSPEC : This value indicates that no threading model has
134 been defined. It may be used on input hints to the fi_getinfo call.
135 When specified, providers will return a threading model that allows for
136 the greatest level of parallelism.
137
138 FI_THREAD_SAFE : A thread safe serialization model allows a
139 multi-threaded application to access any allocated resources through
140 any interface without restriction. All providers are required to sup‐
141 port FI_THREAD_SAFE.
142
143 FI_THREAD_FID : A fabric descriptor (FID) serialization model requires
144 applications to serialize access to individual fabric resources associ‐
145 ated with data transfer operations and completions. Multiple threads
146 must be serialized when accessing the same endpoint, transmit context,
147 receive context, completion queue, counter, wait set, or poll set.
148 Serialization is required only by threads accessing the same object.
149
150 For example, one thread may be initiating a data transfer on an end‐
151 point, while another thread reads from a completion queue associated
152 with the endpoint.
153
154 Serialization to endpoint access is only required when accessing the
155 same endpoint data flow. Multiple threads may initiate transfers on
156 different transmit contexts of the same endpoint without serializing,
157 and no serialization is required between the submission of data trans‐
158 mit requests and data receive operations.
159
160 In general, FI_THREAD_FID allows the provider to be implemented without
161 needing internal locking when handling data transfers. Conceptually,
162 FI_THREAD_FID maps well to providers that implement fabric services in
163 hardware and provide separate command queues to different data flows.
164
165 FI_THREAD_ENDPOINT : The endpoint threading model is similar to
166 FI_THREAD_FID, but with the added restriction that serialization is
167 required when accessing the same endpoint, even if multiple transmit
168 and receive contexts are used. Conceptually, FI_THREAD_ENDPOINT maps
169 well to providers that implement fabric services in hardware but use a
170 single command queue to access different data flows.
171
172 FI_THREAD_COMPLETION : The completion threading model is intended for
173 providers that make use of manual progress. Applications must serial‐
174 ize access to all objects that are associated through the use of having
175 a shared completion structure. This includes endpoint, transmit con‐
176 text, receive context, completion queue, counter, wait set, and poll
177 set objects.
178
179 For example, threads must serialize access to an endpoint and its bound
180 completion queue(s) and/or counters. Access to endpoints that share
181 the same completion queue must also be serialized.
182
183 The use of FI_THREAD_COMPLETION can increase parallelism over
184 FI_THREAD_SAFE, but requires the use of isolated resources.
185
186 FI_THREAD_DOMAIN : A domain serialization model requires applications
187 to serialize access to all objects belonging to a domain.
188
189 Progress Models (control_progress / data_progress)
190 Progress is the ability of the underlying implementation to complete
191 processing of an asynchronous request. In many cases, the processing
192 of an asynchronous request requires the use of the host processor. For
193 example, a received message may need to be matched with the correct
194 buffer, or a timed out request may need to be retransmitted. For per‐
195 formance reasons, it may be undesirable for the provider to allocate a
196 thread for this purpose, which will compete with the application
197 threads.
198
199 Control progress indicates the method that the provider uses to make
200 progress on asynchronous control operations. Control operations are
201 functions which do not directly involve the transfer of application
202 data between endpoints. They include address vector, memory registra‐
203 tion, and connection management routines.
204
205 Data progress indicates the method that the provider uses to make
206 progress on data transfer operations. This includes message queue,
207 RMA, tagged messaging, and atomic operations, along with their comple‐
208 tion processing.
209
210 Progress frequently requires action being taken at both the transmit‐
211 ting and receiving sides of an operation. This is often a requirement
212 for reliable transfers, as a result of retry and acknowledgement pro‐
213 cessing.
214
215 To balance between performance and ease of use, two progress models are
216 defined.
217
218 FI_PROGRESS_UNSPEC : This value indicates that no progress model has
219 been defined. It may be used on input hints to the fi_getinfo call.
220
221 FI_PROGRESS_AUTO : This progress model indicates that the provider will
222 make forward progress on an asynchronous operation without further
223 intervention by the application. When FI_PROGRESS_AUTO is provided as
224 output to fi_getinfo in the absence of any progress hints, it often
225 indicates that the desired functionality is implemented by the provider
226 hardware or is a standard service of the operating system.
227
228 All providers are required to support FI_PROGRESS_AUTO. However, if a
229 provider does not natively support automatic progress, forcing the use
230 of FI_PROGRESS_AUTO may result in threads being allocated below the
231 fabric interfaces.
232
233 FI_PROGRESS_MANUAL : This progress model indicates that the provider
234 requires the use of an application thread to complete an asynchronous
235 request. When manual progress is set, the provider will attempt to
236 advance an asynchronous operation forward when the application attempts
237 to wait on or read an event queue, completion queue, or counter where
238 the completed operation will be reported. Progress also occurs when
239 the application processes a poll or wait set that has been associated
240 with the event or completion queue.
241
242 Only wait operations defined by the fabric interface will result in an
243 operation progressing. Operating system or external wait functions,
244 such as select, poll, or pthread routines, cannot.
245
246 Manual progress requirements not only apply to endpoints that initiate
247 transmit operations, but also to endpoints that may be the target of
248 such operations. This holds true even if the target endpoint will not
249 generate completion events for the operations. For example, an end‐
250 point that acts purely as the target of RMA or atomic operations that
251 uses manual progress may still need application assistance to process
252 received operations.
253
254 Resource Management (resource_mgmt)
255 Resource management (RM) is provider and protocol support to protect
256 against overrunning local and remote resources. This includes local
257 and remote transmit contexts, receive contexts, completion queues, and
258 source and target data buffers.
259
260 When enabled, applications are given some level of protection against
261 overrunning provider queues and local and remote data buffers. Such
262 support may be built directly into the hardware and/or network proto‐
263 col, but may also require that checks be enabled in the provider soft‐
264 ware. By disabling resource management, an application assumes all
265 responsibility for preventing queue and buffer overruns, but doing so
266 may allow a provider to eliminate internal synchronization calls, such
267 as atomic variables or locks.
268
269 It should be noted that even if resource management is disabled, the
270 provider implementation and protocol may still provide some level of
271 protection against overruns. However, such protection is not guaran‐
272 teed. The following values for resource management are defined.
273
274 FI_RM_UNSPEC : This value indicates that no resource management model
275 has been defined. It may be used on input hints to the fi_getinfo
276 call.
277
278 FI_RM_DISABLED : The provider is free to select an implementation and
279 protocol that does not protect against resource overruns. The applica‐
280 tion is responsible for resource protection.
281
282 FI_RM_ENABLED : Resource management is enabled for this provider
283 domain.
284
285 The behavior of the various resource management options depends on
286 whether the endpoint is reliable or unreliable, as well as provider and
287 protocol specific implementation details, as shown in the following ta‐
288 ble. The table assumes that all peers enable or disable RM the same.
289
290 Resource DGRAM EP-no DGRAM EP-with RDM/MSG EP-no RDM/MSG
291 RM RM RM EP-with RM
292 ──────────────────────────────────────────────────────────────────────────────
293 Tx Ctx undefined EAGAIN undefined EAGAIN
294 error error
295 Rx Ctx undefined EAGAIN undefined EAGAIN
296 error error
297 Tx CQ undefined EAGAIN undefined EAGAIN
298 error error
299 Rx CQ undefined EAGAIN undefined EAGAIN
300 error error
301 Target EP dropped dropped transmit retried
302 error
303 No Rx Buffer dropped dropped transmit retried
304 error
305 Rx Buf Over‐ truncate or truncate or truncate or truncate or
306 run drop drop error error
307 Unmatched RMA not applica‐ not applica‐ transmit transmit
308 ble ble error error
309 RMA Overrun not applica‐ not applica‐ transmit transmit
310 ble ble error error
311
312 The resource column indicates the resource being accessed by a data
313 transfer operation.
314
315 Tx Ctx / Rx Ctx : Refers to the transmit/receive contexts when a data
316 transfer operation is submitted. When RM is enabled, attempting to
317 submit a request will fail if the context is full. If RM is disabled,
318 an undefined error (provider specific) will occur. Such errors should
319 be considered fatal to the context, and applications must take steps to
320 avoid queue overruns.
321
322 Tx CQ / Rx CQ : Refers to the completion queue associated with the Tx
323 or Rx context when a local operation completes. When RM is disabled,
324 applications must take care to ensure that completion queues do not get
325 overrun. When an overrun occurs, an undefined, but fatal, error will
326 occur affecting all endpoints associated with the CQ. Overruns can be
327 avoided by sizing the CQs appropriately or by deferring the posting of
328 a data transfer operation unless CQ space is available to store its
329 completion. When RM is enabled, providers may use different mechanisms
330 to prevent CQ overruns. This includes failing (returning -FI_EAGAIN)
331 the posting of operations that could result in CQ overruns, or inter‐
332 nally retrying requests (which will be hidden from the application).
333 See notes at the end of this section regarding CQ resource management
334 restrictions.
335
336 Target EP / No Rx Buffer : Target EP refers to resources associated
337 with the endpoint that is the target of a transmit operation. This
338 includes the target endpoint's receive queue, posted receive buffers
339 (no Rx buffers), the receive side completion queue, and other related
340 packet processing queues. The defined behavior is that seen by the
341 initiator of a request. For FI_EP_DGRAM endpoints, if the target EP
342 queues are unable to accept incoming messages, received messages will
343 be dropped. For reliable endpoints, if RM is disabled, the transmit
344 operation will complete in error. If RM is enabled, the provider will
345 internally retry the operation.
346
347 Rx Buffer Overrun : This refers to buffers posted to receive incoming
348 tagged or untagged messages, with the behavior defined from the view‐
349 point of the sender. The behavior for handling received messages that
350 are larger than the buffers provided by the application is provider
351 specific. Providers may either truncate the message and report a suc‐
352 cessful completion, or fail the operation. For datagram endpoints,
353 failed sends will result in the message being dropped. For reliable
354 endpoints, send operations may complete successfully, yet be truncated
355 at the receive side. This can occur when the target side buffers
356 received data until an application buffer is made available. The com‐
357 pletion status may also be dependent upon the completion model selected
358 byt the application (e.g. FI_DELIVERY_COMPLETE versus FI_TRANSMIT_COM‐
359 PLETE).
360
361 Unmatched RMA / RMA Overrun : Unmatched RMA and RMA overruns deal with
362 the processing of RMA and atomic operations. Unlike send operations,
363 RMA operations that attempt to access a memory address that is either
364 not registered for such operations, or attempt to access outside of the
365 target memory region will fail, resulting in a transmit error.
366
367 When a resource management error occurs on an endpoint, the endpoint is
368 transitioned into a disabled state. Any operations which have not
369 already completed will fail and be discarded. For unconnected end‐
370 points, the endpoint must be re-enabled before it will accept new data
371 transfer operations. For connected endpoints, the connection is torn
372 down and must be re-established.
373
374 There is one notable restriction on the protections offered by resource
375 management. This occurs when resource management is enabled on an end‐
376 point that has been bound to completion queue(s) using the FI_SELEC‐
377 TIVE_COMPLETION flag. Operations posted to such an endpoint may spec‐
378 ify that a successful completion should not generate a entry on the
379 corresponding completion queue. (I.e. the operation leaves the
380 FI_COMPLETION flag unset). In such situations, the provider is not
381 required to reserve an entry in the completion queue to handle the case
382 where the operation fails and does generate a CQ entry, which would
383 effectively require tracking the operation to completion. Applications
384 concerned with avoiding CQ overruns in the occurrence of errors must
385 ensure that there is sufficient space in the CQ to report failed opera‐
386 tions. This can typically be achieved by sizing the CQ to at least the
387 same size as the endpoint queue(s) that are bound to it.
388
389 AV Type (av_type)
390 Specifies the type of address vectors that are usable with this domain.
391 For additional details on AV type, see fi_av(3). The following values
392 may be specified.
393
394 FI_AV_UNSPEC : Any address vector format is requested and supported.
395
396 FI_AV_MAP : Only address vectors of type AV map are requested or sup‐
397 ported.
398
399 FI_AV_TABLE : Only address vectors of type AV index are requested or
400 supported.
401
402 Address vectors are only used by connectionless endpoints. Applica‐
403 tions that require the use of a specific type of address vector should
404 set the domain attribute av_type to the necessary value when calling
405 fi_getinfo. The value FI_AV_UNSPEC may be used to indicate that the
406 provider can support either address vector format. In this case, a
407 provider may return FI_AV_UNSPEC to indicate that either format is sup‐
408 portable, or may return another AV type to indicate the optimal AV type
409 supported by this domain.
410
411 Memory Registration Mode (mr_mode)
412 Defines memory registration specific mode bits used with this domain.
413 Full details on MR mode options are available in fi_mr(3). The follow‐
414 ing values may be specified.
415
416 FI_MR_LOCAL : The provider is optimized around having applications reg‐
417 ister memory for locally accessed data buffers. Data buffers used in
418 send and receive operations and as the source buffer for RMA and atomic
419 operations must be registered by the application for access domains
420 opened with this capability.
421
422 FI_MR_RAW : The provider requires additional setup as part of their
423 memory registration process. This mode is required by providers that
424 use a memory key that is larger than 64-bits.
425
426 FI_MR_VIRT_ADDR : Registered memory regions are referenced by peers
427 using the virtual address of the registered memory region, rather than
428 a 0-based offset.
429
430 FI_MR_ALLOCATED : Indicates that memory registration occurs on allo‐
431 cated data buffers, and physical pages must back all virtual addresses
432 being registered.
433
434 FI_MR_PROV_KEY : Memory registration keys are selected and returned by
435 the provider.
436
437 FI_MR_MMU_NOTIFY : Indicates that the application is responsible for
438 notifying the provider when the page tables referencing a registered
439 memory region may have been updated.
440
441 FI_MR_RMA_EVENT : Indicates that the memory regions associated with
442 completion counters must be explicitly enabled after being bound to any
443 counter.
444
445 FI_MR_ENDPOINT : Memory registration occurs at the endpoint level,
446 rather than domain.
447
448 FI_MR_UNSPEC : Defined for compatibility -- library versions 1.4 and
449 earlier. Setting mr_mode to 0 indicates that FI_MR_BASIC or
450 FI_MR_SCALABLE are requested and supported.
451
452 FI_MR_BASIC : Defined for compatibility -- library versions 1.4 and
453 earlier. Only basic memory registration operations are requested or
454 supported. This mode is equivalent to the FI_MR_VIRT_ADDR, FI_MR_ALLO‐
455 CATED, and FI_MR_PROV_KEY flags being set in later library versions.
456 This flag may not be used in conjunction with other mr_mode bits.
457
458 FI_MR_SCALABLE : Defined for compatibility -- library versions 1.4 and
459 earlier. Only scalable memory registration operations are requested or
460 supported. Scalable registration uses offset based addressing, with
461 application selectable memory keys. For library versions 1.5 and
462 later, this is the default if no mr_mode bits are set. This flag may
463 not be used in conjunction with other mr_mode bits.
464
465 Buffers used in data transfer operations may require notifying the
466 provider of their use before a data transfer can occur. The mr_mode
467 field indicates the type of memory registration that is required, and
468 when registration is necessary. Applications that require the use of a
469 specific registration mode should set the domain attribute mr_mode to
470 the necessary value when calling fi_getinfo. The value FI_MR_UNSPEC
471 may be used to indicate support for any registration mode.
472
473 MR Key Size (mr_key_size)
474 Size of the memory region remote access key, in bytes. Applications
475 that request their own MR key must select a value within the range
476 specified by this value. Key sizes larger than 8 bytes require using
477 the FI_RAW_KEY mode bit.
478
479 CQ Data Size (cq_data_size)
480 Applications may include a small message with a data transfer that is
481 placed directly into a remote completion queue as part of a completion
482 event. This is referred to as remote CQ data (sometimes referred to as
483 immediate data). This field indicates the number of bytes that the
484 provider supports for remote CQ data. If supported (non-zero value is
485 returned), the minimum size of remote CQ data must be at least 4-bytes.
486
487 Completion Queue Count (cq_cnt)
488 The optimal number of completion queues supported by the domain, rela‐
489 tive to any specified or default CQ attributes. The cq_cnt value may
490 be a fixed value of the maximum number of CQs supported by the underly‐
491 ing hardware, or may be a dynamic value, based on the default
492 attributes of an allocated CQ, such as the CQ size and data format.
493
494 Endpoint Count (ep_cnt)
495 The total number of endpoints supported by the domain, relative to any
496 specified or default endpoint attributes. The ep_cnt value may be a
497 fixed value of the maximum number of endpoints supported by the under‐
498 lying hardware, or may be a dynamic value, based on the default
499 attributes of an allocated endpoint, such as the endpoint capabilities
500 and size. The endpoint count is the number of addressable endpoints
501 supported by the provider.
502
503 Transmit Context Count (tx_ctx_cnt)
504 The number of outbound command queues optimally supported by the
505 provider. For a low-level provider, this represents the number of com‐
506 mand queues to the hardware and/or the number of parallel transmit
507 engines effectively supported by the hardware and caches. Applications
508 which allocate more transmit contexts than this value will end up shar‐
509 ing underlying resources. By default, there is a single transmit con‐
510 text associated with each endpoint, but in an advanced usage model, an
511 endpoint may be configured with multiple transmit contexts.
512
513 Receive Context Count (rx_ctx_cnt)
514 The number of inbound processing queues optimally supported by the
515 provider. For a low-level provider, this represents the number hard‐
516 ware queues that can be effectively utilized for processing incoming
517 packets. Applications which allocate more receive contexts than this
518 value will end up sharing underlying resources. By default, a single
519 receive context is associated with each endpoint, but in an advanced
520 usage model, an endpoint may be configured with multiple receive con‐
521 texts.
522
523 Maximum Endpoint Transmit Context (max_ep_tx_ctx)
524 The maximum number of transmit contexts that may be associated with an
525 endpoint.
526
527 Maximum Endpoint Receive Context (max_ep_rx_ctx)
528 The maximum number of receive contexts that may be associated with an
529 endpoint.
530
531 Maximum Sharing of Transmit Context (max_ep_stx_ctx)
532 The maximum number of endpoints that may be associated with a shared
533 transmit context.
534
535 Maximum Sharing of Receive Context (max_ep_srx_ctx)
536 The maximum number of endpoints that may be associated with a shared
537 receive context.
538
539 Counter Count (cntr_cnt)
540 The optimal number of completion counters supported by the domain. The
541 cq_cnt value may be a fixed value of the maximum number of counters
542 supported by the underlying hardware, or may be a dynamic value, based
543 on the default attributes of the domain.
544
545 MR IOV Limit (mr_iov_limit)
546 This is the maximum number of IO vectors (scatter-gather elements) that
547 a single memory registration operation may reference.
548
549 Capabilities (caps)
550 Domain level capabilities. Domain capabilities indicate domain level
551 features that are supported by the provider.
552
553 FI_LOCAL_COMM : At a conceptual level, this field indicates that the
554 underlying device supports loopback communication. More specifically,
555 this field indicates that an endpoint may communicate with other end‐
556 points that are allocated from the same underlying named domain. If
557 this field is not set, an application may need to use an alternate
558 domain or mechanism (e.g. shared memory) to communicate with peers
559 that execute on the same node.
560
561 FI_REMOTE_COMM : This field indicates that the underlying provider sup‐
562 ports communication with nodes that are reachable over the network. If
563 this field is not set, then the provider only supports communication
564 between processes that execute on the same node -- a shared memory
565 provider, for example.
566
567 FI_SHARED_AV : Indicates that the domain supports the ability to share
568 address vectors among multiple processes using the named address vector
569 feature.
570
571 See fi_getinfo(3) for a discussion on primary versus secondary capabil‐
572 ities. All domain capabilities are considered secondary capabilities.
573
574 mode
575 The operational mode bit related to using the domain.
576
577 FI_RESTRICTED_COMP : This bit indicates that the domain limits comple‐
578 tion queues and counters to only be used with endpoints, transmit con‐
579 texts, and receive contexts that have the same set of capability flags.
580
581 Default authorization key (auth_key)
582 The default authorization key to associate with endpoint and memory
583 registrations created within the domain. This field is ignored unless
584 the fabric is opened with API version 1.5 or greater.
585
586 Default authorization key length (auth_key_size)
587 The length in bytes of the default authorization key for the domain.
588 If set to 0, then no authorization key will be associated with end‐
589 points and memory registrations created within the domain unless speci‐
590 fied in the endpoint or memory registration attributes. This field is
591 ignored unless the fabric is opened with API version 1.5 or greater.
592
593 Max Error Data Size (max_err_data)
594 : The maximum amount of error data, in bytes, that may be returned as
595 part of a completion or event queue error. This value corresponds to
596 the err_data_size field in struct fi_cq_err_entry and struct
597 fi_eq_err_entry.
598
599 Memory Regions Count (mr_cnt)
600 The optimal number of memory regions supported by the domain, or end‐
601 point if the mr_mode FI_MR_ENDPOINT bit has been set. The mr_cnt value
602 may be a fixed value of the maximum number of MRs supported by the
603 underlying hardware, or may be a dynamic value, based on the default
604 attributes of the domain, such as the supported memory registration
605 modes. Applications can set the mr_cnt on input to fi_getinfo, in
606 order to indicate their memory registration requirements. Doing so may
607 allow the provider to optimize any memory registration cache or lookup
608 tables.
609
611 Returns 0 on success. On error, a negative value corresponding to fab‐
612 ric errno is returned. Fabric errno values are defined in
613 rdma/fi_errno.h.
614
616 Users should call fi_close to release all resources allocated to the
617 fabric domain.
618
619 The following fabric resources are associated with domains: active end‐
620 points, memory regions, completion event queues, and address vectors.
621
622 Domain attributes reflect the limitations and capabilities of the
623 underlying hardware and/or software provider. They do not reflect sys‐
624 tem limitations, such as the number of physical pages that an applica‐
625 tion may pin or number of file descriptors that the application may
626 open. As a result, the reported maximums may not be achievable, even
627 on a lightly loaded systems, without an administrator configuring sys‐
628 tem resources appropriately for the installed provider(s).
629
631 fi_getinfo(3), fi_endpoint(3), fi_av(3), fi_ep(3), fi_eq(3), fi_mr(3)
632
634 OpenFabrics.
635
636
637
638Libfabric Programmer's Manual 2018-02-13 fi_domain(3)