1fi_trigger(3) Libfabric v1.7.0 fi_trigger(3)
2
3
4
6 fi_trigger - Triggered operations
7
9 #include <rdma/fi_trigger.h>
10
12 Triggered operations allow an application to queue a data transfer re‐
13 quest that is deferred until a specified condition is met. A typical
14 use is to send a message only after receiving all input data.
15
16 A triggered operation may be requested by specifying the FI_TRIGGER
17 flag as part of the operation. Alternatively, an endpoint alias may be
18 created and configured with the FI_TRIGGER flag. Such an endpoint is
19 referred to as a trigger-able endpoint. All data transfer operations
20 on a trigger-able endpoint are deferred.
21
22 Any data transfer operation is potentially trigger-able, subject to
23 provider constraints. Trigger-able endpoints are initialized such that
24 only those interfaces supported by the provider which are trigger-able
25 are available.
26
27 Triggered operations require that applications use struct fi_trig‐
28 gered_context as their per operation context parameter, or if the
29 provider requires the FI_CONTEXT2 mode, struct fi_trigger_context2.
30 The use of struct fi_triggered_context[2] replaces struct fi_con‐
31 text[2], if required by the provider. Although struct fi_trig‐
32 gered_context[2] is not opaque to the application, the contents of the
33 structure may be modified by the provider once it has been submitted as
34 an operation. This structure has similar requirements as struct
35 fi_context[2]. It must be allocated by the application and remain
36 valid until the corresponding operation completes or is successfully
37 canceled.
38
39 Struct fi_triggered_context[2] is used to specify the condition that
40 must be met before the triggered data transfer is initiated. If the
41 condition is met when the request is made, then the data transfer may
42 be initiated immediately. The format of struct fi_triggered_context[2]
43 is described below.
44
45 struct fi_triggered_context {
46 enum fi_trigger_event event_type; /* trigger type */
47 union {
48 struct fi_trigger_threshold threshold;
49 void *internal[3]; /* reserved */
50 } trigger;
51 };
52
53 struct fi_triggered_context2 {
54 enum fi_trigger_event event_type; /* trigger type */
55 union {
56 struct fi_trigger_threshold threshold;
57 void *internal[7]; /* reserved */
58 } trigger;
59 };
60
61 The triggered context indicates the type of event assigned to the trig‐
62 ger, along with a union of trigger details that is based on the event
63 type.
64
65 TRIGGER EVENTS
66 The following trigger events are defined.
67
68 FI_TRIGGER_THRESHOLD
69 This indicates that the data transfer operation will be deferred
70 until an event counter crosses an application specified thresh‐
71 old value. The threshold is specified using struct fi_trig‐
72 ger_threshold:
73
74 struct fi_trigger_threshold {
75 struct fid_cntr *cntr; /* event counter to check */
76 size_t threshold; /* threshold value */
77 };
78
79 Threshold operations are triggered in the order of the threshold val‐
80 ues. This is true even if the counter increments by a value greater
81 than 1. If two triggered operations have the same threshold, they will
82 be triggered in the order in which they were submitted to the endpoint.
83
85 The following feature and description are enhancements to triggered op‐
86 eration support, but should be considered experimental. Until the ex‐
87 perimental tag is removed, the interfaces, semantics, and data struc‐
88 tures defined below may change between library versions.
89
90 The deferred work queue interface is designed as primitive constructs
91 that can be used to implement application-level collective operations.
92 They are a more advanced form of triggered operation. They allow an
93 application to queue operations to a deferred work queue that is asso‐
94 ciated with the domain. Note that the deferred work queue is a concep‐
95 tual construct, rather than an implementation requirement. Deferred
96 work requests consist of three main components: an event or condition
97 that must first be met, an operation to perform, and a completion noti‐
98 fication.
99
100 Because deferred work requests are posted directly to the domain, they
101 can support a broader set of conditions and operations. Deferred work
102 requests are submitted using struct fi_deferred_work. That structure,
103 along with the corresponding operation structures (referenced through
104 the op union) used to describe the work must remain valid until the op‐
105 eration completes or is canceled. The format of the deferred work re‐
106 quest is as follows:
107
108 struct fi_deferred_work {
109 struct fi_context2 context;
110
111 uint64_t threshold;
112 struct fid_cntr *triggering_cntr;
113 struct fid_cntr *completion_cntr;
114
115 enum fi_trigger_op op_type;
116
117 union {
118 struct fi_op_msg *msg;
119 struct fi_op_tagged *tagged;
120 struct fi_op_rma *rma;
121 struct fi_op_atomic *atomic;
122 struct fi_op_fetch_atomic *fetch_atomic;
123 struct fi_op_compare_atomic *compare_atomic;
124 struct fi_op_cntr *cntr;
125 } op;
126 };
127
128 Once a work request has been posted to the deferred work queue, it will
129 remain on the queue until the triggering counter (success plus error
130 counter values) has reached the indicated threshold. If the triggering
131 condition has already been met at the time the work request is queued,
132 the operation will be initiated immediately.
133
134 On the completion of a deferred data transfer, the specified completion
135 counter will be incremented by one. Note that deferred counter opera‐
136 tions do not update the completion counter; only the counter specified
137 through the fi_op_cntr is modified. The completion_cntr field must be
138 NULL for counter operations.
139
140 Because deferred work targets support of collective communication oper‐
141 ations, posted work requests do not generate any completions at the
142 endpoint by default. For example, completed operations are not written
143 to the EP's completion queue or update the EP counter (unless the EP
144 counter is explicitly referenced as the completion_cntr). An applica‐
145 tion may request EP completions by specifying the FI_COMPLETION flag as
146 part of the operation.
147
148 It is the responsibility of the application to detect and handle situa‐
149 tions that occur which could result in a deferred work request's condi‐
150 tion not being met. For example, if a work request is dependent upon
151 the successful completion of a data transfer operation, which fails,
152 then the application must cancel the work request.
153
154 To submit a deferred work request, applications should use the domain's
155 fi_control function with command FI_QUEUE_WORK and struct fi_de‐
156 ferred_work as the fi_control arg parameter. To cancel a deferred work
157 request, use fi_control with command FI_CANCEL_WORK and the correspond‐
158 ing struct fi_deferred_work to cancel. The fi_control command
159 FI_FLUSH_WORK will cancel all queued work requests. FI_FLUSH_WORK may
160 be used to flush all work queued to the domain, or may be used to can‐
161 cel all requests waiting on a specific triggering_cntr.
162
163 Deferred work requests are not acted upon by the provider until the as‐
164 sociated event has occurred; although, certain validation checks may
165 still occur when a request is submitted. Referenced data buffers are
166 not read or otherwise accessed. But the provider may validate fabric
167 objects, such as endpoints and counters, and that input parameters fall
168 within supported ranges. If a specific request is not supported by the
169 provider, it will fail the operation with -FI_ENOSYS.
170
172 fi_getinfo(3), fi_endpoint(3), fi_alias(3), fi_cntr(3)
173
175 OpenFabrics.
176
177
178
179Libfabric Programmer's Manual 2018-10-05 fi_trigger(3)