1fi_trigger(3)                  Libfabric v1.14.0                 fi_trigger(3)
2
3
4

NAME

6       fi_trigger - Triggered operations
7

SYNOPSIS

9              #include <rdma/fi_trigger.h>
10

DESCRIPTION

12       Triggered  operations allow an application to queue a data transfer re‐
13       quest that is deferred until a specified condition is met.   A  typical
14       use is to send a message only after receiving all input data.
15
16       A  triggered  operation  may  be requested by specifying the FI_TRIGGER
17       flag as part of the operation.  Alternatively, an endpoint alias may be
18       created  and  configured with the FI_TRIGGER flag.  Such an endpoint is
19       referred to as a trigger-able endpoint.  All data  transfer  operations
20       on a trigger-able endpoint are deferred.
21
22       Any  data  transfer  operation  is potentially trigger-able, subject to
23       provider constraints.  Trigger-able endpoints are initialized such that
24       only  those interfaces supported by the provider which are trigger-able
25       are available.
26
27       Triggered operations require  that  applications  use  struct  fi_trig‐
28       gered_context  as  their  per  operation  context  parameter, or if the
29       provider requires the  FI_CONTEXT2  mode,  struct  fi_trigger_context2.
30       The  use  of  struct  fi_triggered_context[2]  replaces  struct fi_con‐
31       text[2],  if  required  by  the  provider.   Although  struct  fi_trig‐
32       gered_context[2]  is not opaque to the application, the contents of the
33       structure may be modified by the provider once it has been submitted as
34       an  operation.   This  structure  has  similar  requirements  as struct
35       fi_context[2].  It must be allocated  by  the  application  and  remain
36       valid  until  the  corresponding operation completes or is successfully
37       canceled.
38
39       Struct fi_triggered_context[2] is used to specify  the  condition  that
40       must  be  met  before the triggered data transfer is initiated.  If the
41       condition is met when the request is made, then the data  transfer  may
42       be initiated immediately.  The format of struct fi_triggered_context[2]
43       is described below.
44
45              struct fi_triggered_context {
46                  enum fi_trigger_event         event_type;   /* trigger type */
47                  union {
48                      struct fi_trigger_threshold threshold;
49                      void                        *internal[3]; /* reserved */
50                  } trigger;
51              };
52
53              struct fi_triggered_context2 {
54                  enum fi_trigger_event         event_type;   /* trigger type */
55                  union {
56                      struct fi_trigger_threshold threshold;
57                      void                        *internal[7]; /* reserved */
58                  } trigger;
59              };
60
61       The triggered context indicates the type of event assigned to the trig‐
62       ger,  along  with a union of trigger details that is based on the event
63       type.
64
65   TRIGGER EVENTS
66       The following trigger events are defined.
67
68       FI_TRIGGER_THRESHOLD
69              This indicates that the data transfer operation will be deferred
70              until  an event counter crosses an application specified thresh‐
71              old value.  The threshold is  specified  using  struct  fi_trig‐
72              ger_threshold:
73
74              struct fi_trigger_threshold {
75                  struct fid_cntr *cntr; /* event counter to check */
76                  size_t threshold;      /* threshold value */
77              };
78
79       Threshold  operations  are triggered in the order of the threshold val‐
80       ues.  This is true even if the counter increments by  a  value  greater
81       than 1.  If two triggered operations have the same threshold, they will
82       be triggered in the order in which they were submitted to the endpoint.
83

DEFERRED WORK QUEUES

85       The following feature and description are enhancements to triggered op‐
86       eration support.
87
88       The  deferred  work queue interface is designed as primitive constructs
89       that can be used to implement application-level collective  operations.
90       They  are  a  more advanced form of triggered operation.  They allow an
91       application to queue operations to a deferred work queue that is  asso‐
92       ciated with the domain.  Note that the deferred work queue is a concep‐
93       tual construct, rather than an  implementation  requirement.   Deferred
94       work  requests  consist of three main components: an event or condition
95       that must first be met, an operation to perform, and a completion noti‐
96       fication.
97
98       Because  deferred work requests are posted directly to the domain, they
99       can support a broader set of conditions and operations.  Deferred  work
100       requests  are submitted using struct fi_deferred_work.  That structure,
101       along with the corresponding operation structures  (referenced  through
102       the op union) used to describe the work must remain valid until the op‐
103       eration completes or is canceled.  The format of the deferred work  re‐
104       quest is as follows:
105
106              struct fi_deferred_work {
107                  struct fi_context2    context;
108
109                  uint64_t              threshold;
110                  struct fid_cntr       *triggering_cntr;
111                  struct fid_cntr       *completion_cntr;
112
113                  enum fi_trigger_op    op_type;
114
115                  union {
116                      struct fi_op_msg            *msg;
117                      struct fi_op_tagged         *tagged;
118                      struct fi_op_rma            *rma;
119                      struct fi_op_atomic         *atomic;
120                      struct fi_op_fetch_atomic   *fetch_atomic;
121                      struct fi_op_compare_atomic *compare_atomic;
122                      struct fi_op_cntr           *cntr;
123                  } op;
124              };
125
126       Once a work request has been posted to the deferred work queue, it will
127       remain on the queue until the triggering counter  (success  plus  error
128       counter values) has reached the indicated threshold.  If the triggering
129       condition has already been met at the time the work request is  queued,
130       the operation will be initiated immediately.
131
132       On the completion of a deferred data transfer, the specified completion
133       counter will be incremented by one.  Note that deferred counter  opera‐
134       tions  do not update the completion counter; only the counter specified
135       through the fi_op_cntr is modified.  The completion_cntr field must  be
136       NULL for counter operations.
137
138       Because deferred work targets support of collective communication oper‐
139       ations, posted work requests do not generate  any  completions  at  the
140       endpoint by default.  For example, completed operations are not written
141       to the EP’s completion queue or update the EP counter  (unless  the  EP
142       counter  is explicitly referenced as the completion_cntr).  An applica‐
143       tion may request EP completions by specifying the FI_COMPLETION flag as
144       part of the operation.
145
146       It is the responsibility of the application to detect and handle situa‐
147       tions that occur which could result in a deferred work request’s condi‐
148       tion  not  being met.  For example, if a work request is dependent upon
149       the successful completion of a data transfer  operation,  which  fails,
150       then the application must cancel the work request.
151
152       To submit a deferred work request, applications should use the domain’s
153       fi_control  function  with  command  FI_QUEUE_WORK  and  struct  fi_de‐
154       ferred_work as the fi_control arg parameter.  To cancel a deferred work
155       request, use fi_control with command FI_CANCEL_WORK and the correspond‐
156       ing   struct   fi_deferred_work  to  cancel.   The  fi_control  command
157       FI_FLUSH_WORK will cancel all queued work requests.  FI_FLUSH_WORK  may
158       be  used to flush all work queued to the domain, or may be used to can‐
159       cel all requests waiting on a specific triggering_cntr.
160
161       Deferred work requests are not acted upon by the provider until the as‐
162       sociated  event  has  occurred; although, certain validation checks may
163       still occur when a request is submitted.  Referenced data  buffers  are
164       not  read  or otherwise accessed.  But the provider may validate fabric
165       objects, such as endpoints and counters, and that input parameters fall
166       within supported ranges.  If a specific request is not supported by the
167       provider, it will fail the operation with -FI_ENOSYS.
168

SEE ALSO

170       fi_getinfo(3), fi_endpoint(3), fi_alias(3), fi_cntr(3)
171

AUTHORS

173       OpenFabrics.
174
175
176
177Libfabric Programmer’s Manual     2021-03-22                     fi_trigger(3)
Impressum