1MLX5DV_WR(3) mlx5 Programmer’s Manual MLX5DV_WR(3)
2
3
4
6 mlx5dv_wr_set_dc_addr - Attach a DC info to the last work request
7
8 mlx5dv_wr_raw_wqe - Build a raw work request
9
10 mlx5dv_wr_memcpy - Build a DMA memcpy work request
11
13 #include <infiniband/mlx5dv.h>
14
15 static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp,
16 struct ibv_ah *ah,
17 uint32_t remote_dctn,
18 uint64_t remote_dc_key);
19
20 static inline void mlx5dv_wr_set_dc_addr_stream(struct mlx5dv_qp_ex *mqp,
21 struct ibv_ah *ah,
22 uint32_t remote_dctn,
23 uint64_t remote_dc_key,
24 uint16_t stream_id);
25
26 struct mlx5dv_mr_interleaved {
27 uint64_t addr;
28 uint32_t bytes_count;
29 uint32_t bytes_skip;
30 uint32_t lkey;
31 };
32
33 static inline void mlx5dv_wr_mr_interleaved(struct mlx5dv_qp_ex *mqp,
34 struct mlx5dv_mkey *mkey,
35 uint32_t access_flags, /* use enum ibv_access_flags */
36 uint32_t repeat_count,
37 uint16_t num_interleaved,
38 struct mlx5dv_mr_interleaved *data);
39
40 static inline void mlx5dv_wr_mr_list(struct mlx5dv_qp_ex *mqp,
41 struct mlx5dv_mkey *mkey,
42 uint32_t access_flags, /* use enum ibv_access_flags */
43 uint16_t num_sges,
44 struct ibv_sge *sge);
45
46 static inline int mlx5dv_wr_raw_wqe(struct mlx5dv_qp_ex *mqp, const void *wqe);
47
48 static inline void mlx5dv_wr_memcpy(struct mlx5dv_qp_ex *mqp_ex,
49 uint32_t dest_lkey, uint64_t dest_addr,
50 uint32_t src_lkey, uint64_t src_addr,
51 size_t length)
52
54 The MLX5DV work request APIs (mlx5dv_wr_*) is an extension for IBV work
55 request API (ibv_wr_*) with mlx5 specific features for send work re‐
56 quest. This may be used together with or without ibv_wr_* calls.
57
59 To use these APIs a QP must be created using mlx5dv_create_qp() with
60 send_ops_flags of struct ibv_qp_init_attr_ex set.
61
62 If the QP does not support all the requested work request types then QP
63 creation will fail.
64
65 The mlx5dv_qp_ex is extracted from the IBV_QP by ibv_qp_to_qp_ex() and
66 mlx5dv_qp_ex_from_ibv_qp_ex(). This should be used to apply the mlx5
67 specific features on the posted WR.
68
69 A work request creation requires to use the ibv_qp_ex as described in
70 the man for ibv_wr_post and mlx5dv_qp with its available builders and
71 setters.
72
73 QP Specific builders
74 RC QPs mlx5dv_wr_mr_interleaved()
75
76 registers an interleaved memory layout by using an indirect mkey
77 and some interleaved data. The layout of the memory pointed by
78 the mkey after its registration will be the data representation
79 for the num_interleaved entries. This single layout representa‐
80 tion is repeated by repeat_count.
81
82 The data as described by struct mlx5dv_mr_interleaved will hold
83 real data defined by bytes_count and then a padding of
84 bytes_skip. Post a successful registration, RDMA operations can
85 use this mkey. The hardware will scatter the data according to
86 the pattern. The mkey should be used in a zero-based mode. The
87 addr field in its ibv_sge is an offset in the total data. To
88 create this mkey mlx5dv_create_mkey() should be used.
89
90 Current implementation requires the IBV_SEND_INLINE option to be
91 on in ibv_qp_ex->wr_flags field. To be able to have more than 3
92 num_interleaved entries, the QP should be created with a larger
93 WQE size that may fit it. This should be done using the max_in‐
94 line_data attribute of struct ibv_qp_cap upon its creation.
95
96 As one entry will be consumed for strided header, the mkey
97 should be created with one more entry than the required num_in‐
98 terleaved.
99
100 In case ibv_qp_ex->wr_flags turns on IBV_SEND_SIGNALED, the re‐
101 ported WC opcode will be MLX5DV_WC_UMR. Unregister the mkey to
102 enable another pattern registration should be done via
103 ibv_post_send with IBV_WR_LOCAL_INV opcode.
104 mlx5dv_wr_mr_list()
105
106 registers a memory layout based on list of ibv_sge. The layout
107 of the memory pointed by the mkey after its registration will be
108 based on the list of sge counted by num_sges. Post a successful
109 registration RDMA operations can use this mkey, the hardware
110 will scatter the data according to the pattern. The mkey should
111 be used in a zero-based mode, the addr field in its ibv_sge is
112 an offset in the total data.
113
114 Current implementation requires the IBV_SEND_INLINE option to be
115 on in ibv_qp_ex->wr_flags field. To be able to have more than 4
116 num_sge entries, the QP should be created with a larger WQE size
117 that may fit it. This should be done using the max_inline_data
118 attribute of struct ibv_qp_cap upon its creation.
119
120 In case ibv_qp_ex->wr_flags turns on IBV_SEND_SIGNALED, the re‐
121 ported WC opcode will be MLX5DV_WC_UMR. Unregister the mkey to
122 enable other pattern registration should be done via
123 ibv_post_send with IBV_WR_LOCAL_INV opcode.
124
125 RC or DCI QPs
126 mlx5dv_wr_memcpy()
127
128 Builds a DMA memcpy work request to copy data of length length
129 from src_addr to dest_addr. The copy operation will be done us‐
130 ing the DMA MMO functionality of the device to copy data on PCI
131 bus.
132
133 The MLX5DV_QP_EX_WITH_MEMCPY flag in mlx5dv_qp_init_at‐
134 tr.send_ops_flags needs to be set during QP creation. If the
135 device or QP doesn’t support it then QP creation will fail. The
136 maximum memcpy length that is supported by the device is report‐
137 ed in mlx5dv_context->max_wr_memcpy_length. A zero value in
138 mlx5dv_context->max_wr_memcpy_length means the device doesn’t
139 support memcpy operations.
140
141 IBV_SEND_FENCE indicator should be used on a following send re‐
142 quest which is dependent on dest_addr of the memcpy operation.
143
144 In case ibv_qp_ex->wr_flags turns on IBV_SEND_SIGNALED, the re‐
145 ported WC opcode will be MLX5DV_WC_MEMCPY.
146
147 Raw WQE builders
148 mlx5dv_wr_raw_wqe()
149 It is used to build a custom work request (WQE) and post it on a
150 normal QP. The caller needs to set all details of the WQE (ex‐
151 cept the “ctrl.wqe_index” and “ctrl.signature” fields, which is
152 the driver’s responsibility to set). The
153 MLX5DV_QP_EX_WITH_RAW_WQE flag in mlx5_qp_attr.send_ops_flags
154 needs to be set.
155
156 The wr_flags are ignored as it’s the caller’s responsibility to
157 set flags in WQE.
158
159 No matter what the send opcode is, the work completion opcode
160 for a raw WQE is IBV_WC_DRIVER2.
161
162 QP Specific setters
163 DCI QPs
164 mlx5dv_wr_set_dc_addr() must be called to set the DCI WR proper‐
165 ties. The destination address of the work is specified by ah,
166 the remote DCT number is specified by remote_dctn and the DC key
167 is specified by remote_dc_key. This setter is available when
168 the QP transport is DCI and send_ops_flags in struct
169 ibv_qp_init_attr_ex is set. The available builders and setters
170 for DCI QP are the same as RC QP. DCI QP created with
171 MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS can call
172 mlx5dv_wr_set_dc_addr_stream() to define the stream_id of the
173 operation to allow HW to choose one of the multiple concurrent
174 DCI resources. Calls to mlx5dv_wr_set_dc_addr() are equivalent
175 to using stream_id=0
176
178 /* create DC QP type and specify the required send opcodes */
179 attr_ex.qp_type = IBV_QPT_DRIVER;
180 attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS;
181 attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE;
182
183 attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_DC;
184 attr_dv.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCI;
185
186 ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv);
187 ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp);
188 mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx);
189
190 ibv_wr_start(qpx);
191
192 /* Use ibv_qp_ex object to set WR generic attributes */
193 qpx->wr_id = my_wr_id_1;
194 qpx->wr_flags = IBV_SEND_SIGNALED;
195 ibv_wr_rdma_write(qpx, rkey, remote_addr_1);
196 ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1);
197
198 /* Use mlx5 DC setter using mlx5dv_qp_ex object */
199 mlx5dv_wr_set_wr_dc_addr(mqpx, ah, remote_dctn, remote_dc_key);
200
201 ret = ibv_wr_complete(qpx);
202
204 ibv_post_send(3), ibv_create_qp_ex(3), ibv_wr_post(3), mlx5dv_cre‐
205 ate_mkey(3).
206
208 Guy Levi <guyle@mellanox.com>
209
210 Mark Zhang <markzhang@nvidia.com>
211
212
213
214mlx5 2019-02-24 MLX5DV_WR(3)