1fabtests(7) @VERSION@ fabtests(7)
2
3
4
6 Fabtests
7
9 Fabtests is a set of examples for fabric providers that demonstrates
10 various features of libfabric- high-performance fabric software li‐
11 brary.
12
14 Libfabric defines sets of interface that fabric providers can support.
15 The purpose of Fabtests examples is to demonstrate some of the major
16 features. The goal is to familiarize users with different functionali‐
17 ties libfabric offers and how to use them. Although most tests report
18 performance numbers, they are designed to test functionality and not
19 performance. The exception are the benchmarks and ubertest.
20
21 The tests are divided into the following categories. Except the unit
22 tests all of them are client-server tests. Not all providers will sup‐
23 port each test.
24
25 The test names try to indicate the type of functionality each test is
26 verifying. Although some tests work with any endpoint type, many are
27 restricted to verifying a single endpoint type. These tests typically
28 include the endpoint type as part of the test name, such as dgram, msg,
29 or rdm.
30
32 These tests are a mix of very basic functionality tests that show major
33 features of libfabric.
34
35 fi_av_xfer
36 Tests communication for unconnected endpoints, as addresses are
37 inserted and removed from the local address vector.
38
39 fi_cm_data
40 Verifies exchanging CM data as part of connecting endpoints.
41
42 fi_cq_data
43 Tranfers messages with CQ data.
44
45 fi_dgram
46 A basic datagram endpoint example.
47
48 fi_dgram_waitset
49 Transfers datagrams using waitsets for completion notifcation.
50
51 fi_inj_complete
52 Sends messages using the FI_INJECT_COMPLETE operation flag.
53
54 fi_mcast
55 A simple multicast test.
56
57 fi_msg A basic message endpoint example.
58
59 fi_msg_epoll
60 Transfers messages with completion queues configured to use file
61 descriptors as wait objetcts. The file descriptors are re‐
62 trieved by the program and used directly with the Linux epoll
63 API.
64
65 fi_msg_sockets
66 Verifies that the address assigned to a passive endpoint can be
67 transitioned to an active endpoint. This is required applica‐
68 tions that need socket API semantics over RDMA implementations
69 (e.g. rsockets).
70
71 fi_multi_ep
72 Performs data transfers over multiple endpoints in parallel.
73
74 fi_multi_mr
75 Issues RMA write operations to multiple memory regions, using
76 completion counters of inbound writes as the notification mecha‐
77 nism.
78
79 fi_poll
80 Exchanges data over RDM endpoints using poll sets to drive com‐
81 pletion notifications.
82
83 fi_rdm A basic RDM endpoint example.
84
85 fi_rdm_atomic
86 Test and verifies atomic operations over an RDM endpoint.
87
88 fi_rdm_deferred_wq
89 Test triggered operations and deferred work queue support.
90
91 fi_rdm_multi_domain
92 Performs data transfers over multiple endpoints, with each end‐
93 point belonging to a different opened domain.
94
95 fi_rdm_multi_recv
96 Transfers multiple messages over an RDM endpoint that are re‐
97 ceived into a single buffer, posted using the FI_MULTI_RECV
98 flag.
99
100 fi_rdm_rma_simple
101 A simple RMA write example over an RDM endpoint.
102
103 fi_rdm_rma_trigger
104 A basic example of queuing an RMA write operation that is initi‐
105 ated upon the firing of a triggering completion. Works with RDM
106 endpoints.
107
108 fi_rdm_shared_av
109 Spawns child processes to verify basic functionality of using a
110 shared address vector with RDM endpoints.
111
112 fi_rdm_tagged_peek
113 Basic test of using the FI_PEEK operation flag with tagged mes‐
114 sages. Works with RDM endpoints.
115
116 fi_recv_cancel
117 Tests canceling posted receives for tagged messages.
118
119 fi_resmgmt_test
120 Tests the resource management enabled feature. This verifies
121 that the provider prevents applications from overruning local
122 and remote command queues and completion queues. This corre‐
123 sponds to setting the domain attribute resource_mgmt to
124 FI_RM_ENABLED.
125
126 fi_scalable_ep
127 Performs data transfers over scalable endpoints, endpoints asso‐
128 ciated with multiple transmit and receive contexts.
129
130 fi_shared_ctx
131 Performs data transfers between multiple endpoints, where the
132 endpoints share transmit and/or receive contexts.
133
134 fi_unexpected_msg
135 Tests the send and receive handling of unexpected tagged mes‐
136 sages.
137
139 The client and the server exchange messages in either a ping-pong man‐
140 ner, for pingpong named tests, or transfer messages one-way, for bw
141 named tests. These tests can transfer various messages sizes, with
142 controls over which features are used by the test, and report perfor‐
143 mance numbers. The tests are structured based on the benchmarks pro‐
144 vided by OSU MPI. They are not guaranteed to provide the best latency
145 or bandwidth performance numbers a given provider or system may
146 achieve.
147
148 fi_dgram_pingpong
149 Latency test for datagram endpoints
150
151 fi_msg_bw
152 Message transfer bandwidth test for connected (MSG) endpoints.
153
154 fi_msg_pingpong
155 Message transfer latency test for connected (MSG) endpoints.
156
157 fi_rdm_cntr_pingpong
158 Message transfer latency test for reliable-datagram (RDM) end‐
159 points that uses counters as the completion mechanism.
160
161 fi_rdm_pingpong
162 Message transfer latency test for reliable-datagram (RDM) end‐
163 points.
164
165 fi_rdm_tagged_bw
166 Tagged message bandwidth test for reliable-datagram (RDM) end‐
167 points.
168
169 fi_rdm_tagged_pingpong
170 Tagged message latency test for reliable-datagram (RDM) end‐
171 points.
172
173 fi_rma_bw
174 An RMA read and write bandwidth test for reliable (MSG and RDM)
175 endpoints.
176
178 These are simple one-sided unit tests that validate basic behavior of
179 the API. Because these are single system tests that do not perform da‐
180 ta transfers their testing scope is limited.
181
182 fi_av_test
183 Verify address vector interfaces.
184
185 fi_cntr_test
186 Tests counter creation and destruction.
187
188 fi_cq_test
189 Tests completion queue creation and destruction.
190
191 fi_dom_test
192 Tests domain creation and destruction.
193
194 fi_eq_test
195 Tests event queue creation, destruction, and capabilities.
196
197 fi_getinfo_test
198 Tests provider response to fi_getinfo calls with varying hints.
199
200 fi_mr_test
201 Tests memory registration.
202
203 fi_resource_freeing
204 Allocates and closes fabric resources to check for proper
205 cleanup.
206
208 This is a comprehensive latency, bandwidth, and functionality test that
209 can handle a variety of test configurations. The test is able to run a
210 large number of tests by iterating over a large number of test vari‐
211 ables. As a result, a full ubertest run can take a significant amount
212 of time. Because ubertest iterates over input variables, it relies on
213 a test configuration file for control, rather than extensive command
214 line options that are used by other fabtests. A configuration file
215 must be constructured for each provider. Example test configurations
216 are at /test_configs.
217
218 fi_ubertest
219 This test takes a configure file as input. The file contains a
220 list of variables and their values to iterate over. The test
221 will run a set of latency, bandwidth, and functionality tests
222 over a given provider. It will perform one execution for every
223 possible combination of all variables. For example, if there
224 are 8 test variables, with 6 having 2 possible values and 2 hav‐
225 ing 3 possible values, ubertest will execute 576 total itera‐
226 tions of each test.
227
229 (1) Fabtests requires that libfabric be installed on the system, and at
230 least one provider be usable.
231
232 (2) Install fabtests on the system. By default all the test executa‐
233 bles are installed in /usr/bin directory unless specified other‐
234 wise.
235
236 (3) All the client-server tests have the following usage model:
237
238 fi_ [OPTIONS] start server fi_ connect to server
239
241 Tests share command line options where appropriate. The following com‐
242 mand line options are available for one or more test. To see which op‐
243 tions apply for a given test, you can use the '-h' help option to see
244 the list available for that test.
245
246 -h Displays help output for the test.
247
248 -f Restrict test to the specified fabric name.
249
250 -d Restrict test to the specified domain name.
251
252 -p Restrict test to the specified provider name.
253
254 -e Use the specified endpoint type for the test. Valid options are
255 msg, dgram, and rdm. The default endpoint type is rdm.
256 *-a
257
258 · : The name of a shared address vector. This option only applies to
259 tests that support shared address vectors.
260
261 -B
262 Specifies the port number of the local endpoint, overriding the de‐
263 fault.
264
265 -P
266 Specifies the port number of the peer endpoint, overriding the de‐
267 fault.
268 *-s
269
270 · : Specifies the address of the local endpoint.
271
272 -b[=oob_port]
273 Enables out-of-band (via sockets) address exchange and test synchro‐
274 nization. A port for the out-of-band connection may be specified as
275 part of this option to override the default.
276
277 -I
278 Number of data transfer iterations.
279
280 -w
281 Number of warm-up data transfer iterations.
282
283 -S
284 Data transfer size or 'all' for a full range of sizes. By default a
285 select number of sizes will be tested.
286
287 -l
288 If specified, the starting address of transmit and receive buffers
289 will be aligned along a page boundary.
290
291 -m
292 Use machine readable output. This is useful for post-processing the
293 test output with scripts.
294
295 -t
296 Specify the type of completion mechanism to use. Valid values are
297 queue and counter. The default is to use completion queues.
298
299 -c
300 Indicate the type of processing to use checking for completed opera‐
301 tions. Valid values are spin, sread, and fd. The default is to busy
302 wait (spin) until the desired operation has completed. The sread op‐
303 tion indicates that the application will invoke a blocking read call
304 in libfabric, such as fi_cq_sread. Fd indicates that the application
305 will retrieve the native operating system wait object (file descrip‐
306 tor) and use either poll() or select() to block until the fd has been
307 signaled, prior to checking for completions.
308
309 -o
310 For RMA based tests, specify the type of RMA operation to perform.
311 Valid values are read, write, and writedata. Write operations are
312 the default.
313
314 -M
315 For multicast tests, specifies the address of the multicast group to
316 join.
317
319 A simple example
320 run server: <test_name> -p <provider_name> -s <source_addr>
321 e.g. fi_msg_rma -p sockets -s 192.168.0.123
322 run client: <test_name> <server_addr> -p <provider_name>
323 e.g. fi_msg_rma 192.168.0.123 -p sockets
324
325 An example with various options
326 run server: fi_rdm_atomic -p psm -s 192.168.0.123 -I 1000 -S 1024
327 run client: fi_rdm_atomic 192.168.0.123 -p psm -I 1000 -S 1024
328
329 This will run "fi_rdm_atomic" for all atomic operations with
330
331 - PSM provider
332 - 1000 iterations
333 - 1024 bytes message size
334 - server node as 123.168.0.123
335
336 Run fi_ubertest
337 run server: fi_ubertest
338 run client: fi_ubertest -u /usr/share/fabtests/test_configs/sockets/quick.test 192.168.0.123
339
340 This will run "fi_ubertest" with
341
342 - sockets provider
343 - configurations defined in /usr/share/fabtests/test_configs/sockets/quick.test
344 - server node as 192.168.0.123
345
346 The config files are provided in /test_configs for sockets, verbs, udp,
347 and usnic providers and distributed with fabtests installation.
348
349 For more usage options: fi_ubertest -h
350
351 Run the whole fabtests suite
352 A runscript scripts/runfabtests.sh is provided that runs all the tests
353 in fabtests and reports the number of pass/fail/notrun.
354
355 Usage: runfabtests.sh [OPTIONS] [provider] [host] [client]
356
357 By default if none of the options are provided, it runs all the tests
358 using
359
360 - sockets provider
361 - 127.0.0.1 as both server and client address
362 - for small number of optiond and iterations
363
364 Various options can be used to choose provider, subset tests to run,
365 level of verbosity etc.
366
367 runfabtests.sh -vvv -t all psm 192.168.0.123 192.168.0.124
368
369 This will run all fabtests using
370
371 - psm provider
372 - for different options and larger iterations
373 - server node as 192.168.0.123 and client node as 192.168.0.124
374 - print test output for all the tests
375
376 For detailed usage options: runfabtests.sh -h
377
379 OpenFabrics.
380
381
382
383Libfabric Programmer's Manual 2018-10-06 fabtests(7)