1
2opahostadmin(8) Master map: IFSFFCLIRG (Man Page) opahostadmin(8)
3
4
5
7 opahostadmin
8
9
10
11 (Host) Performs a number of multi-step host initialization and verifi‐
12 cation operations, including upgrading software or firmware, rebooting
13 hosts, and other operations. In general, operations performed by opa‐
14 hostadmin involve a login to one or more host systems.
15
17 opahostadmin [-c] [-i ipoib_suffix] [-f hostfile] [-h 'hosts']
18 [-r release] [-I install_options] [-U upgrade_options] [-d dir]
19 [-T product] [-P packages] [-m netmask] [-S] operation...
20
22 --help Produces full help text.
23
24
25 -c Overwrites the result files from any previous run before
26 starting this run.
27
28
29 -i ipoib_suffix
30 Specifies the suffix to apply to host names to create IPoIB
31 host names. Default is -opa.
32
33
34 -f hostfile
35 Specifies the file with the names of hosts in a cluster.
36 Default is /etc/opa/hosts file.
37
38
39 -h hosts Specifies the list of hosts to execute the operation against.
40
41
42 -r release
43 Specifies the software version to load/upgrade to. Default is
44 the version of Intel(R) Omni-Path Software presently being
45 run on the server.
46
47
48 -d dir Specifies the directory to retrieve product. release.tgz for
49 load or upgrade.
50
51
52 -I install_options
53 Specifies the software install options.
54
55
56 -U upgrade_options
57 Specifies the software upgrade options.
58
59
60 -T product
61 Specifies the product type to install. Default is IntelOPA-
62 Basic. <distro> or IntelOPA-IFS. <distro> where <distro> is
63 the distribution and CPU.
64
65
66 -P packages
67 Specifies the packages to install. Default = oftools ipoib
68 psm_mpi
69
70
71 -m netmask
72 Specifies the IPoIB netmask to use for configipoib operation.
73
74
75 -S Securely prompts for user password on remote system.
76
77
78 operation Performs the specified operation, which can be one or more of
79 the following:
80
81
82
83
84 load Starts initial installation of all hosts.
85
86
87
88
89
90 upgrade Upgrades installation of all hosts.
91
92
93
94
95
96 configipoib
97 Creates ifcfg-ib1 using host IP address from
98 /etc/hosts file.
99
100
101
102
103
104 reboot Reboots hosts, ensures they go down and come back.
105
106
107
108
109
110 sacache Confirms sacache has all hosts in it.
111
112
113
114
115
116 ipoibping Verifies this host can ping each host through
117 IPoIB.
118
119
120
121
122
123 mpiperf Verifies latency and bandwidth for each host.
124
125
126
127
128
129 mpiperfdeviation
130 Verifies latency and bandwidth for each host
131 against a defined threshold (or relative to average
132 host performance).
133
134
135
137 opahostadmin -c reboot
138 opahostadmin upgrade
139 opahostadmin -h 'elrond arwen' reboot
140 HOSTS='elrond arwen' opahostadmin reboot
141
143 opahostadmin provides detailed logging of its results. During each run,
144 the following files are produced:
145
146 · test.res : Appended with summary results of run.
147
148 · test.log : Appended with detailed results of run.
149
150 · save_tmp/ : Contains a directory per failed test with detailed
151 logs.
152
153 · test_tmp*/ : Intermediate result files while test is running.
154
155 The -c option removes all log files.
156
157 Results from opahostadmin are grouped into test suites, test cases, and
158 test items. A given run of opahostadmin represents a single test suite.
159 Within a test suite, multiple test cases occur; typically one test case
160 per host being operated on. Some of the more complex operations may
161 have multiple test items per test case. Each test item represents a
162 major step in the overall test case.
163
164 Each opahostadmin run appends to test.res and test.log, and creates
165 temporary files in test_tmp$PID in the current directory. test.res pro‐
166 vides an overall summary of operations performed and their results. The
167 same information is also displayed while opahostadmin is executing.
168 test.log contains detailed information about what was performed,
169 including the specific commands executed and the resulting output. The
170 test_tmp directories contain temporary files which reflect tests in
171 progress (or killed). The logs for any failures are logged in the
172 save_temp directory with a directory per failed test case. If the same
173 test case fails more than once, save_temp retains the information from
174 the first failure. Subsequent runs of opahostadmin are appended to
175 test.log. Intel recommends reviewing failures and using the -c option
176 to remove old logs before subsequent runs of opahostadmin.
177
178 opahostadmin implicitly performs its operations in parallel. However,
179 as for the other tools, FF_MAX_PARALLEL can be exported to change the
180 degree of parallelism. Twenty (20) parallel operations is the default.
181
183 The following environment variables are also used by this command:
184
185 HOSTS List of hosts, used if -h option not supplied.
186
187
188 HOSTS_FILE
189 File containing list of hosts, used in absence of -f and -h.
190
191
192 FF_MAX_PARALLEL
193 Maximum concurrent operations are performed.
194
195
196 FF_SERIALIZE_OUTPUT
197 Serialize output of parallel operations (yes or no).
198
199
200 FF_TIMEOUT_MULT
201 Multiplier for all timeouts associated with this command.
202 Used if the systems are slow for some reason.
203
204
205
207 (Host) Intel recommends that you set up password SSH or SCP for use
208 during this operation. Alternatively, the -S option can be used to
209 securely prompt for a password, in which case the same password is used
210 for all hosts. Alternately, the password may be put in the environment
211 or the opafastfabric.conf file using FF_PASSWORD and FF_ROOTPASS.
212
213 load Performs an initial installation of Intel(R) Omni-Path Soft‐
214 ware on a group of hosts. Any existing installation is unin‐
215 stalled and existing configuration files are removed. Subse‐
216 quently, the hosts are installed with a default Intel(R)
217 Omni-Path Software configuration. The -I option can be used
218 to select different install packages. Default = oftools ipoib
219 mpi The -r option can be used to specify a release to install
220 other than the one that this host is presently running. The
221 FF_PRODUCT. FF_PRODUCT_VERSION.tgz file (for example,
222 IntelOPA-Basic. version.tgz) is expected to exist in the
223 directory specified by -d. Default is the current working
224 directory. The specified software is copied to all the
225 selected hosts and installed.
226
227
228 upgrade Upgrades all selected hosts without modifying existing con‐
229 figurations. This operation is comparable to the -U option
230 when running ./INSTALL manually. The -r option can be used to
231 upgrade to a release different from this host. The default is
232 to upgrade to the same release as this host. The FF_PRODUCT.
233 FF_PRODUCT_VERSION.tgz file (for example, IntelOPA-Basic.
234 version.tgz) is expected to exist in the directory specified
235 by -d. (The default is the current working directory.) The
236 specified software is copied to all the end nodes and
237 installed.
238
239
240
241
242 NOTE: Only components that are currently installed are upgraded. This
243 operation fails for hosts that do not have Intel(R) Omni-Path Software
244 installed.
245
246
247
248 configipoib
249 Creates a ifcfg-ib1 configuration file for each node using
250 the IP address found using the resolver on the node. The
251 standard Linux* resolver is used through the host command.
252 (If running OFA Delta, this option configures ifcfg-ib0 .)
253
254
255 If the host is not found, /etc/hosts on the node is checked.
256 The -i option specifies an IPoIB suffix to apply to the host
257 name to create the IPoIB host name for the node. The default
258 suffix is -ib. The -m option specifies a netmask other than
259 the default for the given class of IP address, such as when
260 dividing a class A or B address into smaller IP subnets.
261 IPoIB is configured for a static IP address and is
262 autostarted at boot. For the Intel(R) OP Software Stack, the
263 default /etc/ipoib.cfg file is used, which provides a redun‐
264 dant IPoIB configuration using both ports of the first HFI in
265 the system.
266
267
268
269
270 NOTE: opahostadmin configipoib now supports DHCP (auto or static
271 options) for configuring the IPoIB interface. You must specify these
272 options in /etc/opa/opafastfabric.conf against the FF_IPOIB_CONFIG
273 variable. If no options are found, the static IP configuration is used
274 by default. If auto is specified, then one IP address from either
275 static or dhcp is chosen. Static is used if the IP address can be
276 obtained out of /etc/hosts or the resolver, otherwise DHCP is used.
277
278
279
280 reboot Reboots the given hosts and ensures they go down and come
281 back up by pinging them during the reboot process. The ping
282 rate is slow (5 seconds), so if the servers boot faster than
283 this, false failures may be seen.
284
285
286 sacache Verifies the given hosts can properly communicate with the SA
287 and any cached SA data that is up to date. To run this com‐
288 mand, Intel(R) Omni-Path Fabric software must be installed
289 and running on the given hosts. The subnet manager and
290 switches must be up. If this test fails: opacmdall 'opasa‐
291 query -o desc' can be run against any problem hosts.
292
293
294
295
296
297 NOTE: This operation requires that the hosts being queried are speci‐
298 fied by a resolvable TCP/IP host name. This operation FAILS if the
299 selected hosts are specified by IP address.
300
301
302
303 ipoibping Verifies IPoIB basic operation by ensuring that the host can
304 ping all other nodes through IPoIB. To run this command,
305 Intel(R) Omni-Path Fabric software must be installed, IPoIB
306 must be configured and running on the host, and the given
307 hosts, the SM, and switches must be up. The -i option can
308 specify an alternate IPoIB hostname suffix.
309
310
311 mpiperf Verifies that MPI is operational and checks MPI end-to-end
312 latency and bandwidth between pairs of nodes (for example,
313 1-2, 3-4, 5-6). Use this to verify switch latency/hops, PCI
314 bandwidth, and overall MPI performance. The test.res file
315 contains the results of each pair of nodes tested.
316
317
318
319
320
321 NOTE: This option is available for the Intel(R) Omni-Path Fabric Host
322 Software OFA Delta packaging, but is not presently available for other
323 packagings of OFED.
324
325
326
327 To obtain accurate results, this test should be run at a time
328 when no other stressful applications (for example, MPI jobs or
329 high stress file system operations) are running on the given
330 hosts.
331
332 Bandwidth issues typically indicate server configuration issues
333 (for example, incorrect slot used, incorrect BIOS settings, or
334 incorrect HFI model), or fabric issues (for example, symbol
335 errors, incorrect link width, or speed). Assuming opareport has
336 previously been used to check for link errors and link speed
337 issues, the server configuration should be verified.
338
339 Note that BIOS settings and differences between server models
340 can account for 10-20% differences in bandwidth. For more
341 details about BIOS settings, consult the documentation from the
342 server supplier and/or the server PCI chipset manufacturer.
343
344 mpiperfdeviation
345 Specifies the enhanced version of mpiperf that verifies MPI
346 performance. Can be used to verify switch latency/hops, PCI
347 bandwidth, and overall MPI performance. It performs assorted
348 pair-wise bandwidth and latency tests, and reports pairs out‐
349 side an acceptable tolerance range. The tool identifies spe‐
350 cific nodes that have problems and provides a concise summary
351 of results. The test.res file contains the results of each
352 pair of nodes tested.
353
354
355 By default, concurrent mode is used to quickly analyze the
356 fabric and host performance. Pairs that have 20% less band‐
357 width or 50% more latency than the average pair are reported
358 as failures.
359
360 The tool can be run in a sequential or a concurrent mode.
361 Sequential mode runs each host against a reference host. By
362 default, the reference host is selected based on the best
363 performance from a quick test of the first 40 hosts. In con‐
364 current mode, hosts are paired up and all pairs are run con‐
365 currently. Since there may be fabric contention during such a
366 run, any poor performing pairs are then rerun sequentially
367 against the reference host.
368
369 Concurrent mode runs the tests in the shortest amount of
370 time, however, the results could be slightly less accurate
371 due to switch contention. In heavily oversubscribed fabric
372 designs, if concurrent mode is producing unexpectedly low
373 performance, try sequential mode.
374
375
376
377
378 NOTE: This option is available for the Intel(R) Omni-Path Fabric Host
379 Software OFA Delta packaging, but is not presently available for other
380 packagings of OFED.
381
382
383
384 To obtain accurate results, this test should be run at a time
385 when no other stressful applications (for example, MPI jobs,
386 high stress file system operations) are running on the given
387 hosts.
388
389 Bandwidth issues typically indicate server configuration issues
390 (for example, incorrect slot used, incorrect BIOS settings, or
391 incorrect HFI model), or fabric issues (for example, symbol
392 errors, incorrect link width, or speed). Assuming opareport has
393 previously been used to check for link errors and link speed
394 issues, the server configuration should be verified.
395
396 Note that BIOS settings and differences between server models
397 can account for 10-20% differences in bandwidth. A result 5-10%
398 below the average is typically not cause for serious alarm, but
399 may reflect limitations in the server design or the chosen BIOS
400 settings.
401
402 For more details about BIOS settings, consult the documentation
403 from the server supplier and/or the server PCI chipset manufac‐
404 turer.
405
406 The deviation application supports a number of parameters which
407 allow for more precise control over the mode, benchmark and
408 pass/fail criteria. The parameters to use can be selected using
409 the FF_DEVIATION_ARGS configuration parameter in opafastfab‐
410 ric.conf
411
412 Available parameters for deviation application:
413
414 [-bwtol bwtol] [-bwdelta MBs] [-bwthres MBs]
415 [-bwloop count] [-bwsize size] [-lattol latol]
416 [-latdelta usec] [-latthres usec] [-latloop count]
417 [-latsize size][-c] [-b] [-v] [-vv]
418 [-h reference_host]
419
420
421
422
423
424 -bwtol Specifies the percent of bandwidth degradation allowed
425 below average value.
426
427
428
429
430
431 -bwbidir Performs a bidirectional bandwidth test.
432
433
434
435
436
437 -bwunidir Performs a unidirectional bandwidth test (default).
438
439
440
441
442
443 -bwdelta Specifies the limit in MB/s of bandwidth degradation
444 allowed below average value.
445
446
447
448
449
450 -bwthres Specifies the lower limit in MB/s of bandwidth
451 allowed.
452
453
454
455
456
457 -bwloop Specifies the number of loops to execute each band‐
458 width test.
459
460
461
462
463
464 -bwsize Specifies the size of message to use for bandwidth
465 test.
466
467
468
469
470
471 -lattol Specifies the percent of latency degradation allowed
472 above average value.
473
474
475
476
477
478 -latdelta Specifies the imit in µsec of latency degradation
479 allowed above average value.
480
481
482
483
484
485 -latthres Specifies the lower limit in µsec of latency
486 allowed.
487
488
489
490
491
492 -latloop Specifies the number of loops to execute each latency
493 test.
494
495
496
497
498
499 -latsize Specifies the size of message to use for latency test.
500
501
502
503
504
505 -c Runs test pairs concurrently instead of the default of
506 sequential.
507
508
509
510
511
512 -b When comparing results against tolerance and delta,
513 uses best instead of average.
514
515
516
517
518
519 -v Specifies the verbose output.
520
521
522
523
524
525 -vv Specifies the very verbose output.
526
527
528
529
530
531 -h Specifies the reference host to use for sequential
532 pairing.
533
534
535
536 Both bwtol and bwdelta must be exceeded to fail bandwidth test.
537
538 When bwthres is supplied, bwtol and bwdelta are ignored.
539
540 Both lattol and latdelta must be exceeded to fail latency test.
541
542 When latthres is supplied, lattol and latdelta are ignored.
543
544 For consistency with OSU benchmarks, MB/s is defined as 1000000
545 bytes/s.
546
547
548
549Copyright(C) 2015-2018 Intel Corporation opahostadmin(8)