1pdsh(1) General Commands Manual pdsh(1)
2
3
4
6 pdsh - issue commands to groups of hosts in parallel
7
8
10 pdsh [options]... command
11
12
14 pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs com‐
15 mands on a single remote host, pdsh can run multiple remote commands in
16 parallel. pdsh uses a "sliding window" (or fanout) of threads to con‐
17 serve resources on the initiating host while allowing some connections
18 to time out.
19
20 When pdsh receives SIGINT (ctrl-C), it lists the status of current
21 threads. A second SIGINT within one second terminates the program.
22 Pending threads may be canceled by issuing ctrl-Z within one second of
23 ctrl-C. Pending threads are those that have not yet been initiated, or
24 are still in the process of connecting to the remote host.
25
26
27 If a remote command is not specified on the command line, pdsh runs
28 interactively, prompting for commands and executing them when termi‐
29 nated with a carriage return. In interactive mode, target nodes that
30 time out on the first command are not contacted for subsequent com‐
31 mands, and commands prefixed with an exclamation point will be executed
32 on the local system.
33
34 The core functionality of pdsh may be supplemented by dynamically load‐
35 able modules. The modules may provide a new connection protocol
36 (replacing the standard rcmd(3) protocol used by rsh(1)), filtering
37 options (e.g. removing hosts that are "down" from the target list),
38 and/or host selection options (e.g., -a selects all hosts from a con‐
39 figuration file.). By default, pdsh must have at least one "rcmd" mod‐
40 ule loaded. See the RCMD MODULES section for more information.
41
42
44 The method by which pdsh runs commands on remote hosts may be selected
45 at runtime using the -R option (See OPTIONS below). This functionality
46 is ultimately implemented via dynamically loadable modules, and so the
47 list of available options may be different from installation to instal‐
48 lation. A list of currently available rcmd modules is printed when
49 using any of the -h, -V, or -L options. The default rcmd module will
50 also be displayed with the -h and -V options.
51
52 A list of rcmd modules currently distributed with pdsh follows.
53
54 rsh Uses an internal, thread-safe implementation of BSD rcmd(3) to
55 run commands using the standard rsh(1) protocol.
56
57 exec Executes an arbitrary command for each target host. The first
58 of the pdsh remote arguments is the local command to execute,
59 followed by any further arguments. Some simple parameters are
60 substitued on the command line, including %h for the target
61 hostname, %u for the remote username, and %n for the remote
62 rank [0-n] (To get a literal % use %%). For example, the fol‐
63 lowing would duplicate using the ssh module to run hostname(1)
64 across the hosts foo[0-10]:
65
66 pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname
67
68 and this command line would run grep(1) in parallel across the
69 files console.foo[0-10]:
70
71 pdsh -R exec -w foo[0-10] grep BUG console.%h
72
73
74 ssh Uses a variant of popen(3) to run multiple copies of the ssh(1)
75 command.
76
77 mrsh This module uses the mrsh(1) protocol to execute jobs on remote
78 hosts. The mrsh protocol uses a credential based authentica‐
79 tion, forgoing the need to allocate reserved ports. In other
80 aspects, it acts just like rsh. Remote nodes must be running
81 mrshd(8) in order for the mrsh module to work.
82
83 krb4 The krb4 module allows users to execute remote commands after
84 authenticating with kerberos. Of course, the remote rshd dae‐
85 mons must be kerberized.
86
87 xcpu The xcpu module uses the xcpu service to execute remote com‐
88 mands.
89
90
92 The list of available options is determined at runtime by supplementing
93 the list of standard pdsh options with any options provided by loaded
94 rcmd and misc modules. In some cases, options provided by modules may
95 conflict with each other. In these cases, the modules are incompatible
96 and the first module loaded wins.
97
98
100 -w TARGETS,...
101 Target and or filter the specified list of hosts. Do not use
102 with any other node selection options (e.g. -a, -g, if they are
103 available). No spaces are allowed in the comma-separated list.
104 Arguments in the TARGETS list may include normal host names, a
105 range of hosts in hostlist format (See HOSTLIST EXPRESSIONS), or
106 a single `-' character to read the list of hosts on stdin.
107
108 If a host or hostlist is preceded by a `-' character, this
109 causes those hosts to be explicitly excluded. If the argument is
110 preceded by a single `^' character, it is taken to be the path
111 to file containing a list of hosts, one per line. If the item
112 begins with a `/' character, it is taken as a regular expres‐
113 sion on which to filter the list of hosts (a regex argument may
114 also be optionally trailed by another '/', e.g. /node.*/). A
115 regex or file name argument may also be preceeded by a minus `-'
116 to exclude instead of include thoses hosts.
117
118 A list of hosts may also be preceded by "user@" to specify a
119 remote username other than the default, or "rcmd_type:" to spec‐
120 ify an alternate rcmd connection type for these hosts. When used
121 together, the rcmd type must be specified first, e.g.
122 "ssh:user1@host0" would use ssh to connect to host0 as user
123 "user1."
124
125
126
127 -x host,host,...
128 Exclude the specified hosts. May be specified in conjunction
129 with other target node list options such as -a and -g (when
130 available). Hostlists may also be specified to the -x option
131 (see the HOSTLIST EXPRESSIONS section below). Arguments to -x
132 may also be preceeded by the filename (`^') and regex ('/')
133 characters as described above, in which case the resulting hosts
134 are excluded as if they had been given to -w and preceeded with
135 the minus `-' character.
136
137
139 -S Return the largest of the remote command return values.
140
141 -h Output usage menu and quit. A list of available rcmd modules
142 will also be printed at the end of the usage message.
143
144 -s Only on AIX, separate remote command stderr and stdout into two
145 sockets.
146
147 -q List option values and the target nodelist and exit without
148 action.
149
150 -b Disable ctrl-C status feature so that a single ctrl-C kills par‐
151 allel job. (Batch Mode)
152
153 -l user
154 This option may be used to run remote commands as another user,
155 subject to authorization. For BSD rcmd, this means the invoking
156 user and system must be listed in the user´s .rhosts file (even
157 for root).
158
159 -t seconds
160 Set the connect timeout. Default is 10 seconds. This option may
161 also be set via the PDSH_CONNECT_TIMEOUT environment variable.
162
163 -u seconds
164 Set a limit on the amount of time a remote command is allowed to
165 execute. Default is no limit. See note in LIMITATIONS if using
166 -u with ssh. This option may also be set via the PDSH_COM‐
167 MAND_TIMEOUT environment variable.
168
169 -f number
170 Set the maximum number of simultaneous remote commands to num‐
171 ber. The default is 32.
172
173 -R name
174 Set rcmd module to name. This option may also be set via the
175 PDSH_RCMD_TYPE environment variable. A list of available rcmd
176 modules may be obtained via the -h, -V, or -L options. The
177 default will be listed with -h or -V.
178
179 -M name,...
180 When multiple misc modules provide the same options to pdsh, the
181 first module initialized "wins" and subsequent modules are not
182 loaded. The -M option allows a list of modules to be specified
183 that will be force-initialized before all others, in-effect
184 ensuring that they load without conflict (unless they conflict
185 with eachother). This option may also be set via the
186 PDSH_MISC_MODULES environment variable.
187
188 -L List info on all loaded pdsh modules and quit.
189
190 -N Disable hostname: prefix on lines of output.
191
192 -d Include more complete thread status when SIGINT is received, and
193 display connect and command time statistics on stderr when done.
194
195 -V Output pdsh version information, along with list of currently
196 loaded modules, and exit.
197
198
200 -a Target all nodes from machines file.
201
202
204 In addition to the genders options presented below, the genders
205 attribute pdsh_rcmd_type may also be used in the genders database to
206 specify an alternate rcmd connect type than the pdsh default for hosts
207 with this attribute. For example, the following line in the genders
208 file
209
210 host0 pdsh_rcmd_type=ssh
211
212 would cause pdsh to use ssh to connect to host0, even if rsh were the
213 default. This can be overridden on the commandline with the
214 "rcmd_type:host0" syntax.
215
216
217 -A Target all nodes in genders database. The -A option will target
218 every host listed in genders -- if you want to omit some hosts
219 by default, see the -a option below.
220
221 -a Target all nodes in genders database except those with the
222 "pdsh_all_skip" attribute. This is shorthand for running "pdsh
223 -A -X pdsh_all_skip ..."
224
225 -g attr[=val][,attr[=val],...]
226 Target nodes that match any of the specified genders attributes
227 (with optional values). Conflicts with the -a option. If used in
228 combination with other node selection options like -w, the -g
229 option will select from the supplied node list, instead of from
230 the genders file as a whole. Otherwise, This option targets the
231 alternate hostnames in the genders database by default. The -i
232 option provided by the genders module may be used to translate
233 these to the canonical genders hostnames. If the installed ver‐
234 sion of genders supports it, attributes supplied to -g may also
235 take the form of genders queries. Genders queries will query the
236 genders database for the union, intersection, difference, or
237 complement of genders attributes and values. The set operation
238 union is represented by two pipe symbols ('||'), intersection by
239 two ampersand symbols ('&&'), difference by two minus symbols
240 ('--'), and complement by a tilde ('~'). Parentheses may be
241 used to change the order of operations. See the nodeattr(1) man‐
242 page for examples of genders queries.
243
244 -X attr[=val][,attr[=val],...]
245 Exclude nodes that match any of the specified genders attributes
246 (optionally with values). This option may be used in combina‐
247 tion with any other of the node selection options (e.g. -w, -g,
248 -a, -X may also take the form of genders queries. Please see
249 documentation for the genders -g option for more information
250 about genders queries.
251
252 -i Request translation between canonical and alternate hostnames.
253
254 -F filename
255 Read genders information from filename instead of the system
256 default genders file. If filename doesn't specify an absolute
257 path then it is taken to be relative to the directory specified
258 by the PDSH_GENDERS_DIR environment variable (/etc by default).
259 An alternate genders file may also be specified via the
260 PDSH_GENDERS_FILE environment variable.
261
262
264 -v Eliminate target nodes that are considered "down" by libnodeup‐
265 down.
266
267
269 The slurm module allows pdsh to target nodes based on currently running
270 SLURM jobs. The slurm module is typically called after all other node
271 selection options have been processed, and if no nodes have been
272 selected, the module will attempt to read a running jobid from the
273 SLURM_JOBID environment variable (which is set when running under a
274 SLURM allocation). If SLURM_JOBID references an invalid job, it will be
275 silently ignored.
276
277 -j jobid[,jobid,...]
278 Target list of nodes allocated to the SLURM job jobid. This
279 option may be used multiple times to target multiple SLURM jobs.
280 The special argument "all" can be used to target all nodes run‐
281 ning SLURM jobs, e.g. -j all.
282
283 -P partition[,partition,...]
284 Target list of nodes containing in the SLURM partition parti‐
285 tion. This option may be used multiple times to target multiple
286 SLURM partitions and/or partitions may be given in a comma-
287 delimited list.
288
289
291 The torque module allows pdsh to target nodes based on currently run‐
292 ning Torque/PBS jobs. Similar to the slurm module, the torque module is
293 typically called after all other node selection options have been pro‐
294 cessed, and if no nodes have been selected, the module will attempt to
295 read a running jobid from the PBS_JOBID environment variable (which is
296 set when running under a Torque allocation).
297
298 -j jobid[,jobid,...]
299 Target list of nodes allocated to the Torque job jobid. This
300 option may be used multiple times to target multiple Torque
301 jobs.
302
303
305 The dshgroup module allows pdsh to use dsh (or Dancer's shell) style
306 group files from /etc/dsh/group/ or ~/.dsh/group/. The default search
307 path may be overridden with the DSHGROUP_PATH environment variable, a
308 colon-separated list of directories to search. The default value for
309 DSHGROUP_PATH is /etc/dsh/group.
310
311 -g groupname,...
312 Target nodes in dsh group file "groupname" found in either
313 ~/.dsh/group/groupname or /etc/dsh/group/groupname.
314
315 -X groupname,...
316 Exclude nodes in dsh group file "groupname."
317
318 As an enhancement in pdsh, dshgroup files may optionally include other
319 dshgroup files via a special #include STRING syntax. The argument to
320 #include may be either a file path, or a group name, in which case the
321 path used to search for the group file is the same as if the group had
322 been specified to -g.
323
324
326 The netgroup module allows pdsh to use standard netgroup entries to
327 build lists of target hosts. (/etc/netgroup or NIS)
328
329 -g groupname,...
330 Target nodes in netgroup "groupname."
331
332 -X groupname,...
333 Exclude nodes in netgroup "groupname."
334
335
337 PDSH_RCMD_TYPE
338 Equivalent to the -R option, the value of this environment vari‐
339 able will be used to set the default rcmd module for pdsh to use
340 (e.g. ssh, rsh).
341
342 PDSH_SSH_ARGS
343 Override the standard arguments that pdsh passes to the ssh(1)
344 command ("-2 -a -x -l%u %h"). The use of the parameters %u, %h,
345 and %n (as documented in the rcmd/exec section above) is
346 optional. If these parameters are missing, pdsh will append them
347 to the ssh commandline because it is assumed they are mandatory.
348
349 PDSH_SSH_ARGS_APPEND
350 Append additional options to the ssh(1) command invoked by pdsh.
351 For example, PDSH_SSH_ARGS_APPEND="-q" would run ssh in quiet
352 mode, or "-v" would increase the verbosity of ssh. (Note: these
353 arguments are actually prepended to the ssh commandline to
354 ensure they appear before any target hostname argument to ssh.)
355
356 WCOLL If no other node selection option is used, the WCOLL environment
357 variable may be set to a filename from which a list of target
358 hosts will be read. The file should contain a list of hosts, one
359 per line (though each line may contain a hostlist expression.
360 See HOSTLIST EXPRESSIONS section below).
361
362 DSHPATH
363 If set, the path in DSHPATH will be used as the PATH for the
364 remote processes.
365
366 FANOUT Set the pdsh fanout (See description of -f above).
367
368
370 As noted in sections above pdsh accepts lists of hosts the general
371 form: prefix[n-m,l-k,...], where n < m and l < k, etc., as an alterna‐
372 tive to explicit lists of hosts. This form should not be confused with
373 regular expression character classes (also denoted by ``[]''). For
374 example, foo[19] does not represent an expression matching foo1 or
375 foo9, but rather represents the degenerate hostlist: foo19.
376
377 The hostlist syntax is meant only as a convenience on clusters with a
378 "prefixNNN" naming convention and specification of ranges should not be
379 considered necessary -- the list foo1,foo9 could be specified as such,
380 or by the hostlist foo[1,9].
381
382 Some examples of usage follow:
383
384
385 Run command on foo01,foo02,...,foo05
386 pdsh -w foo[01-05] command
387
388 Run command on foo7,foo9,foo10
389 pdsh -w foo[7,9-10] command
390
391 Run command on foo0,foo4,foo5
392 pdsh -w foo[0-5] -x foo[1-3] command
393
394
395 A suffix on the hostname is also supported:
396
397
398 Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0
399 pdsh -w foo[0-3]-eth0 command
400
401
402 As a reminder to the reader, some shells will interpret brackets ('['
403 and ']') for pattern matching. Depending on your shell, it may be nec‐
404 essary to enclose ranged lists within quotes. For example, in tcsh,
405 the first example above should be executed as:
406
407 pdsh -w "foo[01-05]" command
408
409
411 Originally a rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on
412 LLNL's ASCI Blue-Pacific IBM SP system. It is now used on Linux clus‐
413 ters at LLNL.
414
415
417 When using ssh for remote execution, expect the stderr of ssh to be
418 folded in with that of the remote command. When invoked by pdsh, it is
419 not possible for ssh to prompt for passwords if RSA/DSA keys are con‐
420 figured properly, etc.. For ssh implementations that suppport a con‐
421 nect timeout option, pdsh attempts to use that option to enforce the
422 timeout (e.g. -oConnectTimeout=T for OpenSSH), otherwise connect time‐
423 outs are not supported when using ssh. Finally, there is no reliable
424 way for pdsh to ensure that remote commands are actually terminated
425 when using a command timeout. Thus if -u is used with ssh commands may
426 be left running on remote hosts even after timeout has killed local ssh
427 processes.
428
429 The number of nodes that pdsh can simultaneously execute remote jobs on
430 is limited by the maximum number of threads that can be created concur‐
431 rently, as well as the availability of reserved ports in the rsh mod‐
432 ule. On systems that implement Posix threads, the limit is typically
433 defined by the constant PTHREADS_THREADS_MAX.
434
435
438 rsh(1), ssh(1), dshbak(1), pdcp(1)
439
440
441
442 linux-gnu pdsh(1)