1pdsh(1)                     General Commands Manual                    pdsh(1)
2
3
4

NAME

6       pdsh - issue commands to groups of hosts in parallel
7
8

SYNOPSIS

10       pdsh [options]... command
11
12

DESCRIPTION

14       pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs com‐
15       mands on a single remote host, pdsh can run multiple remote commands in
16       parallel.  pdsh  uses a "sliding window" (or fanout) of threads to con‐
17       serve resources on the initiating host while allowing some  connections
18       to time out.
19
20       When  pdsh  receives  SIGINT  (ctrl-C),  it lists the status of current
21       threads. A second SIGINT within  one  second  terminates  the  program.
22       Pending  threads may be canceled by issuing ctrl-Z within one second of
23       ctrl-C.  Pending threads are those that have not yet been initiated, or
24       are still in the process of connecting to the remote host.
25
26
27       If  a  remote  command  is not specified on the command line, pdsh runs
28       interactively, prompting for commands and executing  them  when  termi‐
29       nated  with  a  carriage return. In interactive mode, target nodes that
30       time out on the first command are not  contacted  for  subsequent  com‐
31       mands, and commands prefixed with an exclamation point will be executed
32       on the local system.
33
34       The core functionality of pdsh may be supplemented by dynamically load‐
35       able  modules.  The  modules  may  provide  a  new  connection protocol
36       (replacing the standard rcmd(3) protocol  used  by  rsh(1)),  filtering
37       options  (e.g.  removing  hosts  that are "down" from the target list),
38       and/or host selection options (e.g., -a selects all hosts from  a  con‐
39       figuration  file.). By default, pdsh must have at least one "rcmd" mod‐
40       ule loaded. See the RCMD MODULES section for more information.
41
42

RCMD MODULES

44       The method by which pdsh runs commands on remote hosts may be  selected
45       at runtime using the -R option (See OPTIONS below).  This functionality
46       is ultimately implemented via dynamically loadable modules, and so  the
47       list of available options may be different from installation to instal‐
48       lation. A list of currently available  rcmd  modules  is  printed  when
49       using  any  of  the -h, -V, or -L options. The default rcmd module will
50       also be displayed with the -h and -V options.
51
52       A list of rcmd modules currently distributed with pdsh follows.
53
54       rsh     Uses an internal, thread-safe implementation of BSD rcmd(3)  to
55               run commands using the standard rsh(1) protocol.
56
57       exec    Executes  an  arbitrary command for each target host. The first
58               of the pdsh remote arguments is the local command  to  execute,
59               followed  by  any further arguments. Some simple parameters are
60               substitued on the command line, including  %h  for  the  target
61               hostname,  %u  for  the  remote username, and %n for the remote
62               rank [0-n] (To get a literal % use %%).  For example, the  fol‐
63               lowing  would duplicate using the ssh module to run hostname(1)
64               across the hosts foo[0-10]:
65
66                  pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname
67
68               and this command line would run grep(1) in parallel across  the
69               files console.foo[0-10]:
70
71                  pdsh -R exec -w foo[0-10] grep BUG console.%h
72
73
74       ssh     Uses a variant of popen(3) to run multiple copies of the ssh(1)
75               command.
76
77       mrsh    This module uses the mrsh(1) protocol to execute jobs on remote
78               hosts.   The  mrsh protocol uses a credential based authentica‐
79               tion, forgoing the need to allocate reserved  ports.  In  other
80               aspects,  it  acts  just like rsh. Remote nodes must be running
81               mrshd(8) in order for the mrsh module to work.
82
83       qsh     Allows pdsh to execute MPI jobs over QsNet.  Qshell  propagates
84               the current working directory, pdsh environment, and Elan capa‐
85               bilities to the remote process. The following environment vari‐
86               able   are   also   appended   to  the  environment:  RMS_RANK,
87               RMS_NODEID, RMS_PROCID, RMS_NNODES, and RMS_NPROCS. Since  pdsh
88               needs  to  run  setuid root for qshell support, qshell does not
89               directly support propagation of LD_LIBRARY_PATH and LD_PREOPEN.
90               Instead       the       QSHELL_REMOTE_LD_LIBRARY_PATH       and
91               QSHELL_REMOTE_LD_PREOPEN environment variables will may be used
92               and  will  be remapped to LD_LIBRARY_PATH and LD_PREOPEN by the
93               qshell daemon if set.
94
95       mqsh    Similar to qshell, but uses the mrsh protocol  instead  of  the
96               rsh protocol.
97
98       krb4    The  krb4  module allows users to execute remote commands after
99               authenticating with kerberos. Of course, the remote  rshd  dae‐
100               mons must be kerberized.
101
102       xcpu    The  xcpu  module  uses the xcpu service to execute remote com‐
103               mands.
104
105

OPTIONS

107       The list of available options is determined at runtime by supplementing
108       the  list  of standard pdsh options with any options provided by loaded
109       rcmd and misc modules.  In some cases, options provided by modules  may
110       conflict  with each other. In these cases, the modules are incompatible
111       and the first module loaded wins.
112
113

Standard target nodelist options

115       -w TARGETS,...
116              Target and or filter the specified list of  hosts.  Do  not  use
117              with  any other node selection options (e.g. -a, -g, if they are
118              available). No spaces are allowed in the  comma-separated  list.
119              Arguments  in  the TARGETS list may include normal host names, a
120              range of hosts in hostlist format (See HOSTLIST EXPRESSIONS), or
121              a single `-' character to read the list of hosts on stdin.
122
123              If  a  host  or  hostlist  is  preceded by a `-' character, this
124              causes those hosts to be explicitly excluded. If the argument is
125              preceded  by  a single `^' character, it is taken to be the path
126              to file containing a list of hosts, one per line.  If  the  item
127              begins  with  a `/' character, it is taken  as a regular expres‐
128              sion on which to filter the list of hosts (a regex argument  may
129              also  be  optionally  trailed by another '/', e.g.  /node.*/). A
130              regex or file name argument may also be preceeded by a minus `-'
131              to exclude instead of include thoses hosts.
132
133              A  list  of  hosts  may also be preceded by "user@" to specify a
134              remote username other than the default, or "rcmd_type:" to spec‐
135              ify an alternate rcmd connection type for these hosts. When used
136              together,  the  rcmd  type  must  be   specified   first,   e.g.
137              "ssh:user1@host0"  would  use  ssh  to  connect to host0 as user
138              "user1."
139
140
141
142       -x host,host,...
143              Exclude the specified hosts. May  be  specified  in  conjunction
144              with  other  target  node  list  options such as -a and -g (when
145              available). Hostlists may also be specified  to  the  -x  option
146              (see  the  HOSTLIST  EXPRESSIONS section below). Arguments to -x
147              may also be preceeded by the  filename  (`^')  and  regex  ('/')
148              characters as described above, in which case the resulting hosts
149              are excluded as if they had been given to -w and preceeded  with
150              the minus `-' character.
151
152

Standard pdsh options

154       -S     Return the largest of the remote command return values.
155
156       -h     Output  usage  menu  and  quit. A list of available rcmd modules
157              will also be printed at the end of the usage message.
158
159       -s     Only on AIX, separate remote command stderr and stdout into  two
160              sockets.
161
162       -q     List  option  values  and  the  target nodelist and exit without
163              action.
164
165       -b     Disable ctrl-C status feature so that a single ctrl-C kills par‐
166              allel job. (Batch Mode)
167
168       -l user
169              This  option may be used to run remote commands as another user,
170              subject to authorization. For BSD rcmd, this means the  invoking
171              user  and system must be listed in the user´s .rhosts file (even
172              for root).
173
174       -t seconds
175              Set the connect timeout. Default is 10 seconds.
176
177       -u seconds
178              Set a limit on the amount of time a remote command is allowed to
179              execute.   Default is no limit. See note in LIMITATIONS if using
180              -u with ssh.
181
182       -f number
183              Set the maximum number of simultaneous remote commands  to  num‐
184              ber.  The default is 32.
185
186       -R name
187              Set  rcmd  module  to  name. This option may also be set via the
188              PDSH_RCMD_TYPE environment variable. A list  of  available  rcmd
189              modules  may  be  obtained  via  the -h, -V, or -L options.  The
190              default will be listed with -h or -V.
191
192       -M name,...
193              When multiple misc modules provide the same options to pdsh, the
194              first  module  initialized "wins" and subsequent modules are not
195              loaded.  The -M option allows a list of modules to be  specified
196              that  will  be  force-initialized  before  all others, in-effect
197              ensuring that they load without conflict (unless  they  conflict
198              with   eachother).   This   option  may  also  be  set  via  the
199              PDSH_MISC_MODULES environment variable.
200
201       -L     List info on all loaded pdsh modules and quit.
202
203       -N     Disable hostname: prefix on lines of output.
204
205       -d     Include more complete thread status when SIGINT is received, and
206              display connect and command time statistics on stderr when done.
207
208       -V     Output  pdsh  version  information, along with list of currently
209              loaded modules, and exit.
210
211

qsh/mqsh module options

213       -n tasks_per_node
214              Set the number of tasks spawned per node. Default is 1.
215
216       -m block | cyclic
217              Set block  versus  cyclic  allocation  of  processes  to  nodes.
218              Default is block.
219
220       -r railmask
221              Set  the  rail  bitmask  for  a  job  on a multirail system. The
222              default railmask is 1, which corresponds to rail  0  only.  Each
223              bit  set in the argument to -r corresponds to a rail on the sys‐
224              tem, so a value of 2 would correspond to  rail  1  only,  and  3
225              would indicate to use both rail 1 and rail 0.
226
227

machines module options

229       -a     Target all nodes from machines file.
230
231

genders module options

233       In  addition  to  the  genders  options  presented  below,  the genders
234       attribute pdsh_rcmd_type may also be used in the  genders  database  to
235       specify  an alternate rcmd connect type than the pdsh default for hosts
236       with this attribute. For example, the following  line  in  the  genders
237       file
238
239         host0 pdsh_rcmd_type=ssh
240
241       would  cause  pdsh to use ssh to connect to host0, even if rsh were the
242       default.   This  can  be  overridden  on  the  commandline   with   the
243       "rcmd_type:host0" syntax.
244
245
246       -A     Target  all nodes in genders database. The -A option will target
247              every host listed in genders -- if you want to omit  some  hosts
248              by default, see the -a option below.
249
250       -a     Target  all  nodes  in  genders  database  except those with the
251              "pdsh_all_skip" attribute. This is shorthand for  running  "pdsh
252              -A -X pdsh_all_skip ..."
253
254       -g attr[=val][,attr[=val],...]
255              Target  nodes that match any of the specified genders attributes
256              (with optional values). Conflicts with the -a option. If used in
257              combination  with  other  node selection options like -w, the -g
258              option will select from the supplied node list, instead of  from
259              the  genders file as a whole. Otherwise, This option targets the
260              alternate hostnames in the genders database by default.  The  -i
261              option  provided  by the genders module may be used to translate
262              these to the canonical genders hostnames. If the installed  ver‐
263              sion  of genders supports it, attributes supplied to -g may also
264              take the form of genders queries. Genders queries will query the
265              genders  database  for  the  union, intersection, difference, or
266              complement of genders attributes and values.  The set  operation
267              union is represented by two pipe symbols ('||'), intersection by
268              two ampersand symbols ('&&'), difference by  two  minus  symbols
269              ('--'),  and  complement  by  a tilde ('~').  Parentheses may be
270              used to change the order of operations. See the nodeattr(1) man‐
271              page for examples of genders queries.
272
273       -X attr[=val][,attr[=val],...]
274              Exclude nodes that match any of the specified genders attributes
275              (optionally with values).  This option may be used  in  combina‐
276              tion  with any other of the node selection options (e.g. -w, -g,
277              -a, -X may also take the form of  genders  queries.  Please  see
278              documentation  for  the  genders  -g option for more information
279              about genders queries.
280
281       -i     Request translation between canonical and alternate hostnames.
282
283       -F filename
284              Read genders information from filename  instead  of  the  system
285              default  genders  file.  If filename doesn't specify an absolute
286              path then it is taken to be relative to the directory  specified
287              by  the PDSH_GENDERS_DIR environment variable (/etc by default).
288              An  alternate  genders  file  may  also  be  specified  via  the
289              PDSH_GENDERS_FILE environment variable.
290
291

nodeupdown module options

293       -v     Eliminate  target nodes that are considered "down" by libnodeup‐
294              down.
295
296

slurm module options

298       The slurm module allows pdsh to target nodes based on currently running
299       SLURM  jobs.  The slurm module is typically called after all other node
300       selection options have been  processed,  and  if  no  nodes  have  been
301       selected,  the  module  will  attempt  to read a running jobid from the
302       SLURM_JOBID environment variable (which is set  when  running  under  a
303       SLURM allocation). If SLURM_JOBID references an invalid job, it will be
304       silently ignored.
305
306       -j jobid[,jobid,...]
307              Target list of nodes allocated to  the  SLURM  job  jobid.  This
308              option may be used multiple times to target multiple SLURM jobs.
309              The special argument "all" can be used to target all nodes  run‐
310              ning SLURM jobs, e.g.  -j all.
311
312       -P partition[,partition,...]
313              Target  list  of  nodes containing in the SLURM partition parti‐
314              tion.  This option may be used multiple times to target multiple
315              SLURM  partitions  and/or  partitions  may  be given in a comma-
316              delimited list.
317
318

torque module options

320       The torque module allows pdsh to target nodes based on  currently  run‐
321       ning Torque/PBS jobs. Similar to the slurm module, the torque module is
322       typically called after all other node selection options have been  pro‐
323       cessed,  and if no nodes have been selected, the module will attempt to
324       read a running jobid from the PBS_JOBID environment variable (which  is
325       set when running under a Torque allocation).
326
327       -j jobid[,jobid,...]
328              Target  list  of  nodes  allocated to the Torque job jobid. This
329              option may be used multiple  times  to  target  multiple  Torque
330              jobs.
331
332

rms module options

334       The  rms  module  allows pdsh to target nodes based on an RMS resource.
335       The rms module is typically  called  after  all  other  node  selection
336       options,  and  if  no nodes have been selected, the module will examine
337       the RMS_RESOURCEID environment variable and attempt to set  the  target
338       list  of hosts to the nodes in the RMS resource. If an invalid resource
339       is denoted, the variable is silently ignored.
340
341

SDR module options

343       The SDR module supports targeting hosts via the System Data  Repository
344       on IBM SPs.
345
346       -a     Target  all  nodes  in  the  SDR. The list is generated from the
347              "reliable hostname" in the SDR by default.
348
349       -i     Translate hostnames between reliable and  initial  in  the  SDR,
350              when  applicable.   If  the a target hostname matches either the
351              initial or reliable hostname in the SDR, the alternate name will
352              be  substitued.  Thus  a list composed of initial hostnames will
353              instead be replaced with a  list  of  reliable  hostnames.   For
354              example,  when  used with -a above, all initial hostnames in the
355              SDR are targeted.
356
357       -v     Do not target nodes that are marked as not responding in the SDR
358              on the targeted interface. (If a hostname does not appear in the
359              SDR, then that name will remain in the target hostlist.)
360
361       -G     In combination with -a, include all partitions.
362
363

nodeattr module options

365       The nodeattr module supports access to the  genders  database  via  the
366       nodeattr(1)  command.  See the genders section above for a list of sup‐
367       port options with this module. The option usage with the nodeattr  mod‐
368       ule  is  the  same  as  genders,  above, with the exception that the -i
369       option may only be used with -a or -g. NOTE: This module will only work
370       with  very  old  releases of genders where the nodeattr(1) command sup‐
371       ports the -r option, and before the libgenders API was available. Users
372       running  newer  versions of genders will need to use the genders module
373       instead.
374
375

dshgroup module options

377       The dshgroup module allows pdsh to use dsh (or  Dancer's  shell)  style
378       group  files  from /etc/dsh/group/ or ~/.dsh/group/. The default search
379       path may be overridden with the DSHGROUP_PATH environment  variable,  a
380       colon-separated  list  of  directories to search. The default value for
381       DSHGROUP_PATH is /etc/dsh/group.
382
383       -g groupname,...
384              Target nodes in dsh  group  file  "groupname"  found  in  either
385              ~/.dsh/group/groupname or /etc/dsh/group/groupname.
386
387       -X groupname,...
388              Exclude nodes in dsh group file "groupname."
389
390       As  an enhancement in pdsh, dshgroup files may optionally include other
391       dshgroup files via a special #include STRING syntax.  The  argument  to
392       #include  may be either a file path, or a group name, in which case the
393       path used to search for the group file is the same as if the group  had
394       been specified to -g.
395
396

netgroup module options

398       The  netgroup  module  allows  pdsh to use standard netgroup entries to
399       build lists of target hosts. (/etc/netgroup or NIS)
400
401       -g groupname,...
402              Target nodes in netgroup "groupname."
403
404       -X groupname,...
405              Exclude nodes in netgroup "groupname."
406
407

ENVIRONMENT VARIABLES

409       PDSH_RCMD_TYPE
410              Equivalent to the -R option, the value of this environment vari‐
411              able will be used to set the default rcmd module for pdsh to use
412              (e.g. ssh, rsh).
413
414       PDSH_SSH_ARGS
415              Override the standard arguments that pdsh passes to  the  ssh(1)
416              command  ("-2 -a -x -l%u %h"). The use of the parameters %u, %h,
417              and %n  (as  documented  in  the  rcmd/exec  section  above)  is
418              optional. If these parameters are missing, pdsh will append them
419              to the ssh commandline because it is assumed they are mandatory.
420
421       PDSH_SSH_ARGS_APPEND
422              Append additional options to the ssh(1) command invoked by pdsh.
423              For  example,  PDSH_SSH_ARGS_APPEND="-q"  would run ssh in quiet
424              mode, or "-v" would increase the verbosity of ssh. (Note:  these
425              arguments  are  actually  prepended  to  the  ssh commandline to
426              ensure they appear before any target hostname argument to ssh.)
427
428       WCOLL  If no other node selection option is used, the WCOLL environment
429              variable  may  be  set to a filename from which a list of target
430              hosts will be read. The file should contain a list of hosts, one
431              per  line  (though  each line may contain a hostlist expression.
432              See HOSTLIST EXPRESSIONS section below).
433
434       DSHPATH
435              If set, the path in DSHPATH will be used as  the  PATH  for  the
436              remote processes.
437
438       FANOUT Set the pdsh fanout (See description of -f above).
439
440

HOSTLIST EXPRESSIONS

442       As  noted  in  sections  above  pdsh accepts lists of hosts the general
443       form: prefix[n-m,l-k,...], where n < m and l < k, etc., as an  alterna‐
444       tive  to explicit lists of hosts. This form should not be confused with
445       regular expression character classes  (also  denoted  by  ``[]'').  For
446       example,  foo[19]  does  not  represent  an expression matching foo1 or
447       foo9, but rather represents the degenerate hostlist: foo19.
448
449       The hostlist syntax is meant only as a convenience on clusters  with  a
450       "prefixNNN" naming convention and specification of ranges should not be
451       considered necessary -- this foo1,foo9 could be specified as  such,  or
452       by the hostlist foo[1,9].
453
454       Some examples of usage follow:
455
456
457       Run command on foo01,foo02,...,foo05
458           pdsh -w foo[01-05] command
459
460       Run command on foo7,foo9,foo10
461            pdsh -w foo[7,9-10] command
462
463       Run command on foo0,foo4,foo5
464            pdsh -w foo[0-5] -x foo[1-3] command
465
466
467       A suffix on the hostname is also supported:
468
469
470       Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0
471          pdsh -w foo[0-3]-eth0 command
472
473
474       As  a  reminder to the reader, some shells will interpret brackets ('['
475       and ']') for pattern matching.  Depending on your shell, it may be nec‐
476       essary  to  enclose  ranged lists within quotes.  For example, in tcsh,
477       the first example above should be executed as:
478
479            pdsh -w "foo[01-05]" command
480
481

ORIGIN

483       Originally a rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on
484       LLNL's  ASCI  Blue-Pacific IBM SP system. It is now used on Linux clus‐
485       ters at LLNL.
486
487

LIMITATIONS

489       When using ssh for remote execution, expect the stderr  of  ssh  to  be
490       folded  in with that of the remote command. When invoked by pdsh, it is
491       not possible for ssh to prompt for passwords if RSA/DSA keys  are  con‐
492       figured  properly,  etc..  For ssh implementations that suppport a con‐
493       nect timeout option, pdsh attempts to use that option  to  enforce  the
494       timeout  (e.g. -oConnectTimeout=T for OpenSSH), otherwise connect time‐
495       outs are not supported when using ssh.  Finally, there is  no  reliable
496       way  for  pdsh  to  ensure that remote commands are actually terminated
497       when using a command timeout. Thus if -u is used with ssh commands  may
498       be left running on remote hosts even after timeout has killed local ssh
499       processes.
500
501       Output from multiple processes per node may be interspersed when  using
502       qshell or mqshell rcmd modules.
503
504       The number of nodes that pdsh can simultaneously execute remote jobs on
505       is limited by the maximum number of threads that can be created concur‐
506       rently,  as  well  as the availability of reserved ports in the rsh and
507       qshell rcmd modules. On systems that implement Posix threads, the limit
508       is typically defined by the constant PTHREADS_THREADS_MAX.
509
510

FILES

SEE ALSO

513       rsh(1), ssh(1), dshbak(1), pdcp(1)
514
515
516
517pdsh-2.31                          linux-gnu                           pdsh(1)
Impressum