1pdsh(1)                     General Commands Manual                    pdsh(1)
2
3
4

NAME

6       pdsh - issue commands to groups of hosts in parallel
7
8

SYNOPSIS

10       pdsh [options]... command
11
12

DESCRIPTION

14       pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs com‐
15       mands on a single remote host, pdsh can run multiple remote commands in
16       parallel.  pdsh  uses a "sliding window" (or fanout) of threads to con‐
17       serve resources on the initiating host while allowing some  connections
18       to time out.
19
20       When  pdsh  receives  SIGINT  (ctrl-C),  it lists the status of current
21       threads. A second SIGINT within  one  second  terminates  the  program.
22       Pending  threads may be canceled by issuing ctrl-Z within one second of
23       ctrl-C.  Pending threads are those that have not yet been initiated, or
24       are still in the process of connecting to the remote host.
25
26
27       If  a  remote  command  is not specified on the command line, pdsh runs
28       interactively, prompting for commands and executing  them  when  termi‐
29       nated  with  a  carriage return. In interactive mode, target nodes that
30       time out on the first command are not  contacted  for  subsequent  com‐
31       mands, and commands prefixed with an exclamation point will be executed
32       on the local system.
33
34       The core functionality of pdsh may be supplemented by dynamically load‐
35       able  modules.  The  modules  may  provide  a  new  connection protocol
36       (replacing the standard rcmd(3) protocol  used  by  rsh(1)),  filtering
37       options  (e.g.  removing  hosts  that are "down" from the target list),
38       and/or host selection options (e.g., -a selects all hosts from  a  con‐
39       figuration  file.). By default, pdsh must have at least one "rcmd" mod‐
40       ule loaded. See the RCMD MODULES section for more information.
41
42

RCMD MODULES

44       The method by which pdsh runs commands on remote hosts may be  selected
45       at runtime using the -R option (See OPTIONS below).  This functionality
46       is ultimately implemented via dynamically loadable modules, and so  the
47       list of available options may be different from installation to instal‐
48       lation. A list of currently available  rcmd  modules  is  printed  when
49       using  any  of  the -h, -V, or -L options. The default rcmd module will
50       also be displayed with the -h and -V options.
51
52       A list of rcmd modules currently distributed with pdsh follows.
53
54       rsh     Uses an internal, thread-safe implementation of BSD rcmd(3)  to
55               run commands using the standard rsh(1) protocol.
56
57       exec    Executes  an  arbitrary command for each target host. The first
58               of the pdsh remote arguments is the local command  to  execute,
59               followed  by  any further arguments. Some simple parameters are
60               substitued on the command line, including  %h  for  the  target
61               hostname,  %u  for  the  remote username, and %n for the remote
62               rank [0-n] (To get a literal % use %%).  For example, the  fol‐
63               lowing  would duplicate using the ssh module to run hostname(1)
64               across the hosts foo[0-10]:
65
66                  pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname
67
68               and this command line would run grep(1) in parallel across  the
69               files console.foo[0-10]:
70
71                  pdsh -R exec -w foo[0-10] grep BUG console.%h
72
73
74       ssh     Uses a variant of popen(3) to run multiple copies of the ssh(1)
75               command.
76
77       mrsh    This module uses the mrsh(1) protocol to execute jobs on remote
78               hosts.   The  mrsh protocol uses a credential based authentica‐
79               tion, forgoing the need to allocate reserved  ports.  In  other
80               aspects,  it  acts  just like rsh. Remote nodes must be running
81               mrshd(8) in order for the mrsh module to work.
82
83       qsh     Allows pdsh to execute MPI jobs over QsNet.  Qshell  propagates
84               the current working directory, pdsh environment, and Elan capa‐
85               bilities to the remote process. The following environment vari‐
86               able   are   also   appended   to  the  environment:  RMS_RANK,
87               RMS_NODEID, RMS_PROCID, RMS_NNODES, and RMS_NPROCS. Since  pdsh
88               needs  to  run  setuid root for qshell support, qshell does not
89               directly support propagation of LD_LIBRARY_PATH and LD_PREOPEN.
90               Instead       the       QSHELL_REMOTE_LD_LIBRARY_PATH       and
91               QSHELL_REMOTE_LD_PREOPEN environment variables will may be used
92               and  will  be remapped to LD_LIBRARY_PATH and LD_PREOPEN by the
93               qshell daemon if set.
94
95       mqsh    Similar to qshell, but uses the mrsh protocol  instead  of  the
96               rsh protocol.
97
98       krb4    The  krb4  module allows users to execute remote commands after
99               authenticating with kerberos. Of course, the remote  rshd  dae‐
100               mons must be kerberized.
101
102       xcpu    The  xcpu  module  uses the xcpu service to execute remote com‐
103               mands.
104
105

OPTIONS

107       The list of available options is determined at runtime by supplementing
108       the  list  of standard pdsh options with any options provided by loaded
109       rcmd and misc modules.  In some cases, options provided by modules  may
110       conflict  with each other. In these cases, the modules are incompatible
111       and the first module loaded wins.
112
113

Standard target nodelist options

115       -w [rcmd_type:][user@]host,host,...
116              Target the specified list of hosts. Do not use  with  any  other
117              node  selection  options (e.g. -a, -g if they are available). No
118              spaces are allowed in the comma-separated list.  A list consist‐
119              ing of a single `-' character causes the target hosts to be read
120              from stdin, one per line. The host  list  may  contain  hostlist
121              expressions  of  the  form ``host[1-5,7]''. For more information
122              about the hostlist format, see the HOSTLIST EXPRESSIONS  section
123              below.  A list of hosts may also be preceded by "user@" to spec‐
124              ify a remote username other than the default, or "rcmd_type:" to
125              specify  an alternate rcmd connection type for these hosts. When
126              used together, the rcmd  type  must  be  specified  first,  e.g.
127              "ssh:user1@host0"  would  use  ssh  to  connect to host0 as user
128              "user1."
129
130       -x host,host,...
131              Exclude the specified hosts. May  be  specified  in  conjunction
132              with  other  target  node  list  options such as -a and -g (when
133              available). Hostlists may also be specified  to  the  -x  option
134              (see the HOSTLIST EXPRESSIONS section below).
135
136

Standard pdsh options

138       -S     Return the largest of the remote command return values.
139
140       -h     Output  usage  menu  and  quit. A list of available rcmd modules
141              will also be printed at the end of the usage message.
142
143       -s     Only on AIX, separate remote command stderr and stdout into  two
144              sockets.
145
146       -q     List  option  values  and  the  target nodelist and exit without
147              action.
148
149       -b     Disable ctrl-C status feature so that a single ctrl-C kills par‐
150              allel job. (Batch Mode)
151
152       -l user
153              This  option may be used to run remote commands as another user,
154              subject to authorization. For BSD rcmd, this means the  invoking
155              user  and system must be listed in the user´s .rhosts file (even
156              for root).
157
158       -t seconds
159              Set the connect timeout. Default is 10 seconds.
160
161       -u seconds
162              Set a limit on the amount of time a remote command is allowed to
163              execute.   Default is no limit. See note in LIMITATIONS if using
164              -u with ssh.
165
166       -f number
167              Set the maximum number of simultaneous remote commands  to  num‐
168              ber.  The default is 32.
169
170       -R name
171              Set  rcmd  module  to  name. This option may also be set via the
172              PDSH_RCMD_TYPE environment variable. A list  of  available  rcmd
173              modules  may  be  obtained  via  the -h, -V, or -L options.  The
174              default will be listed with -h or -V.
175
176       -M name,...
177              When multiple misc modules provide the same options to pdsh, the
178              first  module  initialized "wins" and subsequent modules are not
179              loaded.  The -M option allows a list of modules to be  specified
180              that  will  be  force-initialized  before  all others, in-effect
181              ensuring that they load without conflict (unless  they  conflict
182              with   eachother).   This   option  may  also  be  set  via  the
183              PDSH_MISC_MODULES environment variable.
184
185       -L     List info on all loaded pdsh modules and quit.
186
187       -N     Disable hostname: prefix on lines of output.
188
189       -d     Include more complete thread status when SIGINT is received, and
190              display connect and command time statistics on stderr when done.
191
192       -V     Output  pdsh  version  information, along with list of currently
193              loaded modules, and exit.
194
195

qsh/mqsh module options

197       -n tasks_per_node
198              Set the number of tasks spawned per node. Default is 1.
199
200       -m block | cyclic
201              Set block  versus  cyclic  allocation  of  processes  to  nodes.
202              Default is block.
203
204       -r railmask
205              Set  the  rail  bitmask  for  a  job  on a multirail system. The
206              default railmask is 1, which corresponds to rail  0  only.  Each
207              bit  set in the argument to -r corresponds to a rail on the sys‐
208              tem, so a value of 2 would correspond to  rail  1  only,  and  3
209              would indicate to use both rail 1 and rail 0.
210
211

machines module options

213       -a     Target all nodes from machines file.
214
215

genders module options

217       In  addition  to  the  genders  options  presented  below,  the genders
218       attribute pdsh_rcmd_type may also be used in the  genders  database  to
219       specify  an alternate rcmd connect type than the pdsh default for hosts
220       with this attribute. For example, the following  line  in  the  genders
221       file
222
223         host0 pdsh_rcmd_type=ssh
224
225       would  cause  pdsh to use ssh to connect to host0, even if rsh were the
226       default.   This  can  be  overridden  on  the  commandline   with   the
227       "rcmd_type:host0" syntax.
228
229
230       -A     Target  all nodes in genders database. The -A option will target
231              every host listed in genders -- if you want to omit  some  hosts
232              by default, see the -a option below.
233
234       -a     Target  all  nodes  in  genders  database  except those with the
235              "pdsh_all_skip" attribute. This is shorthand for  running  "pdsh
236              -A -X pdsh_all_skip ..."
237
238       -g attr[=val][,attr[=val],...]
239              Target  nodes that match any of the specified genders attributes
240              (with optional values). Conflicts with -a and -w  options.  This
241              option  targets  the alternate hostnames in the genders database
242              by default. The -i option provided by the genders module may  be
243              used  to  translate these to the canonical genders hostnames. If
244              the installed version of genders supports  it,  attributes  sup‐
245              plied  to  -g may also take the form of genders queries. Genders
246              queries will query the genders database for the union, intersec‐
247              tion,  difference,  or complement of genders attributes and val‐
248              ues.  The set operation union is represented by two pipe symbols
249              ('||'), intersection by two ampersand symbols ('&&'), difference
250              by two minus symbols ('--'), and complement by  a  tilde  ('~').
251              Parentheses  may  be used to change the order of operations. See
252              the nodeattr(1) manpage for examples of genders queries.
253
254       -X attr[=val][,attr[=val],...]
255              Exclude nodes that match any of the specified genders attributes
256              (optionally  with  values).  This option may be used in combina‐
257              tion with any other of the node selection options (e.g. -w,  -g,
258              -a,  -X  may  also  take the form of genders queries. Please see
259              documentation for the genders -g  option  for  more  information
260              about genders queries.
261
262       -i     Request translation between canonical and alternate hostnames.
263
264       -F filename
265              Read  genders  information  from  filename instead of the system
266              default genders file. If filename doesn't  specify  an  absolute
267              path  then it is taken to be relative to the directory specified
268              by the PDSH_GENDERS_DIR environment variable (/etc by  default).
269              An  alternate  genders  file  may  also  be  specified  via  the
270              PDSH_GENDERS_FILE environment variable.
271
272

nodeupdown module options

274       -v     Eliminate target nodes that are considered "down" by  libnodeup‐
275              down.
276
277

slurm module options

279       The slurm module allows pdsh to target nodes based on currently running
280       SLURM jobs. The slurm module is typically called after all  other  node
281       selection  options  have  been  processed,  and  if  no nodes have been
282       selected, the module will attempt to read  a  running  jobid  from  the
283       SLURM_JOBID  environment  variable  (which  is set when running under a
284       SLURM allocation). If SLURM_JOBID references an invalid job, it will be
285       silently ignored.
286
287       -j jobid[,jobid,...]
288              Target  list  of  nodes  allocated  to the SLURM job jobid. This
289              option may be used multiple times to target multiple SLURM jobs.
290              The  special argument "all" can be used to target all nodes run‐
291              ning SLURM jobs, e.g.  -j all.
292
293

rms module options

295       The rms module allows pdsh to target nodes based on  an  RMS  resource.
296       The  rms  module  is  typically  called  after all other node selection
297       options, and if no nodes have been selected, the  module  will  examine
298       the  RMS_RESOURCEID  environment variable and attempt to set the target
299       list of hosts to the nodes in the RMS resource. If an invalid  resource
300       is denoted, the variable is silently ignored.
301
302

SDR module options

304       The  SDR module supports targeting hosts via the System Data Repository
305       on IBM SPs.
306
307       -a     Target all nodes in the SDR. The  list  is  generated  from  the
308              "reliable hostname" in the SDR by default.
309
310       -i     Translate  hostnames  between  reliable  and initial in the SDR,
311              when applicable.  If the a target hostname  matches  either  the
312              initial or reliable hostname in the SDR, the alternate name will
313              be substitued. Thus a list composed of  initial  hostnames  will
314              instead  be  replaced  with  a  list of reliable hostnames.  For
315              example, when used with -a above, all initial hostnames  in  the
316              SDR are targeted.
317
318       -v     Do not target nodes that are marked as not responding in the SDR
319              on the targeted interface. (If a hostname does not appear in the
320              SDR, then that name will remain in the target hostlist.)
321
322       -G     In combination with -a, include all partitions.
323
324

nodeattr module options

326       The  nodeattr  module  supports  access to the genders database via the
327       nodeattr(1) command. See the genders section above for a list  of  sup‐
328       port  options with this module. The option usage with the nodeattr mod‐
329       ule is the same as genders, above,  with  the  exception  that  the  -i
330       option may only be used with -a or -g. NOTE: This module will only work
331       with very old releases of genders where the  nodeattr(1)  command  sup‐
332       ports the -r option, and before the libgenders API was available. Users
333       running newer versions of genders will need to use the  genders  module
334       instead.
335
336

dshgroup module options

338       The  dshgroup  module  allows pdsh to use dsh (or Dancer's shell) style
339       group files from /etc/dsh/group/ or ~/.dsh/group/.
340
341       -g groupname,...
342              Target nodes in dsh  group  file  "groupname"  found  in  either
343              ~/.dsh/group/groupname or /etc/dsh/group/groupname.
344
345       -X groupname,...
346              Exclude nodes in dsh group file "groupname."
347
348

netgroup module options

350       The  netgroup  module  allows  pdsh to use standard netgroup entries to
351       build lists of target hosts. (/etc/netgroup or NIS)
352
353       -g groupname,...
354              Target nodes in netgroup "groupname."
355
356       -X groupname,...
357              Exclude nodes in netgroup "groupname."
358
359

ENVIRONMENT VARIABLES

361       PDSH_RCMD_TYPE
362              Equivalent to the -R option, the value of this environment vari‐
363              able will be used to set the default rcmd module for pdsh to use
364              (e.g. ssh, rsh).
365
366       PDSH_SSH_ARGS
367              Override the standard arguments that pdsh passes to  the  ssh(1)
368              command ("-2 -a -x").
369
370       PDSH_SSH_ARGS_APPEND
371              Append additional options to the ssh(1) command invoked by pdsh.
372              For example, PDSH_SSH_ARGS_APPEND="-q" would run  ssh  in  quiet
373              mode, or "-v" would increase the verbosity of ssh.
374
375       WCOLL  If no other node selection option is used, the WCOLL environment
376              variable may be set to a filename from which a  list  of  target
377              hosts will be read. The file should contain a list of hosts, one
378              per line (though each line may contain  a  hostlist  expression.
379              See HOSTLIST EXPRESSIONS section below).
380
381       DSHPATH
382              If  set,  the  path  in DSHPATH will be used as the PATH for the
383              remote processes.
384
385       FANOUT Set the pdsh fanout (See description of -f above).
386
387

HOSTLIST EXPRESSIONS

389       As noted in sections above pdsh accepts  lists  of  hosts  the  general
390       form:  prefix[n-m,l-k,...], where n < m and l < k, etc., as an alterna‐
391       tive to explicit lists of hosts. This form should not be confused  with
392       regular  expression  character  classes  (also  denoted by ``[]''). For
393       example, foo[19] does not represent  an  expression  matching  foo1  or
394       foo9, but rather represents the degenerate hostlist: foo19.
395
396       The  hostlist  syntax is meant only as a convenience on clusters with a
397       "prefixNNN" naming convention and specification of ranges should not be
398       considered  necessary  -- this foo1,foo9 could be specified as such, or
399       by the hostlist foo[1,9].
400
401       Some examples of usage follow:
402
403
404       Run command on foo01,foo02,...,foo05
405           pdsh -w foo[01-05] command
406
407       Run command on foo7,foo9,foo10
408            pdsh -w foo[7,9-10] command
409
410       Run command on foo0,foo4,foo5
411            pdsh -w foo[0-5] -x foo[1-3] command
412
413
414       A suffix on the hostname is also supported:
415
416
417       Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0
418          pdsh -w foo[0-3]-eth0 command
419
420
421       As a reminder to the reader, some shells will interpret  brackets  ('['
422       and ']') for pattern matching.  Depending on your shell, it may be nec‐
423       essary to enclose ranged lists within quotes.  For  example,  in  tcsh,
424       the first example above should be executed as:
425
426            pdsh -w "foo[01-05]" command
427
428

ORIGIN

430       Originally a rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on
431       LLNL's ASCI Blue-Pacific IBM SP system. It is now used on  Linux  clus‐
432       ters at LLNL.
433
434

LIMITATIONS

436       When  using  ssh  for  remote execution, expect the stderr of ssh to be
437       folded in with that of the remote command. When invoked by pdsh, it  is
438       not  possible  for ssh to prompt for passwords if RSA/DSA keys are con‐
439       figured properly, etc..  For ssh implementations that suppport  a  con‐
440       nect  timeout  option,  pdsh attempts to use that option to enforce the
441       timeout (e.g. -oConnectTimeout=T for OpenSSH), otherwise connect  time‐
442       outs  are  not supported when using ssh.  Finally, there is no reliable
443       way for pdsh to ensure that remote  commands  are  actually  terminated
444       when  using a command timeout. Thus if -u is used with ssh commands may
445       be left running on remote hosts even after timeout has killed local ssh
446       processes.
447
448       Output  from multiple processes per node may be interspersed when using
449       qshell or mqshell rcmd modules.
450
451       The number of nodes that pdsh can simultaneously execute remote jobs on
452       is limited by the maximum number of threads that can be created concur‐
453       rently, as well as the availability of reserved ports in  the  rsh  and
454       qshell rcmd modules. On systems that implement Posix threads, the limit
455       is typically defined by the constant PTHREADS_THREADS_MAX.
456
457

FILES

SEE ALSO

460       rsh(1), ssh(1), dshbak(1), pdcp(1)
461
462
463
464pdsh-2.22                          linux-gnu                           pdsh(1)
Impressum