1pdsh(1)                     General Commands Manual                    pdsh(1)
2
3
4

NAME

6       pdsh - issue commands to groups of hosts in parallel
7
8

SYNOPSIS

10       pdsh [options]... command
11
12

DESCRIPTION

14       pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs com‐
15       mands on a single remote host, pdsh can run multiple remote commands in
16       parallel.  pdsh  uses a "sliding window" (or fanout) of threads to con‐
17       serve resources on the initiating host while allowing some  connections
18       to time out.
19
20       When  pdsh  receives  SIGINT  (ctrl-C),  it lists the status of current
21       threads. A second SIGINT within  one  second  terminates  the  program.
22       Pending  threads may be canceled by issuing ctrl-Z within one second of
23       ctrl-C.  Pending threads are those that have not yet been initiated, or
24       are still in the process of connecting to the remote host.
25
26
27       If  a  remote  command  is not specified on the command line, pdsh runs
28       interactively, prompting for commands and executing  them  when  termi‐
29       nated  with  a  carriage return. In interactive mode, target nodes that
30       time out on the first command are not  contacted  for  subsequent  com‐
31       mands, and commands prefixed with an exclamation point will be executed
32       on the local system.
33
34       The core functionality of pdsh may be supplemented by dynamically load‐
35       able  modules.  The  modules  may  provide  a  new  connection protocol
36       (replacing the standard rcmd(3) protocol  used  by  rsh(1)),  filtering
37       options  (e.g.  removing  hosts  that are "down" from the target list),
38       and/or host selection options (e.g., -a selects all hosts from  a  con‐
39       figuration  file.). By default, pdsh must have at least one "rcmd" mod‐
40       ule loaded. See the RCMD MODULES section for more information.
41
42

RCMD MODULES

44       The method by which pdsh runs commands on remote hosts may be  selected
45       at runtime using the -R option (See OPTIONS below).  This functionality
46       is ultimately implemented via dynamically loadable modules, and so  the
47       list of available options may be different from installation to instal‐
48       lation. A list of currently available  rcmd  modules  is  printed  when
49       using  any  of  the -h, -V, or -L options. The default rcmd module will
50       also be displayed with the -h and -V options.
51
52       A list of rcmd modules currently distributed with pdsh follows.
53
54       rsh     Uses an internal, thread-safe implementation of BSD rcmd(3)  to
55               run commands using the standard rsh(1) protocol.
56
57       ssh     Uses a variant of popen(3) to run multiple copies of the ssh(1)
58               command.
59
60       mrsh    This module uses the mrsh(1) protocol to execute jobs on remote
61               hosts.   The  mrsh protocol uses a credential based authentica‐
62               tion, forgoing the need to allocate reserved  ports.  In  other
63               aspects,  it  acts  just like rsh. Remote nodes must be running
64               mrshd(8) in order for the mrsh module to work.
65
66       qsh     Allows pdsh to execute MPI jobs over QsNet.  Qshell  propagates
67               the current working directory, pdsh environment, and Elan capa‐
68               bilities to the remote process. The following environment vari‐
69               able   are   also   appended   to  the  environment:  RMS_RANK,
70               RMS_NODEID, RMS_PROCID, RMS_NNODES, and RMS_NPROCS. Since  pdsh
71               needs  to  run  setuid root for qshell support, qshell does not
72               directly support propagation of LD_LIBRARY_PATH and LD_PREOPEN.
73               Instead       the       QSHELL_REMOTE_LD_LIBRARY_PATH       and
74               QSHELL_REMOTE_LD_PREOPEN environment variables will may be used
75               and  will  be remapped to LD_LIBRARY_PATH and LD_PREOPEN by the
76               qshell daemon if set.
77
78       mqsh    Similar to qshell, but uses the mrsh protocol  instead  of  the
79               rsh protocol.
80
81       krb4    The  krb4  module allows users to execute remote commands after
82               authenticating with kerberos. Of course, the remote  rshd  dae‐
83               mons must be kerberized.
84
85       xcpu    The  xcpu  module  uses the xcpu service to execute remote com‐
86               mands.
87
88

OPTIONS

90       The list of available options is determined at runtime by supplementing
91       the  list  of standard pdsh options with any options provided by loaded
92       rcmd and misc modules.  In some cases, options provided by modules  may
93       conflict  with each other. In these cases, the modules are incompatible
94       and the first module loaded wins.
95
96

Standard target nodelist options

98       -w [rcmd_type:][user@]host,host,...
99              Target the specified list of hosts. Do not use  with  any  other
100              node  selection  options (e.g. -a, -g if they are available). No
101              spaces are allowed in the comma-separated list.  A list consist‐
102              ing of a single `-' character causes the target hosts to be read
103              from stdin, one per line. The host  list  may  contain  hostlist
104              expressions  of  the  form ``host[1-5,7]''. For more information
105              about the hostlist format, see the HOSTLIST EXPRESSIONS  section
106              below.  A list of hosts may also be preceded by "user@" to spec‐
107              ify a remote username other than the default, or "rcmd_type:" to
108              specify  an alternate rcmd connection type for these hosts. When
109              used together, the rcmd  type  must  be  specified  first,  e.g.
110              "ssh:user1@host0"  would  use  ssh  to  connect to host0 as user
111              "user1."
112
113       -x host,host,...
114              Exclude the specified hosts. May  be  specified  in  conjunction
115              with  other  target  node  list  options such as -a and -g (when
116              available). Hostlists may also be specified  to  the  -x  option
117              (see the HOSTLIST EXPRESSIONS section below).
118
119

Standard pdsh options

121       -S     Return the largest of the remote command return values.
122
123       -h     Output  usage  menu  and  quit. A list of available rcmd modules
124              will also be printed at the end of the usage message.
125
126       -s     Only on AIX, separate remote command stderr and stdout into  two
127              sockets.
128
129       -q     List  option  values  and  the  target nodelist and exit without
130              action.
131
132       -b     Disable ctrl-C status feature so that a single ctrl-C kills par‐
133              allel job. (Batch Mode)
134
135       -l user
136              This  option may be used to run remote commands as another user,
137              subject to authorization. For BSD rcmd, this means the  invoking
138              user  and system must be listed in the user´s .rhosts file (even
139              for root).
140
141       -t seconds
142              Set the connect timeout. Default is 10 seconds.
143
144       -u seconds
145              Set a limit on the amount of time a remote command is allowed to
146              execute.   Default is no limit. See note in LIMITATIONS if using
147              -u with ssh.
148
149       -f number
150              Set the maximum number of simultaneous remote commands  to  num‐
151              ber.  The default is 32.
152
153       -R name
154              Set  rcmd  module  to  name. This option may also be set via the
155              PDSH_RCMD_TYPE environment variable. A list  of  available  rcmd
156              modules  may  be  obtained  via  the -h, -V, or -L options.  The
157              default will be listed with -h or -V.
158
159       -L     List info on all loaded pdsh modules and quit.
160
161       -d     Include more complete thread status when SIGINT is received, and
162              display connect and command time statistics on stderr when done.
163
164       -V     Output  pdsh  version  information, along with list of currently
165              loaded modules, and exit.
166
167

qsh/mqsh module options

169       -n tasks_per_node
170              Set the number of tasks spawned per node. Default is 1.
171
172       -m block | cyclic
173              Set block  versus  cyclic  allocation  of  processes  to  nodes.
174              Default is block.
175
176       -r railmask
177              Set  the  rail  bitmask  for  a  job  on a multirail system. The
178              default railmask is 1, which corresponds to rail  0  only.  Each
179              bit  set in the argument to -r corresponds to a rail on the sys‐
180              tem, so a value of 2 would correspond to  rail  1  only,  and  3
181              would indicate to use both rail 1 and rail 0.
182
183

machines module options

185       -a     Target all nodes from machines file.
186
187

genders module options

189       In  addition  to  the  genders  options  presented  below,  the genders
190       attribute pdsh_rcmd_type may also be used in the  genders  database  to
191       specify  an alternate rcmd connect type than the pdsh default for hosts
192       with this attribute. For example, the following  line  in  the  genders
193       file
194
195         host0 pdsh_rcmd_type=ssh
196
197       would  cause  pdsh to use ssh to connect to host0, even if rsh were the
198       default.   This  can  be  overridden  on  the  commandline   with   the
199       "rcmd_type:host0" syntax.
200
201
202       -A     Target  all nodes in genders database. The -A option will target
203              every host listed in genders -- if you want to omit  some  hosts
204              by default, see the -a option below.
205
206       -a     Target  all  nodes  in  genders  database  except those with the
207              "pdsh_all_skip" attribute. This is shorthand for  running  "pdsh
208              -A -X pdsh_all_skip ..."
209
210       -g attr[=val][,attr[=val],...]
211              Target  nodes that match any of the specified genders attributes
212              (with optional values). Conflicts with -a and -w  options.  This
213              option  targets  the alternate hostnames in the genders database
214              by default. The -i option provided by the genders module may  be
215              used  to  translate these to the canonical genders hostnames. If
216              the installed version of genders supports  it,  attributes  sup‐
217              plied  to  -g may also take the form of genders queries. Genders
218              queries will query the genders database for the union, intersec‐
219              tion,  difference,  or complement of genders attributes and val‐
220              ues.  The set operation union is represented by two pipe symbols
221              ('||'), intersection by two ampersand symbols ('&&'), difference
222              by two minus symbols ('--'), and complement by  a  tilde  ('~').
223              Parentheses  may  be used to change the order of operations. See
224              the nodeattr(1) manpage for examples of genders queries.
225
226       -X attr[=val][,attr[=val],...]
227              Exclude nodes that match any of the specified genders attributes
228              (optionally  with  values).  This option may be used in combina‐
229              tion with any other of the node selection options (e.g. -w,  -g,
230              -a,  -X  may  also  take the form of genders queries. Please see
231              documentation for the genders -g  option  for  more  information
232              about genders queries.
233
234       -i     Request translation between canonical and alternate hostnames.
235
236       -F filename
237              Read  genders  information  from  filename instead of the system
238              default genders file.
239
240

nodeupdown module options

242       -v     Eliminate target nodes that are considered "down" by  libnodeup‐
243              down.
244
245

slurm module options

247       The slurm module allows pdsh to target nodes based on currently running
248       SLURM jobs. The slurm module is typically called after all  other  node
249       selection  options  have  been  processed,  and  if  no nodes have been
250       selected, the module will attempt to read  a  running  jobid  from  the
251       SLURM_JOBID  environment  variable  (which  is set when running under a
252       SLURM allocation). If SLURM_JOBID references an invalid job, it will be
253       silently ignored.
254
255       -j jobid[,jobid,...]
256              Target  list  of  nodes  allocated  to the SLURM job jobid. This
257              option may be used multiple times to target multiple SLURM jobs.
258
259

rms module options

261       The rms module allows pdsh to target nodes based on  an  RMS  resource.
262       The  rms  module  is  typically  called  after all other node selection
263       options, and if no nodes have been selected, the  module  will  examine
264       the  RMS_RESOURCEID  environment variable and attempt to set the target
265       list of hosts to the nodes in the RMS resource. If an invalid  resource
266       is denoted, the variable is silently ignored.
267
268

SDR module options

270       The  SDR module supports targeting hosts via the System Data Repository
271       on IBM SPs.
272
273       -a     Target all nodes in the SDR. The  list  is  generated  from  the
274              "reliable hostname" in the SDR by default.
275
276       -i     Translate  hostnames  between  reliable  and initial in the SDR,
277              when applicable.  If the a target hostname  matches  either  the
278              initial or reliable hostname in the SDR, the alternate name will
279              be substitued. Thus a list composed of  initial  hostnames  will
280              instead  be  replaced  with  a  list of reliable hostnames.  For
281              example, when used with -a above, all initial hostnames  in  the
282              SDR are targeted.
283
284       -v     Do not target nodes that are marked as not responding in the SDR
285              on the targeted interface. (If a hostname does not appear in the
286              SDR, then that name will remain in the target hostlist.)
287
288       -G     In combination with -a, include all partitions.
289
290

nodeattr module options

292       The  nodeattr  module  supports  access to the genders database via the
293       nodeattr(1) command. See the genders section above for a list  of  sup‐
294       port  options with this module. The option usage with the nodeattr mod‐
295       ule is the same as genders, above,  with  the  exception  that  the  -i
296       option may only be used with -a or -g.
297
298

dshgroup module options

300       The  dshgroup  module  allows pdsh to use dsh (or Dancer's shell) style
301       group files from /etc/dsh/group/ or ~/.dsh/group/.
302
303       -g groupname,...
304              Target nodes in dsh  group  file  "groupname"  found  in  either
305              ~/.dsh/group/groupname or /etc/dsh/group/groupname.
306
307       -X groupname,...
308              Exclude nodes in dsh group file "groupname."
309
310

netgroup module options

312       The  netgroup  module  allows  pdsh to use standard netgroup entries to
313       build lists of target hosts. (/etc/netgroup or NIS)
314
315       -g groupname,...
316              Target nodes in netgroup "groupname."
317
318       -X groupname,...
319              Exclude nodes in netgroup "groupname."
320
321

ENVIRONMENT VARIABLES

323       PDSH_RCMD_TYPE
324              Equivalent to the -R option, the value of this environment vari‐
325              able will be used to set the default rcmd module for pdsh to use
326              (e.g. ssh, rsh).
327
328       PDSH_SSH_ARGS
329              Override the standard arguments that pdsh passes to  the  ssh(1)
330              command ("-2 -a -x").
331
332       PDSH_SSH_ARGS_APPEND
333              Append additional options to the ssh(1) command invoked by pdsh.
334              For example, PDSH_SSH_ARGS_APPEND="-q" would run  ssh  in  quiet
335              mode, or "-v" would increase the verbosity of ssh.
336
337       WCOLL  If no other node selection option is used, the WCOLL environment
338              variable may be set to a filename from which a  list  of  target
339              hosts will be read. The file should contain a list of hosts, one
340              per line (though each line may contain  a  hostlist  expression.
341              See HOSTLIST EXPRESSIONS section below).
342
343       DSHPATH
344              If  set,  the  path  in DSHPATH will be used as the PATH for the
345              remote processes.
346
347       FANOUT Set the pdsh fanout (See description of -f above).
348
349

HOSTLIST EXPRESSIONS

351       As noted in sections above pdsh accepts  lists  of  hosts  the  general
352       form:  prefix[n-m,l-k,...], where n < m and l < k, etc., as an alterna‐
353       tive to explicit lists of hosts. This form should not be confused  with
354       regular  expression  character  classes  (also  denoted by ``[]''). For
355       example, foo[19] does not represent  an  expression  matching  foo1  or
356       foo9, but rather represents the degenerate hostlist: foo19.
357
358       The  hostlist  syntax is meant only as a convenience on clusters with a
359       "prefixNNN" naming convention and specification of ranges should not be
360       considered  necessary  -- this foo1,foo9 could be specified as such, or
361       by the hostlist foo[1,9].
362
363       Some examples of usage follow:
364
365
366       Run command on foo01,foo02,...,foo05
367           pdsh -w foo[01-05] command
368
369       Run command on foo7,foo9,foo10
370            pdsh -w foo[7,9-10] command
371
372       Run command on foo0,foo4,foo5
373            pdsh -w foo[0-5] -x foo[1-3] command
374
375
376       As a reminder to the reader, some shells will interpret  brackets  ('['
377       and ']') for pattern matching.  Depending on your shell, it may be nec‐
378       essary to enclose ranged lists within quotes.  For  example,  in  tcsh,
379       the first example above should be executed as:
380
381            pdsh -w "foo[01-05]" command
382
383

ORIGIN

385       Originally a rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on
386       LLNL's ASCI Blue-Pacific IBM SP system. It is now used on  Linux  clus‐
387       ters at LLNL.
388
389

LIMITATIONS

391       When  using  ssh  for  remote execution, expect the stderr of ssh to be
392       folded in with that of the remote command. When invoked by pdsh, it  is
393       not  possible  for ssh to prompt for passwords if RSA/DSA keys are con‐
394       figured properly,  etc..  Additionally,  the  connect  timeout  is  not
395       adjustable when ssh is used. Finally, there is no reliable way for pdsh
396       to ensure that remote commands are actually  terminated  when  using  a
397       command  timeout. Thus if -u is used with ssh commands may be left run‐
398       ning on remote hosts even after timeout has killed local ssh processes.
399
400       Output from multiple processes per node may be interspersed when  using
401       qshell or mqshell rcmd modules.
402
403       Hostlist parsing assumes numerical part of hostname is at the end only,
404       e.g., specifying foo[0-5]bar will not work.
405
406       The number of nodes that pdsh can simultaneously execute remote jobs on
407       is limited by the maximum number of threads that can be created concur‐
408       rently, as well as the availability of reserved ports in  the  rsh  and
409       qshell rcmd modules. On systems that implement Posix threads, the limit
410       is typically defined by the constant PTHREADS_THREADS_MAX.
411
412

FILES

SEE ALSO

415       rsh(1), ssh(1), dshbak(1), pdcp(1)
416
417
418
419pdsh-2.11                          linux-gnu                           pdsh(1)
Impressum