1pdsh(1) General Commands Manual pdsh(1)
2
3
4
6 pdcp - copy files to groups of hosts in parallel
7 rpdcp - (reverse pdcp) copy files from a group of hosts in parallel
8
9
11 pdcp [options]... src [src2...] dest
12 rpdcp [options]... src [src2...] dir
13
14
16 pdcp is a variant of the rcp(1) command. Unlike rcp(1), which copies
17 files to a single remote host, pdcp can copy files to multiple remote
18 hosts in parallel. However, pdcp does not recognize files in the for‐
19 mat ``rname@rhost:path,'' therefore all source files must be on the
20 local host machine. Destination nodes must be listed on the pdcp com‐
21 mand line using a suitable target nodelist option (See the OPTIONS sec‐
22 tion below). Each destination node listed must have pdcp installed for
23 the copy to succeed.
24
25 When pdcp receives SIGINT (ctrl-C), it lists the status of current
26 threads. A second SIGINT within one second terminates the program.
27 Pending threads may be canceled by issuing ctrl-Z within one second of
28 ctrl-C. Pending threads are those that have not yet been initiated, or
29 are still in the process of connecting to the remote host.
30
31 Like pdsh(1), the functionality of pdcp may be supplemented by dynami‐
32 cally loadable modules. In pdcp, the modules may provide a new connect
33 protocol (replacing the standard rsh(1) protocol), filtering options
34 (e.g. excluding hosts that are down), and/or host selection options
35 (e.g. -a selects all nodes from a local config file). By default, pdcp
36 requires at least one "rcmd" module to be loaded (to provide the chan‐
37 nel for remote copy).
38
39
41 rpdcp performs a reverse parallel copy. Rather than copying files to
42 remote hosts, files are retrieved from remote hosts and stored locally.
43 All directories or files retrieved will be stored with their remote
44 hostname appended to the filename. The destination file must be a
45 directory when this option is used.
46
47 In other respects, rpdcp is exactly like pdcp, and further statements
48 regarding pdcp in this manual also apply to rpdcp.
49
50
52 The method by which pdcp connects to remote hosts may be selected at
53 runtime using the -R option (See OPTIONS below). This functionality is
54 ultimately implemented via dynamically loadable modules, and so the
55 list of available options may be different from installation to instal‐
56 lation. A list of currently available rcmd modules is printed when
57 using any of the -h, -V, or -L options. The default rcmd module will
58 also be displayed with the -h and -V options.
59
60 A list of rcmd modules currently distributed with pdcp follows.
61
62 rsh Uses an internal, thread-safe implementation of BSD rcmd(3) to
63 run commands using the standard rsh(1) protocol.
64
65 ssh Uses a variant of popen(3) to run multiple copies of the ssh(1)
66 command.
67
68 mrsh This module uses the mrsh(1) protocol to execute jobs on remote
69 hosts. The mrsh protocol uses a credential based authentica‐
70 tion, forgoing the need to allocate reserved ports. In other
71 aspects, it acts just like rsh.
72
73 krb4 The krb4 module allows users to execute remote commands after
74 authenticating with kerberos. Of course, the remote rshd dae‐
75 mons must be kerberized.
76
77 xcpu The xcpu module uses the xcpu service to execute remote com‐
78 mands.
79
80
82 The list of available pdcp options is determined at runtime by supple‐
83 menting the list of standard pdcp options with any options provided by
84 loaded rcmd and misc modules. In some cases, options provided by mod‐
85 ules may conflict with each other. In these cases, the modules are
86 incompatible and the first module loaded wins.
87
88
90 -w TARGETS,...
91 Target and or filter the specified list of hosts. Do not use
92 with any other node selection options (e.g. -a, -g, if they are
93 available). No spaces are allowed in the comma-separated list.
94 Arguments in the TARGETS list may include normal host names, a
95 range of hosts in hostlist format (See HOSTLIST EXPRESSIONS), or
96 a single `-' character to read the list of hosts on stdin.
97
98 If a host or hostlist is preceded by a `-' character, this
99 causes those hosts to be explicitly excluded. If the argument is
100 preceded by a single `^' character, it is taken to be the path
101 to file containing a list of hosts, one per line. If the item
102 begins with a `/' character, it is taken as a regular expres‐
103 sion on which to filter the list of hosts (a regex argument may
104 also be optionally trailed by another '/', e.g. /node.*/). A
105 regex or file name argument may also be preceeded by a minus `-'
106 to exclude instead of include thoses hosts.
107
108 A list of hosts may also be preceded by "user@" to specify a
109 remote username other than the default, or "rcmd_type:" to spec‐
110 ify an alternate rcmd connection type for these hosts. When used
111 together, the rcmd type must be specified first, e.g.
112 "ssh:user1@host0" would use ssh to connect to host0 as user
113 "user1."
114
115
116
117 -x host,host,...
118 Exclude the specified hosts. May be specified in conjunction
119 with other target node list options such as -a and -g (when
120 available). Hostlists may also be specified to the -x option
121 (see the HOSTLIST EXPRESSIONS section below). Arguments to -x
122 may also be preceeded by the filename (`^') and regex ('/')
123 characters as described above, in which case the resulting hosts
124 are excluded as if they had been given to -w and preceeded with
125 the minus `-' character.
126
127
128
130 -h Output usage menu and quit. A list of available rcmd modules
131 will be printed at the end of the usage message.
132
133 -q List option values and the target nodelist and exit without
134 action.
135
136 -b Disable ctrl-C status feature so that a single ctrl-C kills par‐
137 allel copy. (Batch Mode)
138
139 -r Copy directories recursively.
140
141 -p Preserve modification time and modes.
142
143 -e PATH
144 Explicitly specify path to remote pdcp binary instead of using
145 the locally executed path. Can also be set via the environment
146 variable PDSH_REMOTE_PDCP_PATH.
147
148 -l user
149 This option may be used to copy files as another user, subject
150 to authorization. For BSD rcmd, this means the invoking user and
151 system must be listed in the user´s .rhosts file (even for
152 root).
153
154 -t seconds
155 Set the connect timeout. Default is 10 seconds.
156
157 -f number
158 Set the maximum number of simultaneous remote copies to number.
159 The default is 32.
160
161 -R name
162 Set rcmd module to name. This option may also be set via the
163 PDSH_RCMD_TYPE environment variable. A list of available rcmd
164 modules may be obtained via either the -h or -L options.
165
166 -M name,...
167 When multiple misc modules provide the same options to pdsh, the
168 first module initialized "wins" and subsequent modules are not
169 loaded. The -M option allows a list of modules to be specified
170 that will be force-initialized before all others, in-effect
171 ensuring that they load without conflict (unless they conflict
172 with eachother). This option may also be set via the
173 PDSH_MISC_MODULES environment variable.
174
175 -L List info on all loaded pdcp modules and quit.
176
177 -d Include more complete thread status when SIGINT is received, and
178 display connect and command time statistics on stderr when done.
179
180 -V Output pdcp version information, along with list of currently
181 loaded modules, and exit.
182
183
184
186 As noted in sections above, pdcp accepts ranges of hostnames in the
187 general form: prefix[n-m,l-k,...], where n < m and l < k, etc., as an
188 alternative to explicit lists of hosts. This form should not be con‐
189 fused with regular expression character classes (also denoted by
190 ``[]''). For example, foo[19] does not represent foo1 or foo9, but
191 rather represents a degenerate range: foo19.
192
193 This range syntax is meant only as a convenience on clusters with a
194 prefixNN naming convention and specification of ranges should not be
195 considered necessary -- the list foo1,foo9 could be specified as such,
196 or by the range foo[1,9].
197
198 Some examples of range usage follow:
199
200
201 Copy /etc/hosts to foo01,foo02,...,foo05
202 pdcp -w foo[01-05] /etc/hosts /etc
203
204 Copy /etc/hosts to foo7,foo9,foo10
205 pdcp -w foo[7,9-10] /etc/hosts /etc
206
207 Copy /etc/hosts to foo0,foo4,foo5
208 pdcp -w foo[0-5] -x foo[1-3] /etc/hosts /etc
209
210
211 As a reminder to the reader, some shells will interpret brackets ('['
212 and ']') for pattern matching. Depending on your shell, it may be nec‐
213 essary to enclose ranged lists within quotes. For example, in tcsh,
214 the first example above should be executed as:
215
216 pdcp -w "foo[01-05]" /etc/hosts /etc
217
218
220 Pdsh/pdcp was originally a rewrite of IBM dsh(1) by Jim Garlick <gar‐
221 lick@llnl.gov> on LLNL's ASCI Blue-Pacific IBM SP system. It is now
222 also used on Linux clusters at LLNL.
223
224
226 When using ssh for remote execution, stderr of ssh to be folded in with
227 that of the remote command. When invoked by pdcp, it is not possible
228 for ssh to prompt for confirmation if a host key changes, prompt for
229 passwords if RSA keys are not configured properly, etc.. Finally, the
230 connect timeout is only adjustable with ssh when the underlying ssh
231 implementation supports it, and pdsh has been built to use the correct
232 option.
233
234
236 pdsh(1)
237
238
239
240pdsh-2.31 linux-gnu pdsh(1)