1PARALLEL_DESIGN(7) parallel PARALLEL_DESIGN(7)
2
3
4
5options as wrapper scripts
6
8 This document describes design decisions made in the development of GNU
9 parallel and the reasoning behind them. It will give an overview of why
10 some of the code looks the way it does, and will help new maintainers
11 understand the code better.
12
13 One file program
14 GNU parallel is a Perl script in a single file. It is object oriented,
15 but contrary to normal Perl scripts each class is not in its own file.
16 This is due to user experience: The goal is that in a pinch the user
17 will be able to get GNU parallel working simply by copying a single
18 file: No need to mess around with environment variables like PERL5LIB.
19
20 Choice of programming language
21 GNU parallel is designed to be able to run on old systems. That means
22 that it cannot depend on a compiler being installed - and especially
23 not a compiler for a language that is younger than 20 years old.
24
25 The goal is that you can use GNU parallel on any system, even if you
26 are not allowed to install additional software.
27
28 Of all the systems I have experienced, I have yet to see a system that
29 had GCC installed that did not have Perl. The same goes for Rust, Go,
30 Haskell, and other younger languages. I have, however, seen systems
31 with Perl without any of the mentioned compilers.
32
33 Most modern systems also have either Python2 or Python3 installed, but
34 you still cannot be certain which version, and since Python2 cannot run
35 under Python3, Python is not an option.
36
37 Perl has the added benefit that implementing the {= perlexpr =}
38 replacement string was fairly easy.
39
40 Old Perl style
41 GNU parallel uses some old, deprecated constructs. This is due to a
42 goal of being able to run on old installations. Currently the target is
43 CentOS 3.9 and Perl 5.8.0.
44
45 Scalability up and down
46 The smallest system GNU parallel is tested on is a 32 MB ASUS WL500gP.
47 The largest is a 2 TB 128-core machine. It scales up to around 100
48 machines - depending on the duration of each job.
49
50 Exponentially back off
51 GNU parallel busy waits. This is because the reason why a job is not
52 started may be due to load average (when using --load), and thus it
53 will not make sense to wait for a job to finish. Instead the load
54 average must be checked again. Load average is not the only reason:
55 --timeout has a similar problem.
56
57 To not burn up too much CPU GNU parallel sleeps exponentially longer
58 and longer if nothing happens, maxing out at 1 second.
59
60 Shell compatibility
61 It is a goal to have GNU parallel work equally well in any shell.
62 However, in practice GNU parallel is being developed in bash and thus
63 testing in other shells is limited to reported bugs.
64
65 When an incompatibility is found there is often not an easy fix: Fixing
66 the problem in csh often breaks it in bash. In these cases the fix is
67 often to use a small Perl script and call that.
68
69 env_parallel
70 env_parallel is a dummy shell script that will run if env_parallel is
71 not an alias or a function and tell the user how to activate the
72 alias/function for the supported shells.
73
74 The alias or function will copy the current environment and run the
75 command with GNU parallel in the copy of the environment.
76
77 The problem is that you cannot access all of the current environment
78 inside Perl. E.g. aliases, functions and unexported shell variables.
79
80 The idea is therefore to take the environment and put it in
81 $PARALLEL_ENV which GNU parallel prepends to every command.
82
83 The only way to have access to the environment is directly from the
84 shell, so the program must be written in a shell script that will be
85 sourced and there has to deal with the dialect of the relevant shell.
86
87 env_parallel.*
88
89 These are the files that implements the alias or function env_parallel
90 for a given shell. It could be argued that these should be put in some
91 obscure place under /usr/lib, but by putting them in your path it
92 becomes trivial to find the path to them and source them:
93
94 source `which env_parallel.foo`
95
96 The beauty is that they can be put anywhere in the path without the
97 user having to know the location. So if the user's path includes
98 /afs/bin/i386_fc5 or /usr/pkg/parallel/bin or
99 /usr/local/parallel/20161222/sunos5.6/bin the files can be put in the
100 dir that makes most sense for the sysadmin.
101
102 env_parallel.bash / env_parallel.sh / env_parallel.ash /
103 env_parallel.dash / env_parallel.zsh / env_parallel.ksh /
104 env_parallel.mksh
105
106 env_parallel.(bash|sh|ash|dash|ksh|mksh|zsh) defines the function
107 env_parallel. It uses alias and typeset to dump the configuration (with
108 a few exceptions) into $PARALLEL_ENV before running GNU parallel.
109
110 After GNU parallel is finished, $PARALLEL_ENV is deleted.
111
112 env_parallel.csh
113
114 env_parallel.csh has two purposes: If env_parallel is not an alias:
115 make it into an alias that sets $PARALLEL with arguments and calls
116 env_parallel.csh.
117
118 If env_parallel is an alias, then env_parallel.csh uses $PARALLEL as
119 the arguments for GNU parallel.
120
121 It exports the environment by writing a variable definition to a file
122 for each variable. The definitions of aliases are appended to this
123 file. Finally the file is put into $PARALLEL_ENV.
124
125 GNU parallel is then run and $PARALLEL_ENV is deleted.
126
127 env_parallel.fish
128
129 First all functions definitions are generated using a loop and
130 functions.
131
132 Dumping the scalar variable definitions is harder.
133
134 fish can represent non-printable characters in (at least) 2 ways. To
135 avoid problems all scalars are converted to \XX quoting.
136
137 Then commands to generate the definitions are made and separated by
138 NUL.
139
140 This is then piped into a Perl script that quotes all values. List
141 elements will be appended using two spaces.
142
143 Finally \n is converted into \1 because fish variables cannot contain
144 \n. GNU parallel will later convert all \1 from $PARALLEL_ENV into \n.
145
146 This is then all saved in $PARALLEL_ENV.
147
148 GNU parallel is called, and $PARALLEL_ENV is deleted.
149
150 parset (supported in sh, ash, dash, bash, zsh, ksh, mksh)
151 parset is a shell function. This is the reason why parset can set
152 variables: It runs in the shell which is calling it.
153
154 It is also the reason why parset does not work, when data is piped into
155 it: ... | parset ... makes parset start in a subshell, and any changes
156 in environment can therefore not make it back to the calling shell.
157
158 Job slots
159 The easiest way to explain what GNU parallel does is to assume that
160 there are a number of job slots, and when a slot becomes available a
161 job from the queue will be run in that slot. But originally GNU
162 parallel did not model job slots in the code. Job slots have been added
163 to make it possible to use {%} as a replacement string.
164
165 While the job sequence number can be computed in advance, the job slot
166 can only be computed the moment a slot becomes available. So it has
167 been implemented as a stack with lazy evaluation: Draw one from an
168 empty stack and the stack is extended by one. When a job is done, push
169 the available job slot back on the stack.
170
171 This implementation also means that if you re-run the same jobs, you
172 cannot assume jobs will get the same slots. And if you use remote
173 executions, you cannot assume that a given job slot will remain on the
174 same remote server. This goes double since number of job slots can be
175 adjusted on the fly (by giving --jobs a file name).
176
177 Rsync protocol version
178 rsync 3.1.x uses protocol 31 which is unsupported by version 2.5.7.
179 That means that you cannot push a file to a remote system using rsync
180 protocol 31, if the remote system uses 2.5.7. rsync does not
181 automatically downgrade to protocol 30.
182
183 GNU parallel does not require protocol 31, so if the rsync version is
184 >= 3.1.0 then --protocol 30 is added to force newer rsyncs to talk to
185 version 2.5.7.
186
187 Compression
188 GNU parallel buffers output in temporary files. --compress compresses
189 the buffered data. This is a bit tricky because there should be no
190 files to clean up if GNU parallel is killed by a power outage.
191
192 GNU parallel first selects a compression program. If the user has not
193 selected one, the first of these that is in $PATH is used: pzstd lbzip2
194 pbzip2 zstd pixz lz4 pigz lzop plzip lzip gzip lrz pxz bzip2 lzma xz
195 clzip. They are sorted by speed on a 128 core machine.
196
197 Schematically the setup is as follows:
198
199 command started by parallel | compress > tmpfile
200 cattail tmpfile | uncompress | parallel which reads the output
201
202 The setup is duplicated for both standard output (stdout) and standard
203 error (stderr).
204
205 GNU parallel pipes output from the command run into the compression
206 program which saves to a tmpfile. GNU parallel records the pid of the
207 compress program. At the same time a small Perl script (called cattail
208 above) is started: It basically does cat followed by tail -f, but it
209 also removes the tmpfile as soon as the first byte is read, and it
210 continuously checks if the pid of the compression program is dead. If
211 the compress program is dead, cattail reads the rest of tmpfile and
212 exits.
213
214 As most compression programs write out a header when they start, the
215 tmpfile in practice is removed by cattail after around 40 ms.
216
217 Wrapping
218 The command given by the user can be wrapped in multiple templates.
219 Templates can be wrapped in other templates.
220
221 $COMMAND the command to run.
222
223 $INPUT the input to run.
224
225 $SHELL the shell that started GNU Parallel.
226
227 $SSHLOGIN the sshlogin.
228
229 $WORKDIR the working dir.
230
231 $FILE the file to read parts from.
232
233 $STARTPOS the first byte position to read from $FILE.
234
235 $LENGTH the number of bytes to read from $FILE.
236
237 --shellquote echo Double quoted $INPUT
238
239 --nice pri Remote: See The remote system wrapper.
240
241 Local: setpriority(0,0,$nice)
242
243 --cat
244 cat > {}; $COMMAND {};
245 perl -e '$bash = shift;
246 $csh = shift;
247 for(@ARGV) { unlink;rmdir; }
248 if($bash =~ s/h//) { exit $bash; }
249 exit $csh;' "$?h" "$status" {};
250
251 {} is set to $PARALLEL_TMP which is a tmpfile. The Perl
252 script saves the exit value, unlinks the tmpfile, and
253 returns the exit value - no matter if the shell is
254 bash/ksh/zsh (using $?) or *csh/fish (using $status).
255
256 --fifo
257 perl -e '($s,$c,$f) = @ARGV;
258 # mkfifo $PARALLEL_TMP
259 system "mkfifo", $f;
260 # spawn $shell -c $command &
261 $pid = fork || exec $s, "-c", $c;
262 open($o,">",$f) || die $!;
263 # cat > $PARALLEL_TMP
264 while(sysread(STDIN,$buf,131072)){
265 syswrite $o, $buf;
266 }
267 close $o;
268 # waitpid to get the exit code from $command
269 waitpid $pid,0;
270 # Cleanup
271 unlink $f;
272 exit $?/256;' $SHELL -c $COMMAND $PARALLEL_TMP
273
274 This is an elaborate way of: mkfifo {}; run $COMMAND in
275 the background using $SHELL; copying STDIN to {};
276 waiting for background to complete; remove {} and exit
277 with the exit code from $COMMAND.
278
279 It is made this way to be compatible with *csh/fish.
280
281 --pipepart
282 < $FILE perl -e 'while(@ARGV) {
283 sysseek(STDIN,shift,0) || die;
284 $left = shift;
285 while($read =
286 sysread(STDIN,$buf,
287 ($left > 131072 ? 131072 : $left))){
288 $left -= $read;
289 syswrite(STDOUT,$buf);
290 }
291 }' $STARTPOS $LENGTH
292
293 This will read $LENGTH bytes from $FILE starting at
294 $STARTPOS and send it to STDOUT.
295
296 --sshlogin $SSHLOGIN
297 ssh $SSHLOGIN "$COMMAND"
298
299 --transfer
300 ssh $SSHLOGIN mkdir -p ./$WORKDIR;
301 rsync --protocol 30 -rlDzR \
302 -essh ./{} $SSHLOGIN:./$WORKDIR;
303 ssh $SSHLOGIN "$COMMAND"
304
305 Read about --protocol 30 in the section Rsync protocol
306 version.
307
308 --transferfile file
309 <<todo>>
310
311 --basefile <<todo>>
312
313 --return file
314 $COMMAND; _EXIT_status=$?; mkdir -p $WORKDIR;
315 rsync --protocol 30 \
316 --rsync-path=cd\ ./$WORKDIR\;\ rsync \
317 -rlDzR -essh $SSHLOGIN:./$FILE ./$WORKDIR;
318 exit $_EXIT_status;
319
320 The --rsync-path=cd ... is needed because old versions
321 of rsync do not support --no-implied-dirs.
322
323 The $_EXIT_status trick is to postpone the exit value.
324 This makes it incompatible with *csh and should be fixed
325 in the future. Maybe a wrapping 'sh -c' is enough?
326
327 --cleanup $RETURN is the wrapper from --return
328
329 $COMMAND; _EXIT_status=$?; $RETURN;
330 ssh $SSHLOGIN \(rm\ -f\ ./$WORKDIR/{}\;\
331 rmdir\ ./$WORKDIR\ \>\&/dev/null\;\);
332 exit $_EXIT_status;
333
334 $_EXIT_status: see --return above.
335
336 --pipe
337 perl -e 'if(sysread(STDIN, $buf, 1)) {
338 open($fh, "|-", "@ARGV") || die;
339 syswrite($fh, $buf);
340 # Align up to 128k block
341 if($read = sysread(STDIN, $buf, 131071)) {
342 syswrite($fh, $buf);
343 }
344 while($read = sysread(STDIN, $buf, 131072)) {
345 syswrite($fh, $buf);
346 }
347 close $fh;
348 exit ($?&127 ? 128+($?&127) : 1+$?>>8)
349 }' $SHELL -c $COMMAND
350
351 This small wrapper makes sure that $COMMAND will never
352 be run if there is no data.
353
354 --tmux <<TODO Fixup with '-quoting>> mkfifo /tmp/tmx3cMEV &&
355 sh -c 'tmux -S /tmp/tmsaKpv1 new-session -s p334310 -d
356 "sleep .2" >/dev/null 2>&1'; tmux -S /tmp/tmsaKpv1 new-
357 window -t p334310 -n wc\ 10 \(wc\ 10\)\;\ perl\ -e\
358 \'while\(\$t++\<3\)\{\ print\ \$ARGV\[0\],\"\\n\"\ \}\'\
359 \$\?h/\$status\ \>\>\ /tmp/tmx3cMEV\&echo\ wc\\\ 10\;\
360 echo\ \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec
361 perl -e '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
362 exit($1);exit$c' /tmp/tmx3cMEV
363
364 mkfifo tmpfile.tmx; tmux -S <tmpfile.tms> new-session -s
365 pPID -d 'sleep .2' >&/dev/null; tmux -S <tmpfile.tms>
366 new-window -t pPID -n <<shell quoted input>> \(<<shell
367 quoted input>>\)\;\ perl\ -e\ \'while\(\$t++\<3\)\{\
368 print\ \$ARGV\[0\],\"\\n\"\ \}\'\ \$\?h/\$status\ \>\>\
369 tmpfile.tmx\&echo\ <<shell double quoted input>>\;echo\
370 \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec perl -e
371 '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
372 exit($1);exit$c' tmpfile.tmx
373
374 First a FIFO is made (.tmx). It is used for
375 communicating exit value. Next a new tmux session is
376 made. This may fail if there is already a session, so
377 the output is ignored. If all job slots finish at the
378 same time, then tmux will close the session. A temporary
379 socket is made (.tms) to avoid a race condition in tmux.
380 It is cleaned up when GNU parallel finishes.
381
382 The input is used as the name of the windows in tmux.
383 When the job inside tmux finishes, the exit value is
384 printed to the FIFO (.tmx). This FIFO is opened by perl
385 outside tmux, and perl then removes the FIFO. Perl
386 blocks until the first value is read from the FIFO, and
387 this value is used as exit value.
388
389 To make it compatible with csh and bash the exit value
390 is printed as: $?h/$status and this is parsed by perl.
391
392 There is a bug that makes it necessary to print the exit
393 value 3 times.
394
395 Another bug in tmux requires the length of the tmux
396 title and command to not have certain limits. When
397 inside these limits, 75 '\ ' are added to the title to
398 force it to be outside the limits.
399
400 You can map the bad limits using:
401
402 perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 1600 1500 90 |
403 perl -ane '$F[0]+$F[1]+$F[2] < 2037 and print ' |
404 parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' \
405 new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm -f /tmp/p{%}-O*'
406
407 perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 17000 17000 90 |
408 parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' \
409 tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm /tmp/p{%}-O*'
410 > value.csv 2>/dev/null
411
412 R -e 'a<-read.table("value.csv");X11();plot(a[,1],a[,2],col=a[,4]+5,cex=0.1);Sys.sleep(1000)'
413
414 For tmux 1.8 17000 can be lowered to 2100.
415
416 The interesting areas are title 0..1000 with (title +
417 whole command) in 996..1127 and 9331..9636.
418
419 The ordering of the wrapping is important:
420
421 · $PARALLEL_ENV which is set in env_parallel.* must be prepended to
422 the command first, as the command may contain exported variables
423 or functions.
424
425 · --nice/--cat/--fifo should be done on the remote machine
426
427 · --pipepart/--pipe should be done on the local machine inside
428 --tmux
429
430 Convenience options --nice --basefile --transfer --return --cleanup --tmux
431 --group --compress --cat --fifo --workdir --tag --tagstring
432 These are all convenience options that make it easier to do a task. But
433 more importantly: They are tested to work on corner cases, too. Take
434 --nice as an example:
435
436 nice parallel command ...
437
438 will work just fine. But when run remotely, you need to move the nice
439 command so it is being run on the server:
440
441 parallel -S server nice command ...
442
443 And this will again work just fine, as long as you are running a single
444 command. When you are running a composed command you need nice to apply
445 to the whole command, and it gets harder still:
446
447 parallel -S server -q nice bash -c 'command1 ...; cmd2 | cmd3'
448
449 It is not impossible, but by using --nice GNU parallel will do the
450 right thing for you. Similarly when transferring files: It starts to
451 get hard when the file names contain space, :, `, *, or other special
452 characters.
453
454 To run the commands in a tmux session you basically just need to quote
455 the command. For simple commands that is easy, but when commands
456 contain special characters, it gets much harder to get right.
457
458 --compress not only compresses standard output (stdout) but also
459 standard error (stderr); and it does so into files, that are open but
460 deleted, so a crash will not leave these files around.
461
462 --cat and --fifo are easy to do by hand, until you want to clean up the
463 tmpfile and keep the exit code of the command.
464
465 The real killer comes when you try to combine several of these: Doing
466 that correctly for all corner cases is next to impossible to do by
467 hand.
468
469 --shard
470 The simple way to implement sharding would be to:
471
472 1. start n jobs,
473
474 2. split each line into columns,
475
476 3. select the data from the relevant column
477
478 4. compute a hash value from the data
479
480 5. take the modulo n of the hash value
481
482 6. pass the full line to the jobslot that has the computed value
483
484 Unfortunately Perl is rather slow at computing the hash value (and
485 somewhat slow at splitting into columns).
486
487 One solution is to use a compiled language for the splitting and
488 hashing, but that would go against the design criteria of not depending
489 on a compiler.
490
491 Luckily those tasks can be parallelized. So GNU parallel starts n
492 sharders that do step 2-6, and passes blocks of 100k to each of those
493 in a round robin manner. To make sure these sharders compute the hash
494 the same way, $PERL_HASH_SEED is set to the same value for all
495 sharders.
496
497 Running n sharders poses a new problem: Instead of having n outputs
498 (one for each computed value) you now have n outputs for each of the n
499 values, so in total n*n outputs; and you need to merge these n*n
500 outputs together into n outputs.
501
502 This can be done by simply running 'parallel -j0 --lb cat :::
503 outputs_for_one_value', but that is rather inefficient, as it spawns a
504 process for each file. Instead the core code from 'parcat' is run,
505 which is also a bit faster.
506
507 All the sharders and parcats communicate through named pipes that are
508 unlinked as soon as they are opened.
509
510 Shell shock
511 The shell shock bug in bash did not affect GNU parallel, but the
512 solutions did. bash first introduced functions in variables named:
513 BASH_FUNC_myfunc() and later changed that to BASH_FUNC_myfunc%%. When
514 transferring functions GNU parallel reads off the function and changes
515 that into a function definition, which is copied to the remote system
516 and executed before the actual command is executed. Therefore GNU
517 parallel needs to know how to read the function.
518
519 From version 20150122 GNU parallel tries both the ()-version and the
520 %%-version, and the function definition works on both pre- and post-
521 shell shock versions of bash.
522
523 The remote system wrapper
524 The remote system wrapper does some initialization before starting the
525 command on the remote system.
526
527 Ctrl-C and standard error (stderr)
528
529 If the user presses Ctrl-C the user expects jobs to stop. This works
530 out of the box if the jobs are run locally. Unfortunately it is not so
531 simple if the jobs are run remotely.
532
533 If remote jobs are run in a tty using ssh -tt, then Ctrl-C works, but
534 all output to standard error (stderr) is sent to standard output
535 (stdout). This is not what the user expects.
536
537 If remote jobs are run without a tty using ssh (without -tt), then
538 output to standard error (stderr) is kept on stderr, but Ctrl-C does
539 not kill remote jobs. This is not what the user expects.
540
541 So what is needed is a way to have both. It seems the reason why Ctrl-C
542 does not kill the remote jobs is because the shell does not propagate
543 the hang-up signal from sshd. But when sshd dies, the parent of the
544 login shell becomes init (process id 1). So by exec'ing a Perl wrapper
545 to monitor the parent pid and kill the child if the parent pid becomes
546 1, then Ctrl-C works and stderr is kept on stderr.
547
548 To be able to kill all (grand)*children a new process group is started.
549
550 --nice
551
552 niceing the remote process is done by setpriority(0,0,$nice). A few old
553 systems do not implement this and --nice is unsupported on those.
554
555 Setting $PARALLEL_TMP
556
557 $PARALLEL_TMP is used by --fifo and --cat and must point to a non-
558 exitent file in $TMPDIR. This file name is computed on the remote
559 system.
560
561 The wrapper
562
563 The wrapper looks like this:
564
565 $shell = $PARALLEL_SHELL || $SHELL;
566 $tmpdir = $TMPDIR;
567 $nice = $opt::nice;
568 # Set $PARALLEL_TMP to a non-existent file name in $TMPDIR
569 do {
570 $ENV{PARALLEL_TMP} = $tmpdir."/par".
571 join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
572 } while(-e $ENV{PARALLEL_TMP});
573 $SIG{CHLD} = sub { $done = 1; };
574 $pid = fork;
575 unless($pid) {
576 # Make own process group to be able to kill HUP it later
577 setpgrp;
578 eval { setpriority(0,0,$nice) };
579 exec $shell, "-c", ($bashfunc."@ARGV");
580 die "exec: $!\n";
581 }
582 do {
583 # Parent is not init (ppid=1), so sshd is alive
584 # Exponential sleep up to 1 sec
585 $s = $s < 1 ? 0.001 + $s * 1.03 : $s;
586 select(undef, undef, undef, $s);
587 } until ($done || getppid == 1);
588 # Kill HUP the process group if job not done
589 kill(SIGHUP, -${pid}) unless $done;
590 wait;
591 exit ($?&127 ? 128+($?&127) : 1+$?>>8)
592
593 Transferring of variables and functions
594 Transferring of variables and functions given by --env is done by
595 running a Perl script remotely that calls the actual command. The Perl
596 script sets $ENV{variable} to the correct value before exec'ing a shell
597 that runs the function definition followed by the actual command.
598
599 The function env_parallel copies the full current environment into the
600 environment variable PARALLEL_ENV. This variable is picked up by GNU
601 parallel and used to create the Perl script mentioned above.
602
603 Base64 encoded bzip2
604 csh limits words of commands to 1024 chars. This is often too little
605 when GNU parallel encodes environment variables and wraps the command
606 with different templates. All of these are combined and quoted into one
607 single word, which often is longer than 1024 chars.
608
609 When the line to run is > 1000 chars, GNU parallel therefore encodes
610 the line to run. The encoding bzip2s the line to run, converts this to
611 base64, splits the base64 into 1000 char blocks (so csh does not fail),
612 and prepends it with this Perl script that decodes, decompresses and
613 evals the line.
614
615 @GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
616 eval "@GNU_Parallel";
617
618 $SIG{CHLD}="IGNORE";
619 # Search for bzip2. Not found => use default path
620 my $zip = (grep { -x $_ } "/usr/local/bin/bzip2")[0] || "bzip2";
621 # $in = stdin on $zip, $out = stdout from $zip
622 my($in, $out,$eval);
623 open3($in,$out,">&STDERR",$zip,"-dc");
624 if(my $perlpid = fork) {
625 close $in;
626 $eval = join "", <$out>;
627 close $out;
628 } else {
629 close $out;
630 # Pipe decoded base64 into 'bzip2 -dc'
631 print $in (decode_base64(join"",@ARGV));
632 close $in;
633 exit;
634 }
635 wait;
636 eval $eval;
637
638 Perl and bzip2 must be installed on the remote system, but a small test
639 showed that bzip2 is installed by default on all platforms that runs
640 GNU parallel, so this is not a big problem.
641
642 The added bonus of this is that much bigger environments can now be
643 transferred as they will be below bash's limit of 131072 chars.
644
645 Which shell to use
646 Different shells behave differently. A command that works in tcsh may
647 not work in bash. It is therefore important that the correct shell is
648 used when GNU parallel executes commands.
649
650 GNU parallel tries hard to use the right shell. If GNU parallel is
651 called from tcsh it will use tcsh. If it is called from bash it will
652 use bash. It does this by looking at the (grand)*parent process: If the
653 (grand)*parent process is a shell, use this shell; otherwise look at
654 the parent of this (grand)*parent. If none of the (grand)*parents are
655 shells, then $SHELL is used.
656
657 This will do the right thing if called from:
658
659 · an interactive shell
660
661 · a shell script
662
663 · a Perl script in `` or using system if called as a single string.
664
665 While these cover most cases, there are situations where it will fail:
666
667 · When run using exec.
668
669 · When run as the last command using -c from another shell (because
670 some shells use exec):
671
672 zsh% bash -c "parallel 'echo {} is not run in bash; \
673 set | grep BASH_VERSION' ::: This"
674
675 You can work around that by appending '&& true':
676
677 zsh% bash -c "parallel 'echo {} is run in bash; \
678 set | grep BASH_VERSION' ::: This && true"
679
680 · When run in a Perl script using system with parallel as the first
681 string:
682
683 #!/usr/bin/perl
684
685 system("parallel",'setenv a {}; echo $a',":::",2);
686
687 Here it depends on which shell is used to call the Perl script. If
688 the Perl script is called from tcsh it will work just fine, but if it
689 is called from bash it will fail, because the command setenv is not
690 known to bash.
691
692 If GNU parallel guesses wrong in these situation, set the shell using
693 $PARALLEL_SHELL.
694
695 Always running commands in a shell
696 If the command is a simple command with no redirection and setting of
697 variables, the command could be run without spawning a shell. E.g. this
698 simple grep matching either 'ls ' or ' wc >> c':
699
700 parallel "grep -E 'ls | wc >> c' {}" ::: foo
701
702 could be run as:
703
704 system("grep","-E","ls | wc >> c","foo");
705
706 However, as soon as the command is a bit more complex a shell must be
707 spawned:
708
709 parallel "grep -E 'ls | wc >> c' {} | wc >> c" ::: foo
710 parallel "LANG=C grep -E 'ls | wc >> c' {}" ::: foo
711
712 It is impossible to tell the difference between these without parsing
713 the string (is the | a pipe in shell or an alternation in a grep
714 regexp? Is LANG=C a command in csh or setting a variable in bash? Is
715 >> redirection or part of a regexp?).
716
717 On top of this wrapper scripts will often require a shell to be
718 spawned.
719
720 The downside is that you need to quote special shell chars twice:
721
722 parallel echo '*' ::: This will expand the asterisk
723 parallel echo "'*'" ::: This will not
724 parallel "echo '*'" ::: This will not
725 parallel echo '\*' ::: This will not
726 parallel echo \''*'\' ::: This will not
727 parallel -q echo '*' ::: This will not
728
729 -q will quote all special chars, thus redirection will not work: this
730 prints '* > out.1' and does not save '*' into the file out.1:
731
732 parallel -q echo "*" ">" out.{} ::: 1
733
734 GNU parallel tries to live up to Principle Of Least Astonishment
735 (POLA), and the requirement of using -q is hard to understand, when you
736 do not see the whole picture.
737
738 Quoting
739 Quoting depends on the shell. For most shells '-quoting is used for
740 strings containing special characters.
741
742 For tcsh/csh newline is quoted as \ followed by newline. Other special
743 characters are also \-quoted.
744
745 For rc everything is quoted using '.
746
747 --pipepart vs. --pipe
748 While --pipe and --pipepart look much the same to the user, they are
749 implemented very differently.
750
751 With --pipe GNU parallel reads the blocks from standard input (stdin),
752 which is then given to the command on standard input (stdin); so every
753 block is being processed by GNU parallel itself. This is the reason why
754 --pipe maxes out at around 500 MB/sec.
755
756 --pipepart, on the other hand, first identifies at which byte positions
757 blocks start and how long they are. It does that by seeking into the
758 file by the size of a block and then reading until it meets end of a
759 block. The seeking explains why GNU parallel does not know the line
760 number and why -L/-l and -N do not work.
761
762 With a reasonable block and file size this seeking is more than 1000
763 time faster than reading the full file. The byte positions are then
764 given to a small script that reads from position X to Y and sends
765 output to standard output (stdout). This small script is prepended to
766 the command and the full command is executed just as if GNU parallel
767 had been in its normal mode. The script looks like this:
768
769 < file perl -e 'while(@ARGV) {
770 sysseek(STDIN,shift,0) || die;
771 $left = shift;
772 while($read = sysread(STDIN,$buf,
773 ($left > 131072 ? 131072 : $left))){
774 $left -= $read; syswrite(STDOUT,$buf);
775 }
776 }' startbyte length_in_bytes
777
778 It delivers 1 GB/s per core.
779
780 Instead of the script dd was tried, but many versions of dd do not
781 support reading from one byte to another and might cause partial data.
782 See this for a surprising example:
783
784 yes | dd bs=1024k count=10 | wc
785
786 --block-size adjustment
787 Every time GNU parallel detects a record bigger than --block-size it
788 increases the block size by 30%. A small --block-size gives very poor
789 performance; by exponentially increasing the block size performance
790 will not suffer.
791
792 GNU parallel will waste CPU power if --block-size does not contain a
793 full record, because it tries to find a full record and will fail to do
794 so. The recommendation is therefore to use a --block-size > 2 records,
795 so you always get at least one full record when you read one block.
796
797 If you use -N then --block-size should be big enough to contain N+1
798 records.
799
800 Automatic --block-size computation
801 With --pipepart GNU parallel can compute the --block-size
802 automatically. A --block-size of -1 will use a block size so that each
803 jobslot will receive approximately 1 block. --block -2 will pass 2
804 blocks to each jobslot and -n will pass n blocks to each jobslot.
805
806 This can be done because --pipepart reads from files, and we can
807 compute the total size of the input.
808
809 --jobs and --onall
810 When running the same commands on many servers what should --jobs
811 signify? Is it the number of servers to run on in parallel? Is it the
812 number of jobs run in parallel on each server?
813
814 GNU parallel lets --jobs represent the number of servers to run on in
815 parallel. This is to make it possible to run a sequence of commands
816 (that cannot be parallelized) on each server, but run the same sequence
817 on multiple servers.
818
819 --shuf
820 When using --shuf to shuffle the jobs, all jobs are read, then they are
821 shuffled, and finally executed. When using SQL this makes the
822 --sqlmaster be the part that shuffles the jobs. The --sqlworkers simply
823 executes according to Seq number.
824
825 --csv
826 --pipepart is incompatible with --csv because you can have records
827 like:
828
829 a,b,c
830 a,"
831 a,b,c
832 a,b,c
833 a,b,c
834 ",c
835 a,b,c
836
837 Here the second record contains a multi-line field that looks like
838 records. Since --pipepart does not read then whole file when searching
839 for record endings, it may start reading in this multi-line field,
840 which would be wrong.
841
842 Buffering on disk
843 GNU parallel buffers output, because if output is not buffered you have
844 to be ridiculously careful on sizes to avoid mixing of outputs (see
845 excellent example on https://catern.com/posts/pipes.html).
846
847 GNU parallel buffers on disk in $TMPDIR using files, that are removed
848 as soon as they are created, but which are kept open. So even if GNU
849 parallel is killed by a power outage, there will be no files to clean
850 up afterwards. Another advantage is that the file system is aware that
851 these files will be lost in case of a crash, so it does not need to
852 sync them to disk.
853
854 It gives the odd situation that a disk can be fully used, but there are
855 no visible files on it.
856
857 Partly buffering in memory
858
859 When using output formats SQL and CSV then GNU Parallel has to read the
860 whole output into memory. When run normally it will only read the
861 output from a single job. But when using --linebuffer every line
862 printed will also be buffered in memory - for all jobs currently
863 running.
864
865 If memory is tight, then do not use the output format SQL/CSV with
866 --linebuffer.
867
868 Comparing to buffering in memory
869
870 gargs is a parallelizing tool that buffers in memory. It is therefore a
871 useful way of comparing the advantages and disadvantages of buffering
872 in memory to buffering on disk.
873
874 On an system with 6 GB RAM free and 6 GB free swap these were tested
875 with different sizes:
876
877 echo /dev/zero | gargs "head -c $size {}" >/dev/null
878 echo /dev/zero | parallel "head -c $size {}" >/dev/null
879
880 The results are here:
881
882 JobRuntime Command
883 0.344 parallel_test 1M
884 0.362 parallel_test 10M
885 0.640 parallel_test 100M
886 9.818 parallel_test 1000M
887 23.888 parallel_test 2000M
888 30.217 parallel_test 2500M
889 30.963 parallel_test 2750M
890 34.648 parallel_test 3000M
891 43.302 parallel_test 4000M
892 55.167 parallel_test 5000M
893 67.493 parallel_test 6000M
894 178.654 parallel_test 7000M
895 204.138 parallel_test 8000M
896 230.052 parallel_test 9000M
897 255.639 parallel_test 10000M
898 757.981 parallel_test 30000M
899 0.537 gargs_test 1M
900 0.292 gargs_test 10M
901 0.398 gargs_test 100M
902 3.456 gargs_test 1000M
903 8.577 gargs_test 2000M
904 22.705 gargs_test 2500M
905 123.076 gargs_test 2750M
906 89.866 gargs_test 3000M
907 291.798 gargs_test 4000M
908
909 GNU parallel is pretty much limited by the speed of the disk: Up to 6
910 GB data is written to disk but cached, so reading is fast. Above 6 GB
911 data are both written and read from disk. When the 30000MB job is
912 running, the disk system is slow, but usable: If you are not using the
913 disk, you almost do not feel it.
914
915 gargs has a speed advantage up until 2500M where it hits a wall. Then
916 the system starts swapping like crazy and is completely unusable. At
917 5000M it goes out of memory.
918
919 You can make GNU parallel behave similar to gargs if you point $TMPDIR
920 to a tmpfs-filesystem: It will be faster for small outputs, but may
921 kill your system for larger outputs and cause you to lose output.
922
923 Disk full
924 GNU parallel buffers on disk. If the disk is full, data may be lost. To
925 check if the disk is full GNU parallel writes a 8193 byte file every
926 second. If this file is written successfully, it is removed
927 immediately. If it is not written successfully, the disk is full. The
928 size 8193 was chosen because 8192 gave wrong result on some file
929 systems, whereas 8193 did the correct thing on all tested filesystems.
930
931 Memory usage
932 Normally GNU parallel will use around 17 MB RAM constantly - no matter
933 how many jobs or how much output there is. There are a few things that
934 cause the memory usage to rise:
935
936 · Multiple input sources. GNU parallel reads an input source only
937 once. This is by design, as an input source can be a stream (e.g.
938 FIFO, pipe, standard input (stdin)) which cannot be rewound and read
939 again. When reading a single input source, the memory is freed as
940 soon as the job is done - thus keeping the memory usage constant.
941
942 But when reading multiple input sources GNU parallel keeps the
943 already read values for generating all combinations with other input
944 sources.
945
946 · Computing the number of jobs. --bar, --eta, and --halt xx% use
947 total_jobs() to compute the total number of jobs. It does this by
948 generating the data structures for all jobs. All these job data
949 structures will be stored in memory and take up around 400
950 bytes/job.
951
952 · Buffering a full line. --linebuffer will read a full line per
953 running job. A very long output line (say 1 GB without \n) will
954 increase RAM usage temporarily: From when the beginning of the line
955 is read till the line is printed.
956
957 · Buffering the full output of a single job. This happens when using
958 --results *.csv/*.tsv or --sql*. Here GNU parallel will read the
959 whole output of a single job and save it as csv/tsv or SQL.
960
961 Argument separators ::: :::: :::+ ::::+
962 The argument separator ::: was chosen because I have never seen :::
963 used in any command. The natural choice -- would be a bad idea since it
964 is not unlikely that the template command will contain --. I have seen
965 :: used in programming languanges to separate classes, and I did not
966 want the user to be confused that the separator had anything to do with
967 classes.
968
969 ::: also makes a visual separation, which is good if there are multiple
970 :::.
971
972 When ::: was chosen, :::: came as a fairly natural extension.
973
974 Linking input sources meant having to decide for some way to indicate
975 linking of ::: and ::::. :::+ and ::::+ was chosen, so that they were
976 similar to ::: and ::::.
977
978 Perl replacement strings, {= =}, and --rpl
979 The shorthands for replacement strings make a command look more
980 cryptic. Different users will need different replacement strings.
981 Instead of inventing more shorthands you get more flexible replacement
982 strings if they can be programmed by the user.
983
984 The language Perl was chosen because GNU parallel is written in Perl
985 and it was easy and reasonably fast to run the code given by the user.
986
987 If a user needs the same programmed replacement string again and again,
988 the user may want to make his own shorthand for it. This is what --rpl
989 is for. It works so well, that even GNU parallel's own shorthands are
990 implemented using --rpl.
991
992 In Perl code the bigrams {= and =} rarely exist. They look like a
993 matching pair and can be entered on all keyboards. This made them good
994 candidates for enclosing the Perl expression in the replacement
995 strings. Another candidate ,, and ,, was rejected because they do not
996 look like a matching pair. --parens was made, so that the users can
997 still use ,, and ,, if they like: --parens ,,,,
998
999 Internally, however, the {= and =} are replaced by \257< and \257>.
1000 This is to make it simpler to make regular expressions. You only need
1001 to look one character ahead, and never have to look behind.
1002
1003 Test suite
1004 GNU parallel uses its own testing framework. This is mostly due to
1005 historical reasons. It deals reasonably well with tests that are
1006 dependent on how long a given test runs (e.g. more than 10 secs is a
1007 pass, but less is a fail). It parallelizes most tests, but it is easy
1008 to force a test to run as the single test (which may be important for
1009 timing issues). It deals reasonably well with tests that fail
1010 intermittently. It detects which tests failed and pushes these to the
1011 top, so when running the test suite again, the tests that failed most
1012 recently are run first.
1013
1014 If GNU parallel should adopt a real testing framework then those
1015 elements would be important.
1016
1017 Since many tests are dependent on which hardware it is running on,
1018 these tests break when run on a different hardware than what the test
1019 was written for.
1020
1021 When most bugs are fixed a test is added, so this bug will not
1022 reappear. It is, however, sometimes hard to create the environment in
1023 which the bug shows up - especially if the bug only shows up sometimes.
1024 One of the harder problems was to make a machine start swapping without
1025 forcing it to its knees.
1026
1027 Median run time
1028 Using a percentage for --timeout causes GNU parallel to compute the
1029 median run time of a job. The median is a better indicator of the
1030 expected run time than average, because there will often be outliers
1031 taking way longer than the normal run time.
1032
1033 To avoid keeping all run times in memory, an implementation of remedian
1034 was made (Rousseeuw et al).
1035
1036 Error messages and warnings
1037 Error messages like: ERROR, Not found, and 42 are not very helpful. GNU
1038 parallel strives to inform the user:
1039
1040 · What went wrong?
1041
1042 · Why did it go wrong?
1043
1044 · What can be done about it?
1045
1046 Unfortunately it is not always possible to predict the root cause of
1047 the error.
1048
1049 Determine number of CPUs
1050 CPUs is an ambiguous term. It can mean the number of socket filled
1051 (i.e. the number of physical chips). It can mean the number of cores
1052 (i.e. the number of physical compute cores). It can mean the number of
1053 hyperthreaded cores (i.e. the number of virtual cores - with some of
1054 them possibly being hyperthreaded).
1055
1056 On ark.intel.com Intel uses the terms cores and threads for number of
1057 physical cores and the number of hyperthreaded cores respectively.
1058
1059 GNU parallel uses uses CPUs as the number of compute units and the
1060 terms sockets, cores, and threads to specify how the number of compute
1061 units is calculated.
1062
1063 Computation of load
1064 Contrary to the obvious --load does not use load average. This is due
1065 to load average rising too slowly. Instead it uses ps to list the
1066 number of threads in running or blocked state (state D, O or R). This
1067 gives an instant load.
1068
1069 As remote calculation of load can be slow, a process is spawned to run
1070 ps and put the result in a file, which is then used next time.
1071
1072 Killing jobs
1073 GNU parallel kills jobs. It can be due to --memfree, --halt, or when
1074 GNU parallel meets a condition from which it cannot recover. Every job
1075 is started as its own process group. This way any (grand)*children will
1076 get killed, too. The process group is killed with the specification
1077 mentioned in --termseq.
1078
1079 SQL interface
1080 GNU parallel uses the DBURL from GNU sql to give database software,
1081 username, password, host, port, database, and table in a single string.
1082
1083 The DBURL must point to a table name. The table will be dropped and
1084 created. The reason for not reusing an existing table is that the user
1085 may have added more input sources which would require more columns in
1086 the table. By prepending '+' to the DBURL the table will not be
1087 dropped.
1088
1089 The table columns are similar to joblog with the addition of V1 .. Vn
1090 which are values from the input sources, and Stdout and Stderr which
1091 are the output from standard output and standard error, respectively.
1092
1093 The Signal column has been renamed to _Signal due to Signal being a
1094 reserved word in MySQL.
1095
1096 Logo
1097 The logo is inspired by the Cafe Wall illusion. The font is DejaVu
1098 Sans.
1099
1100 Citation notice
1101 Funding a free software project is hard. GNU parallel is no exception.
1102 On top of that it seems the less visible a project is, the harder it is
1103 to get funding. And the nature of GNU parallel is that it will never be
1104 seen by "the guy with the checkbook", but only by the people doing the
1105 actual work.
1106
1107 This problem has been covered by others - though no solution has been
1108 found: https://www.slideshare.net/NadiaEghbal/consider-the-maintainer
1109 https://www.numfocus.org/blog/why-is-numpy-only-now-getting-funded/
1110
1111 Before implementing the citation notice it was discussed with the
1112 users:
1113 https://lists.gnu.org/archive/html/parallel/2013-11/msg00006.html
1114
1115 Having to spend 10 seconds on running parallel --citation once is no
1116 doubt not an ideal solution, but no one has so far come up with an
1117 ideal solution - neither for funding GNU parallel nor other free
1118 software.
1119
1120 If you believe you have the perfect solution, you should try it out,
1121 and if it works, you should post it on the email list. Ideas that will
1122 cost work and which have not been tested are, however, unlikely to be
1123 prioritized.
1124
1125 Running parallel --citation one single time takes less than 10 seconds,
1126 and will silence the citation notice for future runs. This is
1127 comparable to graphical tools where you have to click a checkbox saying
1128 "Do not show this again". But if that is too much trouble for you, why
1129 not use one of the alternatives instead? See a list in: man
1130 parallel_alternatives.
1131
1132 As the request for citation is not a legal requirement this is
1133 acceptable under GPLv3 and cleared with Richard M. Stallman himself.
1134 Thus it does not fall under this:
1135 https://www.gnu.org/licenses/gpl-faq.en.html#RequireCitation
1136
1138 Multiple processes working together
1139 Open3 is slow. Printing is slow. It would be good if they did not tie
1140 up resources, but were run in separate threads.
1141
1142 --rrs on remote using a perl wrapper
1143 ... | perl -pe '$/=$recend$recstart;BEGIN{ if(substr($_) eq $recstart)
1144 substr($_)="" } eof and substr($_) eq $recend) substr($_)=""
1145
1146 It ought to be possible to write a filter that removed rec sep on the
1147 fly instead of inside GNU parallel. This could then use more cpus.
1148
1149 Will that require 2x record size memory?
1150
1151 Will that require 2x block size memory?
1152
1154 These decisions were relevant for earlier versions of GNU parallel, but
1155 not the current version. They are kept here as historical record.
1156
1157 --tollef
1158 You can read about the history of GNU parallel on
1159 https://www.gnu.org/software/parallel/history.html
1160
1161 --tollef was included to make GNU parallel switch compatible with the
1162 parallel from moreutils (which is made by Tollef Fog Heen). This was
1163 done so that users of that parallel easily could port their use to GNU
1164 parallel: Simply set PARALLEL="--tollef" and that would be it.
1165
1166 But several distributions chose to make --tollef global (by putting it
1167 into /etc/parallel/config) without making the users aware of this, and
1168 that caused much confusion when people tried out the examples from GNU
1169 parallel's man page and these did not work. The users became
1170 frustrated because the distribution did not make it clear to them that
1171 it has made --tollef global.
1172
1173 So to lessen the frustration and the resulting support, --tollef was
1174 obsoleted 20130222 and removed one year later.
1175
1176 Transferring of variables and functions
1177 Until 20150122 variables and functions were transferred by looking at
1178 $SHELL to see whether the shell was a *csh shell. If so the variables
1179 would be set using setenv. Otherwise they would be set using =. This
1180 caused the content of the variable to be repeated:
1181
1182 echo $SHELL | grep "/t\{0,1\}csh" > /dev/null && setenv VAR foo ||
1183 export VAR=foo
1184
1185
1186
118720191022 2019-11-21 PARALLEL_DESIGN(7)