1PARALLEL_DESIGN(7) parallel PARALLEL_DESIGN(7)
2
3
4
6 This document describes design decisions made in the development of GNU
7 parallel and the reasoning behind them. It will give an overview of why
8 some of the code looks the way it does, and will help new maintainers
9 understand the code better.
10
11 One file program
12 GNU parallel is a Perl script in a single file. It is object oriented,
13 but contrary to normal Perl scripts each class is not in its own file.
14 This is due to user experience: The goal is that in a pinch the user
15 will be able to get GNU parallel working simply by copying a single
16 file: No need to mess around with environment variables like PERL5LIB.
17
18 Choice of programming language
19 GNU parallel is designed to be able to run on old systems. That means
20 that it cannot depend on a compiler being installed - and especially
21 not a compiler for a language that is younger than 20 years old.
22
23 The goal is that you can use GNU parallel on any system, even if you
24 are not allowed to install additional software.
25
26 Of all the systems I have experienced, I have yet to see a system that
27 had GCC installed that did not have Perl. The same goes for Rust, Go,
28 Haskell, and other younger languages. I have, however, seen systems
29 with Perl without any of the mentioned compilers.
30
31 Most modern systems also have either Python2 or Python3 installed, but
32 you still cannot be certain which version, and since Python2 cannot run
33 under Python3, Python is not an option.
34
35 Perl has the added benefit that implementing the {= perlexpr =}
36 replacement string was fairly easy.
37
38 Old Perl style
39 GNU parallel uses some old, deprecated constructs. This is due to a
40 goal of being able to run on old installations. Currently the target is
41 CentOS 3.9 and Perl 5.8.0.
42
43 Scalability up and down
44 The smallest system GNU parallel is tested on is a 32 MB ASUS WL500gP.
45 The largest is a 2 TB 128-core machine. It scales up to around 100
46 machines - depending on the duration of each job.
47
48 Exponentially back off
49 GNU parallel busy waits. This is because the reason why a job is not
50 started may be due to load average (when using --load), and thus it
51 will not make sense to just wait for a job to finish. Instead the load
52 average must be rechecked regularly. Load average is not the only
53 reason: --timeout has a similar problem.
54
55 To not burn up too much CPU GNU parallel sleeps exponentially longer
56 and longer if nothing happens, maxing out at 1 second.
57
58 Shell compatibility
59 It is a goal to have GNU parallel work equally well in any shell.
60 However, in practice GNU parallel is being developed in bash and thus
61 testing in other shells is limited to reported bugs.
62
63 When an incompatibility is found there is often not an easy fix: Fixing
64 the problem in csh often breaks it in bash. In these cases the fix is
65 often to use a small Perl script and call that.
66
67 env_parallel
68 env_parallel is a dummy shell script that will run if env_parallel is
69 not an alias or a function and tell the user how to activate the
70 alias/function for the supported shells.
71
72 The alias or function will copy the current environment and run the
73 command with GNU parallel in the copy of the environment.
74
75 The problem is that you cannot access all of the current environment
76 inside Perl. E.g. aliases, functions and unexported shell variables.
77
78 The idea is therefore to take the environment and put it in
79 $PARALLEL_ENV which GNU parallel prepends to every command.
80
81 The only way to have access to the environment is directly from the
82 shell, so the program must be written in a shell script that will be
83 sourced and there has to deal with the dialect of the relevant shell.
84
85 env_parallel.*
86
87 These are the files that implements the alias or function env_parallel
88 for a given shell. It could be argued that these should be put in some
89 obscure place under /usr/lib, but by putting them in your path it
90 becomes trivial to find the path to them and source them:
91
92 source `which env_parallel.foo`
93
94 The beauty is that they can be put anywhere in the path without the
95 user having to know the location. So if the user's path includes
96 /afs/bin/i386_fc5 or /usr/pkg/parallel/bin or
97 /usr/local/parallel/20161222/sunos5.6/bin the files can be put in the
98 dir that makes most sense for the sysadmin.
99
100 env_parallel.bash / env_parallel.sh / env_parallel.ash /
101 env_parallel.dash / env_parallel.zsh / env_parallel.ksh /
102 env_parallel.mksh
103
104 env_parallel.(bash|sh|ash|dash|ksh|mksh|zsh) defines the function
105 env_parallel. It uses alias and typeset to dump the configuration (with
106 a few exceptions) into $PARALLEL_ENV before running GNU parallel.
107
108 After GNU parallel is finished, $PARALLEL_ENV is deleted.
109
110 env_parallel.csh
111
112 env_parallel.csh has two purposes: If env_parallel is not an alias:
113 make it into an alias that sets $PARALLEL with arguments and calls
114 env_parallel.csh.
115
116 If env_parallel is an alias, then env_parallel.csh uses $PARALLEL as
117 the arguments for GNU parallel.
118
119 It exports the environment by writing a variable definition to a file
120 for each variable. The definitions of aliases are appended to this
121 file. Finally the file is put into $PARALLEL_ENV.
122
123 GNU parallel is then run and $PARALLEL_ENV is deleted.
124
125 env_parallel.fish
126
127 First all functions definitions are generated using a loop and
128 functions.
129
130 Dumping the scalar variable definitions is harder.
131
132 fish can represent non-printable characters in (at least) 2 ways. To
133 avoid problems all scalars are converted to \XX quoting.
134
135 Then commands to generate the definitions are made and separated by
136 NUL.
137
138 This is then piped into a Perl script that quotes all values. List
139 elements will be appended using two spaces.
140
141 Finally \n is converted into \1 because fish variables cannot contain
142 \n. GNU parallel will later convert all \1 from $PARALLEL_ENV into \n.
143
144 This is then all saved in $PARALLEL_ENV.
145
146 GNU parallel is called, and $PARALLEL_ENV is deleted.
147
148 parset (supported in sh, ash, dash, bash, zsh, ksh, mksh)
149 parset is a shell function. This is the reason why parset can set
150 variables: It runs in the shell which is calling it.
151
152 It is also the reason why parset does not work, when data is piped into
153 it: ... | parset ... makes parset start in a subshell, and any changes
154 in environment can therefore not make it back to the calling shell.
155
156 Job slots
157 The easiest way to explain what GNU parallel does is to assume that
158 there are a number of job slots, and when a slot becomes available a
159 job from the queue will be run in that slot. But originally GNU
160 parallel did not model job slots in the code. Job slots have been added
161 to make it possible to use {%} as a replacement string.
162
163 While the job sequence number can be computed in advance, the job slot
164 can only be computed the moment a slot becomes available. So it has
165 been implemented as a stack with lazy evaluation: Draw one from an
166 empty stack and the stack is extended by one. When a job is done, push
167 the available job slot back on the stack.
168
169 This implementation also means that if you re-run the same jobs, you
170 cannot assume jobs will get the same slots. And if you use remote
171 executions, you cannot assume that a given job slot will remain on the
172 same remote server. This goes double since number of job slots can be
173 adjusted on the fly (by giving --jobs a file name).
174
175 Rsync protocol version
176 rsync 3.1.x uses protocol 31 which is unsupported by version 2.5.7.
177 That means that you cannot push a file to a remote system using rsync
178 protocol 31, if the remote system uses 2.5.7. rsync does not
179 automatically downgrade to protocol 30.
180
181 GNU parallel does not require protocol 31, so if the rsync version is
182 >= 3.1.0 then --protocol 30 is added to force newer rsyncs to talk to
183 version 2.5.7.
184
185 Compression
186 GNU parallel buffers output in temporary files. --compress compresses
187 the buffered data. This is a bit tricky because there should be no
188 files to clean up if GNU parallel is killed by a power outage.
189
190 GNU parallel first selects a compression program. If the user has not
191 selected one, the first of these that is in $PATH is used: pzstd lbzip2
192 pbzip2 zstd pixz lz4 pigz lzop plzip lzip gzip lrz pxz bzip2 lzma xz
193 clzip. They are sorted by speed on a 128 core machine.
194
195 Schematically the setup is as follows:
196
197 command started by parallel | compress > tmpfile
198 cattail tmpfile | uncompress | parallel which reads the output
199
200 The setup is duplicated for both standard output (stdout) and standard
201 error (stderr).
202
203 GNU parallel pipes output from the command run into the compression
204 program which saves to a tmpfile. GNU parallel records the pid of the
205 compress program. At the same time a small Perl script (called cattail
206 above) is started: It basically does cat followed by tail -f, but it
207 also removes the tmpfile as soon as the first byte is read, and it
208 continuously checks if the pid of the compression program is dead. If
209 the compress program is dead, cattail reads the rest of tmpfile and
210 exits.
211
212 As most compression programs write out a header when they start, the
213 tmpfile in practice is removed by cattail after around 40 ms.
214
215 Wrapping
216 The command given by the user can be wrapped in multiple templates.
217 Templates can be wrapped in other templates.
218
219 $COMMAND the command to run.
220
221 $INPUT the input to run.
222
223 $SHELL the shell that started GNU Parallel.
224
225 $SSHLOGIN the sshlogin.
226
227 $WORKDIR the working dir.
228
229 $FILE the file to read parts from.
230
231 $STARTPOS the first byte position to read from $FILE.
232
233 $LENGTH the number of bytes to read from $FILE.
234
235 --shellquote echo Double quoted $INPUT
236
237 --nice pri Remote: See The remote system wrapper.
238
239 Local: setpriority(0,0,$nice)
240
241 --cat
242 cat > {}; $COMMAND {};
243 perl -e '$bash = shift;
244 $csh = shift;
245 for(@ARGV) { unlink;rmdir; }
246 if($bash =~ s/h//) { exit $bash; }
247 exit $csh;' "$?h" "$status" {};
248
249 {} is set to $PARALLEL_TMP which is a tmpfile. The Perl
250 script saves the exit value, unlinks the tmpfile, and
251 returns the exit value - no matter if the shell is
252 bash/ksh/zsh (using $?) or *csh/fish (using $status).
253
254 --fifo
255 perl -e '($s,$c,$f) = @ARGV;
256 # mkfifo $PARALLEL_TMP
257 system "mkfifo", $f;
258 # spawn $shell -c $command &
259 $pid = fork || exec $s, "-c", $c;
260 open($o,">",$f) || die $!;
261 # cat > $PARALLEL_TMP
262 while(sysread(STDIN,$buf,131072)){
263 syswrite $o, $buf;
264 }
265 close $o;
266 # waitpid to get the exit code from $command
267 waitpid $pid,0;
268 # Cleanup
269 unlink $f;
270 exit $?/256;' $SHELL -c $COMMAND $PARALLEL_TMP
271
272 This is an elaborate way of: mkfifo {}; run $COMMAND in
273 the background using $SHELL; copying STDIN to {};
274 waiting for background to complete; remove {} and exit
275 with the exit code from $COMMAND.
276
277 It is made this way to be compatible with *csh/fish.
278
279 --pipepart
280 < $FILE perl -e 'while(@ARGV) {
281 sysseek(STDIN,shift,0) || die;
282 $left = shift;
283 while($read =
284 sysread(STDIN,$buf,
285 ($left > 131072 ? 131072 : $left))){
286 $left -= $read;
287 syswrite(STDOUT,$buf);
288 }
289 }' $STARTPOS $LENGTH
290
291 This will read $LENGTH bytes from $FILE starting at
292 $STARTPOS and send it to STDOUT.
293
294 --sshlogin $SSHLOGIN
295 ssh $SSHLOGIN "$COMMAND"
296
297 --transfer
298 ssh $SSHLOGIN mkdir -p ./$WORKDIR;
299 rsync --protocol 30 -rlDzR \
300 -essh ./{} $SSHLOGIN:./$WORKDIR;
301 ssh $SSHLOGIN "$COMMAND"
302
303 Read about --protocol 30 in the section Rsync protocol
304 version.
305
306 --transferfile file
307 <<todo>>
308
309 --basefile <<todo>>
310
311 --return file
312 $COMMAND; _EXIT_status=$?; mkdir -p $WORKDIR;
313 rsync --protocol 30 \
314 --rsync-path=cd\ ./$WORKDIR\;\ rsync \
315 -rlDzR -essh $SSHLOGIN:./$FILE ./$WORKDIR;
316 exit $_EXIT_status;
317
318 The --rsync-path=cd ... is needed because old versions
319 of rsync do not support --no-implied-dirs.
320
321 The $_EXIT_status trick is to postpone the exit value.
322 This makes it incompatible with *csh and should be fixed
323 in the future. Maybe a wrapping 'sh -c' is enough?
324
325 --cleanup $RETURN is the wrapper from --return
326
327 $COMMAND; _EXIT_status=$?; $RETURN;
328 ssh $SSHLOGIN \(rm\ -f\ ./$WORKDIR/{}\;\
329 rmdir\ ./$WORKDIR\ \>\&/dev/null\;\);
330 exit $_EXIT_status;
331
332 $_EXIT_status: see --return above.
333
334 --pipe
335 perl -e 'if(sysread(STDIN, $buf, 1)) {
336 open($fh, "|-", "@ARGV") || die;
337 syswrite($fh, $buf);
338 # Align up to 128k block
339 if($read = sysread(STDIN, $buf, 131071)) {
340 syswrite($fh, $buf);
341 }
342 while($read = sysread(STDIN, $buf, 131072)) {
343 syswrite($fh, $buf);
344 }
345 close $fh;
346 exit ($?&127 ? 128+($?&127) : 1+$?>>8)
347 }' $SHELL -c $COMMAND
348
349 This small wrapper makes sure that $COMMAND will never
350 be run if there is no data.
351
352 --tmux <<TODO Fixup with '-quoting>> mkfifo /tmp/tmx3cMEV &&
353 sh -c 'tmux -S /tmp/tmsaKpv1 new-session -s p334310 -d
354 "sleep .2" >/dev/null 2>&1'; tmux -S /tmp/tmsaKpv1 new-
355 window -t p334310 -n wc\ 10 \(wc\ 10\)\;\ perl\ -e\
356 \'while\(\$t++\<3\)\{\ print\ \$ARGV\[0\],\"\\n\"\ \}\'\
357 \$\?h/\$status\ \>\>\ /tmp/tmx3cMEV\&echo\ wc\\\ 10\;\
358 echo\ \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec
359 perl -e '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
360 exit($1);exit$c' /tmp/tmx3cMEV
361
362 mkfifo tmpfile.tmx; tmux -S <tmpfile.tms> new-session -s
363 pPID -d 'sleep .2' >&/dev/null; tmux -S <tmpfile.tms>
364 new-window -t pPID -n <<shell quoted input>> \(<<shell
365 quoted input>>\)\;\ perl\ -e\ \'while\(\$t++\<3\)\{\
366 print\ \$ARGV\[0\],\"\\n\"\ \}\'\ \$\?h/\$status\ \>\>\
367 tmpfile.tmx\&echo\ <<shell double quoted input>>\;echo\
368 \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec perl -e
369 '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
370 exit($1);exit$c' tmpfile.tmx
371
372 First a FIFO is made (.tmx). It is used for
373 communicating exit value. Next a new tmux session is
374 made. This may fail if there is already a session, so
375 the output is ignored. If all job slots finish at the
376 same time, then tmux will close the session. A temporary
377 socket is made (.tms) to avoid a race condition in tmux.
378 It is cleaned up when GNU parallel finishes.
379
380 The input is used as the name of the windows in tmux.
381 When the job inside tmux finishes, the exit value is
382 printed to the FIFO (.tmx). This FIFO is opened by perl
383 outside tmux, and perl then removes the FIFO. Perl
384 blocks until the first value is read from the FIFO, and
385 this value is used as exit value.
386
387 To make it compatible with csh and bash the exit value
388 is printed as: $?h/$status and this is parsed by perl.
389
390 There is a bug that makes it necessary to print the exit
391 value 3 times.
392
393 Another bug in tmux requires the length of the tmux
394 title and command to not have certain limits. When
395 inside these limits, 75 '\ ' are added to the title to
396 force it to be outside the limits.
397
398 You can map the bad limits using:
399
400 perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 1600 1500 90 |
401 perl -ane '$F[0]+$F[1]+$F[2] < 2037 and print ' |
402 parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' \
403 new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm -f /tmp/p{%}-O*'
404
405 perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 17000 17000 90 |
406 parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' \
407 tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm /tmp/p{%}-O*'
408 > value.csv 2>/dev/null
409
410 R -e 'a<-read.table("value.csv");X11();plot(a[,1],a[,2],col=a[,4]+5,cex=0.1);Sys.sleep(1000)'
411
412 For tmux 1.8 17000 can be lowered to 2100.
413
414 The interesting areas are title 0..1000 with (title +
415 whole command) in 996..1127 and 9331..9636.
416
417 The ordering of the wrapping is important:
418
419 • $PARALLEL_ENV which is set in env_parallel.* must be prepended to
420 the command first, as the command may contain exported variables
421 or functions.
422
423 • --nice/--cat/--fifo should be done on the remote machine
424
425 • --pipepart/--pipe should be done on the local machine inside
426 --tmux
427
428 Convenience options --nice --basefile --transfer --return --cleanup --tmux
429 --group --compress --cat --fifo --workdir --tag --tagstring
430 These are all convenience options that make it easier to do a task. But
431 more importantly: They are tested to work on corner cases, too. Take
432 --nice as an example:
433
434 nice parallel command ...
435
436 will work just fine. But when run remotely, you need to move the nice
437 command so it is being run on the server:
438
439 parallel -S server nice command ...
440
441 And this will again work just fine, as long as you are running a single
442 command. When you are running a composed command you need nice to apply
443 to the whole command, and it gets harder still:
444
445 parallel -S server -q nice bash -c 'command1 ...; cmd2 | cmd3'
446
447 It is not impossible, but by using --nice GNU parallel will do the
448 right thing for you. Similarly when transferring files: It starts to
449 get hard when the file names contain space, :, `, *, or other special
450 characters.
451
452 To run the commands in a tmux session you basically just need to quote
453 the command. For simple commands that is easy, but when commands
454 contain special characters, it gets much harder to get right.
455
456 --compress not only compresses standard output (stdout) but also
457 standard error (stderr); and it does so into files, that are open but
458 deleted, so a crash will not leave these files around.
459
460 --cat and --fifo are easy to do by hand, until you want to clean up the
461 tmpfile and keep the exit code of the command.
462
463 The real killer comes when you try to combine several of these: Doing
464 that correctly for all corner cases is next to impossible to do by
465 hand.
466
467 --shard
468 The simple way to implement sharding would be to:
469
470 1. start n jobs,
471
472 2. split each line into columns,
473
474 3. select the data from the relevant column
475
476 4. compute a hash value from the data
477
478 5. take the modulo n of the hash value
479
480 6. pass the full line to the jobslot that has the computed value
481
482 Unfortunately Perl is rather slow at computing the hash value (and
483 somewhat slow at splitting into columns).
484
485 One solution is to use a compiled language for the splitting and
486 hashing, but that would go against the design criteria of not depending
487 on a compiler.
488
489 Luckily those tasks can be parallelized. So GNU parallel starts n
490 sharders that do step 2-6, and passes blocks of 100k to each of those
491 in a round robin manner. To make sure these sharders compute the hash
492 the same way, $PERL_HASH_SEED is set to the same value for all
493 sharders.
494
495 Running n sharders poses a new problem: Instead of having n outputs
496 (one for each computed value) you now have n outputs for each of the n
497 values, so in total n*n outputs; and you need to merge these n*n
498 outputs together into n outputs.
499
500 This can be done by simply running 'parallel -j0 --lb cat :::
501 outputs_for_one_value', but that is rather inefficient, as it spawns a
502 process for each file. Instead the core code from 'parcat' is run,
503 which is also a bit faster.
504
505 All the sharders and parcats communicate through named pipes that are
506 unlinked as soon as they are opened.
507
508 Shell shock
509 The shell shock bug in bash did not affect GNU parallel, but the
510 solutions did. bash first introduced functions in variables named:
511 BASH_FUNC_myfunc() and later changed that to BASH_FUNC_myfunc%%. When
512 transferring functions GNU parallel reads off the function and changes
513 that into a function definition, which is copied to the remote system
514 and executed before the actual command is executed. Therefore GNU
515 parallel needs to know how to read the function.
516
517 From version 20150122 GNU parallel tries both the ()-version and the
518 %%-version, and the function definition works on both pre- and post-
519 shell shock versions of bash.
520
521 The remote system wrapper
522 The remote system wrapper does some initialization before starting the
523 command on the remote system.
524
525 Make quoting unnecessary by hex encoding everything
526
527 When you run ssh server foo then foo has to be quoted once:
528
529 ssh server "echo foo; echo bar"
530
531 If you run ssh server1 ssh server2 foo then foo has to be quoted twice:
532
533 ssh server1 ssh server2 \'"echo foo; echo bar"\'
534
535 GNU parallel avoids this by packing everyting into hex values and
536 running a command that does not need quoting:
537
538 perl -X -e GNU_Parallel_worker,eval+pack+q/H10000000/,join+q//,@ARGV
539
540 This command reads hex from the command line and converts that to bytes
541 that are then eval'ed as a Perl expression.
542
543 The string GNU_Parallel_worker is not needed. It is simply there to let
544 the user know, that this process is GNU parallel working.
545
546 Ctrl-C and standard error (stderr)
547
548 If the user presses Ctrl-C the user expects jobs to stop. This works
549 out of the box if the jobs are run locally. Unfortunately it is not so
550 simple if the jobs are run remotely.
551
552 If remote jobs are run in a tty using ssh -tt, then Ctrl-C works, but
553 all output to standard error (stderr) is sent to standard output
554 (stdout). This is not what the user expects.
555
556 If remote jobs are run without a tty using ssh (without -tt), then
557 output to standard error (stderr) is kept on stderr, but Ctrl-C does
558 not kill remote jobs. This is not what the user expects.
559
560 So what is needed is a way to have both. It seems the reason why Ctrl-C
561 does not kill the remote jobs is because the shell does not propagate
562 the hang-up signal from sshd. But when sshd dies, the parent of the
563 login shell becomes init (process id 1). So by exec'ing a Perl wrapper
564 to monitor the parent pid and kill the child if the parent pid becomes
565 1, then Ctrl-C works and stderr is kept on stderr.
566
567 Ctrl-C does, however, kill the ssh connection, so any output from a
568 remote dying process is lost.
569
570 To be able to kill all (grand)*children a new process group is started.
571
572 --nice
573
574 niceing the remote process is done by setpriority(0,0,$nice). A few old
575 systems do not implement this and --nice is unsupported on those.
576
577 Setting $PARALLEL_TMP
578
579 $PARALLEL_TMP is used by --fifo and --cat and must point to a non-
580 exitent file in $TMPDIR. This file name is computed on the remote
581 system.
582
583 The wrapper
584
585 The wrapper looks like this:
586
587 $shell = $PARALLEL_SHELL || $SHELL;
588 $tmpdir = $TMPDIR || $PARALLEL_REMOTE_TMPDIR;
589 $nice = $opt::nice;
590 $termseq = $opt::termseq;
591
592 # Check that $tmpdir is writable
593 -w $tmpdir ||
594 die("$tmpdir is not writable.".
595 " Set PARALLEL_REMOTE_TMPDIR");
596 # Set $PARALLEL_TMP to a non-existent file name in $TMPDIR
597 do {
598 $ENV{PARALLEL_TMP} = $tmpdir."/par".
599 join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
600 } while(-e $ENV{PARALLEL_TMP});
601 # Set $script to a non-existent file name in $TMPDIR
602 do {
603 $script = $tmpdir."/par".
604 join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
605 } while(-e $script);
606 # Create a script from the hex code
607 # that removes itself and runs the commands
608 open($fh,">",$script) || die;
609 # ' needed due to rc-shell
610 print($fh("rm \'$script\'\n",$bashfunc.$cmd));
611 close $fh;
612 my $parent = getppid;
613 my $done = 0;
614 $SIG{CHLD} = sub { $done = 1; };
615 $pid = fork;
616 unless($pid) {
617 # Make own process group to be able to kill HUP it later
618 eval { setpgrp };
619 # Set nice value
620 eval { setpriority(0,0,$nice) };
621 # Run the script
622 exec($shell,$script);
623 die("exec failed: $!");
624 }
625 while((not $done) and (getppid == $parent)) {
626 # Parent pid is not changed, so sshd is alive
627 # Exponential sleep up to 1 sec
628 $s = $s < 1 ? 0.001 + $s * 1.03 : $s;
629 select(undef, undef, undef, $s);
630 }
631 if(not $done) {
632 # sshd is dead: User pressed Ctrl-C
633 # Kill as per --termseq
634 my @term_seq = split/,/,$termseq;
635 if(not @term_seq) {
636 @term_seq = ("TERM",200,"TERM",100,"TERM",50,"KILL",25);
637 }
638 while(@term_seq && kill(0,-$pid)) {
639 kill(shift @term_seq, -$pid);
640 select(undef, undef, undef, (shift @term_seq)/1000);
641 }
642 }
643 wait;
644 exit ($?&127 ? 128+($?&127) : 1+$?>>8)
645
646 Transferring of variables and functions
647 Transferring of variables and functions given by --env is done by
648 running a Perl script remotely that calls the actual command. The Perl
649 script sets $ENV{variable} to the correct value before exec'ing a shell
650 that runs the function definition followed by the actual command.
651
652 The function env_parallel copies the full current environment into the
653 environment variable PARALLEL_ENV. This variable is picked up by GNU
654 parallel and used to create the Perl script mentioned above.
655
656 Base64 encoded bzip2
657 csh limits words of commands to 1024 chars. This is often too little
658 when GNU parallel encodes environment variables and wraps the command
659 with different templates. All of these are combined and quoted into one
660 single word, which often is longer than 1024 chars.
661
662 When the line to run is > 1000 chars, GNU parallel therefore encodes
663 the line to run. The encoding bzip2s the line to run, converts this to
664 base64, splits the base64 into 1000 char blocks (so csh does not fail),
665 and prepends it with this Perl script that decodes, decompresses and
666 evals the line.
667
668 @GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
669 eval "@GNU_Parallel";
670
671 $SIG{CHLD}="IGNORE";
672 # Search for bzip2. Not found => use default path
673 my $zip = (grep { -x $_ } "/usr/local/bin/bzip2")[0] || "bzip2";
674 # $in = stdin on $zip, $out = stdout from $zip
675 my($in, $out,$eval);
676 open3($in,$out,">&STDERR",$zip,"-dc");
677 if(my $perlpid = fork) {
678 close $in;
679 $eval = join "", <$out>;
680 close $out;
681 } else {
682 close $out;
683 # Pipe decoded base64 into 'bzip2 -dc'
684 print $in (decode_base64(join"",@ARGV));
685 close $in;
686 exit;
687 }
688 wait;
689 eval $eval;
690
691 Perl and bzip2 must be installed on the remote system, but a small test
692 showed that bzip2 is installed by default on all platforms that runs
693 GNU parallel, so this is not a big problem.
694
695 The added bonus of this is that much bigger environments can now be
696 transferred as they will be below bash's limit of 131072 chars.
697
698 Which shell to use
699 Different shells behave differently. A command that works in tcsh may
700 not work in bash. It is therefore important that the correct shell is
701 used when GNU parallel executes commands.
702
703 GNU parallel tries hard to use the right shell. If GNU parallel is
704 called from tcsh it will use tcsh. If it is called from bash it will
705 use bash. It does this by looking at the (grand)*parent process: If the
706 (grand)*parent process is a shell, use this shell; otherwise look at
707 the parent of this (grand)*parent. If none of the (grand)*parents are
708 shells, then $SHELL is used.
709
710 This will do the right thing if called from:
711
712 • an interactive shell
713
714 • a shell script
715
716 • a Perl script in `` or using system if called as a single string.
717
718 While these cover most cases, there are situations where it will fail:
719
720 • When run using exec.
721
722 • When run as the last command using -c from another shell (because
723 some shells use exec):
724
725 zsh% bash -c "parallel 'echo {} is not run in bash; \
726 set | grep BASH_VERSION' ::: This"
727
728 You can work around that by appending '&& true':
729
730 zsh% bash -c "parallel 'echo {} is run in bash; \
731 set | grep BASH_VERSION' ::: This && true"
732
733 • When run in a Perl script using system with parallel as the first
734 string:
735
736 #!/usr/bin/perl
737
738 system("parallel",'setenv a {}; echo $a',":::",2);
739
740 Here it depends on which shell is used to call the Perl script. If
741 the Perl script is called from tcsh it will work just fine, but if it
742 is called from bash it will fail, because the command setenv is not
743 known to bash.
744
745 If GNU parallel guesses wrong in these situation, set the shell using
746 $PARALLEL_SHELL.
747
748 Always running commands in a shell
749 If the command is a simple command with no redirection and setting of
750 variables, the command could be run without spawning a shell. E.g. this
751 simple grep matching either 'ls ' or ' wc >> c':
752
753 parallel "grep -E 'ls | wc >> c' {}" ::: foo
754
755 could be run as:
756
757 system("grep","-E","ls | wc >> c","foo");
758
759 However, as soon as the command is a bit more complex a shell must be
760 spawned:
761
762 parallel "grep -E 'ls | wc >> c' {} | wc >> c" ::: foo
763 parallel "LANG=C grep -E 'ls | wc >> c' {}" ::: foo
764
765 It is impossible to tell how | wc >> c should be interpreted without
766 parsing the string (is the | a pipe in shell or an alternation in a
767 grep regexp? Is LANG=C a command in csh or setting a variable in bash?
768 Is >> redirection or part of a regexp?).
769
770 On top of this, wrapper scripts will often require a shell to be
771 spawned.
772
773 The downside is that you need to quote special shell chars twice:
774
775 parallel echo '*' ::: This will expand the asterisk
776 parallel echo "'*'" ::: This will not
777 parallel "echo '*'" ::: This will not
778 parallel echo '\*' ::: This will not
779 parallel echo \''*'\' ::: This will not
780 parallel -q echo '*' ::: This will not
781
782 -q will quote all special chars, thus redirection will not work: this
783 prints '* > out.1' and does not save '*' into the file out.1:
784
785 parallel -q echo "*" ">" out.{} ::: 1
786
787 GNU parallel tries to live up to Principle Of Least Astonishment
788 (POLA), and the requirement of using -q is hard to understand, when you
789 do not see the whole picture.
790
791 Quoting
792 Quoting depends on the shell. For most shells '-quoting is used for
793 strings containing special characters.
794
795 For tcsh/csh newline is quoted as \ followed by newline. Other special
796 characters are also \-quoted.
797
798 For rc everything is quoted using '.
799
800 --pipepart vs. --pipe
801 While --pipe and --pipepart look much the same to the user, they are
802 implemented very differently.
803
804 With --pipe GNU parallel reads the blocks from standard input (stdin),
805 which is then given to the command on standard input (stdin); so every
806 block is being processed by GNU parallel itself. This is the reason why
807 --pipe maxes out at around 500 MB/sec.
808
809 --pipepart, on the other hand, first identifies at which byte positions
810 blocks start and how long they are. It does that by seeking into the
811 file by the size of a block and then reading until it meets end of a
812 block. The seeking explains why GNU parallel does not know the line
813 number and why -L/-l and -N do not work.
814
815 With a reasonable block and file size this seeking is more than 1000
816 time faster than reading the full file. The byte positions are then
817 given to a small script that reads from position X to Y and sends
818 output to standard output (stdout). This small script is prepended to
819 the command and the full command is executed just as if GNU parallel
820 had been in its normal mode. The script looks like this:
821
822 < file perl -e 'while(@ARGV) {
823 sysseek(STDIN,shift,0) || die;
824 $left = shift;
825 while($read = sysread(STDIN,$buf,
826 ($left > 131072 ? 131072 : $left))){
827 $left -= $read; syswrite(STDOUT,$buf);
828 }
829 }' startbyte length_in_bytes
830
831 It delivers 1 GB/s per core.
832
833 Instead of the script dd was tried, but many versions of dd do not
834 support reading from one byte to another and might cause partial data.
835 See this for a surprising example:
836
837 yes | dd bs=1024k count=10 | wc
838
839 --block-size adjustment
840 Every time GNU parallel detects a record bigger than --block-size it
841 increases the block size by 30%. A small --block-size gives very poor
842 performance; by exponentially increasing the block size performance
843 will not suffer.
844
845 GNU parallel will waste CPU power if --block-size does not contain a
846 full record, because it tries to find a full record and will fail to do
847 so. The recommendation is therefore to use a --block-size > 2 records,
848 so you always get at least one full record when you read one block.
849
850 If you use -N then --block-size should be big enough to contain N+1
851 records.
852
853 Automatic --block-size computation
854 With --pipepart GNU parallel can compute the --block-size
855 automatically. A --block-size of -1 will use a block size so that each
856 jobslot will receive approximately 1 block. --block -2 will pass 2
857 blocks to each jobslot and -n will pass n blocks to each jobslot.
858
859 This can be done because --pipepart reads from files, and we can
860 compute the total size of the input.
861
862 --jobs and --onall
863 When running the same commands on many servers what should --jobs
864 signify? Is it the number of servers to run on in parallel? Is it the
865 number of jobs run in parallel on each server?
866
867 GNU parallel lets --jobs represent the number of servers to run on in
868 parallel. This is to make it possible to run a sequence of commands
869 (that cannot be parallelized) on each server, but run the same sequence
870 on multiple servers.
871
872 --shuf
873 When using --shuf to shuffle the jobs, all jobs are read, then they are
874 shuffled, and finally executed. When using SQL this makes the
875 --sqlmaster be the part that shuffles the jobs. The --sqlworkers simply
876 executes according to Seq number.
877
878 --csv
879 --pipepart is incompatible with --csv because you can have records
880 like:
881
882 a,b,c
883 a,"
884 a,b,c
885 a,b,c
886 a,b,c
887 ",c
888 a,b,c
889
890 Here the second record contains a multi-line field that looks like
891 records. Since --pipepart does not read then whole file when searching
892 for record endings, it may start reading in this multi-line field,
893 which would be wrong.
894
895 Buffering on disk
896 GNU parallel buffers output, because if output is not buffered you have
897 to be ridiculously careful on sizes to avoid mixing of outputs (see
898 excellent example on https://catern.com/posts/pipes.html).
899
900 GNU parallel buffers on disk in $TMPDIR using files, that are removed
901 as soon as they are created, but which are kept open. So even if GNU
902 parallel is killed by a power outage, there will be no files to clean
903 up afterwards. Another advantage is that the file system is aware that
904 these files will be lost in case of a crash, so it does not need to
905 sync them to disk.
906
907 It gives the odd situation that a disk can be fully used, but there are
908 no visible files on it.
909
910 Partly buffering in memory
911
912 When using output formats SQL and CSV then GNU Parallel has to read the
913 whole output into memory. When run normally it will only read the
914 output from a single job. But when using --linebuffer every line
915 printed will also be buffered in memory - for all jobs currently
916 running.
917
918 If memory is tight, then do not use the output format SQL/CSV with
919 --linebuffer.
920
921 Comparing to buffering in memory
922
923 gargs is a parallelizing tool that buffers in memory. It is therefore a
924 useful way of comparing the advantages and disadvantages of buffering
925 in memory to buffering on disk.
926
927 On an system with 6 GB RAM free and 6 GB free swap these were tested
928 with different sizes:
929
930 echo /dev/zero | gargs "head -c $size {}" >/dev/null
931 echo /dev/zero | parallel "head -c $size {}" >/dev/null
932
933 The results are here:
934
935 JobRuntime Command
936 0.344 parallel_test 1M
937 0.362 parallel_test 10M
938 0.640 parallel_test 100M
939 9.818 parallel_test 1000M
940 23.888 parallel_test 2000M
941 30.217 parallel_test 2500M
942 30.963 parallel_test 2750M
943 34.648 parallel_test 3000M
944 43.302 parallel_test 4000M
945 55.167 parallel_test 5000M
946 67.493 parallel_test 6000M
947 178.654 parallel_test 7000M
948 204.138 parallel_test 8000M
949 230.052 parallel_test 9000M
950 255.639 parallel_test 10000M
951 757.981 parallel_test 30000M
952 0.537 gargs_test 1M
953 0.292 gargs_test 10M
954 0.398 gargs_test 100M
955 3.456 gargs_test 1000M
956 8.577 gargs_test 2000M
957 22.705 gargs_test 2500M
958 123.076 gargs_test 2750M
959 89.866 gargs_test 3000M
960 291.798 gargs_test 4000M
961
962 GNU parallel is pretty much limited by the speed of the disk: Up to 6
963 GB data is written to disk but cached, so reading is fast. Above 6 GB
964 data are both written and read from disk. When the 30000MB job is
965 running, the disk system is slow, but usable: If you are not using the
966 disk, you almost do not feel it.
967
968 gargs has a speed advantage up until 2500M where it hits a wall. Then
969 the system starts swapping like crazy and is completely unusable. At
970 5000M it goes out of memory.
971
972 You can make GNU parallel behave similar to gargs if you point $TMPDIR
973 to a tmpfs-filesystem: It will be faster for small outputs, but may
974 kill your system for larger outputs and cause you to lose output.
975
976 Disk full
977 GNU parallel buffers on disk. If the disk is full, data may be lost. To
978 check if the disk is full GNU parallel writes a 8193 byte file every
979 second. If this file is written successfully, it is removed
980 immediately. If it is not written successfully, the disk is full. The
981 size 8193 was chosen because 8192 gave wrong result on some file
982 systems, whereas 8193 did the correct thing on all tested filesystems.
983
984 Memory usage
985 Normally GNU parallel will use around 17 MB RAM constantly - no matter
986 how many jobs or how much output there is. There are a few things that
987 cause the memory usage to rise:
988
989 • Multiple input sources. GNU parallel reads an input source only
990 once. This is by design, as an input source can be a stream (e.g.
991 FIFO, pipe, standard input (stdin)) which cannot be rewound and read
992 again. When reading a single input source, the memory is freed as
993 soon as the job is done - thus keeping the memory usage constant.
994
995 But when reading multiple input sources GNU parallel keeps the
996 already read values for generating all combinations with other input
997 sources.
998
999 • Computing the number of jobs. --bar, --eta, and --halt xx% use
1000 total_jobs() to compute the total number of jobs. It does this by
1001 generating the data structures for all jobs. All these job data
1002 structures will be stored in memory and take up around 400
1003 bytes/job.
1004
1005 • Buffering a full line. --linebuffer will read a full line per
1006 running job. A very long output line (say 1 GB without \n) will
1007 increase RAM usage temporarily: From when the beginning of the line
1008 is read till the line is printed.
1009
1010 • Buffering the full output of a single job. This happens when using
1011 --results *.csv/*.tsv or --sql*. Here GNU parallel will read the
1012 whole output of a single job and save it as csv/tsv or SQL.
1013
1014 Argument separators ::: :::: :::+ ::::+
1015 The argument separator ::: was chosen because I have never seen :::
1016 used in any command. The natural choice -- would be a bad idea since it
1017 is not unlikely that the template command will contain --. I have seen
1018 :: used in programming languanges to separate classes, and I did not
1019 want the user to be confused that the separator had anything to do with
1020 classes.
1021
1022 ::: also makes a visual separation, which is good if there are multiple
1023 :::.
1024
1025 When ::: was chosen, :::: came as a fairly natural extension.
1026
1027 Linking input sources meant having to decide for some way to indicate
1028 linking of ::: and ::::. :::+ and ::::+ were chosen, so that they were
1029 similar to ::: and ::::.
1030
1031 Perl replacement strings, {= =}, and --rpl
1032 The shorthands for replacement strings make a command look more
1033 cryptic. Different users will need different replacement strings.
1034 Instead of inventing more shorthands you get more flexible replacement
1035 strings if they can be programmed by the user.
1036
1037 The language Perl was chosen because GNU parallel is written in Perl
1038 and it was easy and reasonably fast to run the code given by the user.
1039
1040 If a user needs the same programmed replacement string again and again,
1041 the user may want to make his own shorthand for it. This is what --rpl
1042 is for. It works so well, that even GNU parallel's own shorthands are
1043 implemented using --rpl.
1044
1045 In Perl code the bigrams {= and =} rarely exist. They look like a
1046 matching pair and can be entered on all keyboards. This made them good
1047 candidates for enclosing the Perl expression in the replacement
1048 strings. Another candidate ,, and ,, was rejected because they do not
1049 look like a matching pair. --parens was made, so that the users can
1050 still use ,, and ,, if they like: --parens ,,,,
1051
1052 Internally, however, the {= and =} are replaced by \257< and \257>.
1053 This is to make it simpler to make regular expressions. You only need
1054 to look one character ahead, and never have to look behind.
1055
1056 Test suite
1057 GNU parallel uses its own testing framework. This is mostly due to
1058 historical reasons. It deals reasonably well with tests that are
1059 dependent on how long a given test runs (e.g. more than 10 secs is a
1060 pass, but less is a fail). It parallelizes most tests, but it is easy
1061 to force a test to run as the single test (which may be important for
1062 timing issues). It deals reasonably well with tests that fail
1063 intermittently. It detects which tests failed and pushes these to the
1064 top, so when running the test suite again, the tests that failed most
1065 recently are run first.
1066
1067 If GNU parallel should adopt a real testing framework then those
1068 elements would be important.
1069
1070 Since many tests are dependent on which hardware it is running on,
1071 these tests break when run on a different hardware than what the test
1072 was written for.
1073
1074 When most bugs are fixed a test is added, so this bug will not
1075 reappear. It is, however, sometimes hard to create the environment in
1076 which the bug shows up - especially if the bug only shows up sometimes.
1077 One of the harder problems was to make a machine start swapping without
1078 forcing it to its knees.
1079
1080 Median run time
1081 Using a percentage for --timeout causes GNU parallel to compute the
1082 median run time of a job. The median is a better indicator of the
1083 expected run time than average, because there will often be outliers
1084 taking way longer than the normal run time.
1085
1086 To avoid keeping all run times in memory, an implementation of remedian
1087 was made (Rousseeuw et al).
1088
1089 Error messages and warnings
1090 Error messages like: ERROR, Not found, and 42 are not very helpful. GNU
1091 parallel strives to inform the user:
1092
1093 • What went wrong?
1094
1095 • Why did it go wrong?
1096
1097 • What can be done about it?
1098
1099 Unfortunately it is not always possible to predict the root cause of
1100 the error.
1101
1102 Determine number of CPUs
1103 CPUs is an ambiguous term. It can mean the number of socket filled
1104 (i.e. the number of physical chips). It can mean the number of cores
1105 (i.e. the number of physical compute cores). It can mean the number of
1106 hyperthreaded cores (i.e. the number of virtual cores - with some of
1107 them possibly being hyperthreaded).
1108
1109 On ark.intel.com Intel uses the terms cores and threads for number of
1110 physical cores and the number of hyperthreaded cores respectively.
1111
1112 GNU parallel uses uses CPUs as the number of compute units and the
1113 terms sockets, cores, and threads to specify how the number of compute
1114 units is calculated.
1115
1116 Computation of load
1117 Contrary to the obvious --load does not use load average. This is due
1118 to load average rising too slowly. Instead it uses ps to list the
1119 number of threads in running or blocked state (state D, O or R). This
1120 gives an instant load.
1121
1122 As remote calculation of load can be slow, a process is spawned to run
1123 ps and put the result in a file, which is then used next time.
1124
1125 Killing jobs
1126 GNU parallel kills jobs. It can be due to --memfree, --halt, or when
1127 GNU parallel meets a condition from which it cannot recover. Every job
1128 is started as its own process group. This way any (grand)*children will
1129 get killed, too. The process group is killed with the specification
1130 mentioned in --termseq.
1131
1132 SQL interface
1133 GNU parallel uses the DBURL from GNU sql to give database software,
1134 username, password, host, port, database, and table in a single string.
1135
1136 The DBURL must point to a table name. The table will be dropped and
1137 created. The reason for not reusing an existing table is that the user
1138 may have added more input sources which would require more columns in
1139 the table. By prepending '+' to the DBURL the table will not be
1140 dropped.
1141
1142 The table columns are similar to joblog with the addition of V1 .. Vn
1143 which are values from the input sources, and Stdout and Stderr which
1144 are the output from standard output and standard error, respectively.
1145
1146 The Signal column has been renamed to _Signal due to Signal being a
1147 reserved word in MySQL.
1148
1149 Logo
1150 The logo is inspired by the Cafe Wall illusion. The font is DejaVu
1151 Sans.
1152
1153 Citation notice
1154 Funding a free software project is hard. GNU parallel is no exception.
1155 On top of that it seems the less visible a project is, the harder it is
1156 to get funding. And the nature of GNU parallel is that it will never be
1157 seen by "the guy with the checkbook", but only by the people doing the
1158 actual work.
1159
1160 This problem has been covered by others - though no solution has been
1161 found: https://www.slideshare.net/NadiaEghbal/consider-the-maintainer
1162 https://www.numfocus.org/blog/why-is-numpy-only-now-getting-funded/
1163
1164 Before implementing the citation notice it was discussed with the
1165 users:
1166 https://lists.gnu.org/archive/html/parallel/2013-11/msg00006.html
1167
1168 Having to spend 10 seconds on running parallel --citation once is no
1169 doubt not an ideal solution, but no one has so far come up with an
1170 ideal solution - neither for funding GNU parallel nor other free
1171 software.
1172
1173 If you believe you have the perfect solution, you should try it out,
1174 and if it works, you should post it on the email list. Ideas that will
1175 cost work and which have not been tested are, however, unlikely to be
1176 prioritized.
1177
1178 Running parallel --citation one single time takes less than 10 seconds,
1179 and will silence the citation notice for future runs. This is
1180 comparable to graphical tools where you have to click a checkbox saying
1181 "Do not show this again". But if that is too much trouble for you, why
1182 not use one of the alternatives instead? See a list in: man
1183 parallel_alternatives.
1184
1185 As the request for citation is not a legal requirement this is
1186 acceptable under GPLv3 and cleared with Richard M. Stallman himself.
1187 Thus it does not fall under this:
1188 https://www.gnu.org/licenses/gpl-faq.en.html#RequireCitation
1189
1191 Multiple processes working together
1192 Open3 is slow. Printing is slow. It would be good if they did not tie
1193 up resources, but were run in separate threads.
1194
1195 --rrs on remote using a perl wrapper
1196 ... | perl -pe '$/=$recend$recstart;BEGIN{ if(substr($_) eq $recstart)
1197 substr($_)="" } eof and substr($_) eq $recend) substr($_)=""
1198
1199 It ought to be possible to write a filter that removed rec sep on the
1200 fly instead of inside GNU parallel. This could then use more cpus.
1201
1202 Will that require 2x record size memory?
1203
1204 Will that require 2x block size memory?
1205
1207 These decisions were relevant for earlier versions of GNU parallel, but
1208 not the current version. They are kept here as historical record.
1209
1210 --tollef
1211 You can read about the history of GNU parallel on
1212 https://www.gnu.org/software/parallel/history.html
1213
1214 --tollef was included to make GNU parallel switch compatible with the
1215 parallel from moreutils (which is made by Tollef Fog Heen). This was
1216 done so that users of that parallel easily could port their use to GNU
1217 parallel: Simply set PARALLEL="--tollef" and that would be it.
1218
1219 But several distributions chose to make --tollef global (by putting it
1220 into /etc/parallel/config) without making the users aware of this, and
1221 that caused much confusion when people tried out the examples from GNU
1222 parallel's man page and these did not work. The users became
1223 frustrated because the distribution did not make it clear to them that
1224 it has made --tollef global.
1225
1226 So to lessen the frustration and the resulting support, --tollef was
1227 obsoleted 20130222 and removed one year later.
1228
1229
1230
123120220222 2022-03-16 PARALLEL_DESIGN(7)