parallel_design(7)

1PARALLEL_DESIGN(7)                 parallel                 PARALLEL_DESIGN(7)
2
3
4

Design of GNU Parallel

6       This document describes design decisions made in the development of GNU
7       parallel and the reasoning behind them. It will give an overview of why
8       some of the code looks the way it does, and will help new maintainers
9       understand the code better.
10
11   One file program
12       GNU parallel is a Perl script in a single file. It is object oriented,
13       but contrary to normal Perl scripts each class is not in its own file.
14       This is due to user experience: The goal is that in a pinch the user
15       will be able to get GNU parallel working simply by copying a single
16       file: No need to mess around with environment variables like PERL5LIB.
17
18   Choice of programming language
19       GNU parallel is designed to be able to run on old systems. That means
20       that it cannot depend on a compiler being installed - and especially
21       not a compiler for a language that is younger than 20 years old.
22
23       The goal is that you can use GNU parallel on any system, even if you
24       are not allowed to install additional software.
25
26       Of all the systems I have experienced, I have yet to see a system that
27       had GCC installed that did not have Perl. The same goes for Rust, Go,
28       Haskell, and other younger languages. I have, however, seen systems
29       with Perl without any of the mentioned compilers.
30
31       Most modern systems also have either Python2 or Python3 installed, but
32       you still cannot be certain which version, and since Python2 cannot run
33       under Python3, Python is not an option.
34
35       Perl has the added benefit that implementing the {= perlexpr =}
36       replacement string was fairly easy.
37
38   Old Perl style
39       GNU parallel uses some old, deprecated constructs. This is due to a
40       goal of being able to run on old installations. Currently the target is
41       CentOS 3.9 and Perl 5.8.0.
42
43   Scalability up and down
44       The smallest system GNU parallel is tested on is a 32 MB ASUS WL500gP.
45       The largest is a 2 TB 128-core machine. It scales up to around 100
46       machines - depending on the duration of each job.
47
48   Exponentially back off
49       GNU parallel busy waits. This is because the reason why a job is not
50       started may be due to load average (when using --load), and thus it
51       will not make sense to just wait for a job to finish. Instead the load
52       average must be rechecked regularly. Load average is not the only
53       reason: --timeout has a similar problem.
54
55       To not burn up too much CPU GNU parallel sleeps exponentially longer
56       and longer if nothing happens, maxing out at 1 second.
57
58   Shell compatibility
59       It is a goal to have GNU parallel work equally well in any shell.
60       However, in practice GNU parallel is being developed in bash and thus
61       testing in other shells is limited to reported bugs.
62
63       When an incompatibility is found there is often not an easy fix: Fixing
64       the problem in csh often breaks it in bash. In these cases the fix is
65       often to use a small Perl script and call that.
66
67   env_parallel
68       env_parallel is a dummy shell script that will run if env_parallel is
69       not an alias or a function and tell the user how to activate the
70       alias/function for the supported shells.
71
72       The alias or function will copy the current environment and run the
73       command with GNU parallel in the copy of the environment.
74
75       The problem is that you cannot access all of the current environment
76       inside Perl. E.g. aliases, functions and unexported shell variables.
77
78       The idea is therefore to take the environment and put it in
79       $PARALLEL_ENV which GNU parallel prepends to every command.
80
81       The only way to have access to the environment is directly from the
82       shell, so the program must be written in a shell script that will be
83       sourced and there has to deal with the dialect of the relevant shell.
84
85       env_parallel.*
86
87       These are the files that implements the alias or function env_parallel
88       for a given shell. It could be argued that these should be put in some
89       obscure place under /usr/lib, but by putting them in your path it
90       becomes trivial to find the path to them and source them:
91
92         source `which env_parallel.foo`
93
94       The beauty is that they can be put anywhere in the path without the
95       user having to know the location. So if the user's path includes
96       /afs/bin/i386_fc5 or /usr/pkg/parallel/bin or
97       /usr/local/parallel/20161222/sunos5.6/bin the files can be put in the
98       dir that makes most sense for the sysadmin.
99
100       env_parallel.bash / env_parallel.sh / env_parallel.ash /
101       env_parallel.dash / env_parallel.zsh / env_parallel.ksh /
102       env_parallel.mksh
103
104       env_parallel.(bash|sh|ash|dash|ksh|mksh|zsh) defines the function
105       env_parallel. It uses alias and typeset to dump the configuration (with
106       a few exceptions) into $PARALLEL_ENV before running GNU parallel.
107
108       After GNU parallel is finished, $PARALLEL_ENV is deleted.
109
110       env_parallel.csh
111
112       env_parallel.csh has two purposes: If env_parallel is not an alias:
113       make it into an alias that sets $PARALLEL with arguments and calls
114       env_parallel.csh.
115
116       If env_parallel is an alias, then env_parallel.csh uses $PARALLEL as
117       the arguments for GNU parallel.
118
119       It exports the environment by writing a variable definition to a file
120       for each variable.  The definitions of aliases are appended to this
121       file. Finally the file is put into $PARALLEL_ENV.
122
123       GNU parallel is then run and $PARALLEL_ENV is deleted.
124
125       env_parallel.fish
126
127       First all functions definitions are generated using a loop and
128       functions.
129
130       Dumping the scalar variable definitions is harder.
131
132       fish can represent non-printable characters in (at least) 2 ways. To
133       avoid problems all scalars are converted to \XX quoting.
134
135       Then commands to generate the definitions are made and separated by
136       NUL.
137
138       This is then piped into a Perl script that quotes all values. List
139       elements will be appended using two spaces.
140
141       Finally \n is converted into \1 because fish variables cannot contain
142       \n. GNU parallel will later convert all \1 from $PARALLEL_ENV into \n.
143
144       This is then all saved in $PARALLEL_ENV.
145
146       GNU parallel is called, and $PARALLEL_ENV is deleted.
147
148   parset (supported in sh, ash, dash, bash, zsh, ksh, mksh)
149       parset is a shell function. This is the reason why parset can set
150       variables: It runs in the shell which is calling it.
151
152       It is also the reason why parset does not work, when data is piped into
153       it: ... | parset ... makes parset start in a subshell, and any changes
154       in environment can therefore not make it back to the calling shell.
155
156   Job slots
157       The easiest way to explain what GNU parallel does is to assume that
158       there are a number of job slots, and when a slot becomes available a
159       job from the queue will be run in that slot. But originally GNU
160       parallel did not model job slots in the code. Job slots have been added
161       to make it possible to use {%} as a replacement string.
162
163       While the job sequence number can be computed in advance, the job slot
164       can only be computed the moment a slot becomes available. So it has
165       been implemented as a stack with lazy evaluation: Draw one from an
166       empty stack and the stack is extended by one. When a job is done, push
167       the available job slot back on the stack.
168
169       This implementation also means that if you re-run the same jobs, you
170       cannot assume jobs will get the same slots. And if you use remote
171       executions, you cannot assume that a given job slot will remain on the
172       same remote server. This goes double since number of job slots can be
173       adjusted on the fly (by giving --jobs a file name).
174
175   Rsync protocol version
176       rsync 3.1.x uses protocol 31 which is unsupported by version 2.5.7.
177       That means that you cannot push a file to a remote system using rsync
178       protocol 31, if the remote system uses 2.5.7. rsync does not
179       automatically downgrade to protocol 30.
180
181       GNU parallel does not require protocol 31, so if the rsync version is
182       >= 3.1.0 then --protocol 30 is added to force newer rsyncs to talk to
183       version 2.5.7.
184
185   Compression
186       GNU parallel buffers output in temporary files.  --compress compresses
187       the buffered data.  This is a bit tricky because there should be no
188       files to clean up if GNU parallel is killed by a power outage.
189
190       GNU parallel first selects a compression program. If the user has not
191       selected one, the first of these that is in $PATH is used: pzstd lbzip2
192       pbzip2 zstd pixz lz4 pigz lzop plzip lzip gzip lrz pxz bzip2 lzma xz
193       clzip. They are sorted by speed on a 128 core machine.
194
195       Schematically the setup is as follows:
196
197         command started by parallel | compress > tmpfile
198         cattail tmpfile | uncompress | parallel which reads the output
199
200       The setup is duplicated for both standard output (stdout) and standard
201       error (stderr).
202
203       GNU parallel pipes output from the command run into the compression
204       program which saves to a tmpfile. GNU parallel records the pid of the
205       compress program.  At the same time a small Perl script (called cattail
206       above) is started: It basically does cat followed by tail -f, but it
207       also removes the tmpfile as soon as the first byte is read, and it
208       continuously checks if the pid of the compression program is dead. If
209       the compress program is dead, cattail reads the rest of tmpfile and
210       exits.
211
212       As most compression programs write out a header when they start, the
213       tmpfile in practice is removed by cattail after around 40 ms.
214
215   Wrapping
216       The command given by the user can be wrapped in multiple templates.
217       Templates can be wrapped in other templates.
218
219       $COMMAND       the command to run.
220
221       $INPUT         the input to run.
222
223       $SHELL         the shell that started GNU Parallel.
224
225       $SSHLOGIN      the sshlogin.
226
227       $WORKDIR       the working dir.
228
229       $FILE          the file to read parts from.
230
231       $STARTPOS      the first byte position to read from $FILE.
232
233       $LENGTH        the number of bytes to read from $FILE.
234
235       --shellquote   echo Double quoted $INPUT
236
237       --nice pri     Remote: See The remote system wrapper.
238
239                      Local: setpriority(0,0,$nice)
240
241       --cat
242                        cat > {}; $COMMAND {};
243                        perl -e '$bash = shift;
244                          $csh = shift;
245                          for(@ARGV) { unlink;rmdir; }
246                          if($bash =~ s/h//) { exit $bash;  }
247                          exit $csh;' "$?h" "$status" {};
248
249                      {} is set to $PARALLEL_TMP which is a tmpfile. The Perl
250                      script saves the exit value, unlinks the tmpfile, and
251                      returns the exit value - no matter if the shell is
252                      bash/ksh/zsh (using $?) or *csh/fish (using $status).
253
254       --fifo
255                        perl -e '($s,$c,$f) = @ARGV;
256                          # mkfifo $PARALLEL_TMP
257                          system "mkfifo", $f;
258                          # spawn $shell -c $command &
259                          $pid = fork || exec $s, "-c", $c;
260                          open($o,">",$f) || die $!;
261                          # cat > $PARALLEL_TMP
262                          while(sysread(STDIN,$buf,131072)){
263                             syswrite $o, $buf;
264                          }
265                          close $o;
266                          # waitpid to get the exit code from $command
267                          waitpid $pid,0;
268                          # Cleanup
269                          unlink $f;
270                          exit $?/256;' $SHELL -c $COMMAND $PARALLEL_TMP
271
272                      This is an elaborate way of: mkfifo {}; run $COMMAND in
273                      the background using $SHELL; copying STDIN to {};
274                      waiting for background to complete; remove {} and exit
275                      with the exit code from $COMMAND.
276
277                      It is made this way to be compatible with *csh/fish.
278
279       --pipepart
280                        < $FILE perl -e 'while(@ARGV) {
281                            sysseek(STDIN,shift,0) || die;
282                            $left = shift;
283                            while($read =
284                                  sysread(STDIN,$buf,
285                                          ($left > 131072 ? 131072 : $left))){
286                              $left -= $read;
287                              syswrite(STDOUT,$buf);
288                            }
289                          }' $STARTPOS $LENGTH
290
291                      This will read $LENGTH bytes from $FILE starting at
292                      $STARTPOS and send it to STDOUT.
293
294       --sshlogin $SSHLOGIN
295                        ssh $SSHLOGIN "$COMMAND"
296
297       --transfer
298                        ssh $SSHLOGIN mkdir -p ./$WORKDIR;
299                        rsync --protocol 30 -rlDzR \
300                              -essh ./{} $SSHLOGIN:./$WORKDIR;
301                        ssh $SSHLOGIN "$COMMAND"
302
303                      Read about --protocol 30 in the section Rsync protocol
304                      version.
305
306       --transferfile file
307                      <<todo>>
308
309       --basefile     <<todo>>
310
311       --return file
312                        $COMMAND; _EXIT_status=$?; mkdir -p $WORKDIR;
313                        rsync --protocol 30 \
314                          --rsync-path=cd\ ./$WORKDIR\;\ rsync \
315                          -rlDzR -essh $SSHLOGIN:./$FILE ./$WORKDIR;
316                        exit $_EXIT_status;
317
318                      The --rsync-path=cd ... is needed because old versions
319                      of rsync do not support --no-implied-dirs.
320
321                      The $_EXIT_status trick is to postpone the exit value.
322                      This makes it incompatible with *csh and should be fixed
323                      in the future. Maybe a wrapping 'sh -c' is enough?
324
325       --cleanup      $RETURN is the wrapper from --return
326
327                        $COMMAND; _EXIT_status=$?; $RETURN;
328                        ssh $SSHLOGIN \(rm\ -f\ ./$WORKDIR/{}\;\
329                                        rmdir\ ./$WORKDIR\ \>\&/dev/null\;\);
330                        exit $_EXIT_status;
331
332                      $_EXIT_status: see --return above.
333
334       --pipe
335                        perl -e 'if(sysread(STDIN, $buf, 1)) {
336                              open($fh, "|-", "@ARGV") || die;
337                              syswrite($fh, $buf);
338                              # Align up to 128k block
339                              if($read = sysread(STDIN, $buf, 131071)) {
340                                  syswrite($fh, $buf);
341                              }
342                              while($read = sysread(STDIN, $buf, 131072)) {
343                                  syswrite($fh, $buf);
344                              }
345                              close $fh;
346                              exit ($?&127 ? 128+($?&127) : 1+$?>>8)
347                          }' $SHELL -c $COMMAND
348
349                      This small wrapper makes sure that $COMMAND will never
350                      be run if there is no data.
351
352       --tmux         <<TODO Fixup with '-quoting>> mkfifo /tmp/tmx3cMEV &&
353                        sh -c 'tmux -S /tmp/tmsaKpv1 new-session -s p334310 -d
354                      "sleep .2" >/dev/null 2>&1'; tmux -S /tmp/tmsaKpv1 new-
355                      window -t p334310 -n wc\ 10 \(wc\ 10\)\;\ perl\ -e\
356                      \'while\(\$t++\<3\)\{\ print\ \$ARGV\[0\],\"\\n\"\ \}\'\
357                      \$\?h/\$status\ \>\>\ /tmp/tmx3cMEV\&echo\ wc\\\ 10\;\
358                      echo\ \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec
359                      perl -e '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
360                      exit($1);exit$c' /tmp/tmx3cMEV
361
362                      mkfifo tmpfile.tmx; tmux -S <tmpfile.tms> new-session -s
363                      pPID -d 'sleep .2' >&/dev/null; tmux -S <tmpfile.tms>
364                      new-window -t pPID -n <<shell quoted input>> \(<<shell
365                      quoted input>>\)\;\ perl\ -e\ \'while\(\$t++\<3\)\{\
366                      print\ \$ARGV\[0\],\"\\n\"\ \}\'\ \$\?h/\$status\ \>\>\
367                      tmpfile.tmx\&echo\ <<shell double quoted input>>\;echo\
368                      \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec perl -e
369                      '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
370                      exit($1);exit$c' tmpfile.tmx
371
372                      First a FIFO is made (.tmx). It is used for
373                      communicating exit value. Next a new tmux session is
374                      made. This may fail if there is already a session, so
375                      the output is ignored. If all job slots finish at the
376                      same time, then tmux will close the session. A temporary
377                      socket is made (.tms) to avoid a race condition in tmux.
378                      It is cleaned up when GNU parallel finishes.
379
380                      The input is used as the name of the windows in tmux.
381                      When the job inside tmux finishes, the exit value is
382                      printed to the FIFO (.tmx).  This FIFO is opened by perl
383                      outside tmux, and perl then removes the FIFO. Perl
384                      blocks until the first value is read from the FIFO, and
385                      this value is used as exit value.
386
387                      To make it compatible with csh and bash the exit value
388                      is printed as: $?h/$status and this is parsed by perl.
389
390                      There is a bug that makes it necessary to print the exit
391                      value 3 times.
392
393                      Another bug in tmux requires the length of the tmux
394                      title and command to not have certain limits.  When
395                      inside these limits, 75 '\ ' are added to the title to
396                      force it to be outside the limits.
397
398                      You can map the bad limits using:
399
400                        perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 1600 1500 90 |
401                          perl -ane '$F[0]+$F[1]+$F[2] < 2037 and print ' |
402                          parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' \
403                            new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm -f /tmp/p{%}-O*'
404
405                        perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 17000 17000 90 |
406                          parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' \
407                        tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm /tmp/p{%}-O*'
408                        > value.csv 2>/dev/null
409
410                        R -e 'a<-read.table("value.csv");X11();plot(a[,1],a[,2],col=a[,4]+5,cex=0.1);Sys.sleep(1000)'
411
412                      For tmux 1.8 17000 can be lowered to 2100.
413
414                      The interesting areas are title 0..1000 with (title +
415                      whole command) in 996..1127 and 9331..9636.
416
417       The ordering of the wrapping is important:
418
419       •    $PARALLEL_ENV which is set in env_parallel.* must be prepended to
420            the command first, as the command may contain exported variables
421            or functions.
422
423       •    --nice/--cat/--fifo should be done on the remote machine
424
425       •    --pipepart/--pipe should be done on the local machine inside
426            --tmux
427
428   Convenience options --nice --basefile --transfer --return --cleanup --tmux
429       --group --compress --cat --fifo --workdir --tag --tagstring
430       These are all convenience options that make it easier to do a task. But
431       more importantly: They are tested to work on corner cases, too. Take
432       --nice as an example:
433
434         nice parallel command ...
435
436       will work just fine. But when run remotely, you need to move the nice
437       command so it is being run on the server:
438
439         parallel -S server nice command ...
440
441       And this will again work just fine, as long as you are running a single
442       command. When you are running a composed command you need nice to apply
443       to the whole command, and it gets harder still:
444
445         parallel -S server -q nice bash -c 'command1 ...; cmd2 | cmd3'
446
447       It is not impossible, but by using --nice GNU parallel will do the
448       right thing for you. Similarly when transferring files: It starts to
449       get hard when the file names contain space, :, `, *, or other special
450       characters.
451
452       To run the commands in a tmux session you basically just need to quote
453       the command. For simple commands that is easy, but when commands
454       contain special characters, it gets much harder to get right.
455
456       --compress not only compresses standard output (stdout) but also
457       standard error (stderr); and it does so into files, that are open but
458       deleted, so a crash will not leave these files around.
459
460       --cat and --fifo are easy to do by hand, until you want to clean up the
461       tmpfile and keep the exit code of the command.
462
463       The real killer comes when you try to combine several of these: Doing
464       that correctly for all corner cases is next to impossible to do by
465       hand.
466
467   --shard
468       The simple way to implement sharding would be to:
469
470       1.   start n jobs,
471
472       2.   split each line into columns,
473
474       3.   select the data from the relevant column
475
476       4.   compute a hash value from the data
477
478       5.   take the modulo n of the hash value
479
480       6.   pass the full line to the jobslot that has the computed value
481
482       Unfortunately Perl is rather slow at computing the hash value (and
483       somewhat slow at splitting into columns).
484
485       One solution is to use a compiled language for the splitting and
486       hashing, but that would go against the design criteria of not depending
487       on a compiler.
488
489       Luckily those tasks can be parallelized. So GNU parallel starts n
490       sharders that do step 2-6, and passes blocks of 100k to each of those
491       in a round robin manner. To make sure these sharders compute the hash
492       the same way, $PERL_HASH_SEED is set to the same value for all
493       sharders.
494
495       Running n sharders poses a new problem: Instead of having n outputs
496       (one for each computed value) you now have n outputs for each of the n
497       values, so in total n*n outputs; and you need to merge these n*n
498       outputs together into n outputs.
499
500       This can be done by simply running 'parallel -j0 --lb cat :::
501       outputs_for_one_value', but that is rather inefficient, as it spawns a
502       process for each file. Instead the core code from 'parcat' is run,
503       which is also a bit faster.
504
505       All the sharders and parcats communicate through named pipes that are
506       unlinked as soon as they are opened.
507
508   Shell shock
509       The shell shock bug in bash did not affect GNU parallel, but the
510       solutions did. bash first introduced functions in variables named:
511       BASH_FUNC_myfunc() and later changed that to BASH_FUNC_myfunc%%. When
512       transferring functions GNU parallel reads off the function and changes
513       that into a function definition, which is copied to the remote system
514       and executed before the actual command is executed. Therefore GNU
515       parallel needs to know how to read the function.
516
517       From version 20150122 GNU parallel tries both the ()-version and the
518       %%-version, and the function definition works on both pre- and post-
519       shell shock versions of bash.
520
521   The remote system wrapper
522       The remote system wrapper does some initialization before starting the
523       command on the remote system.
524
525       Make quoting unnecessary by hex encoding everything
526
527       When you run ssh server foo then foo has to be quoted once:
528
529         ssh server "echo foo; echo bar"
530
531       If you run ssh server1 ssh server2 foo then foo has to be quoted twice:
532
533         ssh server1 ssh server2 \'"echo foo; echo bar"\'
534
535       GNU parallel avoids this by packing everyting into hex values and
536       running a command that does not need quoting:
537
538         perl -X -e GNU_Parallel_worker,eval+pack+q/H10000000/,join+q//,@ARGV
539
540       This command reads hex from the command line and converts that to bytes
541       that are then eval'ed as a Perl expression.
542
543       The string GNU_Parallel_worker is not needed. It is simply there to let
544       the user know, that this process is GNU parallel working.
545
546       Ctrl-C and standard error (stderr)
547
548       If the user presses Ctrl-C the user expects jobs to stop. This works
549       out of the box if the jobs are run locally. Unfortunately it is not so
550       simple if the jobs are run remotely.
551
552       If remote jobs are run in a tty using ssh -tt, then Ctrl-C works, but
553       all output to standard error (stderr) is sent to standard output
554       (stdout). This is not what the user expects.
555
556       If remote jobs are run without a tty using ssh (without -tt), then
557       output to standard error (stderr) is kept on stderr, but Ctrl-C does
558       not kill remote jobs. This is not what the user expects.
559
560       So what is needed is a way to have both. It seems the reason why Ctrl-C
561       does not kill the remote jobs is because the shell does not propagate
562       the hang-up signal from sshd. But when sshd dies, the parent of the
563       login shell becomes init (process id 1). So by exec'ing a Perl wrapper
564       to monitor the parent pid and kill the child if the parent pid becomes
565       1, then Ctrl-C works and stderr is kept on stderr.
566
567       Ctrl-C does, however, kill the ssh connection, so any output from a
568       remote dying process is lost.
569
570       To be able to kill all (grand)*children a new process group is started.
571
572       --nice
573
574       niceing the remote process is done by setpriority(0,0,$nice). A few old
575       systems do not implement this and --nice is unsupported on those.
576
577       Setting $PARALLEL_TMP
578
579       $PARALLEL_TMP is used by --fifo and --cat and must point to a non-
580       exitent file in $TMPDIR. This file name is computed on the remote
581       system.
582
583       The wrapper
584
585       The wrapper looks like this:
586
587         $shell = $PARALLEL_SHELL || $SHELL;
588         $tmpdir = $TMPDIR || $PARALLEL_REMOTE_TMPDIR;
589         $nice = $opt::nice;
590         $termseq = $opt::termseq;
591
592         # Check that $tmpdir is writable
593         -w $tmpdir ||
594             die("$tmpdir is not writable.".
595               " Set PARALLEL_REMOTE_TMPDIR");
596         # Set $PARALLEL_TMP to a non-existent file name in $TMPDIR
597         do {
598             $ENV{PARALLEL_TMP} = $tmpdir."/par".
599               join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
600         } while(-e $ENV{PARALLEL_TMP});
601         # Set $script to a non-existent file name in $TMPDIR
602         do {
603             $script = $tmpdir."/par".
604               join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
605         } while(-e $script);
606         # Create a script from the hex code
607         # that removes itself and runs the commands
608         open($fh,">",$script) || die;
609         # ' needed due to rc-shell
610         print($fh("rm \'$script\'\n",$bashfunc.$cmd));
611         close $fh;
612         my $parent = getppid;
613         my $done = 0;
614         $SIG{CHLD} = sub { $done = 1; };
615         $pid = fork;
616         unless($pid) {
617             # Make own process group to be able to kill HUP it later
618             eval { setpgrp };
619             # Set nice value
620             eval { setpriority(0,0,$nice) };
621             # Run the script
622             exec($shell,$script);
623             die("exec failed: $!");
624         }
625         while((not $done) and (getppid == $parent)) {
626             # Parent pid is not changed, so sshd is alive
627             # Exponential sleep up to 1 sec
628             $s = $s < 1 ? 0.001 + $s * 1.03 : $s;
629             select(undef, undef, undef, $s);
630         }
631         if(not $done) {
632             # sshd is dead: User pressed Ctrl-C
633             # Kill as per --termseq
634             my @term_seq = split/,/,$termseq;
635             if(not @term_seq) {
636               @term_seq = ("TERM",200,"TERM",100,"TERM",50,"KILL",25);
637             }
638             while(@term_seq && kill(0,-$pid)) {
639               kill(shift @term_seq, -$pid);
640               select(undef, undef, undef, (shift @term_seq)/1000);
641             }
642         }
643         wait;
644         exit ($?&127 ? 128+($?&127) : 1+$?>>8)
645
646   Transferring of variables and functions
647       Transferring of variables and functions given by --env is done by
648       running a Perl script remotely that calls the actual command. The Perl
649       script sets $ENV{variable} to the correct value before exec'ing a shell
650       that runs the function definition followed by the actual command.
651
652       The function env_parallel copies the full current environment into the
653       environment variable PARALLEL_ENV. This variable is picked up by GNU
654       parallel and used to create the Perl script mentioned above.
655
656   Base64 encoded bzip2
657       csh limits words of commands to 1024 chars. This is often too little
658       when GNU parallel encodes environment variables and wraps the command
659       with different templates. All of these are combined and quoted into one
660       single word, which often is longer than 1024 chars.
661
662       When the line to run is > 1000 chars, GNU parallel therefore encodes
663       the line to run. The encoding bzip2s the line to run, converts this to
664       base64, splits the base64 into 1000 char blocks (so csh does not fail),
665       and prepends it with this Perl script that decodes, decompresses and
666       evals the line.
667
668           @GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
669           eval "@GNU_Parallel";
670
671           $SIG{CHLD}="IGNORE";
672           # Search for bzip2. Not found => use default path
673           my $zip = (grep { -x $_ } "/usr/local/bin/bzip2")[0] || "bzip2";
674           # $in = stdin on $zip, $out = stdout from $zip
675           my($in, $out,$eval);
676           open3($in,$out,">&STDERR",$zip,"-dc");
677           if(my $perlpid = fork) {
678               close $in;
679               $eval = join "", <$out>;
680               close $out;
681           } else {
682               close $out;
683               # Pipe decoded base64 into 'bzip2 -dc'
684               print $in (decode_base64(join"",@ARGV));
685               close $in;
686               exit;
687           }
688           wait;
689           eval $eval;
690
691       Perl and bzip2 must be installed on the remote system, but a small test
692       showed that bzip2 is installed by default on all platforms that runs
693       GNU parallel, so this is not a big problem.
694
695       The added bonus of this is that much bigger environments can now be
696       transferred as they will be below bash's limit of 131072 chars.
697
698   Which shell to use
699       Different shells behave differently. A command that works in tcsh may
700       not work in bash.  It is therefore important that the correct shell is
701       used when GNU parallel executes commands.
702
703       GNU parallel tries hard to use the right shell. If GNU parallel is
704       called from tcsh it will use tcsh.  If it is called from bash it will
705       use bash. It does this by looking at the (grand)*parent process: If the
706       (grand)*parent process is a shell, use this shell; otherwise look at
707       the parent of this (grand)*parent. If none of the (grand)*parents are
708       shells, then $SHELL is used.
709
710       This will do the right thing if called from:
711
712       • an interactive shell
713
714       • a shell script
715
716       • a Perl script in `` or using system if called as a single string.
717
718       While these cover most cases, there are situations where it will fail:
719
720       • When run using exec.
721
722       • When run as the last command using -c from another shell (because
723         some shells use exec):
724
725           zsh% bash -c "parallel 'echo {} is not run in bash; \
726                set | grep BASH_VERSION' ::: This"
727
728         You can work around that by appending '&& true':
729
730           zsh% bash -c "parallel 'echo {} is run in bash; \
731                set | grep BASH_VERSION' ::: This && true"
732
733       • When run in a Perl script using system with parallel as the first
734         string:
735
736           #!/usr/bin/perl
737
738           system("parallel",'setenv a {}; echo $a',":::",2);
739
740         Here it depends on which shell is used to call the Perl script. If
741         the Perl script is called from tcsh it will work just fine, but if it
742         is called from bash it will fail, because the command setenv is not
743         known to bash.
744
745       If GNU parallel guesses wrong in these situation, set the shell using
746       $PARALLEL_SHELL.
747
748   Always running commands in a shell
749       If the command is a simple command with no redirection and setting of
750       variables, the command could be run without spawning a shell. E.g. this
751       simple grep matching either 'ls ' or ' wc >> c':
752
753         parallel "grep -E 'ls | wc >> c' {}" ::: foo
754
755       could be run as:
756
757         system("grep","-E","ls | wc >> c","foo");
758
759       However, as soon as the command is a bit more complex a shell must be
760       spawned:
761
762         parallel "grep -E 'ls | wc >> c' {} | wc >> c" ::: foo
763         parallel "LANG=C grep -E 'ls | wc >> c' {}" ::: foo
764
765       It is impossible to tell how | wc >> c should be interpreted without
766       parsing the string (is the | a pipe in shell or an alternation in a
767       grep regexp?  Is LANG=C a command in csh or setting a variable in bash?
768       Is >> redirection or part of a regexp?).
769
770       On top of this, wrapper scripts will often require a shell to be
771       spawned.
772
773       The downside is that you need to quote special shell chars twice:
774
775         parallel echo '*' ::: This will expand the asterisk
776         parallel echo "'*'" ::: This will not
777         parallel "echo '*'" ::: This will not
778         parallel echo '\*' ::: This will not
779         parallel echo \''*'\' ::: This will not
780         parallel -q echo '*' ::: This will not
781
782       -q will quote all special chars, thus redirection will not work: this
783       prints '* > out.1' and does not save '*' into the file out.1:
784
785         parallel -q echo "*" ">" out.{} ::: 1
786
787       GNU parallel tries to live up to Principle Of Least Astonishment
788       (POLA), and the requirement of using -q is hard to understand, when you
789       do not see the whole picture.
790
791   Quoting
792       Quoting depends on the shell. For most shells '-quoting is used for
793       strings containing special characters.
794
795       For tcsh/csh newline is quoted as \ followed by newline. Other special
796       characters are also \-quoted.
797
798       For rc everything is quoted using '.
799
800   --pipepart vs. --pipe
801       While --pipe and --pipepart look much the same to the user, they are
802       implemented very differently.
803
804       With --pipe GNU parallel reads the blocks from standard input (stdin),
805       which is then given to the command on standard input (stdin); so every
806       block is being processed by GNU parallel itself. This is the reason why
807       --pipe maxes out at around 500 MB/sec.
808
809       --pipepart, on the other hand, first identifies at which byte positions
810       blocks start and how long they are. It does that by seeking into the
811       file by the size of a block and then reading until it meets end of a
812       block. The seeking explains why GNU parallel does not know the line
813       number and why -L/-l and -N do not work.
814
815       With a reasonable block and file size this seeking is more than 1000
816       time faster than reading the full file. The byte positions are then
817       given to a small script that reads from position X to Y and sends
818       output to standard output (stdout). This small script is prepended to
819       the command and the full command is executed just as if GNU parallel
820       had been in its normal mode. The script looks like this:
821
822         < file perl -e 'while(@ARGV) {
823            sysseek(STDIN,shift,0) || die;
824            $left = shift;
825            while($read = sysread(STDIN,$buf,
826                                  ($left > 131072 ? 131072 : $left))){
827              $left -= $read; syswrite(STDOUT,$buf);
828            }
829         }' startbyte length_in_bytes
830
831       It delivers 1 GB/s per core.
832
833       Instead of the script dd was tried, but many versions of dd do not
834       support reading from one byte to another and might cause partial data.
835       See this for a surprising example:
836
837         yes | dd bs=1024k count=10 | wc
838
839   --block-size adjustment
840       Every time GNU parallel detects a record bigger than --block-size it
841       increases the block size by 30%. A small --block-size gives very poor
842       performance; by exponentially increasing the block size performance
843       will not suffer.
844
845       GNU parallel will waste CPU power if --block-size does not contain a
846       full record, because it tries to find a full record and will fail to do
847       so. The recommendation is therefore to use a --block-size > 2 records,
848       so you always get at least one full record when you read one block.
849
850       If you use -N then --block-size should be big enough to contain N+1
851       records.
852
853   Automatic --block-size computation
854       With --pipepart GNU parallel can compute the --block-size
855       automatically. A --block-size of -1 will use a block size so that each
856       jobslot will receive approximately 1 block.  --block -2 will pass 2
857       blocks to each jobslot and -n will pass n blocks to each jobslot.
858
859       This can be done because --pipepart reads from files, and we can
860       compute the total size of the input.
861
862   --jobs and --onall
863       When running the same commands on many servers what should --jobs
864       signify? Is it the number of servers to run on in parallel?  Is it the
865       number of jobs run in parallel on each server?
866
867       GNU parallel lets --jobs represent the number of servers to run on in
868       parallel. This is to make it possible to run a sequence of commands
869       (that cannot be parallelized) on each server, but run the same sequence
870       on multiple servers.
871
872   --shuf
873       When using --shuf to shuffle the jobs, all jobs are read, then they are
874       shuffled, and finally executed. When using SQL this makes the
875       --sqlmaster be the part that shuffles the jobs. The --sqlworkers simply
876       executes according to Seq number.
877
878   --csv
879       --pipepart is incompatible with --csv because you can have records
880       like:
881
882         a,b,c
883         a,"
884         a,b,c
885         a,b,c
886         a,b,c
887         ",c
888         a,b,c
889
890       Here the second record contains a multi-line field that looks like
891       records. Since --pipepart does not read then whole file when searching
892       for record endings, it may start reading in this multi-line field,
893       which would be wrong.
894
895   Buffering on disk
896       GNU parallel buffers output, because if output is not buffered you have
897       to be ridiculously careful on sizes to avoid mixing of outputs (see
898       excellent example on https://catern.com/posts/pipes.html).
899
900       GNU parallel buffers on disk in $TMPDIR using files, that are removed
901       as soon as they are created, but which are kept open. So even if GNU
902       parallel is killed by a power outage, there will be no files to clean
903       up afterwards. Another advantage is that the file system is aware that
904       these files will be lost in case of a crash, so it does not need to
905       sync them to disk.
906
907       It gives the odd situation that a disk can be fully used, but there are
908       no visible files on it.
909
910       Partly buffering in memory
911
912       When using output formats SQL and CSV then GNU Parallel has to read the
913       whole output into memory. When run normally it will only read the
914       output from a single job. But when using --linebuffer every line
915       printed will also be buffered in memory - for all jobs currently
916       running.
917
918       If memory is tight, then do not use the output format SQL/CSV with
919       --linebuffer.
920
921       Comparing to buffering in memory
922
923       gargs is a parallelizing tool that buffers in memory. It is therefore a
924       useful way of comparing the advantages and disadvantages of buffering
925       in memory to buffering on disk.
926
927       On an system with 6 GB RAM free and 6 GB free swap these were tested
928       with different sizes:
929
930         echo /dev/zero | gargs "head -c $size {}" >/dev/null
931         echo /dev/zero | parallel "head -c $size {}" >/dev/null
932
933       The results are here:
934
935         JobRuntime      Command
936              0.344      parallel_test 1M
937              0.362      parallel_test 10M
938              0.640      parallel_test 100M
939              9.818      parallel_test 1000M
940             23.888      parallel_test 2000M
941             30.217      parallel_test 2500M
942             30.963      parallel_test 2750M
943             34.648      parallel_test 3000M
944             43.302      parallel_test 4000M
945             55.167      parallel_test 5000M
946             67.493      parallel_test 6000M
947            178.654      parallel_test 7000M
948            204.138      parallel_test 8000M
949            230.052      parallel_test 9000M
950            255.639      parallel_test 10000M
951            757.981      parallel_test 30000M
952              0.537      gargs_test 1M
953              0.292      gargs_test 10M
954              0.398      gargs_test 100M
955              3.456      gargs_test 1000M
956              8.577      gargs_test 2000M
957             22.705      gargs_test 2500M
958            123.076      gargs_test 2750M
959             89.866      gargs_test 3000M
960            291.798      gargs_test 4000M
961
962       GNU parallel is pretty much limited by the speed of the disk: Up to 6
963       GB data is written to disk but cached, so reading is fast. Above 6 GB
964       data are both written and read from disk. When the 30000MB job is
965       running, the disk system is slow, but usable: If you are not using the
966       disk, you almost do not feel it.
967
968       gargs has a speed advantage up until 2500M where it hits a wall. Then
969       the system starts swapping like crazy and is completely unusable. At
970       5000M it goes out of memory.
971
972       You can make GNU parallel behave similar to gargs if you point $TMPDIR
973       to a tmpfs-filesystem: It will be faster for small outputs, but may
974       kill your system for larger outputs and cause you to lose output.
975
976   Disk full
977       GNU parallel buffers on disk. If the disk is full, data may be lost. To
978       check if the disk is full GNU parallel writes a 8193 byte file every
979       second. If this file is written successfully, it is removed
980       immediately. If it is not written successfully, the disk is full. The
981       size 8193 was chosen because 8192 gave wrong result on some file
982       systems, whereas 8193 did the correct thing on all tested filesystems.
983
984   Memory usage
985       Normally GNU parallel will use around 17 MB RAM constantly - no matter
986       how many jobs or how much output there is. There are a few things that
987       cause the memory usage to rise:
988
989       •  Multiple input sources. GNU parallel reads an input source only
990          once. This is by design, as an input source can be a stream (e.g.
991          FIFO, pipe, standard input (stdin)) which cannot be rewound and read
992          again. When reading a single input source, the memory is freed as
993          soon as the job is done - thus keeping the memory usage constant.
994
995          But when reading multiple input sources GNU parallel keeps the
996          already read values for generating all combinations with other input
997          sources.
998
999       •  Computing the number of jobs. --bar, --eta, and --halt xx% use
1000          total_jobs() to compute the total number of jobs. It does this by
1001          generating the data structures for all jobs. All these job data
1002          structures will be stored in memory and take up around 400
1003          bytes/job.
1004
1005       •  Buffering a full line. --linebuffer will read a full line per
1006          running job. A very long output line (say 1 GB without \n) will
1007          increase RAM usage temporarily: From when the beginning of the line
1008          is read till the line is printed.
1009
1010       •  Buffering the full output of a single job. This happens when using
1011          --results *.csv/*.tsv or --sql*. Here GNU parallel will read the
1012          whole output of a single job and save it as csv/tsv or SQL.
1013
1014   Argument separators ::: :::: :::+ ::::+
1015       The argument separator ::: was chosen because I have never seen :::
1016       used in any command. The natural choice -- would be a bad idea since it
1017       is not unlikely that the template command will contain --. I have seen
1018       :: used in programming languanges to separate classes, and I did not
1019       want the user to be confused that the separator had anything to do with
1020       classes.
1021
1022       ::: also makes a visual separation, which is good if there are multiple
1023       :::.
1024
1025       When ::: was chosen, :::: came as a fairly natural extension.
1026
1027       Linking input sources meant having to decide for some way to indicate
1028       linking of ::: and ::::. :::+ and ::::+ were chosen, so that they were
1029       similar to ::: and ::::.
1030
1031   Perl replacement strings, {= =}, and --rpl
1032       The shorthands for replacement strings make a command look more
1033       cryptic. Different users will need different replacement strings.
1034       Instead of inventing more shorthands you get more flexible replacement
1035       strings if they can be programmed by the user.
1036
1037       The language Perl was chosen because GNU parallel is written in Perl
1038       and it was easy and reasonably fast to run the code given by the user.
1039
1040       If a user needs the same programmed replacement string again and again,
1041       the user may want to make his own shorthand for it. This is what --rpl
1042       is for. It works so well, that even GNU parallel's own shorthands are
1043       implemented using --rpl.
1044
1045       In Perl code the bigrams {= and =} rarely exist. They look like a
1046       matching pair and can be entered on all keyboards. This made them good
1047       candidates for enclosing the Perl expression in the replacement
1048       strings. Another candidate ,, and ,, was rejected because they do not
1049       look like a matching pair. --parens was made, so that the users can
1050       still use ,, and ,, if they like: --parens ,,,,
1051
1052       Internally, however, the {= and =} are replaced by \257< and \257>.
1053       This is to make it simpler to make regular expressions. You only need
1054       to look one character ahead, and never have to look behind.
1055
1056   Test suite
1057       GNU parallel uses its own testing framework. This is mostly due to
1058       historical reasons. It deals reasonably well with tests that are
1059       dependent on how long a given test runs (e.g. more than 10 secs is a
1060       pass, but less is a fail). It parallelizes most tests, but it is easy
1061       to force a test to run as the single test (which may be important for
1062       timing issues). It deals reasonably well with tests that fail
1063       intermittently. It detects which tests failed and pushes these to the
1064       top, so when running the test suite again, the tests that failed most
1065       recently are run first.
1066
1067       If GNU parallel should adopt a real testing framework then those
1068       elements would be important.
1069
1070       Since many tests are dependent on which hardware it is running on,
1071       these tests break when run on a different hardware than what the test
1072       was written for.
1073
1074       When most bugs are fixed a test is added, so this bug will not
1075       reappear. It is, however, sometimes hard to create the environment in
1076       which the bug shows up - especially if the bug only shows up sometimes.
1077       One of the harder problems was to make a machine start swapping without
1078       forcing it to its knees.
1079
1080   Median run time
1081       Using a percentage for --timeout causes GNU parallel to compute the
1082       median run time of a job. The median is a better indicator of the
1083       expected run time than average, because there will often be outliers
1084       taking way longer than the normal run time.
1085
1086       To avoid keeping all run times in memory, an implementation of remedian
1087       was made (Rousseeuw et al).
1088
1089   Error messages and warnings
1090       Error messages like: ERROR, Not found, and 42 are not very helpful. GNU
1091       parallel strives to inform the user:
1092
1093       • What went wrong?
1094
1095       • Why did it go wrong?
1096
1097       • What can be done about it?
1098
1099       Unfortunately it is not always possible to predict the root cause of
1100       the error.
1101
1102   Determine number of CPUs
1103       CPUs is an ambiguous term. It can mean the number of socket filled
1104       (i.e. the number of physical chips). It can mean the number of cores
1105       (i.e. the number of physical compute cores). It can mean the number of
1106       hyperthreaded cores (i.e. the number of virtual cores - with some of
1107       them possibly being hyperthreaded).
1108
1109       On ark.intel.com Intel uses the terms cores and threads for number of
1110       physical cores and the number of hyperthreaded cores respectively.
1111
1112       GNU parallel uses uses CPUs as the number of compute units and the
1113       terms sockets, cores, and threads to specify how the number of compute
1114       units is calculated.
1115
1116   Computation of load
1117       Contrary to the obvious --load does not use load average. This is due
1118       to load average rising too slowly. Instead it uses ps to list the
1119       number of threads in running or blocked state (state D, O or R). This
1120       gives an instant load.
1121
1122       As remote calculation of load can be slow, a process is spawned to run
1123       ps and put the result in a file, which is then used next time.
1124
1125   Killing jobs
1126       GNU parallel kills jobs. It can be due to --memfree, --halt, or when
1127       GNU parallel meets a condition from which it cannot recover. Every job
1128       is started as its own process group. This way any (grand)*children will
1129       get killed, too. The process group is killed with the specification
1130       mentioned in --termseq.
1131
1132   SQL interface
1133       GNU parallel uses the DBURL from GNU sql to give database software,
1134       username, password, host, port, database, and table in a single string.
1135
1136       The DBURL must point to a table name. The table will be dropped and
1137       created. The reason for not reusing an existing table is that the user
1138       may have added more input sources which would require more columns in
1139       the table. By prepending '+' to the DBURL the table will not be
1140       dropped.
1141
1142       The table columns are similar to joblog with the addition of V1 .. Vn
1143       which are values from the input sources, and Stdout and Stderr which
1144       are the output from standard output and standard error, respectively.
1145
1146       The Signal column has been renamed to _Signal due to Signal being a
1147       reserved word in MySQL.
1148
1149   Logo
1150       The logo is inspired by the Cafe Wall illusion. The font is DejaVu
1151       Sans.
1152
1153   Citation notice
1154       Funding a free software project is hard. GNU parallel is no exception.
1155       On top of that it seems the less visible a project is, the harder it is
1156       to get funding. And the nature of GNU parallel is that it will never be
1157       seen by "the guy with the checkbook", but only by the people doing the
1158       actual work.
1159
1160       This problem has been covered by others - though no solution has been
1161       found: https://www.slideshare.net/NadiaEghbal/consider-the-maintainer
1162       https://www.numfocus.org/blog/why-is-numpy-only-now-getting-funded/
1163
1164       Before implementing the citation notice it was discussed with the
1165       users:
1166       https://lists.gnu.org/archive/html/parallel/2013-11/msg00006.html
1167
1168       Having to spend 10 seconds on running parallel --citation once is no
1169       doubt not an ideal solution, but no one has so far come up with an
1170       ideal solution - neither for funding GNU parallel nor other free
1171       software.
1172
1173       If you believe you have the perfect solution, you should try it out,
1174       and if it works, you should post it on the email list. Ideas that will
1175       cost work and which have not been tested are, however, unlikely to be
1176       prioritized.
1177
1178       Running parallel --citation one single time takes less than 10 seconds,
1179       and will silence the citation notice for future runs. This is
1180       comparable to graphical tools where you have to click a checkbox saying
1181       "Do not show this again". But if that is too much trouble for you, why
1182       not use one of the alternatives instead?  See a list in: man
1183       parallel_alternatives.
1184
1185       As the request for citation is not a legal requirement this is
1186       acceptable under GPLv3 and cleared with Richard M. Stallman himself.
1187       Thus it does not fall under this:
1188       https://www.gnu.org/licenses/gpl-faq.en.html#RequireCitation
1189

Ideas for new design

1191   Multiple processes working together
1192       Open3 is slow. Printing is slow. It would be good if they did not tie
1193       up resources, but were run in separate threads.
1194
1195   --rrs on remote using a perl wrapper
1196       ... | perl -pe '$/=$recend$recstart;BEGIN{ if(substr($_) eq $recstart)
1197       substr($_)="" } eof and substr($_) eq $recend) substr($_)=""
1198
1199       It ought to be possible to write a filter that removed rec sep on the
1200       fly instead of inside GNU parallel. This could then use more cpus.
1201
1202       Will that require 2x record size memory?
1203
1204       Will that require 2x block size memory?
1205

Historical decisions

1207       These decisions were relevant for earlier versions of GNU parallel, but
1208       not the current version. They are kept here as historical record.
1209
1210   --tollef
1211       You can read about the history of GNU parallel on
1212       https://www.gnu.org/software/parallel/history.html
1213
1214       --tollef was included to make GNU parallel switch compatible with the
1215       parallel from moreutils (which is made by Tollef Fog Heen). This was
1216       done so that users of that parallel easily could port their use to GNU
1217       parallel: Simply set PARALLEL="--tollef" and that would be it.
1218
1219       But several distributions chose to make --tollef global (by putting it
1220       into /etc/parallel/config) without making the users aware of this, and
1221       that caused much confusion when people tried out the examples from GNU
1222       parallel's man page and these did not work.  The users became
1223       frustrated because the distribution did not make it clear to them that
1224       it has made --tollef global.
1225
1226       So to lessen the frustration and the resulting support, --tollef was
1227       obsoleted 20130222 and removed one year later.
1228
1229
1230
123120220222                          2022-03-16                PARALLEL_DESIGN(7)