parallel_design(7)

1PARALLEL_DESIGN(7)                 parallel                 PARALLEL_DESIGN(7)
2
3
4
5options as wrapper scripts
6

Design of GNU Parallel

8       This document describes design decisions made in the development of GNU
9       parallel and the reasoning behind them. It will give an overview of why
10       some of the code looks the way it does, and will help new maintainers
11       understand the code better.
12
13   One file program
14       GNU parallel is a Perl script in a single file. It is object oriented,
15       but contrary to normal Perl scripts each class is not in its own file.
16       This is due to user experience: The goal is that in a pinch the user
17       will be able to get GNU parallel working simply by copying a single
18       file: No need to mess around with environment variables like PERL5LIB.
19
20   Choice of programming language
21       GNU parallel is designed to be able to run on old systems. That means
22       that it cannot depend on a compiler being installed - and especially
23       not a compiler for a language that is younger than 20 years old.
24
25       The goal is that you can use GNU parallel on any system, even if you
26       are not allowed to install additional software.
27
28       Of all the systems I have experienced, I have yet to see a system that
29       had GCC installed that did not have Perl. The same goes for Rust, Go,
30       Haskell, and other younger languages. I have, however, seen systems
31       with Perl without any of the mentioned compilers.
32
33       Most modern systems also have either Python2 or Python3 installed, but
34       you still cannot be certain which version, and since Python2 cannot run
35       under Python3, Python is not an option.
36
37       Perl has the added benefit that implementing the {= perlexpr =}
38       replacement string was fairly easy.
39
40   Old Perl style
41       GNU parallel uses some old, deprecated constructs. This is due to a
42       goal of being able to run on old installations. Currently the target is
43       CentOS 3.9 and Perl 5.8.0.
44
45   Scalability up and down
46       The smallest system GNU parallel is tested on is a 32 MB ASUS WL500gP.
47       The largest is a 2 TB 128-core machine. It scales up to around 100
48       machines - depending on the duration of each job.
49
50   Exponentially back off
51       GNU parallel busy waits. This is because the reason why a job is not
52       started may be due to load average (when using --load), and thus it
53       will not make sense to wait for a job to finish. Instead the load
54       average must be checked again. Load average is not the only reason:
55       --timeout has a similar problem.
56
57       To not burn up too much CPU GNU parallel sleeps exponentially longer
58       and longer if nothing happens, maxing out at 1 second.
59
60   Shell compatibility
61       It is a goal to have GNU parallel work equally well in any shell.
62       However, in practice GNU parallel is being developed in bash and thus
63       testing in other shells is limited to reported bugs.
64
65       When an incompatibility is found there is often not an easy fix: Fixing
66       the problem in csh often breaks it in bash. In these cases the fix is
67       often to use a small Perl script and call that.
68
69   env_parallel
70       env_parallel is a dummy shell script that will run if env_parallel is
71       not an alias or a function and tell the user how to activate the
72       alias/function for the supported shells.
73
74       The alias or function will copy the current environment and run the
75       command with GNU parallel in the copy of the environment.
76
77       The problem is that you cannot access all of the current environment
78       inside Perl. E.g. aliases, functions and unexported shell variables.
79
80       The idea is therefore to take the environment and put it in
81       $PARALLEL_ENV which GNU parallel prepends to every command.
82
83       The only way to have access to the environment is directly from the
84       shell, so the program must be written in a shell script that will be
85       sourced and there has to deal with the dialect of the relevant shell.
86
87       env_parallel.*
88
89       These are the files that implements the alias or function env_parallel
90       for a given shell. It could be argued that these should be put in some
91       obscure place under /usr/lib, but by putting them in your path it
92       becomes trivial to find the path to them and source them:
93
94         source `which env_parallel.foo`
95
96       The beauty is that they can be put anywhere in the path without the
97       user having to know the location. So if the user's path includes
98       /afs/bin/i386_fc5 or /usr/pkg/parallel/bin or
99       /usr/local/parallel/20161222/sunos5.6/bin the files can be put in the
100       dir that makes most sense for the sysadmin.
101
102       env_parallel.bash / env_parallel.sh / env_parallel.ash /
103       env_parallel.dash / env_parallel.zsh / env_parallel.ksh /
104       env_parallel.mksh
105
106       env_parallel.(bash|sh|ash|dash|ksh|mksh|zsh) defines the function
107       env_parallel. It uses alias and typeset to dump the configuration (with
108       a few exceptions) into $PARALLEL_ENV before running GNU parallel.
109
110       After GNU parallel is finished, $PARALLEL_ENV is deleted.
111
112       env_parallel.csh
113
114       env_parallel.csh has two purposes: If env_parallel is not an alias:
115       make it into an alias that sets $PARALLEL with arguments and calls
116       env_parallel.csh.
117
118       If env_parallel is an alias, then env_parallel.csh uses $PARALLEL as
119       the arguments for GNU parallel.
120
121       It exports the environment by writing a variable definition to a file
122       for each variable.  The definitions of aliases are appended to this
123       file. Finally the file is put into $PARALLEL_ENV.
124
125       GNU parallel is then run and $PARALLEL_ENV is deleted.
126
127       env_parallel.fish
128
129       First all functions definitions are generated using a loop and
130       functions.
131
132       Dumping the scalar variable definitions is harder.
133
134       fish can represent non-printable characters in (at least) 2 ways. To
135       avoid problems all scalars are converted to \XX quoting.
136
137       Then commands to generate the definitions are made and separated by
138       NUL.
139
140       This is then piped into a Perl script that quotes all values. List
141       elements will be appended using two spaces.
142
143       Finally \n is converted into \1 because fish variables cannot contain
144       \n. GNU parallel will later convert all \1 from $PARALLEL_ENV into \n.
145
146       This is then all saved in $PARALLEL_ENV.
147
148       GNU parallel is called, and $PARALLEL_ENV is deleted.
149
150   parset (supported in sh, ash, dash, bash, zsh, ksh, mksh)
151       parset is a shell function. This is the reason why parset can set
152       variables: It runs in the shell which is calling it.
153
154       It is also the reason why parset does not work, when data is piped into
155       it: ... | parset ... makes parset start in a subshell, and any changes
156       in environment can therefore not make it back to the calling shell.
157
158   Job slots
159       The easiest way to explain what GNU parallel does is to assume that
160       there are a number of job slots, and when a slot becomes available a
161       job from the queue will be run in that slot. But originally GNU
162       parallel did not model job slots in the code. Job slots have been added
163       to make it possible to use {%} as a replacement string.
164
165       While the job sequence number can be computed in advance, the job slot
166       can only be computed the moment a slot becomes available. So it has
167       been implemented as a stack with lazy evaluation: Draw one from an
168       empty stack and the stack is extended by one. When a job is done, push
169       the available job slot back on the stack.
170
171       This implementation also means that if you re-run the same jobs, you
172       cannot assume jobs will get the same slots. And if you use remote
173       executions, you cannot assume that a given job slot will remain on the
174       same remote server. This goes double since number of job slots can be
175       adjusted on the fly (by giving --jobs a file name).
176
177   Rsync protocol version
178       rsync 3.1.x uses protocol 31 which is unsupported by version 2.5.7.
179       That means that you cannot push a file to a remote system using rsync
180       protocol 31, if the remote system uses 2.5.7. rsync does not
181       automatically downgrade to protocol 30.
182
183       GNU parallel does not require protocol 31, so if the rsync version is
184       >= 3.1.0 then --protocol 30 is added to force newer rsyncs to talk to
185       version 2.5.7.
186
187   Compression
188       GNU parallel buffers output in temporary files. --compress compresses
189       the buffered data.  This is a bit tricky because there should be no
190       files to clean up if GNU parallel is killed by a power outage.
191
192       GNU parallel first selects a compression program. If the user has not
193       selected one, the first of these that is in $PATH is used: pzstd lbzip2
194       pbzip2 zstd pixz lz4 pigz lzop plzip lzip gzip lrz pxz bzip2 lzma xz
195       clzip. They are sorted by speed on a 128 core machine.
196
197       Schematically the setup is as follows:
198
199         command started by parallel | compress > tmpfile
200         cattail tmpfile | uncompress | parallel which reads the output
201
202       The setup is duplicated for both standard output (stdout) and standard
203       error (stderr).
204
205       GNU parallel pipes output from the command run into the compression
206       program which saves to a tmpfile. GNU parallel records the pid of the
207       compress program.  At the same time a small Perl script (called cattail
208       above) is started: It basically does cat followed by tail -f, but it
209       also removes the tmpfile as soon as the first byte is read, and it
210       continuously checks if the pid of the compression program is dead. If
211       the compress program is dead, cattail reads the rest of tmpfile and
212       exits.
213
214       As most compression programs write out a header when they start, the
215       tmpfile in practice is removed by cattail after around 40 ms.
216
217   Wrapping
218       The command given by the user can be wrapped in multiple templates.
219       Templates can be wrapped in other templates.
220
221       $COMMAND       the command to run.
222
223       $INPUT         the input to run.
224
225       $SHELL         the shell that started GNU Parallel.
226
227       $SSHLOGIN      the sshlogin.
228
229       $WORKDIR       the working dir.
230
231       $FILE          the file to read parts from.
232
233       $STARTPOS      the first byte position to read from $FILE.
234
235       $LENGTH        the number of bytes to read from $FILE.
236
237       --shellquote   echo Double quoted $INPUT
238
239       --nice pri     Remote: See The remote system wrapper.
240
241                      Local: setpriority(0,0,$nice)
242
243       --cat
244                        cat > {}; $COMMAND {};
245                        perl -e '$bash = shift;
246                          $csh = shift;
247                          for(@ARGV) { unlink;rmdir; }
248                          if($bash =~ s/h//) { exit $bash;  }
249                          exit $csh;' "$?h" "$status" {};
250
251                      {} is set to $PARALLEL_TMP which is a tmpfile. The Perl
252                      script saves the exit value, unlinks the tmpfile, and
253                      returns the exit value - no matter if the shell is
254                      bash/ksh/zsh (using $?) or *csh/fish (using $status).
255
256       --fifo
257                        perl -e '($s,$c,$f) = @ARGV;
258                          # mkfifo $PARALLEL_TMP
259                          system "mkfifo", $f;
260                          # spawn $shell -c $command &
261                          $pid = fork || exec $s, "-c", $c;
262                          open($o,">",$f) || die $!;
263                          # cat > $PARALLEL_TMP
264                          while(sysread(STDIN,$buf,131072)){
265                             syswrite $o, $buf;
266                          }
267                          close $o;
268                          # waitpid to get the exit code from $command
269                          waitpid $pid,0;
270                          # Cleanup
271                          unlink $f;
272                          exit $?/256;' $SHELL -c $COMMAND $PARALLEL_TMP
273
274                      This is an elaborate way of: mkfifo {}; run $COMMAND in
275                      the background using $SHELL; copying STDIN to {};
276                      waiting for background to complete; remove {} and exit
277                      with the exit code from $COMMAND.
278
279                      It is made this way to be compatible with *csh/fish.
280
281       --pipepart
282                        < $FILE perl -e 'while(@ARGV) {
283                            sysseek(STDIN,shift,0) || die;
284                            $left = shift;
285                            while($read =
286                                  sysread(STDIN,$buf,
287                                          ($left > 131072 ? 131072 : $left))){
288                              $left -= $read;
289                              syswrite(STDOUT,$buf);
290                            }
291                          }' $STARTPOS $LENGTH
292
293                      This will read $LENGTH bytes from $FILE starting at
294                      $STARTPOS and send it to STDOUT.
295
296       --sshlogin $SSHLOGIN
297                        ssh $SSHLOGIN "$COMMAND"
298
299       --transfer
300                        ssh $SSHLOGIN mkdir -p ./$WORKDIR;
301                        rsync --protocol 30 -rlDzR \
302                              -essh ./{} $SSHLOGIN:./$WORKDIR;
303                        ssh $SSHLOGIN "$COMMAND"
304
305                      Read about --protocol 30 in the section Rsync protocol
306                      version.
307
308       --transferfile file
309                      <<todo>>
310
311       --basefile     <<todo>>
312
313       --return file
314                        $COMMAND; _EXIT_status=$?; mkdir -p $WORKDIR;
315                        rsync --protocol 30 \
316                          --rsync-path=cd\ ./$WORKDIR\;\ rsync \
317                          -rlDzR -essh $SSHLOGIN:./$FILE ./$WORKDIR;
318                        exit $_EXIT_status;
319
320                      The --rsync-path=cd ... is needed because old versions
321                      of rsync do not support --no-implied-dirs.
322
323                      The $_EXIT_status trick is to postpone the exit value.
324                      This makes it incompatible with *csh and should be fixed
325                      in the future. Maybe a wrapping 'sh -c' is enough?
326
327       --cleanup      $RETURN is the wrapper from --return
328
329                        $COMMAND; _EXIT_status=$?; $RETURN;
330                        ssh $SSHLOGIN \(rm\ -f\ ./$WORKDIR/{}\;\
331                                        rmdir\ ./$WORKDIR\ \>\&/dev/null\;\);
332                        exit $_EXIT_status;
333
334                      $_EXIT_status: see --return above.
335
336       --pipe
337                        perl -e 'if(sysread(STDIN, $buf, 1)) {
338                              open($fh, "|-", "@ARGV") || die;
339                              syswrite($fh, $buf);
340                              # Align up to 128k block
341                              if($read = sysread(STDIN, $buf, 131071)) {
342                                  syswrite($fh, $buf);
343                              }
344                              while($read = sysread(STDIN, $buf, 131072)) {
345                                  syswrite($fh, $buf);
346                              }
347                              close $fh;
348                              exit ($?&127 ? 128+($?&127) : 1+$?>>8)
349                          }' $SHELL -c $COMMAND
350
351                      This small wrapper makes sure that $COMMAND will never
352                      be run if there is no data.
353
354       --tmux         <<TODO Fixup with '-quoting>> mkfifo /tmp/tmx3cMEV &&
355                        sh -c 'tmux -S /tmp/tmsaKpv1 new-session -s p334310 -d
356                      "sleep .2" >/dev/null 2>&1'; tmux -S /tmp/tmsaKpv1 new-
357                      window -t p334310 -n wc\ 10 \(wc\ 10\)\;\ perl\ -e\
358                      \'while\(\$t++\<3\)\{\ print\ \$ARGV\[0\],\"\\n\"\ \}\'\
359                      \$\?h/\$status\ \>\>\ /tmp/tmx3cMEV\&echo\ wc\\\ 10\;\
360                      echo\ \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec
361                      perl -e '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
362                      exit($1);exit$c' /tmp/tmx3cMEV
363
364                      mkfifo tmpfile.tmx; tmux -S <tmpfile.tms> new-session -s
365                      pPID -d 'sleep .2' >&/dev/null; tmux -S <tmpfile.tms>
366                      new-window -t pPID -n <<shell quoted input>> \(<<shell
367                      quoted input>>\)\;\ perl\ -e\ \'while\(\$t++\<3\)\{\
368                      print\ \$ARGV\[0\],\"\\n\"\ \}\'\ \$\?h/\$status\ \>\>\
369                      tmpfile.tmx\&echo\ <<shell double quoted input>>\;echo\
370                      \Job\ finished\ at:\ \`date\`\;sleep\ 10; exec perl -e
371                      '$/="/";$_=<>;$c=<>;unlink $ARGV; /(\d+)h/ and
372                      exit($1);exit$c' tmpfile.tmx
373
374                      First a FIFO is made (.tmx). It is used for
375                      communicating exit value. Next a new tmux session is
376                      made. This may fail if there is already a session, so
377                      the output is ignored. If all job slots finish at the
378                      same time, then tmux will close the session. A temporary
379                      socket is made (.tms) to avoid a race condition in tmux.
380                      It is cleaned up when GNU parallel finishes.
381
382                      The input is used as the name of the windows in tmux.
383                      When the job inside tmux finishes, the exit value is
384                      printed to the FIFO (.tmx).  This FIFO is opened by perl
385                      outside tmux, and perl then removes the FIFO. Perl
386                      blocks until the first value is read from the FIFO, and
387                      this value is used as exit value.
388
389                      To make it compatible with csh and bash the exit value
390                      is printed as: $?h/$status and this is parsed by perl.
391
392                      There is a bug that makes it necessary to print the exit
393                      value 3 times.
394
395                      Another bug in tmux requires the length of the tmux
396                      title and command to not have certain limits.  When
397                      inside these limits, 75 '\ ' are added to the title to
398                      force it to be outside the limits.
399
400                      You can map the bad limits using:
401
402                        perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 1600 1500 90 |
403                          perl -ane '$F[0]+$F[1]+$F[2] < 2037 and print ' |
404                          parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' \
405                            new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm -f /tmp/p{%}-O*'
406
407                        perl -e 'sub r { int(rand(shift)).($_[0] && "\t".r(@_)) } print map { r(@ARGV)."\n" } 1..10000' 17000 17000 90 |
408                          parallel --colsep '\t' --tagstring '{1}\t{2}\t{3}' \
409                        tmux -S /tmp/p{%}-'{=3 $_="O"x$_ =}' new-session -d -n '{=1 $_="O"x$_ =}' true'\ {=2 $_="O"x$_ =};echo $?;rm /tmp/p{%}-O*'
410                        > value.csv 2>/dev/null
411
412                        R -e 'a<-read.table("value.csv");X11();plot(a[,1],a[,2],col=a[,4]+5,cex=0.1);Sys.sleep(1000)'
413
414                      For tmux 1.8 17000 can be lowered to 2100.
415
416                      The interesting areas are title 0..1000 with (title +
417                      whole command) in 996..1127 and 9331..9636.
418
419       The ordering of the wrapping is important:
420
421       •    $PARALLEL_ENV which is set in env_parallel.* must be prepended to
422            the command first, as the command may contain exported variables
423            or functions.
424
425       •    --nice/--cat/--fifo should be done on the remote machine
426
427       •    --pipepart/--pipe should be done on the local machine inside
428            --tmux
429
430   Convenience options --nice --basefile --transfer --return --cleanup --tmux
431       --group --compress --cat --fifo --workdir --tag --tagstring
432       These are all convenience options that make it easier to do a task. But
433       more importantly: They are tested to work on corner cases, too. Take
434       --nice as an example:
435
436         nice parallel command ...
437
438       will work just fine. But when run remotely, you need to move the nice
439       command so it is being run on the server:
440
441         parallel -S server nice command ...
442
443       And this will again work just fine, as long as you are running a single
444       command. When you are running a composed command you need nice to apply
445       to the whole command, and it gets harder still:
446
447         parallel -S server -q nice bash -c 'command1 ...; cmd2 | cmd3'
448
449       It is not impossible, but by using --nice GNU parallel will do the
450       right thing for you. Similarly when transferring files: It starts to
451       get hard when the file names contain space, :, `, *, or other special
452       characters.
453
454       To run the commands in a tmux session you basically just need to quote
455       the command. For simple commands that is easy, but when commands
456       contain special characters, it gets much harder to get right.
457
458       --compress not only compresses standard output (stdout) but also
459       standard error (stderr); and it does so into files, that are open but
460       deleted, so a crash will not leave these files around.
461
462       --cat and --fifo are easy to do by hand, until you want to clean up the
463       tmpfile and keep the exit code of the command.
464
465       The real killer comes when you try to combine several of these: Doing
466       that correctly for all corner cases is next to impossible to do by
467       hand.
468
469   --shard
470       The simple way to implement sharding would be to:
471
472       1.   start n jobs,
473
474       2.   split each line into columns,
475
476       3.   select the data from the relevant column
477
478       4.   compute a hash value from the data
479
480       5.   take the modulo n of the hash value
481
482       6.   pass the full line to the jobslot that has the computed value
483
484       Unfortunately Perl is rather slow at computing the hash value (and
485       somewhat slow at splitting into columns).
486
487       One solution is to use a compiled language for the splitting and
488       hashing, but that would go against the design criteria of not depending
489       on a compiler.
490
491       Luckily those tasks can be parallelized. So GNU parallel starts n
492       sharders that do step 2-6, and passes blocks of 100k to each of those
493       in a round robin manner. To make sure these sharders compute the hash
494       the same way, $PERL_HASH_SEED is set to the same value for all
495       sharders.
496
497       Running n sharders poses a new problem: Instead of having n outputs
498       (one for each computed value) you now have n outputs for each of the n
499       values, so in total n*n outputs; and you need to merge these n*n
500       outputs together into n outputs.
501
502       This can be done by simply running 'parallel -j0 --lb cat :::
503       outputs_for_one_value', but that is rather inefficient, as it spawns a
504       process for each file. Instead the core code from 'parcat' is run,
505       which is also a bit faster.
506
507       All the sharders and parcats communicate through named pipes that are
508       unlinked as soon as they are opened.
509
510   Shell shock
511       The shell shock bug in bash did not affect GNU parallel, but the
512       solutions did. bash first introduced functions in variables named:
513       BASH_FUNC_myfunc() and later changed that to BASH_FUNC_myfunc%%. When
514       transferring functions GNU parallel reads off the function and changes
515       that into a function definition, which is copied to the remote system
516       and executed before the actual command is executed. Therefore GNU
517       parallel needs to know how to read the function.
518
519       From version 20150122 GNU parallel tries both the ()-version and the
520       %%-version, and the function definition works on both pre- and post-
521       shell shock versions of bash.
522
523   The remote system wrapper
524       The remote system wrapper does some initialization before starting the
525       command on the remote system.
526
527       Ctrl-C and standard error (stderr)
528
529       If the user presses Ctrl-C the user expects jobs to stop. This works
530       out of the box if the jobs are run locally. Unfortunately it is not so
531       simple if the jobs are run remotely.
532
533       If remote jobs are run in a tty using ssh -tt, then Ctrl-C works, but
534       all output to standard error (stderr) is sent to standard output
535       (stdout). This is not what the user expects.
536
537       If remote jobs are run without a tty using ssh (without -tt), then
538       output to standard error (stderr) is kept on stderr, but Ctrl-C does
539       not kill remote jobs. This is not what the user expects.
540
541       So what is needed is a way to have both. It seems the reason why Ctrl-C
542       does not kill the remote jobs is because the shell does not propagate
543       the hang-up signal from sshd. But when sshd dies, the parent of the
544       login shell becomes init (process id 1). So by exec'ing a Perl wrapper
545       to monitor the parent pid and kill the child if the parent pid becomes
546       1, then Ctrl-C works and stderr is kept on stderr.
547
548       Ctrl-C does, however, kill the ssh connection, so any output from a
549       remote dying process is lost.
550
551       To be able to kill all (grand)*children a new process group is started.
552
553       --nice
554
555       niceing the remote process is done by setpriority(0,0,$nice). A few old
556       systems do not implement this and --nice is unsupported on those.
557
558       Setting $PARALLEL_TMP
559
560       $PARALLEL_TMP is used by --fifo and --cat and must point to a non-
561       exitent file in $TMPDIR. This file name is computed on the remote
562       system.
563
564       The wrapper
565
566       The wrapper looks like this:
567
568         $shell = $PARALLEL_SHELL || $SHELL;
569         $tmpdir = $TMPDIR;
570         $nice = $opt::nice;
571         # Set $PARALLEL_TMP to a non-existent file name in $TMPDIR
572         do {
573             $ENV{PARALLEL_TMP} = $tmpdir."/par".
574               join"", map { (0..9,"a".."z","A".."Z")[rand(62)] } (1..5);
575         } while(-e $ENV{PARALLEL_TMP});
576         $SIG{CHLD} = sub { $done = 1; };
577         $pid = fork;
578         unless($pid) {
579             # Make own process group to be able to kill HUP it later
580             setpgrp;
581             eval { setpriority(0,0,$nice) };
582             exec $shell, "-c", ($bashfunc."@ARGV");
583             die "exec: $!\n";
584         }
585         do {
586             # Parent is not init (ppid=1), so sshd is alive
587             # Exponential sleep up to 1 sec
588             $s = $s < 1 ? 0.001 + $s * 1.03 : $s;
589             select(undef, undef, undef, $s);
590         } until ($done || getppid == 1);
591         # Kill HUP the process group if job not done
592         kill(SIGHUP, -${pid}) unless $done;
593         wait;
594         exit ($?&127 ? 128+($?&127) : 1+$?>>8)
595
596   Transferring of variables and functions
597       Transferring of variables and functions given by --env is done by
598       running a Perl script remotely that calls the actual command. The Perl
599       script sets $ENV{variable} to the correct value before exec'ing a shell
600       that runs the function definition followed by the actual command.
601
602       The function env_parallel copies the full current environment into the
603       environment variable PARALLEL_ENV. This variable is picked up by GNU
604       parallel and used to create the Perl script mentioned above.
605
606   Base64 encoded bzip2
607       csh limits words of commands to 1024 chars. This is often too little
608       when GNU parallel encodes environment variables and wraps the command
609       with different templates. All of these are combined and quoted into one
610       single word, which often is longer than 1024 chars.
611
612       When the line to run is > 1000 chars, GNU parallel therefore encodes
613       the line to run. The encoding bzip2s the line to run, converts this to
614       base64, splits the base64 into 1000 char blocks (so csh does not fail),
615       and prepends it with this Perl script that decodes, decompresses and
616       evals the line.
617
618           @GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
619           eval "@GNU_Parallel";
620
621           $SIG{CHLD}="IGNORE";
622           # Search for bzip2. Not found => use default path
623           my $zip = (grep { -x $_ } "/usr/local/bin/bzip2")[0] || "bzip2";
624           # $in = stdin on $zip, $out = stdout from $zip
625           my($in, $out,$eval);
626           open3($in,$out,">&STDERR",$zip,"-dc");
627           if(my $perlpid = fork) {
628               close $in;
629               $eval = join "", <$out>;
630               close $out;
631           } else {
632               close $out;
633               # Pipe decoded base64 into 'bzip2 -dc'
634               print $in (decode_base64(join"",@ARGV));
635               close $in;
636               exit;
637           }
638           wait;
639           eval $eval;
640
641       Perl and bzip2 must be installed on the remote system, but a small test
642       showed that bzip2 is installed by default on all platforms that runs
643       GNU parallel, so this is not a big problem.
644
645       The added bonus of this is that much bigger environments can now be
646       transferred as they will be below bash's limit of 131072 chars.
647
648   Which shell to use
649       Different shells behave differently. A command that works in tcsh may
650       not work in bash.  It is therefore important that the correct shell is
651       used when GNU parallel executes commands.
652
653       GNU parallel tries hard to use the right shell. If GNU parallel is
654       called from tcsh it will use tcsh.  If it is called from bash it will
655       use bash. It does this by looking at the (grand)*parent process: If the
656       (grand)*parent process is a shell, use this shell; otherwise look at
657       the parent of this (grand)*parent. If none of the (grand)*parents are
658       shells, then $SHELL is used.
659
660       This will do the right thing if called from:
661
662       • an interactive shell
663
664       • a shell script
665
666       • a Perl script in `` or using system if called as a single string.
667
668       While these cover most cases, there are situations where it will fail:
669
670       • When run using exec.
671
672       • When run as the last command using -c from another shell (because
673         some shells use exec):
674
675           zsh% bash -c "parallel 'echo {} is not run in bash; \
676                set | grep BASH_VERSION' ::: This"
677
678         You can work around that by appending '&& true':
679
680           zsh% bash -c "parallel 'echo {} is run in bash; \
681                set | grep BASH_VERSION' ::: This && true"
682
683       • When run in a Perl script using system with parallel as the first
684         string:
685
686           #!/usr/bin/perl
687
688           system("parallel",'setenv a {}; echo $a',":::",2);
689
690         Here it depends on which shell is used to call the Perl script. If
691         the Perl script is called from tcsh it will work just fine, but if it
692         is called from bash it will fail, because the command setenv is not
693         known to bash.
694
695       If GNU parallel guesses wrong in these situation, set the shell using
696       $PARALLEL_SHELL.
697
698   Always running commands in a shell
699       If the command is a simple command with no redirection and setting of
700       variables, the command could be run without spawning a shell. E.g. this
701       simple grep matching either 'ls ' or ' wc >> c':
702
703         parallel "grep -E 'ls | wc >> c' {}" ::: foo
704
705       could be run as:
706
707         system("grep","-E","ls | wc >> c","foo");
708
709       However, as soon as the command is a bit more complex a shell must be
710       spawned:
711
712         parallel "grep -E 'ls | wc >> c' {} | wc >> c" ::: foo
713         parallel "LANG=C grep -E 'ls | wc >> c' {}" ::: foo
714
715       It is impossible to tell how | wc >> c should be interpreted without
716       parsing the string (is the | a pipe in shell or an alternation in a
717       grep regexp?  Is LANG=C a command in csh or setting a variable in bash?
718       Is >> redirection or part of a regexp?).
719
720       On top of this, wrapper scripts will often require a shell to be
721       spawned.
722
723       The downside is that you need to quote special shell chars twice:
724
725         parallel echo '*' ::: This will expand the asterisk
726         parallel echo "'*'" ::: This will not
727         parallel "echo '*'" ::: This will not
728         parallel echo '\*' ::: This will not
729         parallel echo \''*'\' ::: This will not
730         parallel -q echo '*' ::: This will not
731
732       -q will quote all special chars, thus redirection will not work: this
733       prints '* > out.1' and does not save '*' into the file out.1:
734
735         parallel -q echo "*" ">" out.{} ::: 1
736
737       GNU parallel tries to live up to Principle Of Least Astonishment
738       (POLA), and the requirement of using -q is hard to understand, when you
739       do not see the whole picture.
740
741   Quoting
742       Quoting depends on the shell. For most shells '-quoting is used for
743       strings containing special characters.
744
745       For tcsh/csh newline is quoted as \ followed by newline. Other special
746       characters are also \-quoted.
747
748       For rc everything is quoted using '.
749
750   --pipepart vs. --pipe
751       While --pipe and --pipepart look much the same to the user, they are
752       implemented very differently.
753
754       With --pipe GNU parallel reads the blocks from standard input (stdin),
755       which is then given to the command on standard input (stdin); so every
756       block is being processed by GNU parallel itself. This is the reason why
757       --pipe maxes out at around 500 MB/sec.
758
759       --pipepart, on the other hand, first identifies at which byte positions
760       blocks start and how long they are. It does that by seeking into the
761       file by the size of a block and then reading until it meets end of a
762       block. The seeking explains why GNU parallel does not know the line
763       number and why -L/-l and -N do not work.
764
765       With a reasonable block and file size this seeking is more than 1000
766       time faster than reading the full file. The byte positions are then
767       given to a small script that reads from position X to Y and sends
768       output to standard output (stdout). This small script is prepended to
769       the command and the full command is executed just as if GNU parallel
770       had been in its normal mode. The script looks like this:
771
772         < file perl -e 'while(@ARGV) {
773            sysseek(STDIN,shift,0) || die;
774            $left = shift;
775            while($read = sysread(STDIN,$buf,
776                                  ($left > 131072 ? 131072 : $left))){
777              $left -= $read; syswrite(STDOUT,$buf);
778            }
779         }' startbyte length_in_bytes
780
781       It delivers 1 GB/s per core.
782
783       Instead of the script dd was tried, but many versions of dd do not
784       support reading from one byte to another and might cause partial data.
785       See this for a surprising example:
786
787         yes | dd bs=1024k count=10 | wc
788
789   --block-size adjustment
790       Every time GNU parallel detects a record bigger than --block-size it
791       increases the block size by 30%. A small --block-size gives very poor
792       performance; by exponentially increasing the block size performance
793       will not suffer.
794
795       GNU parallel will waste CPU power if --block-size does not contain a
796       full record, because it tries to find a full record and will fail to do
797       so. The recommendation is therefore to use a --block-size > 2 records,
798       so you always get at least one full record when you read one block.
799
800       If you use -N then --block-size should be big enough to contain N+1
801       records.
802
803   Automatic --block-size computation
804       With --pipepart GNU parallel can compute the --block-size
805       automatically. A --block-size of -1 will use a block size so that each
806       jobslot will receive approximately 1 block. --block -2 will pass 2
807       blocks to each jobslot and -n will pass n blocks to each jobslot.
808
809       This can be done because --pipepart reads from files, and we can
810       compute the total size of the input.
811
812   --jobs and --onall
813       When running the same commands on many servers what should --jobs
814       signify? Is it the number of servers to run on in parallel?  Is it the
815       number of jobs run in parallel on each server?
816
817       GNU parallel lets --jobs represent the number of servers to run on in
818       parallel. This is to make it possible to run a sequence of commands
819       (that cannot be parallelized) on each server, but run the same sequence
820       on multiple servers.
821
822   --shuf
823       When using --shuf to shuffle the jobs, all jobs are read, then they are
824       shuffled, and finally executed. When using SQL this makes the
825       --sqlmaster be the part that shuffles the jobs. The --sqlworkers simply
826       executes according to Seq number.
827
828   --csv
829       --pipepart is incompatible with --csv because you can have records
830       like:
831
832         a,b,c
833         a,"
834         a,b,c
835         a,b,c
836         a,b,c
837         ",c
838         a,b,c
839
840       Here the second record contains a multi-line field that looks like
841       records. Since --pipepart does not read then whole file when searching
842       for record endings, it may start reading in this multi-line field,
843       which would be wrong.
844
845   Buffering on disk
846       GNU parallel buffers output, because if output is not buffered you have
847       to be ridiculously careful on sizes to avoid mixing of outputs (see
848       excellent example on https://catern.com/posts/pipes.html).
849
850       GNU parallel buffers on disk in $TMPDIR using files, that are removed
851       as soon as they are created, but which are kept open. So even if GNU
852       parallel is killed by a power outage, there will be no files to clean
853       up afterwards. Another advantage is that the file system is aware that
854       these files will be lost in case of a crash, so it does not need to
855       sync them to disk.
856
857       It gives the odd situation that a disk can be fully used, but there are
858       no visible files on it.
859
860       Partly buffering in memory
861
862       When using output formats SQL and CSV then GNU Parallel has to read the
863       whole output into memory. When run normally it will only read the
864       output from a single job. But when using --linebuffer every line
865       printed will also be buffered in memory - for all jobs currently
866       running.
867
868       If memory is tight, then do not use the output format SQL/CSV with
869       --linebuffer.
870
871       Comparing to buffering in memory
872
873       gargs is a parallelizing tool that buffers in memory. It is therefore a
874       useful way of comparing the advantages and disadvantages of buffering
875       in memory to buffering on disk.
876
877       On an system with 6 GB RAM free and 6 GB free swap these were tested
878       with different sizes:
879
880         echo /dev/zero | gargs "head -c $size {}" >/dev/null
881         echo /dev/zero | parallel "head -c $size {}" >/dev/null
882
883       The results are here:
884
885         JobRuntime      Command
886              0.344      parallel_test 1M
887              0.362      parallel_test 10M
888              0.640      parallel_test 100M
889              9.818      parallel_test 1000M
890             23.888      parallel_test 2000M
891             30.217      parallel_test 2500M
892             30.963      parallel_test 2750M
893             34.648      parallel_test 3000M
894             43.302      parallel_test 4000M
895             55.167      parallel_test 5000M
896             67.493      parallel_test 6000M
897            178.654      parallel_test 7000M
898            204.138      parallel_test 8000M
899            230.052      parallel_test 9000M
900            255.639      parallel_test 10000M
901            757.981      parallel_test 30000M
902              0.537      gargs_test 1M
903              0.292      gargs_test 10M
904              0.398      gargs_test 100M
905              3.456      gargs_test 1000M
906              8.577      gargs_test 2000M
907             22.705      gargs_test 2500M
908            123.076      gargs_test 2750M
909             89.866      gargs_test 3000M
910            291.798      gargs_test 4000M
911
912       GNU parallel is pretty much limited by the speed of the disk: Up to 6
913       GB data is written to disk but cached, so reading is fast. Above 6 GB
914       data are both written and read from disk. When the 30000MB job is
915       running, the disk system is slow, but usable: If you are not using the
916       disk, you almost do not feel it.
917
918       gargs has a speed advantage up until 2500M where it hits a wall. Then
919       the system starts swapping like crazy and is completely unusable. At
920       5000M it goes out of memory.
921
922       You can make GNU parallel behave similar to gargs if you point $TMPDIR
923       to a tmpfs-filesystem: It will be faster for small outputs, but may
924       kill your system for larger outputs and cause you to lose output.
925
926   Disk full
927       GNU parallel buffers on disk. If the disk is full, data may be lost. To
928       check if the disk is full GNU parallel writes a 8193 byte file every
929       second. If this file is written successfully, it is removed
930       immediately. If it is not written successfully, the disk is full. The
931       size 8193 was chosen because 8192 gave wrong result on some file
932       systems, whereas 8193 did the correct thing on all tested filesystems.
933
934   Memory usage
935       Normally GNU parallel will use around 17 MB RAM constantly - no matter
936       how many jobs or how much output there is. There are a few things that
937       cause the memory usage to rise:
938
939       •  Multiple input sources. GNU parallel reads an input source only
940          once. This is by design, as an input source can be a stream (e.g.
941          FIFO, pipe, standard input (stdin)) which cannot be rewound and read
942          again. When reading a single input source, the memory is freed as
943          soon as the job is done - thus keeping the memory usage constant.
944
945          But when reading multiple input sources GNU parallel keeps the
946          already read values for generating all combinations with other input
947          sources.
948
949       •  Computing the number of jobs. --bar, --eta, and --halt xx% use
950          total_jobs() to compute the total number of jobs. It does this by
951          generating the data structures for all jobs. All these job data
952          structures will be stored in memory and take up around 400
953          bytes/job.
954
955       •  Buffering a full line. --linebuffer will read a full line per
956          running job. A very long output line (say 1 GB without \n) will
957          increase RAM usage temporarily: From when the beginning of the line
958          is read till the line is printed.
959
960       •  Buffering the full output of a single job. This happens when using
961          --results *.csv/*.tsv or --sql*. Here GNU parallel will read the
962          whole output of a single job and save it as csv/tsv or SQL.
963
964   Argument separators ::: :::: :::+ ::::+
965       The argument separator ::: was chosen because I have never seen :::
966       used in any command. The natural choice -- would be a bad idea since it
967       is not unlikely that the template command will contain --. I have seen
968       :: used in programming languanges to separate classes, and I did not
969       want the user to be confused that the separator had anything to do with
970       classes.
971
972       ::: also makes a visual separation, which is good if there are multiple
973       :::.
974
975       When ::: was chosen, :::: came as a fairly natural extension.
976
977       Linking input sources meant having to decide for some way to indicate
978       linking of ::: and ::::. :::+ and ::::+ was chosen, so that they were
979       similar to ::: and ::::.
980
981   Perl replacement strings, {= =}, and --rpl
982       The shorthands for replacement strings make a command look more
983       cryptic. Different users will need different replacement strings.
984       Instead of inventing more shorthands you get more flexible replacement
985       strings if they can be programmed by the user.
986
987       The language Perl was chosen because GNU parallel is written in Perl
988       and it was easy and reasonably fast to run the code given by the user.
989
990       If a user needs the same programmed replacement string again and again,
991       the user may want to make his own shorthand for it. This is what --rpl
992       is for. It works so well, that even GNU parallel's own shorthands are
993       implemented using --rpl.
994
995       In Perl code the bigrams {= and =} rarely exist. They look like a
996       matching pair and can be entered on all keyboards. This made them good
997       candidates for enclosing the Perl expression in the replacement
998       strings. Another candidate ,, and ,, was rejected because they do not
999       look like a matching pair. --parens was made, so that the users can
1000       still use ,, and ,, if they like: --parens ,,,,
1001
1002       Internally, however, the {= and =} are replaced by \257< and \257>.
1003       This is to make it simpler to make regular expressions. You only need
1004       to look one character ahead, and never have to look behind.
1005
1006   Test suite
1007       GNU parallel uses its own testing framework. This is mostly due to
1008       historical reasons. It deals reasonably well with tests that are
1009       dependent on how long a given test runs (e.g. more than 10 secs is a
1010       pass, but less is a fail). It parallelizes most tests, but it is easy
1011       to force a test to run as the single test (which may be important for
1012       timing issues). It deals reasonably well with tests that fail
1013       intermittently. It detects which tests failed and pushes these to the
1014       top, so when running the test suite again, the tests that failed most
1015       recently are run first.
1016
1017       If GNU parallel should adopt a real testing framework then those
1018       elements would be important.
1019
1020       Since many tests are dependent on which hardware it is running on,
1021       these tests break when run on a different hardware than what the test
1022       was written for.
1023
1024       When most bugs are fixed a test is added, so this bug will not
1025       reappear. It is, however, sometimes hard to create the environment in
1026       which the bug shows up - especially if the bug only shows up sometimes.
1027       One of the harder problems was to make a machine start swapping without
1028       forcing it to its knees.
1029
1030   Median run time
1031       Using a percentage for --timeout causes GNU parallel to compute the
1032       median run time of a job. The median is a better indicator of the
1033       expected run time than average, because there will often be outliers
1034       taking way longer than the normal run time.
1035
1036       To avoid keeping all run times in memory, an implementation of remedian
1037       was made (Rousseeuw et al).
1038
1039   Error messages and warnings
1040       Error messages like: ERROR, Not found, and 42 are not very helpful. GNU
1041       parallel strives to inform the user:
1042
1043       • What went wrong?
1044
1045       • Why did it go wrong?
1046
1047       • What can be done about it?
1048
1049       Unfortunately it is not always possible to predict the root cause of
1050       the error.
1051
1052   Determine number of CPUs
1053       CPUs is an ambiguous term. It can mean the number of socket filled
1054       (i.e. the number of physical chips). It can mean the number of cores
1055       (i.e. the number of physical compute cores). It can mean the number of
1056       hyperthreaded cores (i.e. the number of virtual cores - with some of
1057       them possibly being hyperthreaded).
1058
1059       On ark.intel.com Intel uses the terms cores and threads for number of
1060       physical cores and the number of hyperthreaded cores respectively.
1061
1062       GNU parallel uses uses CPUs as the number of compute units and the
1063       terms sockets, cores, and threads to specify how the number of compute
1064       units is calculated.
1065
1066   Computation of load
1067       Contrary to the obvious --load does not use load average. This is due
1068       to load average rising too slowly. Instead it uses ps to list the
1069       number of threads in running or blocked state (state D, O or R). This
1070       gives an instant load.
1071
1072       As remote calculation of load can be slow, a process is spawned to run
1073       ps and put the result in a file, which is then used next time.
1074
1075   Killing jobs
1076       GNU parallel kills jobs. It can be due to --memfree, --halt, or when
1077       GNU parallel meets a condition from which it cannot recover. Every job
1078       is started as its own process group. This way any (grand)*children will
1079       get killed, too. The process group is killed with the specification
1080       mentioned in --termseq.
1081
1082   SQL interface
1083       GNU parallel uses the DBURL from GNU sql to give database software,
1084       username, password, host, port, database, and table in a single string.
1085
1086       The DBURL must point to a table name. The table will be dropped and
1087       created. The reason for not reusing an existing table is that the user
1088       may have added more input sources which would require more columns in
1089       the table. By prepending '+' to the DBURL the table will not be
1090       dropped.
1091
1092       The table columns are similar to joblog with the addition of V1 .. Vn
1093       which are values from the input sources, and Stdout and Stderr which
1094       are the output from standard output and standard error, respectively.
1095
1096       The Signal column has been renamed to _Signal due to Signal being a
1097       reserved word in MySQL.
1098
1099   Logo
1100       The logo is inspired by the Cafe Wall illusion. The font is DejaVu
1101       Sans.
1102
1103   Citation notice
1104       Funding a free software project is hard. GNU parallel is no exception.
1105       On top of that it seems the less visible a project is, the harder it is
1106       to get funding. And the nature of GNU parallel is that it will never be
1107       seen by "the guy with the checkbook", but only by the people doing the
1108       actual work.
1109
1110       This problem has been covered by others - though no solution has been
1111       found: https://www.slideshare.net/NadiaEghbal/consider-the-maintainer
1112       https://www.numfocus.org/blog/why-is-numpy-only-now-getting-funded/
1113
1114       Before implementing the citation notice it was discussed with the
1115       users:
1116       https://lists.gnu.org/archive/html/parallel/2013-11/msg00006.html
1117
1118       Having to spend 10 seconds on running parallel --citation once is no
1119       doubt not an ideal solution, but no one has so far come up with an
1120       ideal solution - neither for funding GNU parallel nor other free
1121       software.
1122
1123       If you believe you have the perfect solution, you should try it out,
1124       and if it works, you should post it on the email list. Ideas that will
1125       cost work and which have not been tested are, however, unlikely to be
1126       prioritized.
1127
1128       Running parallel --citation one single time takes less than 10 seconds,
1129       and will silence the citation notice for future runs. This is
1130       comparable to graphical tools where you have to click a checkbox saying
1131       "Do not show this again". But if that is too much trouble for you, why
1132       not use one of the alternatives instead?  See a list in: man
1133       parallel_alternatives.
1134
1135       As the request for citation is not a legal requirement this is
1136       acceptable under GPLv3 and cleared with Richard M. Stallman himself.
1137       Thus it does not fall under this:
1138       https://www.gnu.org/licenses/gpl-faq.en.html#RequireCitation
1139

Ideas for new design

1141   Multiple processes working together
1142       Open3 is slow. Printing is slow. It would be good if they did not tie
1143       up resources, but were run in separate threads.
1144
1145   --rrs on remote using a perl wrapper
1146       ... | perl -pe '$/=$recend$recstart;BEGIN{ if(substr($_) eq $recstart)
1147       substr($_)="" } eof and substr($_) eq $recend) substr($_)=""
1148
1149       It ought to be possible to write a filter that removed rec sep on the
1150       fly instead of inside GNU parallel. This could then use more cpus.
1151
1152       Will that require 2x record size memory?
1153
1154       Will that require 2x block size memory?
1155

Historical decisions

1157       These decisions were relevant for earlier versions of GNU parallel, but
1158       not the current version. They are kept here as historical record.
1159
1160   --tollef
1161       You can read about the history of GNU parallel on
1162       https://www.gnu.org/software/parallel/history.html
1163
1164       --tollef was included to make GNU parallel switch compatible with the
1165       parallel from moreutils (which is made by Tollef Fog Heen). This was
1166       done so that users of that parallel easily could port their use to GNU
1167       parallel: Simply set PARALLEL="--tollef" and that would be it.
1168
1169       But several distributions chose to make --tollef global (by putting it
1170       into /etc/parallel/config) without making the users aware of this, and
1171       that caused much confusion when people tried out the examples from GNU
1172       parallel's man page and these did not work.  The users became
1173       frustrated because the distribution did not make it clear to them that
1174       it has made --tollef global.
1175
1176       So to lessen the frustration and the resulting support, --tollef was
1177       obsoleted 20130222 and removed one year later.
1178
1179   Transferring of variables and functions
1180       Until 20150122 variables and functions were transferred by looking at
1181       $SHELL to see whether the shell was a *csh shell. If so the variables
1182       would be set using setenv. Otherwise they would be set using =. This
1183       caused the content of the variable to be repeated:
1184
1185       echo $SHELL | grep "/t\{0,1\}csh" > /dev/null && setenv VAR foo ||
1186       export VAR=foo
1187
1188
1189
119020201122                          2020-12-20                PARALLEL_DESIGN(7)