parallel_alternatives(7)

1PARALLEL_ALTERNATIVES(7)           parallel           PARALLEL_ALTERNATIVES(7)
2
3
4

NAME

6       parallel_alternatives - Alternatives to GNU parallel
7

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

9       There are a lot programs that share functionality with GNU parallel.
10       Some of these are specialized tools, and while GNU parallel can emulate
11       many of them, a specialized tool can be better at a given task. GNU
12       parallel strives to include the best of the general functionality
13       without sacrificing ease of use.
14
15       parallel has existed since 2002-01-06 and as GNU parallel since 2010. A
16       lot of the alternatives have not had the vitality to survive that long,
17       but have come and gone during that time.
18
19       GNU parallel is actively maintained with a new release every month
20       since 2010. Most other alternatives are fleeting interests of the
21       developers with irregular releases and only maintained for a few years.
22
23   SUMMARY LEGEND
24       The following features are in some of the comparable tools:
25
26       Inputs
27
28       I1. Arguments can be read from stdin
29       I2. Arguments can be read from a file
30       I3. Arguments can be read from multiple files
31       I4. Arguments can be read from command line
32       I5. Arguments can be read from a table
33       I6. Arguments can be read from the same file using #! (shebang)
34       I7. Line oriented input as default (Quoting of special chars not
35       needed)
36
37       Manipulation of input
38
39       M1. Composed command
40       M2. Multiple arguments can fill up an execution line
41       M3. Arguments can be put anywhere in the execution line
42       M4. Multiple arguments can be put anywhere in the execution line
43       M5. Arguments can be replaced with context
44       M6. Input can be treated as the complete command line
45
46       Outputs
47
48       O1. Grouping output so output from different jobs do not mix
49       O2. Send stderr (standard error) to stderr (standard error)
50       O3. Send stdout (standard output) to stdout (standard output)
51       O4. Order of output can be same as order of input
52       O5. Stdout only contains stdout (standard output) from the command
53       O6. Stderr only contains stderr (standard error) from the command
54       O7. Buffering on disk
55       O8. Cleanup of temporary files if killed
56       O9. Test if disk runs full during run
57       O10. Output of a line bigger than 4 GB
58
59       Execution
60
61       E1. Running jobs in parallel
62       E2. List running jobs
63       E3. Finish running jobs, but do not start new jobs
64       E4. Number of running jobs can depend on number of cpus
65       E5. Finish running jobs, but do not start new jobs after first failure
66       E6. Number of running jobs can be adjusted while running
67       E7. Only spawn new jobs if load is less than a limit
68
69       Remote execution
70
71       R1. Jobs can be run on remote computers
72       R2. Basefiles can be transferred
73       R3. Argument files can be transferred
74       R4. Result files can be transferred
75       R5. Cleanup of transferred files
76       R6. No config files needed
77       R7. Do not run more than SSHD's MaxStartups can handle
78       R8. Configurable SSH command
79       R9. Retry if connection breaks occasionally
80
81       Semaphore
82
83       S1. Possibility to work as a mutex
84       S2. Possibility to work as a counting semaphore
85
86       Legend
87
88       - = no
89       x = not applicable
90       ID = yes
91
92       As every new version of the programs are not tested the table may be
93       outdated. Please file a bug report if you find errors (See REPORTING
94       BUGS).
95
96       parallel:
97
98       I1 I2 I3 I4 I5 I6 I7
99       M1 M2 M3 M4 M5 M6
100       O1 O2 O3 O4 O5 O6 O7 O8 O9 O10
101       E1 E2 E3 E4 E5 E6 E7
102       R1 R2 R3 R4 R5 R6 R7 R8 R9
103       S1 S2
104
105   DIFFERENCES BETWEEN xargs AND GNU Parallel
106       Summary (see legend above):
107
108       I1 I2 - - - - -
109       - M2 M3 - - -
110       - O2 O3 - O5 O6
111       E1 - - - - - -
112       - - - - - x - - -
113       - -
114
115       xargs offers some of the same possibilities as GNU parallel.
116
117       xargs deals badly with special characters (such as space, \, ' and ").
118       To see the problem try this:
119
120         touch important_file
121         touch 'not important_file'
122         ls not* | xargs rm
123         mkdir -p "My brother's 12\" records"
124         ls | xargs rmdir
125         touch 'c:\windows\system32\clfs.sys'
126         echo 'c:\windows\system32\clfs.sys' | xargs ls -l
127
128       You can specify -0, but many input generators are not optimized for
129       using NUL as separator but are optimized for newline as separator. E.g.
130       awk, ls, echo, tar -v, head (requires using -z), tail (requires using
131       -z), sed (requires using -z), perl (-0 and \0 instead of \n), locate
132       (requires using -0), find (requires using -print0), grep (requires
133       using -z or -Z), sort (requires using -z).
134
135       GNU parallel's newline separation can be emulated with:
136
137       cat | xargs -d "\n" -n1 command
138
139       xargs can run a given number of jobs in parallel, but has no support
140       for running number-of-cpu-cores jobs in parallel.
141
142       xargs has no support for grouping the output, therefore output may run
143       together, e.g. the first half of a line is from one process and the
144       last half of the line is from another process. The example Parallel
145       grep cannot be done reliably with xargs because of this. To see this in
146       action try:
147
148         parallel perl -e '\$a=\"1\".\"{}\"x10000000\;print\ \$a,\"\\n\"' \
149           '>' {} ::: a b c d e f g h
150         # Serial = no mixing = the wanted result
151         # 'tr -s a-z' squeezes repeating letters into a single letter
152         echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
153         # Compare to 8 jobs in parallel
154         parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
155         echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
156         echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
157           tr -s a-z
158
159       Or try this:
160
161         slow_seq() {
162           echo Count to "$@"
163           seq "$@" |
164             perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
165         }
166         export -f slow_seq
167         # Serial = no mixing = the wanted result
168         seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
169         # Compare to 8 jobs in parallel
170         seq 8 | parallel -P8 slow_seq {}
171         seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
172
173       xargs has no support for keeping the order of the output, therefore if
174       running jobs in parallel using xargs the output of the second job
175       cannot be postponed till the first job is done.
176
177       xargs has no support for running jobs on remote computers.
178
179       xargs has no support for context replace, so you will have to create
180       the arguments.
181
182       If you use a replace string in xargs (-I) you can not force xargs to
183       use more than one argument.
184
185       Quoting in xargs works like -q in GNU parallel. This means composed
186       commands and redirection require using bash -c.
187
188         ls | parallel "wc {} >{}.wc"
189         ls | parallel "echo {}; ls {}|wc"
190
191       becomes (assuming you have 8 cores and that none of the filenames
192       contain space, " or ').
193
194         ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
195         ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
196
197       A more extreme example can be found on:
198       https://unix.stackexchange.com/q/405552/
199
200       https://www.gnu.org/software/findutils/
201
202   DIFFERENCES BETWEEN find -exec AND GNU Parallel
203       Summary (see legend above):
204
205       -  -  -  x  -  x  -
206       -  M2 M3 -  -  -  -
207       -  O2 O3 O4 O5 O6
208       -  -  -  -  -  -  -
209       -  -  -  -  -  -  -  -  -
210       x  x
211
212       find -exec offers some of the same possibilities as GNU parallel.
213
214       find -exec only works on files. Processing other input (such as hosts
215       or URLs) will require creating these inputs as files. find -exec has no
216       support for running commands in parallel.
217
218       https://www.gnu.org/software/findutils/ (Last checked: 2019-01)
219
220   DIFFERENCES BETWEEN make -j AND GNU Parallel
221       Summary (see legend above):
222
223       -  -  -  -  -  -  -
224       -  -  -  -  -  -
225       O1 O2 O3 -  x  O6
226       E1 -  -  -  E5 -
227       -  -  -  -  -  -  -  -  -
228       -  -
229
230       make -j can run jobs in parallel, but requires a crafted Makefile to do
231       this. That results in extra quoting to get filenames containing
232       newlines to work correctly.
233
234       make -j computes a dependency graph before running jobs. Jobs run by
235       GNU parallel does not depend on each other.
236
237       (Very early versions of GNU parallel were coincidentally implemented
238       using make -j).
239
240       https://www.gnu.org/software/make/ (Last checked: 2019-01)
241
242   DIFFERENCES BETWEEN ppss AND GNU Parallel
243       Summary (see legend above):
244
245       I1 I2 - - - - I7
246       M1 - M3 - - M6
247       O1 - - x - -
248       E1 E2 ?E3 E4 - - -
249       R1 R2 R3 R4 - - ?R7 ? ?
250       - -
251
252       ppss is also a tool for running jobs in parallel.
253
254       The output of ppss is status information and thus not useful for using
255       as input for another command. The output from the jobs are put into
256       files.
257
258       The argument replace string ($ITEM) cannot be changed. Arguments must
259       be quoted - thus arguments containing special characters (space '"&!*)
260       may cause problems. More than one argument is not supported. Filenames
261       containing newlines are not processed correctly. When reading input
262       from a file null cannot be used as a terminator. ppss needs to read the
263       whole input file before starting any jobs.
264
265       Output and status information is stored in ppss_dir and thus requires
266       cleanup when completed. If the dir is not removed before running ppss
267       again it may cause nothing to happen as ppss thinks the task is already
268       done. GNU parallel will normally not need cleaning up if running
269       locally and will only need cleaning up if stopped abnormally and
270       running remote (--cleanup may not complete if stopped abnormally). The
271       example Parallel grep would require extra postprocessing if written
272       using ppss.
273
274       For remote systems PPSS requires 3 steps: config, deploy, and start.
275       GNU parallel only requires one step.
276
277       EXAMPLES FROM ppss MANUAL
278
279       Here are the examples from ppss's manual page with the equivalent using
280       GNU parallel:
281
282         1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
283
284         1$ find /path/to/files -type f | parallel gzip
285
286         2$ ./ppss.sh standalone -d /path/to/files -c 'cp "$ITEM" /destination/dir '
287
288         2$ find /path/to/files -type f | parallel cp {} /destination/dir
289
290         3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
291
292         3$ parallel -a list-of-urls.txt wget -q
293
294         4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
295
296         4$ parallel -a list-of-urls.txt wget -q {}
297
298         5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
299              -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
300              -n nodes.txt -o /some/output/dir --upload --download;
301            ./ppss deploy -C config.cfg
302            ./ppss start -C config
303
304         5$ # parallel does not use configs. If you want a different username put it in nodes.txt: user@hostname
305            find source/dir -type f |
306              parallel --sshloginfile nodes.txt --trc {.}.mp3 lame -a {} -o {.}.mp3 --preset standard --quiet
307
308         6$ ./ppss stop -C config.cfg
309
310         6$ killall -TERM parallel
311
312         7$ ./ppss pause -C config.cfg
313
314         7$ Press: CTRL-Z or killall -SIGTSTP parallel
315
316         8$ ./ppss continue -C config.cfg
317
318         8$ Enter: fg or killall -SIGCONT parallel
319
320         9$ ./ppss.sh status -C config.cfg
321
322         9$ killall -SIGUSR2 parallel
323
324       https://github.com/louwrentius/PPSS
325
326   DIFFERENCES BETWEEN pexec AND GNU Parallel
327       Summary (see legend above):
328
329       I1 I2 - I4 I5 - -
330       M1 - M3 - - M6
331       O1 O2 O3 - O5 O6
332       E1 - - E4 - E6 -
333       R1 - - - - R6 - - -
334       S1 -
335
336       pexec is also a tool for running jobs in parallel.
337
338       EXAMPLES FROM pexec MANUAL
339
340       Here are the examples from pexec's info page with the equivalent using
341       GNU parallel:
342
343         1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
344              'echo "scale=10000;sqrt($NUM)" | bc'
345
346         1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
347              bc > sqrt-{}.dat'
348
349         2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
350
351         2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
352
353         3$ pexec -f image.list -n auto -e B -u star.log -c -- \
354              'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
355
356         3$ parallel -a image.list \
357              'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
358
359         4$ pexec -r *.png -e IMG -c -o - -- \
360              'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
361
362         4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
363
364         5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
365
366         5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
367
368         6$ for p in *.png ; do echo ${p%.png} ; done | \
369              pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
370
371         6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
372
373         7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
374            pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
375
376         7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
377
378         8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
379              'pexec -j -m blockread -d $IMG | \
380               jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
381               pexec -j -m blockwrite -s th_$IMG'
382
383         8$ # Combining GNU B<parallel> and GNU B<sem>.
384            ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
385              'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
386
387            # If reading and writing is done to the same disk, this may be
388            # faster as only one process will be either reading or writing:
389            ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
390              'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
391
392       https://www.gnu.org/software/pexec/
393
394   DIFFERENCES BETWEEN xjobs AND GNU Parallel
395       xjobs is also a tool for running jobs in parallel. It only supports
396       running jobs on your local computer.
397
398       xjobs deals badly with special characters just like xargs. See the
399       section DIFFERENCES BETWEEN xargs AND GNU Parallel.
400
401       EXAMPLES FROM xjobs MANUAL
402
403       Here are the examples from xjobs's man page with the equivalent using
404       GNU parallel:
405
406         1$ ls -1 *.zip | xjobs unzip
407
408         1$ ls *.zip | parallel unzip
409
410         2$ ls -1 *.zip | xjobs -n unzip
411
412         2$ ls *.zip | parallel unzip >/dev/null
413
414         3$ find . -name '*.bak' | xjobs gzip
415
416         3$ find . -name '*.bak' | parallel gzip
417
418         4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
419
420         4$ ls *.jar | parallel jar tf {} '>' {}.idx
421
422         5$ xjobs -s script
423
424         5$ cat script | parallel
425
426         6$ mkfifo /var/run/my_named_pipe;
427            xjobs -s /var/run/my_named_pipe &
428            echo unzip 1.zip >> /var/run/my_named_pipe;
429            echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
430
431         6$ mkfifo /var/run/my_named_pipe;
432            cat /var/run/my_named_pipe | parallel &
433            echo unzip 1.zip >> /var/run/my_named_pipe;
434            echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
435
436       https://www.maier-komor.de/xjobs.html (Last checked: 2019-01)
437
438   DIFFERENCES BETWEEN prll AND GNU Parallel
439       prll is also a tool for running jobs in parallel. It does not support
440       running jobs on remote computers.
441
442       prll encourages using BASH aliases and BASH functions instead of
443       scripts. GNU parallel supports scripts directly, functions if they are
444       exported using export -f, and aliases if using env_parallel.
445
446       prll generates a lot of status information on stderr (standard error)
447       which makes it harder to use the stderr (standard error) output of the
448       job directly as input for another program.
449
450       EXAMPLES FROM prll's MANUAL
451
452       Here is the example from prll's man page with the equivalent using GNU
453       parallel:
454
455         1$ prll -s 'mogrify -flip $1' *.jpg
456
457         1$ parallel mogrify -flip ::: *.jpg
458
459       https://github.com/exzombie/prll (Last checked: 2019-01)
460
461   DIFFERENCES BETWEEN dxargs AND GNU Parallel
462       dxargs is also a tool for running jobs in parallel.
463
464       dxargs does not deal well with more simultaneous jobs than SSHD's
465       MaxStartups. dxargs is only built for remote run jobs, but does not
466       support transferring of files.
467
468       https://web.archive.org/web/20120518070250/http://www.
469       semicomplete.com/blog/geekery/distributed-xargs.html (Last checked:
470       2019-01)
471
472   DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
473       middleman(mdm) is also a tool for running jobs in parallel.
474
475       EXAMPLES FROM middleman's WEBSITE
476
477       Here are the shellscripts of
478       https://web.archive.org/web/20110728064735/http://mdm.
479       berlios.de/usage.html ported to GNU parallel:
480
481         1$ seq 19 | parallel buffon -o - | sort -n > result
482            cat files | parallel cmd
483            find dir -execdir sem cmd {} \;
484
485       https://github.com/cklin/mdm (Last checked: 2019-01)
486
487   DIFFERENCES BETWEEN xapply AND GNU Parallel
488       xapply can run jobs in parallel on the local computer.
489
490       EXAMPLES FROM xapply's MANUAL
491
492       Here are the examples from xapply's man page with the equivalent using
493       GNU parallel:
494
495         1$ xapply '(cd %1 && make all)' */
496
497         1$ parallel 'cd {} && make all' ::: */
498
499         2$ xapply -f 'diff %1 ../version5/%1' manifest | more
500
501         2$ parallel diff {} ../version5/{} < manifest | more
502
503         3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
504
505         3$ parallel --link diff {1} {2} :::: manifest1 checklist1
506
507         4$ xapply 'indent' *.c
508
509         4$ parallel indent ::: *.c
510
511         5$ find ~ksb/bin -type f ! -perm -111 -print | \
512              xapply -f -v 'chmod a+x' -
513
514         5$ find ~ksb/bin -type f ! -perm -111 -print | \
515              parallel -v chmod a+x
516
517         6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
518
519         6$ sh <(find */ -... | parallel -s 1024 echo vi)
520
521         6$ find */ -... | parallel -s 1024 -Xuj1 vi
522
523         7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
524
525         7$ sh <(find ... | parallel -n5 echo vi)
526
527         7$ find ... | parallel -n5 -uj1 vi
528
529         8$ xapply -fn "" /etc/passwd
530
531         8$ parallel -k echo < /etc/passwd
532
533         9$ tr ':' '\012' < /etc/passwd | \
534              xapply -7 -nf 'chown %1 %6' - - - - - - -
535
536         9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
537
538         10$ xapply '[ -d %1/RCS ] || echo %1' */
539
540         10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
541
542         11$ xapply -f '[ -f %1 ] && echo %1' List | ...
543
544         11$ parallel '[ -f {} ] && echo {}' < List | ...
545
546       https://www.databits.net/~ksb/msrc/local/bin/xapply/xapply.html
547
548   DIFFERENCES BETWEEN AIX apply AND GNU Parallel
549       apply can build command lines based on a template and arguments - very
550       much like GNU parallel. apply does not run jobs in parallel. apply does
551       not use an argument separator (like :::); instead the template must be
552       the first argument.
553
554       EXAMPLES FROM IBM's KNOWLEDGE CENTER
555
556       Here are the examples from IBM's Knowledge Center and the corresponding
557       command using GNU parallel:
558
559       To obtain results similar to those of the ls command, enter:
560
561         1$ apply echo *
562         1$ parallel echo ::: *
563
564       To compare the file named a1 to the file named b1, and the file named
565       a2 to the file named b2, enter:
566
567         2$ apply -2 cmp a1 b1 a2 b2
568         2$ parallel -N2 cmp ::: a1 b1 a2 b2
569
570       To run the who command five times, enter:
571
572         3$ apply -0 who 1 2 3 4 5
573         3$ parallel -N0 who ::: 1 2 3 4 5
574
575       To link all files in the current directory to the directory /usr/joe,
576       enter:
577
578         4$ apply 'ln %1 /usr/joe' *
579         4$ parallel ln {} /usr/joe ::: *
580
581       https://www-01.ibm.com/support/knowledgecenter/
582       ssw_aix_71/com.ibm.aix.cmds1/apply.htm (Last checked: 2019-01)
583
584   DIFFERENCES BETWEEN paexec AND GNU Parallel
585       paexec can run jobs in parallel on both the local and remote computers.
586
587       paexec requires commands to print a blank line as the last output. This
588       means you will have to write a wrapper for most programs.
589
590       paexec has a job dependency facility so a job can depend on another job
591       to be executed successfully. Sort of a poor-man's make.
592
593       EXAMPLES FROM paexec's EXAMPLE CATALOG
594
595       Here are the examples from paexec's example catalog with the equivalent
596       using GNU parallel:
597
598       1_div_X_run
599
600         1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
601
602         1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
603
604       all_substr_run
605
606         2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
607
608         2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
609
610       cc_wrapper_run
611
612         3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
613                    -n 'host1 host2' \
614                    -t '/usr/bin/ssh -x' <<EOF [...]
615
616         3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
617                    -S host1,host2 <<EOF [...]
618
619            # This is not exactly the same, but avoids the wrapper
620            parallel gcc -O2 -c -o {.}.o {} \
621                    -S host1,host2 <<EOF [...]
622
623       toupper_run
624
625         4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
626
627         4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
628
629            # Without the wrapper:
630            parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
631
632       https://github.com/cheusov/paexec
633
634   DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
635       Summary (see legend above):
636
637       I1 - - I4 - - (I7)
638       M1 (M2) M3 (M4) M5 M6
639       - O2 O3 - O5 - - N/A N/A O10
640       E1 - - - - - -
641       - - - - - - - - -
642       - -
643
644       (I7): Only under special circumstances. See below.
645
646       (M2+M4): Only if there is a single replacement string.
647
648       map rejects input with special characters:
649
650         echo "The Cure" > My\ brother\'s\ 12\"\ records
651
652         ls | map 'echo %; wc %'
653
654       It works with GNU parallel:
655
656         ls | parallel 'echo {}; wc {}'
657
658       Under some circumstances it also works with map:
659
660         ls | map 'echo % works %'
661
662       But tiny changes make it reject the input with special characters:
663
664         ls | map 'echo % does not work "%"'
665
666       This means that many UTF-8 characters will be rejected. This is by
667       design. From the web page: "As such, programs that quietly handle them,
668       with no warnings at all, are doing their users a disservice."
669
670       map delays each job by 0.01 s. This can be emulated by using parallel
671       --delay 0.01.
672
673       map prints '+' on stderr when a job starts, and '-' when a job
674       finishes. This cannot be disabled. parallel has --bar if you need to
675       see progress.
676
677       map's replacement strings (% %D %B %E) can be simulated in GNU parallel
678       by putting this in ~/.parallel/config:
679
680         --rpl '%'
681         --rpl '%D $_=Q(::dirname($_));'
682         --rpl '%B s:.*/::;s:\.[^/.]+$::;'
683         --rpl '%E s:.*\.::'
684
685       map does not have an argument separator on the command line, but uses
686       the first argument as command. This makes quoting harder which again
687       may affect readability. Compare:
688
689         map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
690
691         parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
692
693       map can do multiple arguments with context replace, but not without
694       context replace:
695
696         parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
697
698         map "echo 'BEGIN{'%'}END'" 1 2 3
699
700       map has no support for grouping. So this gives the wrong results:
701
702         parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
703           ::: a b c d e f
704         ls -l a b c d e f
705         parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
706         map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
707         map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
708         map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
709         ls -l out*
710         md5sum out*
711
712       EXAMPLES FROM map's WEBSITE
713
714       Here are the examples from map's web page with the equivalent using GNU
715       parallel:
716
717         1$ ls *.gif | map convert % %B.png         # default max-args: 1
718
719         1$ ls *.gif | parallel convert {} {.}.png
720
721         2$ map "mkdir %B; tar -C %B -xf %" *.tgz   # default max-args: 1
722
723         2$ parallel 'mkdir {.}; tar -C {.} -xf {}' :::  *.tgz
724
725         3$ ls *.gif | map cp % /tmp                # default max-args: 100
726
727         3$ ls *.gif | parallel -X cp {} /tmp
728
729         4$ ls *.tar | map -n 1 tar -xf %
730
731         4$ ls *.tar | parallel tar -xf
732
733         5$ map "cp % /tmp" *.tgz
734
735         5$ parallel cp {} /tmp ::: *.tgz
736
737         6$ map "du -sm /home/%/mail" alice bob carol
738
739         6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
740         or if you prefer running a single job with multiple args:
741         6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
742
743         7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
744
745         7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
746
747         8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
748
749         8$ export PARALLEL=-j50%
750
751       https://github.com/sitaramc/map (Last checked: 2020-05)
752
753   DIFFERENCES BETWEEN ladon AND GNU Parallel
754       ladon can run multiple jobs on files in parallel.
755
756       ladon only works on files and the only way to specify files is using a
757       quoted glob string (such as \*.jpg). It is not possible to list the
758       files manually.
759
760       As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
761       RELPATH
762
763       These can be simulated using GNU parallel by putting this in
764       ~/.parallel/config:
765
766         --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
767         --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
768         --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
769         --rpl 'EXT s:.*\.::'
770         --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
771                s:\Q$c/\E::;$_=::dirname($_);'
772         --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
773                s:\Q$c/\E::;'
774
775       ladon deals badly with filenames containing " and newline, and it fails
776       for output larger than 200k:
777
778         ladon '*' -- seq 36000 | wc
779
780       EXAMPLES FROM ladon MANUAL
781
782       It is assumed that the '--rpl's above are put in ~/.parallel/config and
783       that it is run under a shell that supports '**' globbing (such as zsh):
784
785         1$ ladon "**/*.txt" -- echo RELPATH
786
787         1$ parallel echo RELPATH ::: **/*.txt
788
789         2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
790
791         2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
792
793         3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
794              -thumbnail 100x100^ -gravity center -extent 100x100 \
795              thumbs/RELPATH
796
797         3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
798              -thumbnail 100x100^ -gravity center -extent 100x100 \
799              thumbs/RELPATH ::: **/*.jpg
800
801         4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
802
803         4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
804
805       https://github.com/danielgtaylor/ladon (Last checked: 2019-01)
806
807   DIFFERENCES BETWEEN jobflow AND GNU Parallel
808       jobflow can run multiple jobs in parallel.
809
810       Just like xargs output from jobflow jobs running in parallel mix
811       together by default. jobflow can buffer into files (placed in
812       /run/shm), but these are not cleaned up if jobflow dies unexpectedly
813       (e.g. by Ctrl-C). If the total output is big (in the order of RAM+swap)
814       it can cause the system to slow to a crawl and eventually run out of
815       memory.
816
817       jobflow gives no error if the command is unknown, and like xargs
818       redirection and composed commands require wrapping with bash -c.
819
820       Input lines can at most be 4096 bytes. You can at most have 16 {}'s in
821       the command template. More than that either crashes the program or
822       simple does not execute the command.
823
824       jobflow has no equivalent for --pipe, or --sshlogin.
825
826       jobflow makes it possible to set resource limits on the running jobs.
827       This can be emulated by GNU parallel using bash's ulimit:
828
829         jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
830
831         parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
832
833       EXAMPLES FROM jobflow README
834
835         1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
836
837         1$ cat things.list | parallel -j8 ./mytask {}
838
839         2$ seq 100 | jobflow -threads=100 -exec echo {}
840
841         2$ seq 100 | parallel -j100 echo {}
842
843         3$ cat urls.txt | jobflow -threads=32 -exec wget {}
844
845         3$ cat urls.txt | parallel -j32 wget {}
846
847         4$ find . -name '*.bmp' | \
848              jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
849
850         4$ find . -name '*.bmp' | \
851              parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
852
853       https://github.com/rofl0r/jobflow
854
855   DIFFERENCES BETWEEN gargs AND GNU Parallel
856       gargs can run multiple jobs in parallel.
857
858       Older versions cache output in memory. This causes it to be extremely
859       slow when the output is larger than the physical RAM, and can cause the
860       system to run out of memory.
861
862       See more details on this in man parallel_design.
863
864       Newer versions cache output in files, but leave files in $TMPDIR if it
865       is killed.
866
867       Output to stderr (standard error) is changed if the command fails.
868
869       EXAMPLES FROM gargs WEBSITE
870
871         1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
872
873         1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
874
875         2$ cat t.txt | gargs --sep "\s+" \
876              -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
877
878         2$ cat t.txt | parallel --colsep "\\s+" \
879              -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
880
881       https://github.com/brentp/gargs
882
883   DIFFERENCES BETWEEN orgalorg AND GNU Parallel
884       orgalorg can run the same job on multiple machines. This is related to
885       --onall and --nonall.
886
887       orgalorg supports entering the SSH password - provided it is the same
888       for all servers. GNU parallel advocates using ssh-agent instead, but it
889       is possible to emulate orgalorg's behavior by setting SSHPASS and by
890       using --ssh "sshpass ssh".
891
892       To make the emulation easier, make a simple alias:
893
894         alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
895
896       If you want to supply a password run:
897
898         SSHPASS=`ssh-askpass`
899
900       or set the password directly:
901
902         SSHPASS=P4$$w0rd!
903
904       If the above is set up you can then do:
905
906         orgalorg -o frontend1 -o frontend2 -p -C uptime
907         par_emul -S frontend1 -S frontend2 uptime
908
909         orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
910         par_emul -S frontend1 -S frontend2 top -bid 1
911
912         orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
913           'md5sum /tmp/bigfile' -S bigfile
914         par_emul -S frontend1 -S frontend2 --basefile bigfile \
915           --workdir /tmp md5sum /tmp/bigfile
916
917       orgalorg has a progress indicator for the transferring of a file. GNU
918       parallel does not.
919
920       https://github.com/reconquest/orgalorg
921
922   DIFFERENCES BETWEEN Rust parallel AND GNU Parallel
923       Rust parallel focuses on speed. It is almost as fast as xargs, but not
924       as fast as parallel-bash. It implements a few features from GNU
925       parallel, but lacks many functions. All these fail:
926
927         # Read arguments from file
928         parallel -a file echo
929         # Changing the delimiter
930         parallel -d _ echo ::: a_b_c_
931
932       These do something different from GNU parallel
933
934         # -q to protect quoted $ and space
935         parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
936         # Generation of combination of inputs
937         parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
938         # {= perl expression =} replacement string
939         parallel echo '{= s/new/old/ =}' ::: my.new your.new
940         # --pipe
941         seq 100000 | parallel --pipe wc
942         # linked arguments
943         parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
944         # Run different shell dialects
945         zsh -c 'parallel echo \={} ::: zsh && true'
946         csh -c 'parallel echo \$\{\} ::: shell && true'
947         bash -c 'parallel echo \$\({}\) ::: pwd && true'
948         # Rust parallel does not start before the last argument is read
949         (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
950         tail -f /var/log/syslog | parallel echo
951
952       Most of the examples from the book GNU Parallel 2018 do not work, thus
953       Rust parallel is not close to being a compatible replacement.
954
955       Rust parallel has no remote facilities.
956
957       It uses /tmp/parallel for tmp files and does not clean up if terminated
958       abruptly. If another user on the system uses Rust parallel, then
959       /tmp/parallel will have the wrong permissions and Rust parallel will
960       fail. A malicious user can setup the right permissions and symlink the
961       output file to one of the user's files and next time the user uses Rust
962       parallel it will overwrite this file.
963
964         attacker$ mkdir /tmp/parallel
965         attacker$ chmod a+rwX /tmp/parallel
966         # Symlink to the file the attacker wants to zero out
967         attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
968         victim$ seq 1000 | parallel echo
969         # This file is now overwritten with stderr from 'echo'
970         victim$ cat ~victim/.important-file
971
972       If /tmp/parallel runs full during the run, Rust parallel does not
973       report this, but finishes with success - thereby risking data loss.
974
975       https://github.com/mmstick/parallel
976
977   DIFFERENCES BETWEEN Rush AND GNU Parallel
978       rush (https://github.com/shenwei356/rush) is written in Go and based on
979       gargs.
980
981       Just like GNU parallel rush buffers in temporary files. But opposite
982       GNU parallel rush does not clean up, if the process dies abnormally.
983
984       rush has some string manipulations that can be emulated by putting this
985       into ~/.parallel/config (/ is used instead of %, and % is used instead
986       of ^ as that is closer to bash's ${var%postfix}):
987
988         --rpl '{:} s:(\.[^/]+)*$::'
989         --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
990         --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
991         --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
992         --rpl '{@(.*?)} /$$1/ and $_=$1;'
993
994       EXAMPLES FROM rush's WEBSITE
995
996       Here are the examples from rush's website with the equivalent command
997       in GNU parallel.
998
999       1. Simple run, quoting is not necessary
1000
1001         1$ seq 1 3 | rush echo {}
1002
1003         1$ seq 1 3 | parallel echo {}
1004
1005       2. Read data from file (`-i`)
1006
1007         2$ rush echo {} -i data1.txt -i data2.txt
1008
1009         2$ cat data1.txt data2.txt | parallel echo {}
1010
1011       3. Keep output order (`-k`)
1012
1013         3$ seq 1 3 | rush 'echo {}' -k
1014
1015         3$ seq 1 3 | parallel -k echo {}
1016
1017       4. Timeout (`-t`)
1018
1019         4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
1020
1021         4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
1022
1023       5. Retry (`-r`)
1024
1025         5$ seq 1 | rush 'python unexisted_script.py' -r 1
1026
1027         5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
1028
1029       Use -u to see it is really run twice:
1030
1031         5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
1032
1033       6. Dirname (`{/}`) and basename (`{%}`) and remove custom suffix
1034       (`{^suffix}`)
1035
1036         6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
1037
1038         6$ echo dir/file_1.txt.gz |
1039              parallel --plus echo {//} {/} {%_1.txt.gz}
1040
1041       7. Get basename, and remove last (`{.}`) or any (`{:}`) extension
1042
1043         7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
1044
1045         7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
1046
1047       8. Job ID, combine fields index and other replacement strings
1048
1049         8$ echo 12 file.txt dir/s_1.fq.gz |
1050              rush 'echo job {#}: {2} {2.} {3%:^_1}'
1051
1052         8$ echo 12 file.txt dir/s_1.fq.gz |
1053              parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1054
1055       9. Capture submatch using regular expression (`{@regexp}`)
1056
1057         9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1058
1059         9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1060
1061       10. Custom field delimiter (`-d`)
1062
1063         10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1064
1065         10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1066
1067       11. Send multi-lines to every command (`-n`)
1068
1069         11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1070
1071         11$ seq 5 |
1072               parallel -n 2 -k \
1073                 'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1074
1075         11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1076
1077         11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1078
1079       12. Custom record delimiter (`-D`), note that empty records are not
1080       used.
1081
1082         12$ echo a b c d | rush -D " " -k 'echo {}'
1083
1084         12$ echo a b c d | parallel -d " " -k 'echo {}'
1085
1086         12$ echo abcd | rush -D "" -k 'echo {}'
1087
1088         Cannot be done by GNU Parallel
1089
1090         12$ cat fasta.fa
1091         >seq1
1092         tag
1093         >seq2
1094         cat
1095         gat
1096         >seq3
1097         attac
1098         a
1099         cat
1100
1101         12$ cat fasta.fa | rush -D ">" \
1102               'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1103             # rush fails to join the multiline sequences
1104
1105         12$ cat fasta.fa | (read -n1 ignore_first_char;
1106               parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1107                 name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1108             )
1109
1110       13. Assign value to variable, like `awk -v` (`-v`)
1111
1112         13$ seq 1 |
1113               rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1114
1115         13$ seq 1 |
1116               parallel -N0 \
1117                 'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1118
1119         13$ for var in a b; do \
1120         13$   seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1121         13$ done
1122
1123       In GNU parallel you would typically do:
1124
1125         13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1126
1127       If you really want the var:
1128
1129         13$ seq 1 3 |
1130               parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1131
1132       If you really want the for-loop:
1133
1134         13$ for var in a b; do
1135               export var;
1136               seq 1 3 | parallel -k 'echo var: $var, data: {}';
1137             done
1138
1139       Contrary to rush this also works if the value is complex like:
1140
1141         My brother's 12" records
1142
1143       14. Preset variable (`-v`), avoid repeatedly writing verbose
1144       replacement strings
1145
1146         14$ # naive way
1147             echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1148
1149         14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1150
1151         14$ # macro + removing suffix
1152             echo read_1.fq.gz |
1153               rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1154
1155         14$ echo read_1.fq.gz |
1156               parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1157
1158         14$ # macro + regular expression
1159             echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1160
1161         14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1162
1163       Contrary to rush GNU parallel works with complex values:
1164
1165         14$ echo "My brother's 12\"read_1.fq.gz" |
1166               parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1167
1168       15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and
1169       exit.
1170
1171         15$ seq 1 20 | rush 'sleep 1; echo {}'
1172             ^C
1173
1174         15$ seq 1 20 | parallel 'sleep 1; echo {}'
1175             ^C
1176
1177       16. Continue/resume jobs (`-c`). When some jobs failed (by execution
1178       failure, timeout, or canceling by user with `Ctrl + C`), please switch
1179       flag `-c/--continue` on and run again, so that `rush` can save
1180       successful commands and ignore them in NEXT run.
1181
1182         16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1183             cat successful_cmds.rush
1184             seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1185
1186         16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1187               'sleep {}; echo {}'
1188             cat mylog
1189             seq 1 3 | parallel --joblog mylog --retry-failed \
1190               'sleep {}; echo {}'
1191
1192       Multi-line jobs:
1193
1194         16$ seq 1 3 | rush 'sleep {}; echo {}; \
1195               echo finish {}' -t 3 -c -C finished.rush
1196             cat finished.rush
1197             seq 1 3 | rush 'sleep {}; echo {}; \
1198               echo finish {}' -t 3 -c -C finished.rush
1199
1200         16$ seq 1 3 |
1201               parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1202                 echo finish {}'
1203             cat mylog
1204             seq 1 3 |
1205               parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1206                 echo finish {}'
1207
1208       17. A comprehensive example: downloading 1K+ pages given by three URL
1209       list files using `phantomjs save_page.js` (some page contents are
1210       dynamically generated by Javascript, so `wget` does not work). Here I
1211       set max jobs number (`-j`) as `20`, each job has a max running time
1212       (`-t`) of `60` seconds and `3` retry changes (`-r`). Continue flag `-c`
1213       is also switched on, so we can continue unfinished jobs. Luckily, it's
1214       accomplished in one run :)
1215
1216         17$ for f in $(seq 2014 2016); do \
1217               /bin/rm -rf $f; mkdir -p $f; \
1218               cat $f.html.txt | rush -v d=$f -d = \
1219                 'phantomjs save_page.js "{}" > {d}/{3}.html' \
1220                 -j 20 -t 60 -r 3 -c; \
1221             done
1222
1223       GNU parallel can append to an existing joblog with '+':
1224
1225         17$ rm mylog
1226             for f in $(seq 2014 2016); do
1227               /bin/rm -rf $f; mkdir -p $f;
1228               cat $f.html.txt |
1229                 parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1230                   --colsep = \
1231                   phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1232             done
1233
1234       18. A bioinformatics example: mapping with `bwa`, and processing result
1235       with `samtools`:
1236
1237         18$ ref=ref/xxx.fa
1238             threads=25
1239             ls -d raw.cluster.clean.mapping/* \
1240               | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1241               'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1242               samtools view -bS {p}.sam > {p}.bam; \
1243               samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1244               samtools index {p}.sorted.bam; \
1245               samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1246               /bin/rm {p}.bam {p}.sam;' \
1247               -j 2 --verbose -c -C mapping.rush
1248
1249       GNU parallel would use a function:
1250
1251         18$ ref=ref/xxx.fa
1252             export ref
1253             thr=25
1254             export thr
1255             bwa_sam() {
1256               p="$1"
1257               bam="$p".bam
1258               sam="$p".sam
1259               sortbam="$p".sorted.bam
1260               bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1261               samtools view -bS "$sam" > "$bam"
1262               samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1263               samtools index "$sortbam"
1264               samtools flagstat "$sortbam" > "$sortbam".flagstat
1265               /bin/rm "$bam" "$sam"
1266             }
1267             export -f bwa_sam
1268             ls -d raw.cluster.clean.mapping/* |
1269               parallel -j 2 --verbose --joblog mylog bwa_sam
1270
1271       Other rush features
1272
1273       rush has:
1274
1275       •   awk -v like custom defined variables (-v)
1276
1277           With GNU parallel you would simply set a shell variable:
1278
1279              parallel 'v={}; echo "$v"' ::: foo
1280              echo foo | rush -v v={} 'echo {v}'
1281
1282           Also rush does not like special chars. So these do not work:
1283
1284              echo does not work | rush -v v=\" 'echo {v}'
1285              echo "My  brother's  12\"  records" | rush -v v={} 'echo {v}'
1286
1287           Whereas the corresponding GNU parallel version works:
1288
1289              parallel 'v=\"; echo "$v"' ::: works
1290              parallel 'v={}; echo "$v"' ::: "My  brother's  12\"  records"
1291
1292       •   Exit on first error(s) (-e)
1293
1294           This is called --halt now,fail=1 (or shorter: --halt 2) when used
1295           with GNU parallel.
1296
1297       •   Settable records sending to every command (-n, default 1)
1298
1299           This is also called -n in GNU parallel.
1300
1301       •   Practical replacement strings
1302
1303           {:} remove any extension
1304               With GNU parallel this can be emulated by:
1305
1306                 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1307
1308           {^suffix}, remove suffix
1309               With GNU parallel this can be emulated by:
1310
1311                 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1312
1313           {@regexp}, capture submatch using regular expression
1314               With GNU parallel this can be emulated by:
1315
1316                 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1317                   echo '{@\d_(.*).gz}' ::: 1_foo.gz
1318
1319           {%.}, {%:}, basename without extension
1320               With GNU parallel this can be emulated by:
1321
1322                 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1323
1324               And if you need it often, you define a --rpl in
1325               $HOME/.parallel/config:
1326
1327                 --rpl '{%.} s:.*/::;s/\..*//'
1328                 --rpl '{%:} s:.*/::;s/\..*//'
1329
1330               Then you can use them as:
1331
1332                 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1333
1334       •   Preset variable (macro)
1335
1336           E.g.
1337
1338             echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1339
1340           With GNU parallel this can be emulated by:
1341
1342             echo foosuffix |
1343               parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1344
1345           Opposite rush GNU parallel works fine if the input contains double
1346           space, ' and ":
1347
1348             echo "1'6\"  foosuffix" |
1349               parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1350
1351       •   Commands of multi-lines
1352
1353           While you can use multi-lined commands in GNU parallel, to improve
1354           readability GNU parallel discourages the use of multi-line
1355           commands. In most cases it can be written as a function:
1356
1357             seq 1 3 |
1358               parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1359                 echo finish {}'
1360
1361           Could be written as:
1362
1363             doit() {
1364               sleep "$1"
1365               echo "$1"
1366               echo finish "$1"
1367             }
1368             export -f doit
1369             seq 1 3 | parallel --timeout 2 --joblog my.log doit
1370
1371           The failed commands can be resumed with:
1372
1373             seq 1 3 |
1374               parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1375                 echo finish {}'
1376
1377       https://github.com/shenwei356/rush
1378
1379   DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1380       ClusterSSH solves a different problem than GNU parallel.
1381
1382       ClusterSSH opens a terminal window for each computer and using a master
1383       window you can run the same command on all the computers. This is
1384       typically used for administrating several computers that are almost
1385       identical.
1386
1387       GNU parallel runs the same (or different) commands with different
1388       arguments in parallel possibly using remote computers to help
1389       computing. If more than one computer is listed in -S GNU parallel may
1390       only use one of these (e.g. if there are 8 jobs to be run and one
1391       computer has 8 cores).
1392
1393       GNU parallel can be used as a poor-man's version of ClusterSSH:
1394
1395       parallel --nonall -S server-a,server-b do_stuff foo bar
1396
1397       https://github.com/duncs/clusterssh
1398
1399   DIFFERENCES BETWEEN coshell AND GNU Parallel
1400       coshell only accepts full commands on standard input. Any quoting needs
1401       to be done by the user.
1402
1403       Commands are run in sh so any bash/tcsh/zsh specific syntax will not
1404       work.
1405
1406       Output can be buffered by using -d. Output is buffered in memory, so
1407       big output can cause swapping and therefore be terrible slow or even
1408       cause out of memory.
1409
1410       https://github.com/gdm85/coshell (Last checked: 2019-01)
1411
1412   DIFFERENCES BETWEEN spread AND GNU Parallel
1413       spread runs commands on all directories.
1414
1415       It can be emulated with GNU parallel using this Bash function:
1416
1417         spread() {
1418           _cmds() {
1419             perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1420           }
1421           parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1422         }
1423
1424       This works except for the --exclude option.
1425
1426       (Last checked: 2017-11)
1427
1428   DIFFERENCES BETWEEN pyargs AND GNU Parallel
1429       pyargs deals badly with input containing spaces. It buffers stdout, but
1430       not stderr. It buffers in RAM. {} does not work as replacement string.
1431       It does not support running functions.
1432
1433       pyargs does not support composed commands if run with --lines, and
1434       fails on pyargs traceroute gnu.org fsf.org.
1435
1436       Examples
1437
1438         seq 5 | pyargs -P50 -L seq
1439         seq 5 | parallel -P50 --lb seq
1440
1441         seq 5 | pyargs -P50 --mark -L seq
1442         seq 5 | parallel -P50 --lb \
1443           --tagstring OUTPUT'[{= $_=$job->replaced()=}]' seq
1444         # Similar, but not precisely the same
1445         seq 5 | parallel -P50 --lb --tag seq
1446
1447         seq 5 | pyargs -P50  --mark command
1448         # Somewhat longer with GNU Parallel due to the special
1449         #   --mark formatting
1450         cmd="$(echo "command" | parallel --shellquote)"
1451         wrap_cmd() {
1452            echo "MARK $cmd $@================================" >&3
1453            echo "OUTPUT START[$cmd $@]:"
1454            eval $cmd "$@"
1455            echo "OUTPUT END[$cmd $@]"
1456         }
1457         (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1458         # Similar, but not exactly the same
1459         seq 5 | parallel -t --tag command
1460
1461         (echo '1  2  3';echo 4 5 6) | pyargs  --stream seq
1462         (echo '1  2  3';echo 4 5 6) | perl -pe 's/\n/ /' |
1463           parallel -r -d' ' seq
1464         # Similar, but not exactly the same
1465         parallel seq ::: 1 2 3 4 5 6
1466
1467       https://github.com/robertblackwell/pyargs (Last checked: 2019-01)
1468
1469   DIFFERENCES BETWEEN concurrently AND GNU Parallel
1470       concurrently runs jobs in parallel.
1471
1472       The output is prepended with the job number, and may be incomplete:
1473
1474         $ concurrently 'seq 100000' | (sleep 3;wc -l)
1475         7165
1476
1477       When pretty printing it caches output in memory. Output mixes by using
1478       test MIX below whether or not output is cached.
1479
1480       There seems to be no way of making a template command and have
1481       concurrently fill that with different args. The full commands must be
1482       given on the command line.
1483
1484       There is also no way of controlling how many jobs should be run in
1485       parallel at a time - i.e. "number of jobslots". Instead all jobs are
1486       simply started in parallel.
1487
1488       https://github.com/kimmobrunfeldt/concurrently (Last checked: 2019-01)
1489
1490   DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1491       map does not run jobs in parallel by default. The README suggests
1492       using:
1493
1494         ... | map t 'sleep $t && say done &'
1495
1496       But this fails if more jobs are run in parallel than the number of
1497       available processes. Since there is no support for parallelization in
1498       map itself, the output also mixes:
1499
1500         seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1501
1502       The major difference is that GNU parallel is built for parallelization
1503       and map is not. So GNU parallel has lots of ways of dealing with the
1504       issues that parallelization raises:
1505
1506       •   Keep the number of processes manageable
1507
1508       •   Make sure output does not mix
1509
1510       •   Make Ctrl-C kill all running processes
1511
1512       EXAMPLES FROM maps WEBSITE
1513
1514       Here are the 5 examples converted to GNU Parallel:
1515
1516         1$ ls *.c | map f 'foo $f'
1517         1$ ls *.c | parallel foo
1518
1519         2$ ls *.c | map f 'foo $f; bar $f'
1520         2$ ls *.c | parallel 'foo {}; bar {}'
1521
1522         3$ cat urls | map u 'curl -O $u'
1523         3$ cat urls | parallel curl -O
1524
1525         4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1526         4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1527         4$ parallel 'sleep {} && say done' ::: 1 1 1
1528
1529         5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1530         5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1531         5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1532
1533       https://github.com/soveran/map (Last checked: 2019-01)
1534
1535   DIFFERENCES BETWEEN loop AND GNU Parallel
1536       loop mixes stdout and stderr:
1537
1538           loop 'ls /no-such-file' >/dev/null
1539
1540       loop's replacement string $ITEM does not quote strings:
1541
1542           echo 'two  spaces' | loop 'echo $ITEM'
1543
1544       loop cannot run functions:
1545
1546           myfunc() { echo joe; }
1547           export -f myfunc
1548           loop 'myfunc this fails'
1549
1550       EXAMPLES FROM loop's WEBSITE
1551
1552       Some of the examples from https://github.com/Miserlou/Loop/ can be
1553       emulated with GNU parallel:
1554
1555           # A couple of functions will make the code easier to read
1556           $ loopy() {
1557               yes | parallel -uN0 -j1 "$@"
1558             }
1559           $ export -f loopy
1560           $ time_out() {
1561               parallel -uN0 -q --timeout "$@" ::: 1
1562             }
1563           $ match() {
1564               perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1565             }
1566           $ export -f match
1567
1568           $ loop 'ls' --every 10s
1569           $ loopy --delay 10s ls
1570
1571           $ loop 'touch $COUNT.txt' --count-by 5
1572           $ loopy touch '{= $_=seq()*5 =}'.txt
1573
1574           $ loop --until-contains 200 -- \
1575               ./get_response_code.sh --site mysite.biz`
1576           $ loopy --halt now,success=1 \
1577               './get_response_code.sh --site mysite.biz | match 200'
1578
1579           $ loop './poke_server' --for-duration 8h
1580           $ time_out 8h loopy ./poke_server
1581
1582           $ loop './poke_server' --until-success
1583           $ loopy --halt now,success=1 ./poke_server
1584
1585           $ cat files_to_create.txt | loop 'touch $ITEM'
1586           $ cat files_to_create.txt | parallel touch {}
1587
1588           $ loop 'ls' --for-duration 10min --summary
1589           # --joblog is somewhat more verbose than --summary
1590           $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1591
1592           $ loop 'echo hello'
1593           $ loopy echo hello
1594
1595           $ loop 'echo $COUNT'
1596           # GNU Parallel counts from 1
1597           $ loopy echo {#}
1598           # Counting from 0 can be forced
1599           $ loopy echo '{= $_=seq()-1 =}'
1600
1601           $ loop 'echo $COUNT' --count-by 2
1602           $ loopy echo '{= $_=2*(seq()-1) =}'
1603
1604           $ loop 'echo $COUNT' --count-by 2 --offset 10
1605           $ loopy echo '{= $_=10+2*(seq()-1) =}'
1606
1607           $ loop 'echo $COUNT' --count-by 1.1
1608           # GNU Parallel rounds 3.3000000000000003 to 3.3
1609           $ loopy echo '{= $_=1.1*(seq()-1) =}'
1610
1611           $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1612           $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1613
1614           $ loop 'echo $COUNT' --num 3 --summary
1615           # --joblog is somewhat more verbose than --summary
1616           $ seq 3 | parallel --joblog my.log echo; cat my.log
1617
1618           $ loop 'ls -foobarbatz' --num 3 --summary
1619           # --joblog is somewhat more verbose than --summary
1620           $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1621
1622           $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1623           # Can be emulated by running 2 jobs
1624           $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1625           $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1626
1627           $ loop 'date' --every 5s
1628           $ loopy --delay 5s date
1629
1630           $ loop 'date' --for-duration 8s --every 2s
1631           $ time_out 8s loopy --delay 2s date
1632
1633           $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1634           $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date  +%s`))s
1635           $ time_out $seconds loopy --delay 5s date -u
1636
1637           $ loop 'echo $RANDOM' --until-contains "666"
1638           $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1639
1640           $ loop 'if (( RANDOM % 2 )); then
1641                     (echo "TRUE"; true);
1642                   else
1643                     (echo "FALSE"; false);
1644                   fi' --until-success
1645           $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1646                                           (echo "TRUE"; true);
1647                                         else
1648                                           (echo "FALSE"; false);
1649                                         fi'
1650
1651           $ loop 'if (( RANDOM % 2 )); then
1652               (echo "TRUE"; true);
1653             else
1654               (echo "FALSE"; false);
1655             fi' --until-error
1656           $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1657                                        (echo "TRUE"; true);
1658                                      else
1659                                        (echo "FALSE"; false);
1660                                      fi'
1661
1662           $ loop 'date' --until-match "(\d{4})"
1663           $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1664
1665           $ loop 'echo $ITEM' --for red,green,blue
1666           $ parallel echo ::: red green blue
1667
1668           $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1669           $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1670
1671           $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1672           $ ls | parallel cp {} {}.bak; ls
1673
1674           $ loop 'echo $ITEM | tr a-z A-Z' -i
1675           $ parallel 'echo {} | tr a-z A-Z'
1676           # Or more efficiently:
1677           $ parallel --pipe tr a-z A-Z
1678
1679           $ loop 'echo $ITEM' --for "`ls`"
1680           $ parallel echo {} ::: "`ls`"
1681
1682           $ ls | loop './my_program $ITEM' --until-success;
1683           $ ls | parallel --halt now,success=1 ./my_program {}
1684
1685           $ ls | loop './my_program $ITEM' --until-fail;
1686           $ ls | parallel --halt now,fail=1 ./my_program {}
1687
1688           $ ./deploy.sh;
1689             loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1690               --every 5s --until-contains 200;
1691             ./announce_to_slack.sh
1692           $ ./deploy.sh;
1693             loopy --delay 5s --halt now,success=1 \
1694             'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1695             ./announce_to_slack.sh
1696
1697           $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1698           $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1699
1700           $ ./create_big_file -o my_big_file.bin;
1701             loop 'ls' --until-contains 'my_big_file.bin';
1702             ./upload_big_file my_big_file.bin
1703           # inotifywait is a better tool to detect file system changes.
1704           # It can even make sure the file is complete
1705           # so you are not uploading an incomplete file
1706           $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1707               grep my_big_file.bin
1708
1709           $ ls | loop 'cp $ITEM $ITEM.bak'
1710           $ ls | parallel cp {} {}.bak
1711
1712           $ loop './do_thing.sh' --every 15s --until-success --num 5
1713           $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
1714
1715       https://github.com/Miserlou/Loop/ (Last checked: 2018-10)
1716
1717   DIFFERENCES BETWEEN lorikeet AND GNU Parallel
1718       lorikeet can run jobs in parallel. It does this based on a dependency
1719       graph described in a file, so this is similar to make.
1720
1721       https://github.com/cetra3/lorikeet (Last checked: 2018-10)
1722
1723   DIFFERENCES BETWEEN spp AND GNU Parallel
1724       spp can run jobs in parallel. spp does not use a command template to
1725       generate the jobs, but requires jobs to be in a file. Output from the
1726       jobs mix.
1727
1728       https://github.com/john01dav/spp (Last checked: 2019-01)
1729
1730   DIFFERENCES BETWEEN paral AND GNU Parallel
1731       paral prints a lot of status information and stores the output from the
1732       commands run into files. This means it cannot be used the middle of a
1733       pipe like this
1734
1735         paral "echo this" "echo does not" "echo work" | wc
1736
1737       Instead it puts the output into files named like out_#_command.out.log.
1738       To get a very similar behaviour with GNU parallel use --results
1739       'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta
1740
1741       paral only takes arguments on the command line and each argument should
1742       be a full command. Thus it does not use command templates.
1743
1744       This limits how many jobs it can run in total, because they all need to
1745       fit on a single command line.
1746
1747       paral has no support for running jobs remotely.
1748
1749       EXAMPLES FROM README.markdown
1750
1751       The examples from README.markdown and the corresponding command run
1752       with GNU parallel (--results
1753       'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta is omitted from the
1754       GNU parallel command):
1755
1756         1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
1757         1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
1758
1759         2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
1760              "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
1761         2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
1762              "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
1763            # Or shorter:
1764            parallel "sleep {} && echo c{}" ::: {1..5}
1765
1766         3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1767              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1768         3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
1769              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1770            # Or shorter:
1771            parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
1772
1773         4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1774              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1775         4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
1776
1777         5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1778              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1779         5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
1780
1781         6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1782              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1783         6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
1784
1785         7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1786              echo c && sleep 0.5 && echo d && sleep 0.5 && \
1787              echo e && sleep 0.5 && echo f && sleep 0.5 && \
1788              echo g && sleep 0.5 && echo h"
1789         7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1790              echo c && sleep 0.5 && echo d && sleep 0.5 && \
1791              echo e && sleep 0.5 && echo f && sleep 0.5 && \
1792              echo g && sleep 0.5 && echo h"
1793
1794       https://github.com/amattn/paral (Last checked: 2019-01)
1795
1796   DIFFERENCES BETWEEN concurr AND GNU Parallel
1797       concurr is built to run jobs in parallel using a client/server model.
1798
1799       EXAMPLES FROM README.md
1800
1801       The examples from README.md:
1802
1803         1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
1804         1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
1805
1806         2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
1807         2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
1808
1809         3$ concurr 'echo {}' < input_file
1810         3$ parallel 'echo {}' < input_file
1811
1812         4$ cat file | concurr 'echo {}'
1813         4$ cat file | parallel 'echo {}'
1814
1815       concurr deals badly empty input files and with output larger than 64
1816       KB.
1817
1818       https://github.com/mmstick/concurr (Last checked: 2019-01)
1819
1820   DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
1821       lesser-parallel is the inspiration for parallel --embed. Both lesser-
1822       parallel and parallel --embed define bash functions that can be
1823       included as part of a bash script to run jobs in parallel.
1824
1825       lesser-parallel implements a few of the replacement strings, but hardly
1826       any options, whereas parallel --embed gives you the full GNU parallel
1827       experience.
1828
1829       https://github.com/kou1okada/lesser-parallel (Last checked: 2019-01)
1830
1831   DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
1832       npm-parallel can run npm tasks in parallel.
1833
1834       There are no examples and very little documentation, so it is hard to
1835       compare to GNU parallel.
1836
1837       https://github.com/spion/npm-parallel (Last checked: 2019-01)
1838
1839   DIFFERENCES BETWEEN machma AND GNU Parallel
1840       machma runs tasks in parallel. It gives time stamped output. It buffers
1841       in RAM.
1842
1843       EXAMPLES FROM README.md
1844
1845       The examples from README.md:
1846
1847         1$ # Put shorthand for timestamp in config for the examples
1848            echo '--rpl '\
1849              \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
1850              > ~/.parallel/machma
1851            echo '--line-buffer --tagstring "{#} {time} {}"' \
1852              >> ~/.parallel/machma
1853
1854         2$ find . -iname '*.jpg' |
1855              machma --  mogrify -resize 1200x1200 -filter Lanczos {}
1856            find . -iname '*.jpg' |
1857              parallel --bar -Jmachma mogrify -resize 1200x1200 \
1858                -filter Lanczos {}
1859
1860         3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
1861         3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
1862
1863         4$ cat /tmp/ips |
1864              machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
1865         4$ cat /tmp/ips |
1866              parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
1867
1868         5$ find . -iname '*.jpg' |
1869              machma --timeout 5s -- mogrify -resize 1200x1200 \
1870                -filter Lanczos {}
1871         5$ find . -iname '*.jpg' |
1872              parallel --timeout 5s --bar mogrify -resize 1200x1200 \
1873                -filter Lanczos {}
1874
1875         6$ find . -iname '*.jpg' -print0 |
1876              machma --null --  mogrify -resize 1200x1200 -filter Lanczos {}
1877         6$ find . -iname '*.jpg' -print0 |
1878              parallel --null --bar mogrify -resize 1200x1200 \
1879                -filter Lanczos {}
1880
1881       https://github.com/fd0/machma (Last checked: 2019-06)
1882
1883   DIFFERENCES BETWEEN interlace AND GNU Parallel
1884       Summary (see legend above):
1885
1886       - I2 I3 I4 - - -
1887       M1 - M3 - - M6
1888       - O2 O3 - - - - x x
1889       E1 E2 - - - - -
1890       - - - - - - - - -
1891       - -
1892
1893       interlace is built for network analysis to run network tools in
1894       parallel.
1895
1896       interface does not buffer output, so output from different jobs mixes.
1897
1898       The overhead for each target is O(n*n), so with 1000 targets it becomes
1899       very slow with an overhead in the order of 500ms/target.
1900
1901       EXAMPLES FROM interlace's WEBSITE
1902
1903       Using prips most of the examples from
1904       https://github.com/codingo/Interlace can be run with GNU parallel:
1905
1906       Blocker
1907
1908         commands.txt:
1909           mkdir -p _output_/_target_/scans/
1910           _blocker_
1911           nmap _target_ -oA _output_/_target_/scans/_target_-nmap
1912         interlace -tL ./targets.txt -cL commands.txt -o $output
1913
1914         parallel -a targets.txt \
1915           mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
1916
1917       Blocks
1918
1919         commands.txt:
1920           _block:nmap_
1921           mkdir -p _target_/output/scans/
1922           nmap _target_ -oN _target_/output/scans/_target_-nmap
1923           _block:nmap_
1924           nikto --host _target_
1925         interlace -tL ./targets.txt -cL commands.txt
1926
1927         _nmap() {
1928           mkdir -p $1/output/scans/
1929           nmap $1 -oN $1/output/scans/$1-nmap
1930         }
1931         export -f _nmap
1932         parallel ::: _nmap "nikto --host" :::: targets.txt
1933
1934       Run Nikto Over Multiple Sites
1935
1936         interlace -tL ./targets.txt -threads 5 \
1937           -c "nikto --host _target_ > ./_target_-nikto.txt" -v
1938
1939         parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
1940
1941       Run Nikto Over Multiple Sites and Ports
1942
1943         interlace -tL ./targets.txt -threads 5 -c \
1944           "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
1945           -p 80,443 -v
1946
1947         parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
1948           :::: targets.txt ::: 80 443
1949
1950       Run a List of Commands against Target Hosts
1951
1952         commands.txt:
1953           nikto --host _target_:_port_ > _output_/_target_-nikto.txt
1954           sslscan _target_:_port_ >  _output_/_target_-sslscan.txt
1955           testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
1956         interlace -t example.com -o ~/Engagements/example/ \
1957           -cL ./commands.txt -p 80,443
1958
1959         parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
1960           ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
1961
1962       CIDR notation with an application that doesn't support it
1963
1964         interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
1965           -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1966
1967         prips 192.168.12.0/24 |
1968           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1969
1970       Glob notation with an application that doesn't support it
1971
1972         interlace -t 192.168.12.* -c "vhostscan _target_ \
1973           -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1974
1975         # Glob is not supported in prips
1976         prips 192.168.12.0/24 |
1977           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1978
1979       Dash (-) notation with an application that doesn't support it
1980
1981         interlace -t 192.168.12.1-15 -c \
1982           "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
1983           -o ~/scans/ -threads 50
1984
1985         # Dash notation is not supported in prips
1986         prips 192.168.12.1 192.168.12.15 |
1987           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1988
1989       Threading Support for an application that doesn't support it
1990
1991         interlace -tL ./target-list.txt -c \
1992           "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
1993           -o ~/scans/ -threads 50
1994
1995         cat ./target-list.txt |
1996           parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
1997
1998       alternatively
1999
2000         ./vhosts-commands.txt:
2001           vhostscan -t $target -oN _output_/_target_-vhosts.txt
2002         interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
2003           -threads 50 -o ~/scans
2004
2005         ./vhosts-commands.txt:
2006           vhostscan -t "$1" -oN "$2"
2007         parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
2008           :::: ./target-list.txt
2009
2010       Exclusions
2011
2012         interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
2013           "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2014           -o ~/scans/ -threads 50
2015
2016         prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
2017           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2018
2019       Run Nikto Using Multiple Proxies
2020
2021          interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
2022            "nikto --host _target_:_port_ -useproxy _proxy_ > \
2023             ./_target_-_port_-nikto.txt" -p 80,443 -v
2024
2025          parallel -j5 \
2026            "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
2027            :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
2028
2029       https://github.com/codingo/Interlace (Last checked: 2019-09)
2030
2031   DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
2032       I have been unable to get the code to run at all. It seems unfinished.
2033
2034       https://github.com/otonvm/Parallel (Last checked: 2019-02)
2035
2036   DIFFERENCES BETWEEN k-bx par AND GNU Parallel
2037       par requires Haskell to work. This limits the number of platforms this
2038       can work on.
2039
2040       par does line buffering in memory. The memory usage is 3x the longest
2041       line (compared to 1x for parallel --lb). Commands must be given as
2042       arguments. There is no template.
2043
2044       These are the examples from https://github.com/k-bx/par with the
2045       corresponding GNU parallel command.
2046
2047         par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2048             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2049         parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2050             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2051
2052         par "echo foo; sleep 1; foofoo" \
2053             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2054         parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
2055             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2056
2057         par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
2058         parallel --lb --colsep , --tagstring {1} {2} \
2059           ::: "[fooechoer],echo foo" "[bar],echo bar"
2060
2061         par --succeed "foo" "bar" && echo 'wow'
2062         parallel "foo" "bar"; true && echo 'wow'
2063
2064       https://github.com/k-bx/par (Last checked: 2019-02)
2065
2066   DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2067       parallelshell does not allow for composed commands:
2068
2069         # This does not work
2070         parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2071
2072       Instead you have to wrap that in a shell:
2073
2074         parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2075
2076       It buffers output in RAM. All commands must be given on the command
2077       line and all commands are started in parallel at the same time. This
2078       will cause the system to freeze if there are so many jobs that there is
2079       not enough memory to run them all at the same time.
2080
2081       https://github.com/keithamus/parallelshell (Last checked: 2019-02)
2082
2083       https://github.com/darkguy2008/parallelshell (Last checked: 2019-03)
2084
2085   DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2086       shell-executor does not allow for composed commands:
2087
2088         # This does not work
2089         sx 'echo foo;echo bar' 'echo baz;echo quuz'
2090
2091       Instead you have to wrap that in a shell:
2092
2093         sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2094
2095       It buffers output in RAM. All commands must be given on the command
2096       line and all commands are started in parallel at the same time. This
2097       will cause the system to freeze if there are so many jobs that there is
2098       not enough memory to run them all at the same time.
2099
2100       https://github.com/royriojas/shell-executor (Last checked: 2019-02)
2101
2102   DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2103       par buffers in memory to avoid mixing of jobs. It takes 1s per 1
2104       million output lines.
2105
2106       par needs to have all commands before starting the first job. The jobs
2107       are read from stdin (standard input) so any quoting will have to be
2108       done by the user.
2109
2110       Stdout (standard output) is prepended with o:. Stderr (standard error)
2111       is sendt to stdout (standard output) and prepended with e:.
2112
2113       For short jobs with little output par is 20% faster than GNU parallel
2114       and 60% slower than xargs.
2115
2116       https://github.com/UnixJunkie/PAR
2117
2118       https://savannah.nongnu.org/projects/par (Last checked: 2019-02)
2119
2120   DIFFERENCES BETWEEN fd AND GNU Parallel
2121       fd does not support composed commands, so commands must be wrapped in
2122       sh -c.
2123
2124       It buffers output in RAM.
2125
2126       It only takes file names from the filesystem as input (similar to
2127       find).
2128
2129       https://github.com/sharkdp/fd (Last checked: 2019-02)
2130
2131   DIFFERENCES BETWEEN lateral AND GNU Parallel
2132       lateral is very similar to sem: It takes a single command and runs it
2133       in the background. The design means that output from parallel running
2134       jobs may mix. If it dies unexpectly it leaves a socket in
2135       ~/.lateral/socket.PID.
2136
2137       lateral deals badly with too long command lines. This makes the lateral
2138       server crash:
2139
2140         lateral run echo `seq 100000| head -c 1000k`
2141
2142       Any options will be read by lateral so this does not work (lateral
2143       interprets the -l):
2144
2145         lateral run ls -l
2146
2147       Composed commands do not work:
2148
2149         lateral run pwd ';' ls
2150
2151       Functions do not work:
2152
2153         myfunc() { echo a; }
2154         export -f myfunc
2155         lateral run myfunc
2156
2157       Running emacs in the terminal causes the parent shell to die:
2158
2159         echo '#!/bin/bash' > mycmd
2160         echo emacs -nw >> mycmd
2161         chmod +x mycmd
2162         lateral start
2163         lateral run ./mycmd
2164
2165       Here are the examples from https://github.com/akramer/lateral with the
2166       corresponding GNU sem and GNU parallel commands:
2167
2168         1$ lateral start
2169            for i in $(cat /tmp/names); do
2170              lateral run -- some_command $i
2171            done
2172            lateral wait
2173
2174         1$ for i in $(cat /tmp/names); do
2175              sem some_command $i
2176            done
2177            sem --wait
2178
2179         1$ parallel some_command :::: /tmp/names
2180
2181         2$ lateral start
2182            for i in $(seq 1 100); do
2183              lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2184            done
2185            lateral wait
2186
2187         2$ for i in $(seq 1 100); do
2188              sem my_slow_command < workfile$i > /tmp/logfile$i
2189            done
2190            sem --wait
2191
2192         2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2193              ::: {1..100}
2194
2195         3$ lateral start -p 0 # yup, it will just queue tasks
2196            for i in $(seq 1 100); do
2197              lateral run -- command_still_outputs_but_wont_spam inputfile$i
2198            done
2199            # command output spam can commence
2200            lateral config -p 10; lateral wait
2201
2202         3$ for i in $(seq 1 100); do
2203              echo "command inputfile$i" >> joblist
2204            done
2205            parallel -j 10 :::: joblist
2206
2207         3$ echo 1 > /tmp/njobs
2208            parallel -j /tmp/njobs command inputfile{} \
2209              ::: {1..100} &
2210            echo 10 >/tmp/njobs
2211            wait
2212
2213       https://github.com/akramer/lateral (Last checked: 2019-03)
2214
2215   DIFFERENCES BETWEEN with-this AND GNU Parallel
2216       The examples from https://github.com/amritb/with-this.git and the
2217       corresponding GNU parallel command:
2218
2219         with -v "$(cat myurls.txt)" "curl -L this"
2220         parallel curl -L ::: myurls.txt
2221
2222         with -v "$(cat myregions.txt)" \
2223           "aws --region=this ec2 describe-instance-status"
2224         parallel aws --region={} ec2 describe-instance-status \
2225           :::: myregions.txt
2226
2227         with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2228         ls | parallel kubectl --kubeconfig={} get pods
2229
2230         with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2231         ls | grep config | parallel kubectl --kubeconfig={} get pods
2232
2233         with -v "$(echo {1..10})" "echo 123"
2234         parallel -N0 echo 123 ::: {1..10}
2235
2236       Stderr is merged with stdout. with-this buffers in RAM. It uses 3x the
2237       output size, so you cannot have output larger than 1/3rd the amount of
2238       RAM. The input values cannot contain spaces. Composed commands do not
2239       work.
2240
2241       with-this gives some additional information, so the output has to be
2242       cleaned before piping it to the next command.
2243
2244       https://github.com/amritb/with-this.git (Last checked: 2019-03)
2245
2246   DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2247       Summary (see legend above):
2248
2249       - - - I4 - - I7
2250       - - M3 - - M6
2251       - O2 O3 - O5 O6 - x x
2252       E1 - - - - - E7
2253       - x x x x x x x x
2254       - -
2255
2256       EXAMPLES FROM Tollef's parallel MANUAL
2257
2258       Tollef parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2259
2260       GNU parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2261
2262       Tollef parallel -j 3 ufraw -o processed -- *.NEF
2263
2264       GNU parallel -j 3 ufraw -o processed ::: *.NEF
2265
2266       Tollef parallel -j 3 -- ls df "echo hi"
2267
2268       GNU parallel -j 3 ::: ls df "echo hi"
2269
2270       (Last checked: 2019-08)
2271
2272   DIFFERENCES BETWEEN rargs AND GNU Parallel
2273       Summary (see legend above):
2274
2275       I1 - - - - - I7
2276       - - M3 M4 - -
2277       - O2 O3 - O5 O6 - O8 -
2278       E1 - - E4 - - -
2279       - - - - - - - - -
2280       - -
2281
2282       rargs has elegant ways of doing named regexp capture and field ranges.
2283
2284       With GNU parallel you can use --rpl to get a similar functionality as
2285       regexp capture gives, and use join and @arg to get the field ranges.
2286       But the syntax is longer. This:
2287
2288         --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2289
2290       would make it possible to use:
2291
2292         {1r3..6}
2293
2294       for field 3..6.
2295
2296       For full support of {n..m:s} including negative numbers use a dynamic
2297       replacement string like this:
2298
2299         PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2300                 $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2301                 $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2302                 $s = defined $$6 ? $$7 : " ";
2303                 $_ = join $s,@arg[$a..$b]'\'
2304         export PARALLEL
2305
2306       You can then do:
2307
2308         head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2309           4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2310           ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2311
2312       EXAMPLES FROM rargs MANUAL
2313
2314         ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2315         ls *.bak | parallel mv {} {.}
2316
2317         cat download-list.csv | rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2318         cat download-list.csv | parallel --csv wget {1} -O {2}
2319         # or use regexps:
2320         cat download-list.csv |
2321           parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' wget {url} -O {filename}
2322
2323         cat /etc/passwd | rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2324         cat /etc/passwd |
2325           parallel -q --colsep : echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2326
2327       https://github.com/lotabout/rargs (Last checked: 2020-01)
2328
2329   DIFFERENCES BETWEEN threader AND GNU Parallel
2330       Summary (see legend above):
2331
2332       I1 - - - - - -
2333       M1 - M3 - - M6
2334       O1 - O3 - O5 - - N/A N/A
2335       E1 - - E4 - - -
2336       - - - - - - - - -
2337       - -
2338
2339       Newline separates arguments, but newline at the end of file is treated
2340       as an empty argument. So this runs 2 jobs:
2341
2342         echo two_jobs | threader -run 'echo "$THREADID"'
2343
2344       threader ignores stderr, so any output to stderr is lost. threader
2345       buffers in RAM, so output bigger than the machine's virtual memory will
2346       cause the machine to crash.
2347
2348       https://github.com/voodooEntity/threader (Last checked: 2020-04)
2349
2350   DIFFERENCES BETWEEN runp AND GNU Parallel
2351       Summary (see legend above):
2352
2353       I1 I2 - - - - -
2354       M1 - (M3) - - M6
2355       O1 O2 O3 - O5 O6 - N/A N/A -
2356       E1 - - - - - -
2357       - - - - - - - - -
2358       - -
2359
2360       (M3): You can add a prefix and a postfix to the input, so it means you
2361       can only insert the argument on the command line once.
2362
2363       runp runs 10 jobs in parallel by default.  runp blocks if output of a
2364       command is > 64 Kbytes.  Quoting of input is needed.  It adds output to
2365       stderr (this can be prevented with -q)
2366
2367       Examples as GNU Parallel
2368
2369         base='https://images-api.nasa.gov/search'
2370         query='jupiter'
2371         desc='planet'
2372         type='image'
2373         url="$base?q=$query&description=$desc&media_type=$type"
2374
2375         # Download the images in parallel using runp
2376         curl -s $url | jq -r .collection.items[].href | \
2377           runp -p 'curl -s' | jq -r .[] | grep large | \
2378           runp -p 'curl -s -L -O'
2379
2380         time curl -s $url | jq -r .collection.items[].href | \
2381           runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2382           runp -g 1 -q -p 'curl -s -L -O'
2383
2384         # Download the images in parallel
2385         curl -s $url | jq -r .collection.items[].href | \
2386           parallel curl -s | jq -r .[] | grep large | \
2387           parallel curl -s -L -O
2388
2389         time curl -s $url | jq -r .collection.items[].href | \
2390           parallel -j 1 curl -s | jq -r .[] | grep large | \
2391           parallel -j 1 curl -s -L -O
2392
2393       Run some test commands (read from file)
2394
2395         # Create a file containing commands to run in parallel.
2396         cat << EOF > /tmp/test-commands.txt
2397         sleep 5
2398         sleep 3
2399         blah     # this will fail
2400         ls $PWD  # PWD shell variable is used here
2401         EOF
2402
2403         # Run commands from the file.
2404         runp /tmp/test-commands.txt > /dev/null
2405
2406         parallel -a /tmp/test-commands.txt > /dev/null
2407
2408       Ping several hosts and see packet loss (read from stdin)
2409
2410         # First copy this line and press Enter
2411         runp -p 'ping -c 5 -W 2' -s '| grep loss'
2412         localhost
2413         1.1.1.1
2414         8.8.8.8
2415         # Press Enter and Ctrl-D when done entering the hosts
2416
2417         # First copy this line and press Enter
2418         parallel ping -c 5 -W 2 {} '| grep loss'
2419         localhost
2420         1.1.1.1
2421         8.8.8.8
2422         # Press Enter and Ctrl-D when done entering the hosts
2423
2424       Get directories' sizes (read from stdin)
2425
2426         echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2427
2428         echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2429         # or:
2430         parallel sudo du -sh ::: "$HOME" /etc /tmp
2431
2432       Compress files
2433
2434         find . -iname '*.txt' | runp -p 'gzip --best'
2435
2436         find . -iname '*.txt' | parallel gzip --best
2437
2438       Measure HTTP request + response time
2439
2440         export CURL="curl -w 'time_total:  %{time_total}\n'"
2441         CURL="$CURL -o /dev/null -s https://golang.org/"
2442         perl -wE 'for (1..10) { say $ENV{CURL} }' |
2443            runp -q  # Make 10 requests
2444
2445         perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2446         # or:
2447         parallel -N0 "$CURL" ::: {1..10}
2448
2449       Find open TCP ports
2450
2451         cat << EOF > /tmp/host-port.txt
2452         localhost 22
2453         localhost 80
2454         localhost 81
2455         127.0.0.1 443
2456         127.0.0.1 444
2457         scanme.nmap.org 22
2458         scanme.nmap.org 23
2459         scanme.nmap.org 443
2460         EOF
2461
2462         1$ cat /tmp/host-port.txt |
2463              runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2464
2465         # --colsep is needed to split the line
2466         1$ cat /tmp/host-port.txt |
2467              parallel --colsep ' ' netcat -v -w2 -z 2>&1 |
2468              egrep '(succeeded!|open)$'
2469         # or use uq for unquoted:
2470         1$ cat /tmp/host-port.txt |
2471              parallel netcat -v -w2 -z {=uq=} 2>&1 |
2472              egrep '(succeeded!|open)$'
2473
2474       https://github.com/jreisinger/runp (Last checked: 2020-04)
2475
2476   DIFFERENCES BETWEEN papply AND GNU Parallel
2477       Summary (see legend above):
2478
2479       - - - I4 - - -
2480       M1 - M3 - - M6
2481       - - O3 - O5 - - N/A N/A O10
2482       E1 - - E4 - - -
2483       - - - - - - - - -
2484       - -
2485
2486       papply does not print the output if the command fails:
2487
2488         $ papply 'echo %F; false' foo
2489         "echo foo; false" did not succeed
2490
2491       papply's replacement strings (%F %d %f %n %e %z) can be simulated in
2492       GNU parallel by putting this in ~/.parallel/config:
2493
2494         --rpl '%F'
2495         --rpl '%d $_=Q(::dirname($_));'
2496         --rpl '%f s:.*/::;'
2497         --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2498         --rpl '%e s:.*\.:.:'
2499         --rpl '%z $_=""'
2500
2501       papply buffers in RAM, and uses twice the amount of output. So output
2502       of 5 GB takes 10 GB RAM.
2503
2504       The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2505       seconds (compared to 10 seconds with GNU parallel).
2506
2507       Examples as GNU Parallel
2508
2509         1$ papply gzip *.txt
2510
2511         1$ parallel gzip ::: *.txt
2512
2513         2$ papply "convert %F %n.jpg" *.png
2514
2515         2$ parallel convert {} {.}.jpg ::: *.png
2516
2517       https://pypi.org/project/papply/ (Last checked: 2020-04)
2518
2519   DIFFERENCES BETWEEN async AND GNU Parallel
2520       Summary (see legend above):
2521
2522       - - - I4 - - I7
2523       - - - - - M6
2524       - O2 O3 - O5 O6 - N/A N/A O10
2525       E1 - - E4 - E6 -
2526       - - - - - - - - -
2527       S1 S2
2528
2529       async is very similary to GNU parallel's --semaphore mode (aka sem).
2530       async requires the user to start a server process.
2531
2532       The input is quoted like -q so you need bash -c "...;..." to run
2533       composed commands.
2534
2535       Examples as GNU Parallel
2536
2537         1$ S="/tmp/example_socket"
2538
2539         1$ ID=myid
2540
2541         2$ async -s="$S" server --start
2542
2543         2$ # GNU Parallel does not need a server to run
2544
2545         3$ for i in {1..20}; do
2546                # prints command output to stdout
2547                async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2548            done
2549
2550         3$ for i in {1..20}; do
2551                # prints command output to stdout
2552                sem --id "$ID" -j100% "sleep 1 && echo test $i"
2553                # GNU Parallel will only print job when it is done
2554                # If you need output from different jobs to mix
2555                # use -u or --line-buffer
2556                sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2557            done
2558
2559         4$ # wait until all commands are finished
2560            async -s="$S" wait
2561
2562         4$ sem --id "$ID" --wait
2563
2564         5$ # configure the server to run four commands in parallel
2565            async -s="$S" server -j4
2566
2567         5$ export PARALLEL=-j4
2568
2569         6$ mkdir "/tmp/ex_dir"
2570            for i in {21..40}; do
2571              # redirects command output to /tmp/ex_dir/file*
2572              async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2573                bash -c "sleep 1 && echo test $i"
2574            done
2575
2576         6$ mkdir "/tmp/ex_dir"
2577            for i in {21..40}; do
2578              # redirects command output to /tmp/ex_dir/file*
2579              sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
2580                "sleep 1 && echo test $i"
2581            done
2582
2583         7$ sem --id "$ID" --wait
2584
2585         7$ async -s="$S" wait
2586
2587         8$ # stops server
2588            async -s="$S" server --stop
2589
2590         8$ # GNU Parallel does not need to stop a server
2591
2592       https://github.com/ctbur/async/ (Last checked: 2020-11)
2593
2594   DIFFERENCES BETWEEN pardi AND GNU Parallel
2595       Summary (see legend above):
2596
2597       I1 I2 - - - - I7
2598       M1 - - - - M6
2599       O1 O2 O3 O4 O5 - O7 - - O10
2600       E1 - - E4 - - -
2601       - - - - - - - - -
2602       - -
2603
2604       pardi is very similar to parallel --pipe --cat: It reads blocks of data
2605       and not arguments. So it cannot insert an argument in the command line.
2606       It puts the block into a temporary file, and this file name (%IN) can
2607       be put in the command line. You can only use %IN once.
2608
2609       It can also run full command lines in parallel (like: cat file |
2610       parallel).
2611
2612       EXAMPLES FROM pardi test.sh
2613
2614         1$ time pardi -v -c 100 -i data/decoys.smi -ie .smi -oe .smi \
2615              -o data/decoys_std_pardi.smi \
2616                 -w '(standardiser -i %IN -o %OUT 2>&1) > /dev/null'
2617
2618         1$ cat data/decoys.smi |
2619              time parallel -N 100 --pipe --cat \
2620                '(standardiser -i {} -o {#} 2>&1) > /dev/null; cat {#}; rm {#}' \
2621                > data/decoys_std_pardi.smi
2622
2623         2$ pardi -n 1 -i data/test_in.types -o data/test_out.types \
2624                    -d 'r:^#atoms:' -w 'cat %IN > %OUT'
2625
2626         2$ cat data/test_in.types | parallel -n 1 -k --pipe --cat \
2627                    --regexp --recstart '^#atoms' 'cat {}' > data/test_out.types
2628
2629         3$ pardi -c 6 -i data/test_in.types -o data/test_out.types \
2630                    -d 'r:^#atoms:' -w 'cat %IN > %OUT'
2631
2632         3$ cat data/test_in.types | parallel -n 6 -k --pipe --cat \
2633                    --regexp --recstart '^#atoms' 'cat {}' > data/test_out.types
2634
2635         4$ pardi -i data/decoys.mol2 -o data/still_decoys.mol2 \
2636                    -d 's:@<TRIPOS>MOLECULE' -w 'cp %IN %OUT'
2637
2638         4$ cat data/decoys.mol2 |
2639              parallel -n 1 --pipe --cat --recstart '@<TRIPOS>MOLECULE' \
2640                'cp {} {#}; cat {#}; rm {#}' > data/still_decoys.mol2
2641
2642         5$ pardi -i data/decoys.mol2 -o data/decoys2.mol2 \
2643                    -d b:10000 -w 'cp %IN %OUT' --preserve
2644
2645         5$ cat data/decoys.mol2 |
2646              parallel -k --pipe --block 10k --recend '' --cat \
2647                'cat {} > {#}; cat {#}; rm {#}' > data/decoys2.mol2
2648
2649       https://github.com/UnixJunkie/pardi (Last checked: 2021-01)
2650
2651   DIFFERENCES BETWEEN bthread AND GNU Parallel
2652       Summary (see legend above):
2653
2654       - - - I4 -  - -
2655       - - - - - M6
2656       O1 - O3 - - - O7 O8 - -
2657       E1 - - - - - -
2658       - - - - - - - - -
2659       - -
2660
2661       bthread takes around 1 sec per MB of output. The maximal output line
2662       length is 1073741759.
2663
2664       You cannot quote space in the command, so you cannot run composed
2665       commands like sh -c "echo a; echo b".
2666
2667       https://gitlab.com/netikras/bthread (Last checked: 2021-01)
2668
2669   DIFFERENCES BETWEEN simple_gpu_scheduler AND GNU Parallel
2670       Summary (see legend above):
2671
2672       I1 - - - - - I7
2673       M1 - - - - M6
2674       - O2 O3 - - O6 - x x O10
2675       E1 - - - - - -
2676       - - - - - - - - -
2677       - -
2678
2679       EXAMPLES FROM simple_gpu_scheduler MANUAL
2680
2681         1$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
2682
2683         1$ parallel -j3 --shuf \
2684            CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' < gpu_commands.txt
2685
2686         2$ simple_hypersearch "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
2687              -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
2688              simple_gpu_scheduler --gpus 0,1,2
2689
2690         2$ parallel --header : --shuf -j3 -v \
2691              CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' \
2692              python3 train_dnn.py --lr {lr} --batch_size {bs} \
2693              ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2694
2695         3$ simple_hypersearch \
2696              "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
2697              --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
2698              simple_gpu_scheduler --gpus 0,1,2
2699
2700         3$ parallel --header : --shuf \
2701              CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq() > 5 and skip() =}' \
2702              python3 train_dnn.py --lr {lr} --batch_size {bs} \
2703              ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2704
2705         4$ touch gpu.queue
2706            tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
2707            echo "my_command_with | and stuff > logfile" >> gpu.queue
2708
2709         4$ touch gpu.queue
2710            tail -f -n 0 gpu.queue |
2711              parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
2712            # Needed to fill job slots once
2713            seq 3 | parallel echo true >> gpu.queue
2714            # Add jobs
2715            echo "my_command_with | and stuff > logfile" >> gpu.queue
2716            # Needed to flush output from completed jobs
2717            seq 3 | parallel echo true >> gpu.queue
2718
2719       https://github.com/ExpectationMax/simple_gpu_scheduler (Last checked:
2720       2021-01)
2721
2722   DIFFERENCES BETWEEN parasweep AND GNU Parallel
2723       parasweep is a Python module for facilitating parallel parameter
2724       sweeps.
2725
2726       A parasweep job will normally take a text file as input. The text file
2727       contains arguments for the job. Some of these arguments will be fixed
2728       and some of them will be changed by parasweep.
2729
2730       It does this by having a template file such as template.txt:
2731
2732         Xval: {x}
2733         Yval: {y}
2734         FixedValue: 9
2735         # x with 2 decimals
2736         DecimalX: {x:.2f}
2737         TenX: ${x*10}
2738         RandomVal: {r}
2739
2740       and from this template it generates the file to be used by the job by
2741       replacing the replacement strings.
2742
2743       Being a Python module parasweep integrates tighter with Python than GNU
2744       parallel. You get the parameters directly in a Python data structure.
2745       With GNU parallel you can use the JSON or CSV output format to get
2746       something similar, but you would have to read the output.
2747
2748       parasweep has a filtering method to ignore parameter combinations you
2749       do not need.
2750
2751       Instead of calling the jobs directly, parasweep can use Python's
2752       Distributed Resource Management Application API to make jobs run with
2753       different cluster software.
2754
2755       GNU parallel --tmpl supports templates with replacement strings. Such
2756       as:
2757
2758         Xval: {x}
2759         Yval: {y}
2760         FixedValue: 9
2761         # x with 2 decimals
2762         DecimalX: {=x $_=sprintf("%.2f",$_) =}
2763         TenX: {=x $_=$_*10 =}
2764         RandomVal: {=1 $_=rand() =}
2765
2766       that can be used like:
2767
2768         parallel --header : --tmpl my.tmpl={#}.t myprog {#}.t \
2769           ::: x 1 2 3 ::: y 1 2 3
2770
2771       Filtering is supported as:
2772
2773         parallel --filter '{1} > {2}' echo ::: 1 2 3 ::: 1 2 3
2774
2775       https://github.com/eviatarbach/parasweep (Last checked: 2021-01)
2776
2777   DIFFERENCES BETWEEN parallel-bash AND GNU Parallel
2778       Summary (see legend above):
2779
2780       I1 I2 - - - - -
2781       - - M3 - - M6
2782       - O2 O3 - O5 O6 - O8 x O10
2783       E1 - - - - - -
2784       - - - - - - - - -
2785       - -
2786
2787       parallel-bash is written in pure bash. It is really fast (overhead of
2788       ~0.05 ms/job compared to GNU parallel's ~3 ms/job). So if your jobs are
2789       extremely short lived, and you can live with the quite limited command,
2790       this may be useful.
2791
2792       It works by making a queue for each process. Then the jobs are
2793       distributed to the queues in a round robin fashion. Finally the queues
2794       are started in parallel. This works fine, if you are lucky, but if not,
2795       all the long jobs may end up in the same queue, so you may see:
2796
2797         $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
2798             time parallel -P4 sleep {}
2799         (7 seconds)
2800         $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
2801             time ./parallel-bash.bash -p 4 -c sleep {}
2802         (12 seconds)
2803
2804       Because it uses bash lists, the total number of jobs is limited to
2805       167000..265000 depending on your environment. You get a segmentation
2806       fault, when you reach the limit.
2807
2808       Ctrl-C does not stop spawning new jobs. Ctrl-Z does not suspend running
2809       jobs.
2810
2811       EXAMPLES FROM parallel-bash
2812
2813         1$ some_input | parallel-bash -p 5 -c echo
2814
2815         1$ some_input | parallel -j 5 echo
2816
2817         2$ parallel-bash -p 5 -c echo < some_file
2818
2819         2$ parallel -j 5 echo < some_file
2820
2821         3$ parallel-bash -p 5 -c echo <<< 'some string'
2822
2823         3$ parallel -j 5 -c echo <<< 'some string'
2824
2825         4$ something | parallel-bash -p 5 -c echo {} {}
2826
2827         4$ something | parallel -j 5 echo {} {}
2828
2829       https://reposhub.com/python/command-line-tools/Akianonymus-parallel-bash.html
2830       (Last checked: 2021-06)
2831
2832   DIFFERENCES BETWEEN bash-concurrent AND GNU Parallel
2833       bash-concurrent is more an alternative to make than to GNU parallel.
2834       Its input is very similar to a Makefile, where jobs depend on other
2835       jobs.
2836
2837       It has a nice progress indicator where you can see which jobs completed
2838       successfully, which jobs are currently running, which jobs failed, and
2839       which jobs were skipped due to a depending job failed.  The indicator
2840       does not deal well with resizing the window.
2841
2842       Output is cached in tempfiles on disk, but is only shown if there is an
2843       error, so it is not meant to be part of a UNIX pipeline. If bash-
2844       concurrent crashes these tempfiles are not removed.
2845
2846       It uses an O(n*n) algorithm, so if you have 1000 independent jobs it
2847       takes 22 seconds to start it.
2848
2849       https://github.com/themattrix/bash-concurrent (Last checked: 2021-02)
2850
2851   DIFFERENCES BETWEEN spawntool AND GNU Parallel
2852       Summary (see legend above):
2853
2854       I1 - - - - - -
2855       M1 - - - - M6
2856       - O2 O3 - O5 O6 - x x O10
2857       E1 - - - - - -
2858       - - - - - - - - -
2859       - -
2860
2861       spawn reads a full command line from stdin which it executes in
2862       parallel.
2863
2864       http://code.google.com/p/spawntool/ (Last checked: 2021-07)
2865
2866   DIFFERENCES BETWEEN go-pssh AND GNU Parallel
2867       Summary (see legend above):
2868
2869       - - - - - - -
2870       M1 - - - - -
2871       O1 - - - - - - x x O10
2872       E1 - - - - - -
2873       R1 R2 - - - R6 - - -
2874       - -
2875
2876       go-pssh does ssh in parallel to multiple machines. It runs the same
2877       command on multiple machines similar to --nonall.
2878
2879       The hostnames must be given as IP-addresses (not as hostnames).
2880
2881       Output is sent to stdout (standard output) if command is successful,
2882       and to stderr (standard error) if the command fails.
2883
2884       EXAMPLES FROM go-pssh
2885
2886         1$ go-pssh -l <ip>,<ip> -u <user> -p <port> -P <passwd> -c "<command>"
2887
2888         1$ parallel -S 'sshpass -p <passwd> ssh -p <port> <user>@<ip>' \
2889              --nonall "<command>"
2890
2891         2$ go-pssh scp -f host.txt -u <user> -p <port> -P <password> \
2892              -s /local/file_or_directory -d /remote/directory
2893
2894         2$ parallel --nonall --slf host.txt \
2895              --basefile /local/file_or_directory/./ --wd /remote/directory
2896              --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
2897
2898         3$ go-pssh scp -l <ip>,<ip> -u <user> -p <port> -P <password> \
2899              -s /local/file_or_directory -d /remote/directory
2900
2901         3$ parallel --nonall -S <ip>,<ip> \
2902              --basefile /local/file_or_directory/./ --wd /remote/directory
2903              --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
2904
2905       https://github.com/xuchenCN/go-pssh (Last checked: 2021-07)
2906
2907   DIFFERENCES BETWEEN go-parallel AND GNU Parallel
2908       Summary (see legend above):
2909
2910       I1 I2 - - - - I7
2911       - - M3 - - M6
2912       - O2 O3 - O5 - - x x - O10
2913       E1 - - E4 - - -
2914       - - - - - - - - -
2915       - -
2916
2917       go-parallel uses Go templates for replacement strings. Quite similar to
2918       the {= perl expr =} replacement string.
2919
2920       EXAMPLES FROM go-parallel
2921
2922         1$ go-parallel -a ./files.txt -t 'cp {{.Input}} {{.Input | dirname | dirname}}'
2923
2924         1$ parallel -a ./files.txt cp {} '{= $_=::dirname(::dirname($_)) =}'
2925
2926         2$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{noExt .Input}}'
2927
2928         2$ parallel -a ./files.txt echo mkdir -p {} {.}
2929
2930         3$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{.Input | basename | noExt}}'
2931
2932         3$ parallel -a ./files.txt echo mkdir -p {} {/.}
2933
2934       https://github.com/mylanconnolly/parallel (Last checked: 2021-07)
2935
2936   Todo
2937       http://code.google.com/p/push/ (cannot compile)
2938
2939       https://github.com/krashanoff/parallel
2940
2941       https://github.com/Nukesor/pueue
2942
2943       https://arxiv.org/pdf/2012.15443.pdf KumQuat
2944
2945       https://arxiv.org/pdf/2007.09436.pdf PaSH: Light-touch Data-Parallel
2946       Shell Processing
2947
2948       https://github.com/JeiKeiLim/simple_distribute_job
2949
2950       https://github.com/reggi/pkgrun - not obvious how to use
2951
2952       https://github.com/benoror/better-npm-run - not obvious how to use
2953
2954       https://github.com/bahmutov/with-package
2955
2956       https://github.com/flesler/parallel
2957
2958       https://github.com/Julian/Verge
2959
2960       https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html
2961
2962       https://vicerveza.homeunix.net/~viric/soft/ts/
2963
2964       https://github.com/chapmanjacobd/que
2965

TESTING OTHER TOOLS

2967       There are certain issues that are very common on parallelizing tools.
2968       Here are a few stress tests. Be warned: If the tool is badly coded it
2969       may overload your machine.
2970
2971   MIX: Output mixes
2972       Output from 2 jobs should not mix. If the output is not used, this does
2973       not matter; but if the output is used then it is important that you do
2974       not get half a line from one job followed by half a line from another
2975       job.
2976
2977       If the tool does not buffer, output will most likely mix now and then.
2978
2979       This test stresses whether output mixes.
2980
2981         #!/bin/bash
2982
2983         paralleltool="parallel -j0"
2984
2985         cat <<-EOF > mycommand
2986         #!/bin/bash
2987
2988         # If a, b, c, d, e, and f mix: Very bad
2989         perl -e 'print STDOUT "a"x3000_000," "'
2990         perl -e 'print STDERR "b"x3000_000," "'
2991         perl -e 'print STDOUT "c"x3000_000," "'
2992         perl -e 'print STDERR "d"x3000_000," "'
2993         perl -e 'print STDOUT "e"x3000_000," "'
2994         perl -e 'print STDERR "f"x3000_000," "'
2995         echo
2996         echo >&2
2997         EOF
2998         chmod +x mycommand
2999
3000         # Run 30 jobs in parallel
3001         seq 30 |
3002           $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
3003
3004         # 'a c e' and 'b d f' should always stay together
3005         # and there should only be a single line per job
3006
3007   STDERRMERGE: Stderr is merged with stdout
3008       Output from stdout and stderr should not be merged, but kept separated.
3009
3010       This test shows whether stdout is mixed with stderr.
3011
3012         #!/bin/bash
3013
3014         paralleltool="parallel -j0"
3015
3016         cat <<-EOF > mycommand
3017         #!/bin/bash
3018
3019         echo stdout
3020         echo stderr >&2
3021         echo stdout
3022         echo stderr >&2
3023         EOF
3024         chmod +x mycommand
3025
3026         # Run one job
3027         echo |
3028           $paralleltool ./mycommand > stdout 2> stderr
3029         cat stdout
3030         cat stderr
3031
3032   RAM: Output limited by RAM
3033       Some tools cache output in RAM. This makes them extremely slow if the
3034       output is bigger than physical memory and crash if the output is bigger
3035       than the virtual memory.
3036
3037         #!/bin/bash
3038
3039         paralleltool="parallel -j0"
3040
3041         cat <<'EOF' > mycommand
3042         #!/bin/bash
3043
3044         # Generate 1 GB output
3045         yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
3046         EOF
3047         chmod +x mycommand
3048
3049         # Run 20 jobs in parallel
3050         # Adjust 20 to be > physical RAM and < free space on /tmp
3051         seq 20 | time $paralleltool ./mycommand | wc -c
3052
3053   DISKFULL: Incomplete data if /tmp runs full
3054       If caching is done on disk, the disk can run full during the run. Not
3055       all programs discover this. GNU Parallel discovers it, if it stays full
3056       for at least 2 seconds.
3057
3058         #!/bin/bash
3059
3060         paralleltool="parallel -j0"
3061
3062         # This should be a dir with less than 100 GB free space
3063         smalldisk=/tmp/shm/parallel
3064
3065         TMPDIR="$smalldisk"
3066         export TMPDIR
3067
3068         max_output() {
3069             # Force worst case scenario:
3070             # Make GNU Parallel only check once per second
3071             sleep 10
3072             # Generate 100 GB to fill $TMPDIR
3073             # Adjust if /tmp is bigger than 100 GB
3074             yes | head -c 100G >$TMPDIR/$$
3075             # Generate 10 MB output that will not be buffered due to full disk
3076             perl -e 'print "X"x10_000_000' | head -c 10M
3077             echo This part is missing from incomplete output
3078             sleep 2
3079             rm $TMPDIR/$$
3080             echo Final output
3081         }
3082
3083         export -f max_output
3084         seq 10 | $paralleltool max_output | tr -s X
3085
3086   CLEANUP: Leaving tmp files at unexpected death
3087       Some tools do not clean up tmp files if they are killed. If the tool
3088       buffers on disk, they may not clean up, if they are killed.
3089
3090         #!/bin/bash
3091
3092         paralleltool=parallel
3093
3094         ls /tmp >/tmp/before
3095         seq 10 | $paralleltool sleep &
3096         pid=$!
3097         # Give the tool time to start up
3098         sleep 1
3099         # Kill it without giving it a chance to cleanup
3100         kill -9 $!
3101         # Should be empty: No files should be left behind
3102         diff <(ls /tmp) /tmp/before
3103
3104   SPCCHAR: Dealing badly with special file names.
3105       It is not uncommon for users to create files like:
3106
3107         My brother's 12" *** record  (costs $$$).jpg
3108
3109       Some tools break on this.
3110
3111         #!/bin/bash
3112
3113         paralleltool=parallel
3114
3115         touch "My brother's 12\" *** record  (costs \$\$\$).jpg"
3116         ls My*jpg | $paralleltool ls -l
3117
3118   COMPOSED: Composed commands do not work
3119       Some tools require you to wrap composed commands into bash -c.
3120
3121         echo bar | $paralleltool echo foo';' echo {}
3122
3123   ONEREP: Only one replacement string allowed
3124       Some tools can only insert the argument once.
3125
3126         echo bar | $paralleltool echo {} foo {}
3127
3128   INPUTSIZE: Length of input should not be limited
3129       Some tools limit the length of the input lines artificially with no
3130       good reason. GNU parallel does not:
3131
3132         perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
3133
3134       GNU parallel limits the command to run to 128 KB due to execve(1):
3135
3136         perl -e 'print "x"x131_000' | parallel echo {} | wc
3137
3138   NUMWORDS: Speed depends on number of words
3139       Some tools become very slow if output lines have many words.
3140
3141         #!/bin/bash
3142
3143         paralleltool=parallel
3144
3145         cat <<-EOF > mycommand
3146         #!/bin/bash
3147
3148         # 10 MB of lines with 1000 words
3149         yes "`seq 1000`" | head -c 10M
3150         EOF
3151         chmod +x mycommand
3152
3153         # Run 30 jobs in parallel
3154         seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
3155
3156   4GB: Output with a line > 4GB should be OK
3157         #!/bin/bash
3158
3159         paralleltool="parallel -j0"
3160
3161         cat <<-EOF > mycommand
3162         #!/bin/bash
3163
3164         perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
3165         EOF
3166         chmod +x mycommand
3167
3168         # Run 1 job
3169         seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
3170

AUTHOR

3172       When using GNU parallel for a publication please cite:
3173
3174       O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
3175       The USENIX Magazine, February 2011:42-47.
3176
3177       This helps funding further development; and it won't cost you a cent.
3178       If you pay 10000 EUR you should feel free to use GNU Parallel without
3179       citing.
3180
3181       Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
3182
3183       Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
3184
3185       Copyright (C) 2010-2022 Ole Tange, http://ole.tange.dk and Free
3186       Software Foundation, Inc.
3187
3188       Parts of the manual concerning xargs compatibility is inspired by the
3189       manual of xargs from GNU findutils 4.4.2.
3190

LICENSE

3192       This program is free software; you can redistribute it and/or modify it
3193       under the terms of the GNU General Public License as published by the
3194       Free Software Foundation; either version 3 of the License, or at your
3195       option any later version.
3196
3197       This program is distributed in the hope that it will be useful, but
3198       WITHOUT ANY WARRANTY; without even the implied warranty of
3199       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
3200       General Public License for more details.
3201
3202       You should have received a copy of the GNU General Public License along
3203       with this program.  If not, see <https://www.gnu.org/licenses/>.
3204
3205   Documentation license I
3206       Permission is granted to copy, distribute and/or modify this
3207       documentation under the terms of the GNU Free Documentation License,
3208       Version 1.3 or any later version published by the Free Software
3209       Foundation; with no Invariant Sections, with no Front-Cover Texts, and
3210       with no Back-Cover Texts.  A copy of the license is included in the
3211       file LICENSES/GFDL-1.3-or-later.txt.
3212
3213   Documentation license II
3214       You are free:
3215
3216       to Share to copy, distribute and transmit the work
3217
3218       to Remix to adapt the work
3219
3220       Under the following conditions:
3221
3222       Attribution
3223                You must attribute the work in the manner specified by the
3224                author or licensor (but not in any way that suggests that they
3225                endorse you or your use of the work).
3226
3227       Share Alike
3228                If you alter, transform, or build upon this work, you may
3229                distribute the resulting work only under the same, similar or
3230                a compatible license.
3231
3232       With the understanding that:
3233
3234       Waiver   Any of the above conditions can be waived if you get
3235                permission from the copyright holder.
3236
3237       Public Domain
3238                Where the work or any of its elements is in the public domain
3239                under applicable law, that status is in no way affected by the
3240                license.
3241
3242       Other Rights
3243                In no way are any of the following rights affected by the
3244                license:
3245
3246                • Your fair dealing or fair use rights, or other applicable
3247                  copyright exceptions and limitations;
3248
3249                • The author's moral rights;
3250
3251                • Rights other persons may have either in the work itself or
3252                  in how the work is used, such as publicity or privacy
3253                  rights.
3254
3255       Notice   For any reuse or distribution, you must make clear to others
3256                the license terms of this work.
3257
3258       A copy of the full license is included in the file as
3259       LICENCES/CC-BY-SA-4.0.txt
3260

DEPENDENCIES

3262       GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
3263       Symbol, IO::File, POSIX, and File::Temp. For remote usage it also uses
3264       rsync with ssh.
3265

NAME

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

TESTING OTHER TOOLS

AUTHOR

LICENSE

DEPENDENCIES

SEE ALSO