1PARALLEL_ALTERNATIVES(7)           parallel           PARALLEL_ALTERNATIVES(7)
2
3
4

NAME

6       parallel_alternatives - Alternatives to GNU parallel
7

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

9       There are a lot programs with some of the functionality of GNU
10       parallel. GNU parallel strives to include the best of the functionality
11       without sacrificing ease of use.
12
13       parallel has existed since 2002 and as GNU parallel since 2010. A lot
14       of the alternatives have not had the vitality to survive that long, but
15       have come and gone during that time.
16
17       GNU parallel is actively maintained with a new release every month
18       since 2010. Most other alternatives are fleeting interests of the
19       developers with irregular releases and only maintained for a few years.
20
21   SUMMARY TABLE
22       The following features are in some of the comparable tools:
23
24       Inputs
25        I1. Arguments can be read from stdin
26        I2. Arguments can be read from a file
27        I3. Arguments can be read from multiple files
28        I4. Arguments can be read from command line
29        I5. Arguments can be read from a table
30        I6. Arguments can be read from the same file using #! (shebang)
31        I7. Line oriented input as default (Quoting of special chars not
32       needed)
33
34       Manipulation of input
35        M1. Composed command
36        M2. Multiple arguments can fill up an execution line
37        M3. Arguments can be put anywhere in the execution line
38        M4. Multiple arguments can be put anywhere in the execution line
39        M5. Arguments can be replaced with context
40        M6. Input can be treated as the complete command line
41
42       Outputs
43        O1. Grouping output so output from different jobs do not mix
44        O2. Send stderr (standard error) to stderr (standard error)
45        O3. Send stdout (standard output) to stdout (standard output)
46        O4. Order of output can be same as order of input
47        O5. Stdout only contains stdout (standard output) from the command
48        O6. Stderr only contains stderr (standard error) from the command
49        O7. Buffering on disk
50        O8. Cleanup of file if killed
51        O9. Test if disk runs full during run
52        O10. Output of a line bigger than 4 GB
53
54       Execution
55        E1. Running jobs in parallel
56        E2. List running jobs
57        E3. Finish running jobs, but do not start new jobs
58        E4. Number of running jobs can depend on number of cpus
59        E5. Finish running jobs, but do not start new jobs after first failure
60        E6. Number of running jobs can be adjusted while running
61        E7. Only spawn new jobs if load is less than a limit
62
63       Remote execution
64        R1. Jobs can be run on remote computers
65        R2. Basefiles can be transferred
66        R3. Argument files can be transferred
67        R4. Result files can be transferred
68        R5. Cleanup of transferred files
69        R6. No config files needed
70        R7. Do not run more than SSHD's MaxStartups can handle
71        R8. Configurable SSH command
72        R9. Retry if connection breaks occasionally
73
74       Semaphore
75        S1. Possibility to work as a mutex
76        S2. Possibility to work as a counting semaphore
77
78       Legend
79        - = no
80        x = not applicable
81        ID = yes
82
83       As every new version of the programs are not tested the table may be
84       outdated. Please file a bug-report if you find errors (See REPORTING
85       BUGS).
86
87       parallel: I1 I2 I3 I4 I5 I6 I7 M1 M2 M3 M4 M5 M6 O1 O2 O3 O4 O5 O6 O7
88       O8 O9 O10 E1 E2 E3 E4 E5 E6 E7 R1 R2 R3 R4 R5 R6 R7 R8 R9 S1 S2
89
90       find -exec: -  -  -  x  -  x  - -  M2 M3 -  -  -  - -  O2 O3 O4 O5 O6 -
91       -  -  -  -  -  - -  -  -  -  -  -  -  -  - x  x
92
93       make -j: -  -  -  -  -  -  - -  -  -  -  -  - O1 O2 O3 -  x  O6 E1 -  -
94       -  E5 - -  -  -  -  -  -  -  -  - -  -
95
96   DIFFERENCES BETWEEN xargs AND GNU Parallel
97       Summary table (see legend above): I1 I2 - - - - - - M2 M3 - - - - O2 O3
98       - O5 O6 E1 - - - - - - - - - - - x - - - - -
99
100       xargs offers some of the same possibilities as GNU parallel.
101
102       xargs deals badly with special characters (such as space, \, ' and ").
103       To see the problem try this:
104
105         touch important_file
106         touch 'not important_file'
107         ls not* | xargs rm
108         mkdir -p "My brother's 12\" records"
109         ls | xargs rmdir
110         touch 'c:\windows\system32\clfs.sys'
111         echo 'c:\windows\system32\clfs.sys' | xargs ls -l
112
113       You can specify -0, but many input generators are not optimized for
114       using NUL as separator but are optimized for newline as separator. E.g.
115       awk, ls, echo, tar -v, head (requires using -z), tail (requires using
116       -z), sed (requires using -z), perl (-0 and \0 instead of \n), locate
117       (requires using -0), find (requires using -print0), grep (requires
118       using -z or -Z), sort (requires using -z).
119
120       GNU parallel's newline separation can be emulated with:
121
122       cat | xargs -d "\n" -n1 command
123
124       xargs can run a given number of jobs in parallel, but has no support
125       for running number-of-cpu-cores jobs in parallel.
126
127       xargs has no support for grouping the output, therefore output may run
128       together, e.g. the first half of a line is from one process and the
129       last half of the line is from another process. The example Parallel
130       grep cannot be done reliably with xargs because of this. To see this in
131       action try:
132
133         parallel perl -e '\$a=\"1\".\"{}\"x10000000\;print\ \$a,\"\\n\"' \
134           '>' {} ::: a b c d e f g h
135         # Serial = no mixing = the wanted result
136         # 'tr -s a-z' squeezes repeating letters into a single letter
137         echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
138         # Compare to 8 jobs in parallel
139         parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
140         echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
141         echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
142           tr -s a-z
143
144       Or try this:
145
146         slow_seq() {
147           echo Count to "$@"
148           seq "$@" |
149             perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
150         }
151         export -f slow_seq
152         # Serial = no mixing = the wanted result
153         seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
154         # Compare to 8 jobs in parallel
155         seq 8 | parallel -P8 slow_seq {}
156         seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
157
158       xargs has no support for keeping the order of the output, therefore if
159       running jobs in parallel using xargs the output of the second job
160       cannot be postponed till the first job is done.
161
162       xargs has no support for running jobs on remote computers.
163
164       xargs has no support for context replace, so you will have to create
165       the arguments.
166
167       If you use a replace string in xargs (-I) you can not force xargs to
168       use more than one argument.
169
170       Quoting in xargs works like -q in GNU parallel. This means composed
171       commands and redirection require using bash -c.
172
173         ls | parallel "wc {} >{}.wc"
174         ls | parallel "echo {}; ls {}|wc"
175
176       becomes (assuming you have 8 cores and that none of the filenames
177       contain space, " or ').
178
179         ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
180         ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
181
182       https://www.gnu.org/software/findutils/
183
184   DIFFERENCES BETWEEN find -exec AND GNU Parallel
185       find -exec offers some of the same possibilities as GNU parallel.
186
187       find -exec only works on files. Processing other input (such as hosts
188       or URLs) will require creating these inputs as files. find -exec has no
189       support for running commands in parallel.
190
191       https://www.gnu.org/software/findutils/ (Last checked: 2019-01)
192
193   DIFFERENCES BETWEEN make -j AND GNU Parallel
194       make -j can run jobs in parallel, but requires a crafted Makefile to do
195       this. That results in extra quoting to get filenames containing
196       newlines to work correctly.
197
198       make -j computes a dependency graph before running jobs. Jobs run by
199       GNU parallel does not depend on each other.
200
201       (Very early versions of GNU parallel were coincidentally implemented
202       using make -j).
203
204       https://www.gnu.org/software/make/ (Last checked: 2019-01)
205
206   DIFFERENCES BETWEEN ppss AND GNU Parallel
207       Summary table (see legend above): I1 I2 - - - - I7 M1 - M3 - - M6 O1 -
208       - x - - E1 E2 ?E3 E4 - - - R1 R2 R3 R4 - - ?R7 ? ?  - -
209
210       ppss is also a tool for running jobs in parallel.
211
212       The output of ppss is status information and thus not useful for using
213       as input for another command. The output from the jobs are put into
214       files.
215
216       The argument replace string ($ITEM) cannot be changed. Arguments must
217       be quoted - thus arguments containing special characters (space '"&!*)
218       may cause problems. More than one argument is not supported. Filenames
219       containing newlines are not processed correctly. When reading input
220       from a file null cannot be used as a terminator. ppss needs to read the
221       whole input file before starting any jobs.
222
223       Output and status information is stored in ppss_dir and thus requires
224       cleanup when completed. If the dir is not removed before running ppss
225       again it may cause nothing to happen as ppss thinks the task is already
226       done. GNU parallel will normally not need cleaning up if running
227       locally and will only need cleaning up if stopped abnormally and
228       running remote (--cleanup may not complete if stopped abnormally). The
229       example Parallel grep would require extra postprocessing if written
230       using ppss.
231
232       For remote systems PPSS requires 3 steps: config, deploy, and start.
233       GNU parallel only requires one step.
234
235       EXAMPLES FROM ppss MANUAL
236
237       Here are the examples from ppss's manual page with the equivalent using
238       GNU parallel:
239
240         1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
241
242         1$ find /path/to/files -type f | parallel gzip
243
244         2$ ./ppss.sh standalone -d /path/to/files -c 'cp "$ITEM" /destination/dir '
245
246         2$ find /path/to/files -type f | parallel cp {} /destination/dir
247
248         3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
249
250         3$ parallel -a list-of-urls.txt wget -q
251
252         4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
253
254         4$ parallel -a list-of-urls.txt wget -q {}
255
256         5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
257              -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
258              -n nodes.txt -o /some/output/dir --upload --download;
259            ./ppss deploy -C config.cfg
260            ./ppss start -C config
261
262         5$ # parallel does not use configs. If you want a different username put it in nodes.txt: user@hostname
263            find source/dir -type f |
264              parallel --sshloginfile nodes.txt --trc {.}.mp3 lame -a {} -o {.}.mp3 --preset standard --quiet
265
266         6$ ./ppss stop -C config.cfg
267
268         6$ killall -TERM parallel
269
270         7$ ./ppss pause -C config.cfg
271
272         7$ Press: CTRL-Z or killall -SIGTSTP parallel
273
274         8$ ./ppss continue -C config.cfg
275
276         8$ Enter: fg or killall -SIGCONT parallel
277
278         9$ ./ppss.sh status -C config.cfg
279
280         9$ killall -SIGUSR2 parallel
281
282       https://github.com/louwrentius/PPSS
283
284   DIFFERENCES BETWEEN pexec AND GNU Parallel
285       Summary table (see legend above): I1 I2 - I4 I5 - - M1 - M3 - - M6 O1
286       O2 O3 - O5 O6 E1 - - E4 - E6 - R1 - - - - R6 - - - S1 -
287
288       pexec is also a tool for running jobs in parallel.
289
290       EXAMPLES FROM pexec MANUAL
291
292       Here are the examples from pexec's info page with the equivalent using
293       GNU parallel:
294
295         1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
296              'echo "scale=10000;sqrt($NUM)" | bc'
297
298         1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
299              bc > sqrt-{}.dat'
300
301         2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
302
303         2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
304
305         3$ pexec -f image.list -n auto -e B -u star.log -c -- \
306              'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
307
308         3$ parallel -a image.list \
309              'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
310
311         4$ pexec -r *.png -e IMG -c -o - -- \
312              'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
313
314         4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
315
316         5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
317
318         5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
319
320         6$ for p in *.png ; do echo ${p%.png} ; done | \
321              pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
322
323         6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
324
325         7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
326            pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
327
328         7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
329
330         8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
331              'pexec -j -m blockread -d $IMG | \
332               jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
333               pexec -j -m blockwrite -s th_$IMG'
334
335         8$ # Combining GNU B<parallel> and GNU B<sem>.
336            ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
337              'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
338
339            # If reading and writing is done to the same disk, this may be
340            # faster as only one process will be either reading or writing:
341            ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
342              'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
343
344       https://www.gnu.org/software/pexec/
345
346   DIFFERENCES BETWEEN xjobs AND GNU Parallel
347       xjobs is also a tool for running jobs in parallel. It only supports
348       running jobs on your local computer.
349
350       xjobs deals badly with special characters just like xargs. See the
351       section DIFFERENCES BETWEEN xargs AND GNU Parallel.
352
353       EXAMPLES FROM xjobs MANUAL
354
355       Here are the examples from xjobs's man page with the equivalent using
356       GNU parallel:
357
358         1$ ls -1 *.zip | xjobs unzip
359
360         1$ ls *.zip | parallel unzip
361
362         2$ ls -1 *.zip | xjobs -n unzip
363
364         2$ ls *.zip | parallel unzip >/dev/null
365
366         3$ find . -name '*.bak' | xjobs gzip
367
368         3$ find . -name '*.bak' | parallel gzip
369
370         4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
371
372         4$ ls *.jar | parallel jar tf {} '>' {}.idx
373
374         5$ xjobs -s script
375
376         5$ cat script | parallel
377
378         6$ mkfifo /var/run/my_named_pipe;
379            xjobs -s /var/run/my_named_pipe &
380            echo unzip 1.zip >> /var/run/my_named_pipe;
381            echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
382
383         6$ mkfifo /var/run/my_named_pipe;
384            cat /var/run/my_named_pipe | parallel &
385            echo unzip 1.zip >> /var/run/my_named_pipe;
386            echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
387
388       http://www.maier-komor.de/xjobs.html (Last checked: 2019-01)
389
390   DIFFERENCES BETWEEN prll AND GNU Parallel
391       prll is also a tool for running jobs in parallel. It does not support
392       running jobs on remote computers.
393
394       prll encourages using BASH aliases and BASH functions instead of
395       scripts. GNU parallel supports scripts directly, functions if they are
396       exported using export -f, and aliases if using env_parallel.
397
398       prll generates a lot of status information on stderr (standard error)
399       which makes it harder to use the stderr (standard error) output of the
400       job directly as input for another program.
401
402       EXAMPLES FROM prll's MANUAL
403
404       Here is the example from prll's man page with the equivalent using GNU
405       parallel:
406
407         1$ prll -s 'mogrify -flip $1' *.jpg
408
409         1$ parallel mogrify -flip ::: *.jpg
410
411       https://github.com/exzombie/prll (Last checked: 2019-01)
412
413   DIFFERENCES BETWEEN dxargs AND GNU Parallel
414       dxargs is also a tool for running jobs in parallel.
415
416       dxargs does not deal well with more simultaneous jobs than SSHD's
417       MaxStartups. dxargs is only built for remote run jobs, but does not
418       support transferring of files.
419
420       https://web.archive.org/web/20120518070250/http://www.
421       semicomplete.com/blog/geekery/distributed-xargs.html (Last checked:
422       2019-01)
423
424   DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
425       middleman(mdm) is also a tool for running jobs in parallel.
426
427       EXAMPLES FROM middleman's WEBSITE
428
429       Here are the shellscripts of
430       https://web.archive.org/web/20110728064735/http://mdm.
431       berlios.de/usage.html ported to GNU parallel:
432
433         1$ seq 19 | parallel buffon -o - | sort -n > result
434            cat files | parallel cmd
435            find dir -execdir sem cmd {} \;
436
437       https://github.com/cklin/mdm (Last checked: 2019-01)
438
439   DIFFERENCES BETWEEN xapply AND GNU Parallel
440       xapply can run jobs in parallel on the local computer.
441
442       EXAMPLES FROM xapply's MANUAL
443
444       Here are the examples from xapply's man page with the equivalent using
445       GNU parallel:
446
447         1$ xapply '(cd %1 && make all)' */
448
449         1$ parallel 'cd {} && make all' ::: */
450
451         2$ xapply -f 'diff %1 ../version5/%1' manifest | more
452
453         2$ parallel diff {} ../version5/{} < manifest | more
454
455         3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
456
457         3$ parallel --link diff {1} {2} :::: manifest1 checklist1
458
459         4$ xapply 'indent' *.c
460
461         4$ parallel indent ::: *.c
462
463         5$ find ~ksb/bin -type f ! -perm -111 -print | \
464              xapply -f -v 'chmod a+x' -
465
466         5$ find ~ksb/bin -type f ! -perm -111 -print | \
467              parallel -v chmod a+x
468
469         6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
470
471         6$ sh <(find */ -... | parallel -s 1024 echo vi)
472
473         6$ find */ -... | parallel -s 1024 -Xuj1 vi
474
475         7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
476
477         7$ sh <(find ... | parallel -n5 echo vi)
478
479         7$ find ... | parallel -n5 -uj1 vi
480
481         8$ xapply -fn "" /etc/passwd
482
483         8$ parallel -k echo < /etc/passwd
484
485         9$ tr ':' '\012' < /etc/passwd | \
486              xapply -7 -nf 'chown %1 %6' - - - - - - -
487
488         9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
489
490         10$ xapply '[ -d %1/RCS ] || echo %1' */
491
492         10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
493
494         11$ xapply -f '[ -f %1 ] && echo %1' List | ...
495
496         11$ parallel '[ -f {} ] && echo {}' < List | ...
497
498       https://web.archive.org/web/20160702211113/
499       http://carrera.databits.net/~ksb/msrc/local/bin/xapply/xapply.html
500
501   DIFFERENCES BETWEEN AIX apply AND GNU Parallel
502       apply can build command lines based on a template and arguments - very
503       much like GNU parallel. apply does not run jobs in parallel. apply does
504       not use an argument separator (like :::); instead the template must be
505       the first argument.
506
507       EXAMPLES FROM IBM's KNOWLEDGE CENTER
508
509       Here are the examples from IBM's Knowledge Center and the corresponding
510       command using GNU parallel:
511
512       To obtain results similar to those of the ls command, enter:
513
514         1$ apply echo *
515         1$ parallel echo ::: *
516
517       To compare the file named a1 to the file named b1, and the file named
518       a2 to the file named b2, enter:
519
520         2$ apply -2 cmp a1 b1 a2 b2
521         2$ parallel -N2 cmp ::: a1 b1 a2 b2
522
523       To run the who command five times, enter:
524
525         3$ apply -0 who 1 2 3 4 5
526         3$ parallel -N0 who ::: 1 2 3 4 5
527
528       To link all files in the current directory to the directory /usr/joe,
529       enter:
530
531         4$ apply 'ln %1 /usr/joe' *
532         4$ parallel ln {} /usr/joe ::: *
533
534       https://www-01.ibm.com/support/knowledgecenter/
535       ssw_aix_71/com.ibm.aix.cmds1/apply.htm (Last checked: 2019-01)
536
537   DIFFERENCES BETWEEN paexec AND GNU Parallel
538       paexec can run jobs in parallel on both the local and remote computers.
539
540       paexec requires commands to print a blank line as the last output. This
541       means you will have to write a wrapper for most programs.
542
543       paexec has a job dependency facility so a job can depend on another job
544       to be executed successfully. Sort of a poor-man's make.
545
546       EXAMPLES FROM paexec's EXAMPLE CATALOG
547
548       Here are the examples from paexec's example catalog with the equivalent
549       using GNU parallel:
550
551       1_div_X_run
552
553         1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
554
555         1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
556
557       all_substr_run
558
559         2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
560
561         2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
562
563       cc_wrapper_run
564
565         3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
566                    -n 'host1 host2' \
567                    -t '/usr/bin/ssh -x' <<EOF [...]
568
569         3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
570                    -S host1,host2 <<EOF [...]
571
572            # This is not exactly the same, but avoids the wrapper
573            parallel gcc -O2 -c -o {.}.o {} \
574                    -S host1,host2 <<EOF [...]
575
576       toupper_run
577
578         4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
579
580         4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
581
582            # Without the wrapper:
583            parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
584
585       https://github.com/cheusov/paexec
586
587   DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
588       Summary table (see legend above): I1 - - I4 - - (I7) M1 (M2) M3 (M4) M5
589       M6 - O2 O3 - O5 - - N/A N/A O10 E1 - - - - - - - - - - - - - - - - -
590
591       (I7): Only under special circumstances. See below.
592
593       (M2+M4): Only if there is a single replacement string.
594
595       map rejects input with special characters:
596
597         echo "The Cure" > My\ brother\'s\ 12\"\ records
598
599         ls | map 'echo %; wc %'
600
601       It works with GNU parallel:
602
603         ls | parallel 'echo {}; wc {}'
604
605       Under some circumstances it also works with map:
606
607         ls | map 'echo % works %'
608
609       But tiny changes make it reject the input with special characters:
610
611         ls | map 'echo % does not work "%"'
612
613       This means that many UTF-8 characters will be rejected. This is by
614       design. From the web page: "As such, programs that quietly handle them,
615       with no warnings at all, are doing their users a disservice."
616
617       map delays each job by 0.01 s. This can be emulated by using parallel
618       --delay 0.01.
619
620       map prints '+' on stderr when a job starts, and '-' when a job
621       finishes. This cannot be disabled. parallel has --bar if you need to
622       see progress.
623
624       map's replacement strings (% %D %B %E) can be simulated in GNU parallel
625       by putting this in ~/.parallel/config:
626
627         --rpl '%'
628         --rpl '%D $_=Q(::dirname($_));'
629         --rpl '%B s:.*/::;s:\.[^/.]+$::;'
630         --rpl '%E s:.*\.::'
631
632       map does not have an argument separator on the command line, but uses
633       the first argument as command. This makes quoting harder which again
634       may affect readability. Compare:
635
636         map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
637
638         parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
639
640       map can do multiple arguments with context replace, but not without
641       context replace:
642
643         parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
644
645         map "echo 'BEGIN{'%'}END'" 1 2 3
646
647       map has no support for grouping. So this gives the wrong results:
648
649         parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
650           ::: a b c d e f
651         ls -l a b c d e f
652         parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
653         map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
654         map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
655         map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
656         ls -l out*
657         md5sum out*
658
659       EXAMPLES FROM map's WEBSITE
660
661       Here are the examples from map's web page with the equivalent using GNU
662       parallel:
663
664         1$ ls *.gif | map convert % %B.png         # default max-args: 1
665
666         1$ ls *.gif | parallel convert {} {.}.png
667
668         2$ map "mkdir %B; tar -C %B -xf %" *.tgz   # default max-args: 1
669
670         2$ parallel 'mkdir {.}; tar -C {.} -xf {}' :::  *.tgz
671
672         3$ ls *.gif | map cp % /tmp                # default max-args: 100
673
674         3$ ls *.gif | parallel -X cp {} /tmp
675
676         4$ ls *.tar | map -n 1 tar -xf %
677
678         4$ ls *.tar | parallel tar -xf
679
680         5$ map "cp % /tmp" *.tgz
681
682         5$ parallel cp {} /tmp ::: *.tgz
683
684         6$ map "du -sm /home/%/mail" alice bob carol
685
686         6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
687         or if you prefer running a single job with multiple args:
688         6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
689
690         7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
691
692         7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
693
694         8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
695
696         8$ export PARALLEL=-j50%
697
698       https://github.com/sitaramc/map (Last checked: 2020-05)
699
700   DIFFERENCES BETWEEN ladon AND GNU Parallel
701       ladon can run multiple jobs on files in parallel.
702
703       ladon only works on files and the only way to specify files is using a
704       quoted glob string (such as \*.jpg). It is not possible to list the
705       files manually.
706
707       As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
708       RELPATH
709
710       These can be simulated using GNU parallel by putting this in
711       ~/.parallel/config:
712
713           --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
714           --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
715           --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
716           --rpl 'EXT s:.*\.::'
717           --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
718                  s:\Q$c/\E::;$_=::dirname($_);'
719           --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
720                  s:\Q$c/\E::;'
721
722       ladon deals badly with filenames containing " and newline, and it fails
723       for output larger than 200k:
724
725           ladon '*' -- seq 36000 | wc
726
727       EXAMPLES FROM ladon MANUAL
728
729       It is assumed that the '--rpl's above are put in ~/.parallel/config and
730       that it is run under a shell that supports '**' globbing (such as zsh):
731
732         1$ ladon "**/*.txt" -- echo RELPATH
733
734         1$ parallel echo RELPATH ::: **/*.txt
735
736         2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
737
738         2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
739
740         3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
741              -thumbnail 100x100^ -gravity center -extent 100x100 \
742              thumbs/RELPATH
743
744         3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
745              -thumbnail 100x100^ -gravity center -extent 100x100 \
746              thumbs/RELPATH ::: **/*.jpg
747
748         4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
749
750         4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
751
752       https://github.com/danielgtaylor/ladon (Last checked: 2019-01)
753
754   DIFFERENCES BETWEEN jobflow AND GNU Parallel
755       jobflow can run multiple jobs in parallel.
756
757       Just like xargs output from jobflow jobs running in parallel mix
758       together by default. jobflow can buffer into files (placed in
759       /run/shm), but these are not cleaned up if jobflow dies unexpectedly
760       (e.g. by Ctrl-C). If the total output is big (in the order of RAM+swap)
761       it can cause the system to slow to a crawl and eventually run out of
762       memory.
763
764       jobflow gives no error if the command is unknown, and like xargs
765       redirection and composed commands require wrapping with bash -c.
766
767       Input lines can at most be 4096 bytes. You can at most have 16 {}'s in
768       the command template. More than that either crashes the program or
769       simple does not execute the command.
770
771       jobflow has no equivalent for --pipe, or --sshlogin.
772
773       jobflow makes it possible to set resource limits on the running jobs.
774       This can be emulated by GNU parallel using bash's ulimit:
775
776         jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
777
778         parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
779
780       EXAMPLES FROM jobflow README
781
782         1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
783
784         1$ cat things.list | parallel -j8 ./mytask {}
785
786         2$ seq 100 | jobflow -threads=100 -exec echo {}
787
788         2$ seq 100 | parallel -j100 echo {}
789
790         3$ cat urls.txt | jobflow -threads=32 -exec wget {}
791
792         3$ cat urls.txt | parallel -j32 wget {}
793
794         4$ find . -name '*.bmp' | \
795              jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
796
797         4$ find . -name '*.bmp' | \
798              parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
799
800       https://github.com/rofl0r/jobflow
801
802   DIFFERENCES BETWEEN gargs AND GNU Parallel
803       gargs can run multiple jobs in parallel.
804
805       Older versions cache output in memory. This causes it to be extremely
806       slow when the output is larger than the physical RAM, and can cause the
807       system to run out of memory.
808
809       See more details on this in man parallel_design.
810
811       Newer versions cache output in files, but leave files in $TMPDIR if it
812       is killed.
813
814       Output to stderr (standard error) is changed if the command fails.
815
816       EXAMPLES FROM gargs WEBSITE
817
818         1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
819
820         1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
821
822         2$ cat t.txt | gargs --sep "\s+" \
823              -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
824
825         2$ cat t.txt | parallel --colsep "\\s+" \
826              -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
827
828       https://github.com/brentp/gargs
829
830   DIFFERENCES BETWEEN orgalorg AND GNU Parallel
831       orgalorg can run the same job on multiple machines. This is related to
832       --onall and --nonall.
833
834       orgalorg supports entering the SSH password - provided it is the same
835       for all servers. GNU parallel advocates using ssh-agent instead, but it
836       is possible to emulate orgalorg's behavior by setting SSHPASS and by
837       using --ssh "sshpass ssh".
838
839       To make the emulation easier, make a simple alias:
840
841         alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
842
843       If you want to supply a password run:
844
845         SSHPASS=`ssh-askpass`
846
847       or set the password directly:
848
849         SSHPASS=P4$$w0rd!
850
851       If the above is set up you can then do:
852
853         orgalorg -o frontend1 -o frontend2 -p -C uptime
854         par_emul -S frontend1 -S frontend2 uptime
855
856         orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
857         par_emul -S frontend1 -S frontend2 top -bid 1
858
859         orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
860           'md5sum /tmp/bigfile' -S bigfile
861         par_emul -S frontend1 -S frontend2 --basefile bigfile \
862           --workdir /tmp md5sum /tmp/bigfile
863
864       orgalorg has a progress indicator for the transferring of a file. GNU
865       parallel does not.
866
867       https://github.com/reconquest/orgalorg
868
869   DIFFERENCES BETWEEN Rust parallel AND GNU Parallel
870       Rust parallel focuses on speed. It is almost as fast as xargs. It
871       implements a few features from GNU parallel, but lacks many functions.
872       All these fail:
873
874         # Read arguments from file
875         parallel -a file echo
876         # Changing the delimiter
877         parallel -d _ echo ::: a_b_c_
878
879       These do something different from GNU parallel
880
881         # -q to protect quoted $ and space
882         parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
883         # Generation of combination of inputs
884         parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
885         # {= perl expression =} replacement string
886         parallel echo '{= s/new/old/ =}' ::: my.new your.new
887         # --pipe
888         seq 100000 | parallel --pipe wc
889         # linked arguments
890         parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
891         # Run different shell dialects
892         zsh -c 'parallel echo \={} ::: zsh && true'
893         csh -c 'parallel echo \$\{\} ::: shell && true'
894         bash -c 'parallel echo \$\({}\) ::: pwd && true'
895         # Rust parallel does not start before the last argument is read
896         (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
897         tail -f /var/log/syslog | parallel echo
898
899       Most of the examples from the book GNU Parallel 2018 do not work, thus
900       Rust parallel is not close to being a compatible replacement.
901
902       Rust parallel has no remote facilities.
903
904       It uses /tmp/parallel for tmp files and does not clean up if terminated
905       abruptly. If another user on the system uses Rust parallel, then
906       /tmp/parallel will have the wrong permissions and Rust parallel will
907       fail. A malicious user can setup the right permissions and symlink the
908       output file to one of the user's files and next time the user uses Rust
909       parallel it will overwrite this file.
910
911         attacker$ mkdir /tmp/parallel
912         attacker$ chmod a+rwX /tmp/parallel
913         # Symlink to the file the attacker wants to zero out
914         attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
915         victim$ seq 1000 | parallel echo
916         # This file is now overwritten with stderr from 'echo'
917         victim$ cat ~victim/.important-file
918
919       If /tmp/parallel runs full during the run, Rust parallel does not
920       report this, but finishes with success - thereby risking data loss.
921
922       https://github.com/mmstick/parallel
923
924   DIFFERENCES BETWEEN Rush AND GNU Parallel
925       rush (https://github.com/shenwei356/rush) is written in Go and based on
926       gargs.
927
928       Just like GNU parallel rush buffers in temporary files. But opposite
929       GNU parallel rush does not clean up, if the process dies abnormally.
930
931       rush has some string manipulations that can be emulated by putting this
932       into ~/.parallel/config (/ is used instead of %, and % is used instead
933       of ^ as that is closer to bash's ${var%postfix}):
934
935         --rpl '{:} s:(\.[^/]+)*$::'
936         --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
937         --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
938         --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
939         --rpl '{@(.*?)} /$$1/ and $_=$1;'
940
941       EXAMPLES FROM rush's WEBSITE
942
943       Here are the examples from rush's website with the equivalent command
944       in GNU parallel.
945
946       1. Simple run, quoting is not necessary
947
948         1$ seq 1 3 | rush echo {}
949
950         1$ seq 1 3 | parallel echo {}
951
952       2. Read data from file (`-i`)
953
954         2$ rush echo {} -i data1.txt -i data2.txt
955
956         2$ cat data1.txt data2.txt | parallel echo {}
957
958       3. Keep output order (`-k`)
959
960         3$ seq 1 3 | rush 'echo {}' -k
961
962         3$ seq 1 3 | parallel -k echo {}
963
964       4. Timeout (`-t`)
965
966         4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
967
968         4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
969
970       5. Retry (`-r`)
971
972         5$ seq 1 | rush 'python unexisted_script.py' -r 1
973
974         5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
975
976       Use -u to see it is really run twice:
977
978         5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
979
980       6. Dirname (`{/}`) and basename (`{%}`) and remove custom suffix
981       (`{^suffix}`)
982
983         6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
984
985         6$ echo dir/file_1.txt.gz |
986              parallel --plus echo {//} {/} {%_1.txt.gz}
987
988       7. Get basename, and remove last (`{.}`) or any (`{:}`) extension
989
990         7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
991
992         7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
993
994       8. Job ID, combine fields index and other replacement strings
995
996         8$ echo 12 file.txt dir/s_1.fq.gz |
997              rush 'echo job {#}: {2} {2.} {3%:^_1}'
998
999         8$ echo 12 file.txt dir/s_1.fq.gz |
1000              parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1001
1002       9. Capture submatch using regular expression (`{@regexp}`)
1003
1004         9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1005
1006         9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1007
1008       10. Custom field delimiter (`-d`)
1009
1010         10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1011
1012         10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1013
1014       11. Send multi-lines to every command (`-n`)
1015
1016         11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1017
1018         11$ seq 5 |
1019               parallel -n 2 -k \
1020                 'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1021
1022         11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1023
1024         11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1025
1026       12. Custom record delimiter (`-D`), note that empty records are not
1027       used.
1028
1029         12$ echo a b c d | rush -D " " -k 'echo {}'
1030
1031         12$ echo a b c d | parallel -d " " -k 'echo {}'
1032
1033         12$ echo abcd | rush -D "" -k 'echo {}'
1034
1035         Cannot be done by GNU Parallel
1036
1037         12$ cat fasta.fa
1038         >seq1
1039         tag
1040         >seq2
1041         cat
1042         gat
1043         >seq3
1044         attac
1045         a
1046         cat
1047
1048         12$ cat fasta.fa | rush -D ">" \
1049               'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1050             # rush fails to join the multiline sequences
1051
1052         12$ cat fasta.fa | (read -n1 ignore_first_char;
1053               parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1054                 name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1055             )
1056
1057       13. Assign value to variable, like `awk -v` (`-v`)
1058
1059         13$ seq 1 |
1060               rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1061
1062         13$ seq 1 |
1063               parallel -N0 \
1064                 'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1065
1066         13$ for var in a b; do \
1067         13$   seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1068         13$ done
1069
1070       In GNU parallel you would typically do:
1071
1072         13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1073
1074       If you really want the var:
1075
1076         13$ seq 1 3 |
1077               parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1078
1079       If you really want the for-loop:
1080
1081         13$ for var in a b; do
1082               export var;
1083               seq 1 3 | parallel -k 'echo var: $var, data: {}';
1084             done
1085
1086       Contrary to rush this also works if the value is complex like:
1087
1088         My brother's 12" records
1089
1090       14. Preset variable (`-v`), avoid repeatedly writing verbose
1091       replacement strings
1092
1093         14$ # naive way
1094             echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1095
1096         14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1097
1098         14$ # macro + removing suffix
1099             echo read_1.fq.gz |
1100               rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1101
1102         14$ echo read_1.fq.gz |
1103               parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1104
1105         14$ # macro + regular expression
1106             echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1107
1108         14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1109
1110       Contrary to rush GNU parallel works with complex values:
1111
1112         14$ echo "My brother's 12\"read_1.fq.gz" |
1113               parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1114
1115       15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and
1116       exit.
1117
1118         15$ seq 1 20 | rush 'sleep 1; echo {}'
1119             ^C
1120
1121         15$ seq 1 20 | parallel 'sleep 1; echo {}'
1122             ^C
1123
1124       16. Continue/resume jobs (`-c`). When some jobs failed (by execution
1125       failure, timeout, or canceling by user with `Ctrl + C`), please switch
1126       flag `-c/--continue` on and run again, so that `rush` can save
1127       successful commands and ignore them in NEXT run.
1128
1129         16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1130             cat successful_cmds.rush
1131             seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1132
1133         16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1134               'sleep {}; echo {}'
1135             cat mylog
1136             seq 1 3 | parallel --joblog mylog --retry-failed \
1137               'sleep {}; echo {}'
1138
1139       Multi-line jobs:
1140
1141         16$ seq 1 3 | rush 'sleep {}; echo {}; \
1142               echo finish {}' -t 3 -c -C finished.rush
1143             cat finished.rush
1144             seq 1 3 | rush 'sleep {}; echo {}; \
1145               echo finish {}' -t 3 -c -C finished.rush
1146
1147         16$ seq 1 3 |
1148               parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1149                 echo finish {}'
1150             cat mylog
1151             seq 1 3 |
1152               parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1153                 echo finish {}'
1154
1155       17. A comprehensive example: downloading 1K+ pages given by three URL
1156       list files using `phantomjs save_page.js` (some page contents are
1157       dynamically generated by Javascript, so `wget` does not work). Here I
1158       set max jobs number (`-j`) as `20`, each job has a max running time
1159       (`-t`) of `60` seconds and `3` retry changes (`-r`). Continue flag `-c`
1160       is also switched on, so we can continue unfinished jobs. Luckily, it's
1161       accomplished in one run :)
1162
1163         17$ for f in $(seq 2014 2016); do \
1164               /bin/rm -rf $f; mkdir -p $f; \
1165               cat $f.html.txt | rush -v d=$f -d = \
1166                 'phantomjs save_page.js "{}" > {d}/{3}.html' \
1167                 -j 20 -t 60 -r 3 -c; \
1168             done
1169
1170       GNU parallel can append to an existing joblog with '+':
1171
1172         17$ rm mylog
1173             for f in $(seq 2014 2016); do
1174               /bin/rm -rf $f; mkdir -p $f;
1175               cat $f.html.txt |
1176                 parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1177                   --colsep = \
1178                   phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1179             done
1180
1181       18. A bioinformatics example: mapping with `bwa`, and processing result
1182       with `samtools`:
1183
1184         18$ ref=ref/xxx.fa
1185             threads=25
1186             ls -d raw.cluster.clean.mapping/* \
1187               | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1188               'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1189               samtools view -bS {p}.sam > {p}.bam; \
1190               samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1191               samtools index {p}.sorted.bam; \
1192               samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1193               /bin/rm {p}.bam {p}.sam;' \
1194               -j 2 --verbose -c -C mapping.rush
1195
1196       GNU parallel would use a function:
1197
1198         18$ ref=ref/xxx.fa
1199             export ref
1200             thr=25
1201             export thr
1202             bwa_sam() {
1203               p="$1"
1204               bam="$p".bam
1205               sam="$p".sam
1206               sortbam="$p".sorted.bam
1207               bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1208               samtools view -bS "$sam" > "$bam"
1209               samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1210               samtools index "$sortbam"
1211               samtools flagstat "$sortbam" > "$sortbam".flagstat
1212               /bin/rm "$bam" "$sam"
1213             }
1214             export -f bwa_sam
1215             ls -d raw.cluster.clean.mapping/* |
1216               parallel -j 2 --verbose --joblog mylog bwa_sam
1217
1218       Other rush features
1219
1220       rush has:
1221
1222awk -v like custom defined variables (-v)
1223
1224           With GNU parallel you would simply set a shell variable:
1225
1226              parallel 'v={}; echo "$v"' ::: foo
1227              echo foo | rush -v v={} 'echo {v}'
1228
1229           Also rush does not like special chars. So these do not work:
1230
1231              echo does not work | rush -v v=\" 'echo {v}'
1232              echo "My  brother's  12\"  records" | rush -v v={} 'echo {v}'
1233
1234           Whereas the corresponding GNU parallel version works:
1235
1236              parallel 'v=\"; echo "$v"' ::: works
1237              parallel 'v={}; echo "$v"' ::: "My  brother's  12\"  records"
1238
1239       •   Exit on first error(s) (-e)
1240
1241           This is called --halt now,fail=1 (or shorter: --halt 2) when used
1242           with GNU parallel.
1243
1244       •   Settable records sending to every command (-n, default 1)
1245
1246           This is also called -n in GNU parallel.
1247
1248       •   Practical replacement strings
1249
1250           {:} remove any extension
1251               With GNU parallel this can be emulated by:
1252
1253                 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1254
1255           {^suffix}, remove suffix
1256               With GNU parallel this can be emulated by:
1257
1258                 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1259
1260           {@regexp}, capture submatch using regular expression
1261               With GNU parallel this can be emulated by:
1262
1263                 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1264                   echo '{@\d_(.*).gz}' ::: 1_foo.gz
1265
1266           {%.}, {%:}, basename without extension
1267               With GNU parallel this can be emulated by:
1268
1269                 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1270
1271               And if you need it often, you define a --rpl in
1272               $HOME/.parallel/config:
1273
1274                 --rpl '{%.} s:.*/::;s/\..*//'
1275                 --rpl '{%:} s:.*/::;s/\..*//'
1276
1277               Then you can use them as:
1278
1279                 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1280
1281       •   Preset variable (macro)
1282
1283           E.g.
1284
1285             echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1286
1287           With GNU parallel this can be emulated by:
1288
1289             echo foosuffix |
1290               parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1291
1292           Opposite rush GNU parallel works fine if the input contains double
1293           space, ' and ":
1294
1295             echo "1'6\"  foosuffix" |
1296               parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1297
1298       •   Commands of multi-lines
1299
1300           While you can use multi-lined commands in GNU parallel, to improve
1301           readability GNU parallel discourages the use of multi-line
1302           commands. In most cases it can be written as a function:
1303
1304             seq 1 3 |
1305               parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1306                 echo finish {}'
1307
1308           Could be written as:
1309
1310             doit() {
1311               sleep "$1"
1312               echo "$1"
1313               echo finish "$1"
1314             }
1315             export -f doit
1316             seq 1 3 | parallel --timeout 2 --joblog my.log doit
1317
1318           The failed commands can be resumed with:
1319
1320             seq 1 3 |
1321               parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1322                 echo finish {}'
1323
1324       https://github.com/shenwei356/rush
1325
1326   DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1327       ClusterSSH solves a different problem than GNU parallel.
1328
1329       ClusterSSH opens a terminal window for each computer and using a master
1330       window you can run the same command on all the computers. This is
1331       typically used for administrating several computers that are almost
1332       identical.
1333
1334       GNU parallel runs the same (or different) commands with different
1335       arguments in parallel possibly using remote computers to help
1336       computing. If more than one computer is listed in -S GNU parallel may
1337       only use one of these (e.g. if there are 8 jobs to be run and one
1338       computer has 8 cores).
1339
1340       GNU parallel can be used as a poor-man's version of ClusterSSH:
1341
1342       parallel --nonall -S server-a,server-b do_stuff foo bar
1343
1344       https://github.com/duncs/clusterssh
1345
1346   DIFFERENCES BETWEEN coshell AND GNU Parallel
1347       coshell only accepts full commands on standard input. Any quoting needs
1348       to be done by the user.
1349
1350       Commands are run in sh so any bash/tcsh/zsh specific syntax will not
1351       work.
1352
1353       Output can be buffered by using -d. Output is buffered in memory, so
1354       big output can cause swapping and therefore be terrible slow or even
1355       cause out of memory.
1356
1357       https://github.com/gdm85/coshell (Last checked: 2019-01)
1358
1359   DIFFERENCES BETWEEN spread AND GNU Parallel
1360       spread runs commands on all directories.
1361
1362       It can be emulated with GNU parallel using this Bash function:
1363
1364         spread() {
1365           _cmds() {
1366             perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1367           }
1368           parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1369         }
1370
1371       This works except for the --exclude option.
1372
1373       (Last checked: 2017-11)
1374
1375   DIFFERENCES BETWEEN pyargs AND GNU Parallel
1376       pyargs deals badly with input containing spaces. It buffers stdout, but
1377       not stderr. It buffers in RAM. {} does not work as replacement string.
1378       It does not support running functions.
1379
1380       pyargs does not support composed commands if run with --lines, and
1381       fails on pyargs traceroute gnu.org fsf.org.
1382
1383       Examples
1384
1385         seq 5 | pyargs -P50 -L seq
1386         seq 5 | parallel -P50 --lb seq
1387
1388         seq 5 | pyargs -P50 --mark -L seq
1389         seq 5 | parallel -P50 --lb \
1390           --tagstring OUTPUT'[{= $_=$job->replaced()=}]' seq
1391         # Similar, but not precisely the same
1392         seq 5 | parallel -P50 --lb --tag seq
1393
1394         seq 5 | pyargs -P50  --mark command
1395         # Somewhat longer with GNU Parallel due to the special
1396         #   --mark formatting
1397         cmd="$(echo "command" | parallel --shellquote)"
1398         wrap_cmd() {
1399            echo "MARK $cmd $@================================" >&3
1400            echo "OUTPUT START[$cmd $@]:"
1401            eval $cmd "$@"
1402            echo "OUTPUT END[$cmd $@]"
1403         }
1404         (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1405         # Similar, but not exactly the same
1406         seq 5 | parallel -t --tag command
1407
1408         (echo '1  2  3';echo 4 5 6) | pyargs  --stream seq
1409         (echo '1  2  3';echo 4 5 6) | perl -pe 's/\n/ /' |
1410           parallel -r -d' ' seq
1411         # Similar, but not exactly the same
1412         parallel seq ::: 1 2 3 4 5 6
1413
1414       https://github.com/robertblackwell/pyargs (Last checked: 2019-01)
1415
1416   DIFFERENCES BETWEEN concurrently AND GNU Parallel
1417       concurrently runs jobs in parallel.
1418
1419       The output is prepended with the job number, and may be incomplete:
1420
1421         $ concurrently 'seq 100000' | (sleep 3;wc -l)
1422         7165
1423
1424       When pretty printing it caches output in memory. Output mixes by using
1425       test MIX below whether or not output is cached.
1426
1427       There seems to be no way of making a template command and have
1428       concurrently fill that with different args. The full commands must be
1429       given on the command line.
1430
1431       There is also no way of controlling how many jobs should be run in
1432       parallel at a time - i.e. "number of jobslots". Instead all jobs are
1433       simply started in parallel.
1434
1435       https://github.com/kimmobrunfeldt/concurrently (Last checked: 2019-01)
1436
1437   DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1438       map does not run jobs in parallel by default. The README suggests
1439       using:
1440
1441         ... | map t 'sleep $t && say done &'
1442
1443       But this fails if more jobs are run in parallel than the number of
1444       available processes. Since there is no support for parallelization in
1445       map itself, the output also mixes:
1446
1447         seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1448
1449       The major difference is that GNU parallel is built for parallelization
1450       and map is not. So GNU parallel has lots of ways of dealing with the
1451       issues that parallelization raises:
1452
1453       •   Keep the number of processes manageable
1454
1455       •   Make sure output does not mix
1456
1457       •   Make Ctrl-C kill all running processes
1458
1459       EXAMPLES FROM maps WEBSITE
1460
1461       Here are the 5 examples converted to GNU Parallel:
1462
1463         1$ ls *.c | map f 'foo $f'
1464         1$ ls *.c | parallel foo
1465
1466         2$ ls *.c | map f 'foo $f; bar $f'
1467         2$ ls *.c | parallel 'foo {}; bar {}'
1468
1469         3$ cat urls | map u 'curl -O $u'
1470         3$ cat urls | parallel curl -O
1471
1472         4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1473         4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1474         4$ parallel 'sleep {} && say done' ::: 1 1 1
1475
1476         5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1477         5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1478         5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1479
1480       https://github.com/soveran/map (Last checked: 2019-01)
1481
1482   DIFFERENCES BETWEEN loop AND GNU Parallel
1483       loop mixes stdout and stderr:
1484
1485           loop 'ls /no-such-file' >/dev/null
1486
1487       loop's replacement string $ITEM does not quote strings:
1488
1489           echo 'two  spaces' | loop 'echo $ITEM'
1490
1491       loop cannot run functions:
1492
1493           myfunc() { echo joe; }
1494           export -f myfunc
1495           loop 'myfunc this fails'
1496
1497       EXAMPLES FROM loop's WEBSITE
1498
1499       Some of the examples from https://github.com/Miserlou/Loop/ can be
1500       emulated with GNU parallel:
1501
1502           # A couple of functions will make the code easier to read
1503           $ loopy() {
1504               yes | parallel -uN0 -j1 "$@"
1505             }
1506           $ export -f loopy
1507           $ time_out() {
1508               parallel -uN0 -q --timeout "$@" ::: 1
1509             }
1510           $ match() {
1511               perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1512             }
1513           $ export -f match
1514
1515           $ loop 'ls' --every 10s
1516           $ loopy --delay 10s ls
1517
1518           $ loop 'touch $COUNT.txt' --count-by 5
1519           $ loopy touch '{= $_=seq()*5 =}'.txt
1520
1521           $ loop --until-contains 200 -- \
1522               ./get_response_code.sh --site mysite.biz`
1523           $ loopy --halt now,success=1 \
1524               './get_response_code.sh --site mysite.biz | match 200'
1525
1526           $ loop './poke_server' --for-duration 8h
1527           $ time_out 8h loopy ./poke_server
1528
1529           $ loop './poke_server' --until-success
1530           $ loopy --halt now,success=1 ./poke_server
1531
1532           $ cat files_to_create.txt | loop 'touch $ITEM'
1533           $ cat files_to_create.txt | parallel touch {}
1534
1535           $ loop 'ls' --for-duration 10min --summary
1536           # --joblog is somewhat more verbose than --summary
1537           $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1538
1539           $ loop 'echo hello'
1540           $ loopy echo hello
1541
1542           $ loop 'echo $COUNT'
1543           # GNU Parallel counts from 1
1544           $ loopy echo {#}
1545           # Counting from 0 can be forced
1546           $ loopy echo '{= $_=seq()-1 =}'
1547
1548           $ loop 'echo $COUNT' --count-by 2
1549           $ loopy echo '{= $_=2*(seq()-1) =}'
1550
1551           $ loop 'echo $COUNT' --count-by 2 --offset 10
1552           $ loopy echo '{= $_=10+2*(seq()-1) =}'
1553
1554           $ loop 'echo $COUNT' --count-by 1.1
1555           # GNU Parallel rounds 3.3000000000000003 to 3.3
1556           $ loopy echo '{= $_=1.1*(seq()-1) =}'
1557
1558           $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1559           $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1560
1561           $ loop 'echo $COUNT' --num 3 --summary
1562           # --joblog is somewhat more verbose than --summary
1563           $ seq 3 | parallel --joblog my.log echo; cat my.log
1564
1565           $ loop 'ls -foobarbatz' --num 3 --summary
1566           # --joblog is somewhat more verbose than --summary
1567           $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1568
1569           $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1570           # Can be emulated by running 2 jobs
1571           $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1572           $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1573
1574           $ loop 'date' --every 5s
1575           $ loopy --delay 5s date
1576
1577           $ loop 'date' --for-duration 8s --every 2s
1578           $ time_out 8s loopy --delay 2s date
1579
1580           $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1581           $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date  +%s`))s
1582           $ time_out $seconds loopy --delay 5s date -u
1583
1584           $ loop 'echo $RANDOM' --until-contains "666"
1585           $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1586
1587           $ loop 'if (( RANDOM % 2 )); then
1588                     (echo "TRUE"; true);
1589                   else
1590                     (echo "FALSE"; false);
1591                   fi' --until-success
1592           $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1593                                           (echo "TRUE"; true);
1594                                         else
1595                                           (echo "FALSE"; false);
1596                                         fi'
1597
1598           $ loop 'if (( RANDOM % 2 )); then
1599               (echo "TRUE"; true);
1600             else
1601               (echo "FALSE"; false);
1602             fi' --until-error
1603           $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1604                                        (echo "TRUE"; true);
1605                                      else
1606                                        (echo "FALSE"; false);
1607                                      fi'
1608
1609           $ loop 'date' --until-match "(\d{4})"
1610           $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1611
1612           $ loop 'echo $ITEM' --for red,green,blue
1613           $ parallel echo ::: red green blue
1614
1615           $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1616           $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1617
1618           $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1619           $ ls | parallel cp {} {}.bak; ls
1620
1621           $ loop 'echo $ITEM | tr a-z A-Z' -i
1622           $ parallel 'echo {} | tr a-z A-Z'
1623           # Or more efficiently:
1624           $ parallel --pipe tr a-z A-Z
1625
1626           $ loop 'echo $ITEM' --for "`ls`"
1627           $ parallel echo {} ::: "`ls`"
1628
1629           $ ls | loop './my_program $ITEM' --until-success;
1630           $ ls | parallel --halt now,success=1 ./my_program {}
1631
1632           $ ls | loop './my_program $ITEM' --until-fail;
1633           $ ls | parallel --halt now,fail=1 ./my_program {}
1634
1635           $ ./deploy.sh;
1636             loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1637               --every 5s --until-contains 200;
1638             ./announce_to_slack.sh
1639           $ ./deploy.sh;
1640             loopy --delay 5s --halt now,success=1 \
1641             'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1642             ./announce_to_slack.sh
1643
1644           $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1645           $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1646
1647           $ ./create_big_file -o my_big_file.bin;
1648             loop 'ls' --until-contains 'my_big_file.bin';
1649             ./upload_big_file my_big_file.bin
1650           # inotifywait is a better tool to detect file system changes.
1651           # It can even make sure the file is complete
1652           # so you are not uploading an incomplete file
1653           $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1654               grep my_big_file.bin
1655
1656           $ ls | loop 'cp $ITEM $ITEM.bak'
1657           $ ls | parallel cp {} {}.bak
1658
1659           $ loop './do_thing.sh' --every 15s --until-success --num 5
1660           $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
1661
1662       https://github.com/Miserlou/Loop/ (Last checked: 2018-10)
1663
1664   DIFFERENCES BETWEEN lorikeet AND GNU Parallel
1665       lorikeet can run jobs in parallel. It does this based on a dependency
1666       graph described in a file, so this is similar to make.
1667
1668       https://github.com/cetra3/lorikeet (Last checked: 2018-10)
1669
1670   DIFFERENCES BETWEEN spp AND GNU Parallel
1671       spp can run jobs in parallel. spp does not use a command template to
1672       generate the jobs, but requires jobs to be in a file. Output from the
1673       jobs mix.
1674
1675       https://github.com/john01dav/spp (Last checked: 2019-01)
1676
1677   DIFFERENCES BETWEEN paral AND GNU Parallel
1678       paral prints a lot of status information and stores the output from the
1679       commands run into files. This means it cannot be used the middle of a
1680       pipe like this
1681
1682         paral "echo this" "echo does not" "echo work" | wc
1683
1684       Instead it puts the output into files named like out_#_command.out.log.
1685       To get a very similar behaviour with GNU parallel use --results
1686       'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta
1687
1688       paral only takes arguments on the command line and each argument should
1689       be a full command. Thus it does not use command templates.
1690
1691       This limits how many jobs it can run in total, because they all need to
1692       fit on a single command line.
1693
1694       paral has no support for running jobs remotely.
1695
1696       EXAMPLES FROM README.markdown
1697
1698       The examples from README.markdown and the corresponding command run
1699       with GNU parallel (--results
1700       'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta is omitted from the
1701       GNU parallel command):
1702
1703         1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
1704         1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
1705
1706         2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
1707              "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
1708         2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
1709              "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
1710            # Or shorter:
1711            parallel "sleep {} && echo c{}" ::: {1..5}
1712
1713         3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1714              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1715         3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
1716              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1717            # Or shorter:
1718            parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
1719
1720         4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1721              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1722         4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
1723
1724         5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1725              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1726         5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
1727
1728         6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1729              "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1730         6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
1731
1732         7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1733              echo c && sleep 0.5 && echo d && sleep 0.5 && \
1734              echo e && sleep 0.5 && echo f && sleep 0.5 && \
1735              echo g && sleep 0.5 && echo h"
1736         7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1737              echo c && sleep 0.5 && echo d && sleep 0.5 && \
1738              echo e && sleep 0.5 && echo f && sleep 0.5 && \
1739              echo g && sleep 0.5 && echo h"
1740
1741       https://github.com/amattn/paral (Last checked: 2019-01)
1742
1743   DIFFERENCES BETWEEN concurr AND GNU Parallel
1744       concurr is built to run jobs in parallel using a client/server model.
1745
1746       EXAMPLES FROM README.md
1747
1748       The examples from README.md:
1749
1750         1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
1751         1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
1752
1753         2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
1754         2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
1755
1756         3$ concurr 'echo {}' < input_file
1757         3$ parallel 'echo {}' < input_file
1758
1759         4$ cat file | concurr 'echo {}'
1760         4$ cat file | parallel 'echo {}'
1761
1762       concurr deals badly empty input files and with output larger than 64
1763       KB.
1764
1765       https://github.com/mmstick/concurr (Last checked: 2019-01)
1766
1767   DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
1768       lesser-parallel is the inspiration for parallel --embed. Both lesser-
1769       parallel and parallel --embed define bash functions that can be
1770       included as part of a bash script to run jobs in parallel.
1771
1772       lesser-parallel implements a few of the replacement strings, but hardly
1773       any options, whereas parallel --embed gives you the full GNU parallel
1774       experience.
1775
1776       https://github.com/kou1okada/lesser-parallel (Last checked: 2019-01)
1777
1778   DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
1779       npm-parallel can run npm tasks in parallel.
1780
1781       There are no examples and very little documentation, so it is hard to
1782       compare to GNU parallel.
1783
1784       https://github.com/spion/npm-parallel (Last checked: 2019-01)
1785
1786   DIFFERENCES BETWEEN machma AND GNU Parallel
1787       machma runs tasks in parallel. It gives time stamped output. It buffers
1788       in RAM.
1789
1790       EXAMPLES FROM README.md
1791
1792       The examples from README.md:
1793
1794         1$ # Put shorthand for timestamp in config for the examples
1795            echo '--rpl '\
1796              \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
1797              > ~/.parallel/machma
1798            echo '--line-buffer --tagstring "{#} {time} {}"' \
1799              >> ~/.parallel/machma
1800
1801         2$ find . -iname '*.jpg' |
1802              machma --  mogrify -resize 1200x1200 -filter Lanczos {}
1803            find . -iname '*.jpg' |
1804              parallel --bar -Jmachma mogrify -resize 1200x1200 \
1805                -filter Lanczos {}
1806
1807         3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
1808         3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
1809
1810         4$ cat /tmp/ips |
1811              machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
1812         4$ cat /tmp/ips |
1813              parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
1814
1815         5$ find . -iname '*.jpg' |
1816              machma --timeout 5s -- mogrify -resize 1200x1200 \
1817                -filter Lanczos {}
1818         5$ find . -iname '*.jpg' |
1819              parallel --timeout 5s --bar mogrify -resize 1200x1200 \
1820                -filter Lanczos {}
1821
1822         6$ find . -iname '*.jpg' -print0 |
1823              machma --null --  mogrify -resize 1200x1200 -filter Lanczos {}
1824         6$ find . -iname '*.jpg' -print0 |
1825              parallel --null --bar mogrify -resize 1200x1200 \
1826                -filter Lanczos {}
1827
1828       https://github.com/fd0/machma (Last checked: 2019-06)
1829
1830   DIFFERENCES BETWEEN interlace AND GNU Parallel
1831       Summary table (see legend above): - I2 I3 I4 - - - M1 - M3 - - M6 - O2
1832       O3 - - - - x x E1 E2 - - - - - - - - - - - - - - - -
1833
1834       interlace is built for network analysis to run network tools in
1835       parallel.
1836
1837       interface does not buffer output, so output from different jobs mixes.
1838
1839       The overhead for each target is O(n*n), so with 1000 targets it becomes
1840       very slow with an overhead in the order of 500ms/target.
1841
1842       EXAMPLES FROM interlace's WEBSITE
1843
1844       Using prips most of the examples from
1845       https://github.com/codingo/Interlace can be run with GNU parallel:
1846
1847       Blocker
1848
1849         commands.txt:
1850           mkdir -p _output_/_target_/scans/
1851           _blocker_
1852           nmap _target_ -oA _output_/_target_/scans/_target_-nmap
1853         interlace -tL ./targets.txt -cL commands.txt -o $output
1854
1855         parallel -a targets.txt \
1856           mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
1857
1858       Blocks
1859
1860         commands.txt:
1861           _block:nmap_
1862           mkdir -p _target_/output/scans/
1863           nmap _target_ -oN _target_/output/scans/_target_-nmap
1864           _block:nmap_
1865           nikto --host _target_
1866         interlace -tL ./targets.txt -cL commands.txt
1867
1868         _nmap() {
1869           mkdir -p $1/output/scans/
1870           nmap $1 -oN $1/output/scans/$1-nmap
1871         }
1872         export -f _nmap
1873         parallel ::: _nmap "nikto --host" :::: targets.txt
1874
1875       Run Nikto Over Multiple Sites
1876
1877         interlace -tL ./targets.txt -threads 5 \
1878           -c "nikto --host _target_ > ./_target_-nikto.txt" -v
1879
1880         parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
1881
1882       Run Nikto Over Multiple Sites and Ports
1883
1884         interlace -tL ./targets.txt -threads 5 -c \
1885           "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
1886           -p 80,443 -v
1887
1888         parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
1889           :::: targets.txt ::: 80 443
1890
1891       Run a List of Commands against Target Hosts
1892
1893         commands.txt:
1894           nikto --host _target_:_port_ > _output_/_target_-nikto.txt
1895           sslscan _target_:_port_ >  _output_/_target_-sslscan.txt
1896           testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
1897         interlace -t example.com -o ~/Engagements/example/ \
1898           -cL ./commands.txt -p 80,443
1899
1900         parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
1901           ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
1902
1903       CIDR notation with an application that doesn't support it
1904
1905         interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
1906           -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1907
1908         prips 192.168.12.0/24 |
1909           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1910
1911       Glob notation with an application that doesn't support it
1912
1913         interlace -t 192.168.12.* -c "vhostscan _target_ \
1914           -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1915
1916         # Glob is not supported in prips
1917         prips 192.168.12.0/24 |
1918           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1919
1920       Dash (-) notation with an application that doesn't support it
1921
1922         interlace -t 192.168.12.1-15 -c \
1923           "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
1924           -o ~/scans/ -threads 50
1925
1926         # Dash notation is not supported in prips
1927         prips 192.168.12.1 192.168.12.15 |
1928           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1929
1930       Threading Support for an application that doesn't support it
1931
1932         interlace -tL ./target-list.txt -c \
1933           "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
1934           -o ~/scans/ -threads 50
1935
1936         cat ./target-list.txt |
1937           parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
1938
1939       alternatively
1940
1941         ./vhosts-commands.txt:
1942           vhostscan -t $target -oN _output_/_target_-vhosts.txt
1943         interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
1944           -threads 50 -o ~/scans
1945
1946         ./vhosts-commands.txt:
1947           vhostscan -t "$1" -oN "$2"
1948         parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
1949           :::: ./target-list.txt
1950
1951       Exclusions
1952
1953         interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
1954           "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
1955           -o ~/scans/ -threads 50
1956
1957         prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
1958           parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1959
1960       Run Nikto Using Multiple Proxies
1961
1962          interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
1963            "nikto --host _target_:_port_ -useproxy _proxy_ > \
1964             ./_target_-_port_-nikto.txt" -p 80,443 -v
1965
1966          parallel -j5 \
1967            "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
1968            :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
1969
1970       https://github.com/codingo/Interlace (Last checked: 2019-09)
1971
1972   DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
1973       I have been unable to get the code to run at all. It seems unfinished.
1974
1975       https://github.com/otonvm/Parallel (Last checked: 2019-02)
1976
1977   DIFFERENCES BETWEEN k-bx par AND GNU Parallel
1978       par requires Haskell to work. This limits the number of platforms this
1979       can work on.
1980
1981       par does line buffering in memory. The memory usage is 3x the longest
1982       line (compared to 1x for parallel --lb). Commands must be given as
1983       arguments. There is no template.
1984
1985       These are the examples from https://github.com/k-bx/par with the
1986       corresponding GNU parallel command.
1987
1988         par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
1989             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
1990         parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
1991             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
1992
1993         par "echo foo; sleep 1; foofoo" \
1994             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
1995         parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
1996             "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
1997
1998         par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
1999         parallel --lb --colsep , --tagstring {1} {2} \
2000           ::: "[fooechoer],echo foo" "[bar],echo bar"
2001
2002         par --succeed "foo" "bar" && echo 'wow'
2003         parallel "foo" "bar"; true && echo 'wow'
2004
2005       https://github.com/k-bx/par (Last checked: 2019-02)
2006
2007   DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2008       parallelshell does not allow for composed commands:
2009
2010         # This does not work
2011         parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2012
2013       Instead you have to wrap that in a shell:
2014
2015         parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2016
2017       It buffers output in RAM. All commands must be given on the command
2018       line and all commands are started in parallel at the same time. This
2019       will cause the system to freeze if there are so many jobs that there is
2020       not enough memory to run them all at the same time.
2021
2022       https://github.com/keithamus/parallelshell (Last checked: 2019-02)
2023
2024       https://github.com/darkguy2008/parallelshell (Last checked: 2019-03)
2025
2026   DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2027       shell-executor does not allow for composed commands:
2028
2029         # This does not work
2030         sx 'echo foo;echo bar' 'echo baz;echo quuz'
2031
2032       Instead you have to wrap that in a shell:
2033
2034         sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2035
2036       It buffers output in RAM. All commands must be given on the command
2037       line and all commands are started in parallel at the same time. This
2038       will cause the system to freeze if there are so many jobs that there is
2039       not enough memory to run them all at the same time.
2040
2041       https://github.com/royriojas/shell-executor (Last checked: 2019-02)
2042
2043   DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2044       par buffers in memory to avoid mixing of jobs. It takes 1s per 1
2045       million output lines.
2046
2047       par needs to have all commands before starting the first job. The jobs
2048       are read from stdin (standard input) so any quoting will have to be
2049       done by the user.
2050
2051       Stdout (standard output) is prepended with o:. Stderr (standard error)
2052       is sendt to stdout (standard output) and prepended with e:.
2053
2054       For short jobs with little output par is 20% faster than GNU parallel
2055       and 60% slower than xargs.
2056
2057       http://savannah.nongnu.org/projects/par (Last checked: 2019-02)
2058
2059   DIFFERENCES BETWEEN fd AND GNU Parallel
2060       fd does not support composed commands, so commands must be wrapped in
2061       sh -c.
2062
2063       It buffers output in RAM.
2064
2065       It only takes file names from the filesystem as input (similar to
2066       find).
2067
2068       https://github.com/sharkdp/fd (Last checked: 2019-02)
2069
2070   DIFFERENCES BETWEEN lateral AND GNU Parallel
2071       lateral is very similar to sem: It takes a single command and runs it
2072       in the background. The design means that output from parallel running
2073       jobs may mix. If it dies unexpectly it leaves a socket in
2074       ~/.lateral/socket.PID.
2075
2076       lateral deals badly with too long command lines. This makes the lateral
2077       server crash:
2078
2079         lateral run echo `seq 100000| head -c 1000k`
2080
2081       Any options will be read by lateral so this does not work (lateral
2082       interprets the -l):
2083
2084         lateral run ls -l
2085
2086       Composed commands do not work:
2087
2088         lateral run pwd ';' ls
2089
2090       Functions do not work:
2091
2092         myfunc() { echo a; }
2093         export -f myfunc
2094         lateral run myfunc
2095
2096       Running emacs in the terminal causes the parent shell to die:
2097
2098         echo '#!/bin/bash' > mycmd
2099         echo emacs -nw >> mycmd
2100         chmod +x mycmd
2101         lateral start
2102         lateral run ./mycmd
2103
2104       Here are the examples from https://github.com/akramer/lateral with the
2105       corresponding GNU sem and GNU parallel commands:
2106
2107         1$ lateral start
2108            for i in $(cat /tmp/names); do
2109              lateral run -- some_command $i
2110            done
2111            lateral wait
2112
2113         1$ for i in $(cat /tmp/names); do
2114              sem some_command $i
2115            done
2116            sem --wait
2117
2118         1$ parallel some_command :::: /tmp/names
2119
2120         2$ lateral start
2121            for i in $(seq 1 100); do
2122              lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2123            done
2124            lateral wait
2125
2126         2$ for i in $(seq 1 100); do
2127              sem my_slow_command < workfile$i > /tmp/logfile$i
2128            done
2129            sem --wait
2130
2131         2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2132              ::: {1..100}
2133
2134         3$ lateral start -p 0 # yup, it will just queue tasks
2135            for i in $(seq 1 100); do
2136              lateral run -- command_still_outputs_but_wont_spam inputfile$i
2137            done
2138            # command output spam can commence
2139            lateral config -p 10; lateral wait
2140
2141         3$ for i in $(seq 1 100); do
2142              echo "command inputfile$i" >> joblist
2143            done
2144            parallel -j 10 :::: joblist
2145
2146         3$ echo 1 > /tmp/njobs
2147            parallel -j /tmp/njobs command inputfile{} \
2148              ::: {1..100} &
2149            echo 10 >/tmp/njobs
2150            wait
2151
2152       https://github.com/akramer/lateral (Last checked: 2019-03)
2153
2154   DIFFERENCES BETWEEN with-this AND GNU Parallel
2155       The examples from https://github.com/amritb/with-this.git and the
2156       corresponding GNU parallel command:
2157
2158         with -v "$(cat myurls.txt)" "curl -L this"
2159         parallel curl -L ::: myurls.txt
2160
2161         with -v "$(cat myregions.txt)" \
2162           "aws --region=this ec2 describe-instance-status"
2163         parallel aws --region={} ec2 describe-instance-status \
2164           :::: myregions.txt
2165
2166         with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2167         ls | parallel kubectl --kubeconfig={} get pods
2168
2169         with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2170         ls | grep config | parallel kubectl --kubeconfig={} get pods
2171
2172         with -v "$(echo {1..10})" "echo 123"
2173         parallel -N0 echo 123 ::: {1..10}
2174
2175       Stderr is merged with stdout. with-this buffers in RAM. It uses 3x the
2176       output size, so you cannot have output larger than 1/3rd the amount of
2177       RAM. The input values cannot contain spaces. Composed commands do not
2178       work.
2179
2180       with-this gives some additional information, so the output has to be
2181       cleaned before piping it to the next command.
2182
2183       https://github.com/amritb/with-this.git (Last checked: 2019-03)
2184
2185   DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2186       Summary table (see legend above): - - - I4 - - I7 - - M3 - - M6 - O2 O3
2187       - O5 O6 - x x E1 - - - - - E7 - x x x x x x x x - -
2188
2189       EXAMPLES FROM Tollef's parallel MANUAL
2190
2191       Tollef parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2192
2193       GNU parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2194
2195       Tollef parallel -j 3 ufraw -o processed -- *.NEF
2196
2197       GNU parallel -j 3 ufraw -o processed ::: *.NEF
2198
2199       Tollef parallel -j 3 -- ls df "echo hi"
2200
2201       GNU parallel -j 3 ::: ls df "echo hi"
2202
2203       (Last checked: 2019-08)
2204
2205   DIFFERENCES BETWEEN rargs AND GNU Parallel
2206       Summary table (see legend above): I1 - - - - - I7 - - M3 M4 - - - O2 O3
2207       - O5 O6 - O8 - E1 - - E4 - - - - - - - - - - - - - -
2208
2209       rargs has elegant ways of doing named regexp capture and field ranges.
2210
2211       With GNU parallel you can use --rpl to get a similar functionality as
2212       regexp capture gives, and use join and @arg to get the field ranges.
2213       But the syntax is longer. This:
2214
2215         --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2216
2217       would make it possible to use:
2218
2219         {1r3..6}
2220
2221       for field 3..6.
2222
2223       For full support of {n..m:s} including negative numbers use a dynamic
2224       replacement string like this:
2225
2226         PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2227                 $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2228                 $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2229                 $s = defined $$6 ? $$7 : " ";
2230                 $_ = join $s,@arg[$a..$b]'\'
2231         export PARALLEL
2232
2233       You can then do:
2234
2235         head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2236           4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2237           ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2238
2239       EXAMPLES FROM rargs MANUAL
2240
2241         ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2242         ls *.bak | parallel mv {} {.}
2243
2244         cat download-list.csv | rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2245         cat download-list.csv | parallel --csv wget {1} -O {2}
2246         # or use regexps:
2247         cat download-list.csv |
2248           parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' wget {url} -O {filename}
2249
2250         cat /etc/passwd | rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2251         cat /etc/passwd |
2252           parallel -q --colsep : echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2253
2254       https://github.com/lotabout/rargs (Last checked: 2020-01)
2255
2256   DIFFERENCES BETWEEN threader AND GNU Parallel
2257       Summary table (see legend above): I1 - - - - - - M1 - M3 - - M6 O1 - O3
2258       - O5 - - N/A N/A E1 - - E4 - - - - - - - - - - - - - -
2259
2260       Newline separates arguments, but newline at the end of file is treated
2261       as an empty argument. So this runs 2 jobs:
2262
2263         echo two_jobs | threader -run 'echo "$THREADID"'
2264
2265       threader ignores stderr, so any output to stderr is lost. threader
2266       buffers in RAM, so output bigger than the machine's virtual memory will
2267       cause the machine to crash.
2268
2269       https://github.com/voodooEntity/threader (Last checked: 2020-04)
2270
2271   DIFFERENCES BETWEEN runp AND GNU Parallel
2272       Summary table (see legend above): I1 I2 - - - - - M1 - (M3) - - M6 O1
2273       O2 O3 - O5 O6 - N/A N/A - E1 - - - - - - - - - - - - - - - - -
2274
2275       (M3): You can add a prefix and a postfix to the input, so it means you
2276       can only insert the argument on the command line once.
2277
2278       runp runs 10 jobs in parallel by default.  runp blocks if output of a
2279       command is > 64 Kbytes.  Quoting of input is needed.  It adds output to
2280       stderr (this can be prevented with -q)
2281
2282       Examples as GNU Parallel
2283
2284         base='https://images-api.nasa.gov/search'
2285         query='jupiter'
2286         desc='planet'
2287         type='image'
2288         url="$base?q=$query&description=$desc&media_type=$type"
2289
2290         # Download the images in parallel using runp
2291         curl -s $url | jq -r .collection.items[].href | \
2292           runp -p 'curl -s' | jq -r .[] | grep large | \
2293           runp -p 'curl -s -L -O'
2294
2295         time curl -s $url | jq -r .collection.items[].href | \
2296           runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2297           runp -g 1 -q -p 'curl -s -L -O'
2298
2299         # Download the images in parallel
2300         curl -s $url | jq -r .collection.items[].href | \
2301           parallel curl -s | jq -r .[] | grep large | \
2302           parallel curl -s -L -O
2303
2304         time curl -s $url | jq -r .collection.items[].href | \
2305           parallel -j 1 curl -s | jq -r .[] | grep large | \
2306           parallel -j 1 curl -s -L -O
2307
2308       Run some test commands (read from file)
2309
2310         # Create a file containing commands to run in parallel.
2311         cat << EOF > /tmp/test-commands.txt
2312         sleep 5
2313         sleep 3
2314         blah     # this will fail
2315         ls $PWD  # PWD shell variable is used here
2316         EOF
2317
2318         # Run commands from the file.
2319         runp /tmp/test-commands.txt > /dev/null
2320
2321         parallel -a /tmp/test-commands.txt > /dev/null
2322
2323       Ping several hosts and see packet loss (read from stdin)
2324
2325         # First copy this line and press Enter
2326         runp -p 'ping -c 5 -W 2' -s '| grep loss'
2327         localhost
2328         1.1.1.1
2329         8.8.8.8
2330         # Press Enter and Ctrl-D when done entering the hosts
2331
2332         # First copy this line and press Enter
2333         parallel ping -c 5 -W 2 {} '| grep loss'
2334         localhost
2335         1.1.1.1
2336         8.8.8.8
2337         # Press Enter and Ctrl-D when done entering the hosts
2338
2339       Get directories' sizes (read from stdin)
2340
2341         echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2342
2343         echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2344         # or:
2345         parallel sudo du -sh ::: "$HOME" /etc /tmp
2346
2347       Compress files
2348
2349         find . -iname '*.txt' | runp -p 'gzip --best'
2350
2351         find . -iname '*.txt' | parallel gzip --best
2352
2353       Measure HTTP request + response time
2354
2355         export CURL="curl -w 'time_total:  %{time_total}\n'"
2356         CURL="$CURL -o /dev/null -s https://golang.org/"
2357         perl -wE 'for (1..10) { say $ENV{CURL} }' |
2358            runp -q  # Make 10 requests
2359
2360         perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2361         # or:
2362         parallel -N0 "$CURL" ::: {1..10}
2363
2364       Find open TCP ports
2365
2366         cat << EOF > /tmp/host-port.txt
2367         localhost 22
2368         localhost 80
2369         localhost 81
2370         127.0.0.1 443
2371         127.0.0.1 444
2372         scanme.nmap.org 22
2373         scanme.nmap.org 23
2374         scanme.nmap.org 443
2375         EOF
2376
2377         cat /tmp/host-port.txt | \
2378           runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2379
2380         # --colsep is needed to split the line
2381         cat /tmp/host-port.txt | \
2382           parallel --colsep ' ' netcat -v -w2 -z 2>&1 | egrep '(succeeded!|open)$'
2383         # or use uq for unquoted:
2384         cat /tmp/host-port.txt | \
2385           parallel netcat -v -w2 -z {=uq=} 2>&1 | egrep '(succeeded!|open)$'
2386
2387       https://github.com/jreisinger/runp (Last checked: 2020-04)
2388
2389   DIFFERENCES BETWEEN papply AND GNU Parallel
2390       Summary table (see legend above): - - - I4 - - - M1 - M3 - - M6 - - O3
2391       - O5 - - N/A N/A O10 E1 - - E4 - - - - - - - - - - - - - -
2392
2393       papply does not print the output if the command fails:
2394
2395         $ papply 'echo %F; false' foo
2396         "echo foo; false" did not succeed
2397
2398       papply's replacement strings (%F %d %f %n %e %z) can be simulated in
2399       GNU parallel by putting this in ~/.parallel/config:
2400
2401         --rpl '%F'
2402         --rpl '%d $_=Q(::dirname($_));'
2403         --rpl '%f s:.*/::;'
2404         --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2405         --rpl '%e s:.*\.:.:'
2406         --rpl '%z $_=""'
2407
2408       papply buffers in RAM, and uses twice the amount of output. So output
2409       of 5 GB takes 10 GB RAM.
2410
2411       The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2412       seconds (compared to 10 seconds with GNU parallel).
2413
2414       Examples as GNU Parallel
2415
2416         1$ papply gzip *.txt
2417
2418         1$ parallel gzip ::: *.txt
2419
2420         2$ papply "convert %F %n.jpg" *.png
2421
2422         2$ parallel convert {} {.}.jpg ::: *.png
2423
2424       https://pypi.org/project/papply/ (Last checked: 2020-04)
2425
2426   DIFFERENCES BETWEEN async AND GNU Parallel
2427       Summary table (see legend above): - - - I4 - - I7 - - - - - M6 - O2 O3
2428       - O5 O6 - N/A N/A O10 E1 - - E4 - E6 - - - - - - - - - - S1 S2
2429
2430       async is very similary to GNU parallel's --semaphore mode (aka sem).
2431       async requires the user to start a server process.
2432
2433       The input is quoted like -q so you need bash -c "...;..." to run
2434       composed commands.
2435
2436       Examples as GNU Parallel
2437
2438         1$ S="/tmp/example_socket"
2439
2440         1$ ID=myid
2441
2442         2$ async -s="$S" server --start
2443
2444         2$ # GNU Parallel does not need a server to run
2445
2446         3$ for i in {1..20}; do
2447                # prints command output to stdout
2448                async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2449            done
2450
2451         3$ for i in {1..20}; do
2452                # prints command output to stdout
2453                sem --id "$ID" -j100% "sleep 1 && echo test $i"
2454                # GNU Parallel will only print job when it is done
2455                # If you need output from different jobs to mix
2456                # use -u or --line-buffer
2457                sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2458            done
2459
2460         4$ # wait until all commands are finished
2461            async -s="$S" wait
2462
2463         4$ sem --id "$ID" --wait
2464
2465         5$ # configure the server to run four commands in parallel
2466            async -s="$S" server -j4
2467
2468         5$ export PARALLEL=-j4
2469
2470         6$ mkdir "/tmp/ex_dir"
2471            for i in {21..40}; do
2472              # redirects command output to /tmp/ex_dir/file*
2473              async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2474                bash -c "sleep 1 && echo test $i"
2475            done
2476
2477         6$ mkdir "/tmp/ex_dir"
2478            for i in {21..40}; do
2479              # redirects command output to /tmp/ex_dir/file*
2480              sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
2481                "sleep 1 && echo test $i"
2482            done
2483
2484         7$ sem --id "$ID" --wait
2485
2486         7$ async -s="$S" wait
2487
2488         8$ # stops server
2489            async -s="$S" server --stop
2490
2491         8$ # GNU Parallel does not need to stop a server
2492
2493       https://github.com/ctbur/async/ (Last checked: 2020-11)
2494
2495   Todo
2496       test_many_var() { gen500k() { seq -f %f 1000000000000000
2497       1000000000050000 | head -c 131000; } for a in `seq 11000`; do eval
2498       "export a$a=1" ; done gen500k | stdout parallel --timeout 5 -Xj1  'echo
2499       {} {} {} {} | wc' | perl -pe 's/\d{3,5} //g' }
2500
2501       test_many_var_func() { gen500k() { seq -f %f 1000000000000000
2502       1000000000050000 | head -c 131000; } for a in `seq 5100`; do eval
2503       "export a$a=1" ; done for a in `seq 5100`; do eval "a$a() { 1; }" ;
2504       done for a in `seq 5100`; do eval export -f a$a ; done gen500k | stdout
2505       parallel --timeout 21 -Xj1  'echo {} {} {} {} | wc' | perl -pe
2506       's/\d{3,5} //g' }
2507
2508       test_many_var_func() { gen500k() { seq -f %f 1000000000000000
2509       1000000000050000 | head -c 131000; } for a in `seq 8000`; do eval
2510       "a$a() { 1; }" ; done for a in `seq 8000`; do eval export -f a$a ; done
2511       gen500k | stdout parallel --timeout 6 -Xj1  'echo {} {} {} {} | wc' |
2512       perl -pe 's/\d{3,5} //g' }
2513
2514       test_big_func() { gen500k() { seq -f %f 1000000000000000
2515       1000000000050000 | head -c 131000; } big=`seq 1000` for a in `seq 50`;
2516       do eval "a$a() { '$big'; }" ; done for a in `seq 50`; do eval export -f
2517       a$a ; done gen500k | stdout parallel --timeout 4 -Xj1  'echo {} {} {}
2518       {} | wc' | perl -pe 's/\d{3,5} //g' }
2519
2520       test_many_var_big_func() { gen500k() { seq -f %f 1000000000000000
2521       1000000000050000 | head -c 131000; } big=`seq 1000` for a in `seq
2522       5100`; do eval "export a$a=1" ; done for a in `seq 20`; do eval "a$a()
2523       { '$big'; }" ; done for a in `seq 20`; do eval export -f a$a ; done
2524       gen500k | stdout parallel --timeout 6 -Xj1  'echo {} {} {} {} | wc' |
2525       perl -pe 's/\d{3,5} //g' }
2526
2527       test_big_func_name() { gen500k() { seq -f %f 1000000000000000
2528       1000000000050000 | head -c 131000; } big=`perl -e print\"x\"x10000` for
2529       a in `seq 20`; do eval "export a$big$a=1" ; done gen500k | stdout
2530       parallel --timeout 8 -Xj1  'echo {} {} {} {} | wc' | perl -pe
2531       's/\d{3,5} //g' }
2532
2533       test_big_var_func_name() { gen500k() { seq -f %f 1000000000000000
2534       1000000000050000 | head -c 131000; } big=`perl -e print\"x\"x10000` for
2535       a in `seq 2`; do eval "export a$big$a=1" ; done for a in `seq 2`; do
2536       eval "a$big$a() { '$big'; }" ; done for a in `seq 2`; do eval export -f
2537       a$big$a ; done gen500k | stdout parallel --timeout 1000 -Xj1  'echo {}
2538       {} {} {} | wc' | perl -pe 's/\d{3,5} //g' }
2539
2540       tange@macosx:~$ for a in `seq 100`; do eval export
2541       a$a=fffffffffffffffffffffffff ; donetange@macosx:~$ seq 50000 | stdout
2542       parallel -Xj1  'echo {} {} | wc' | perl -pe 's/\d{3,5} //g'
2543       tange@macosx:~$ for a in `seq 100`; do eval export
2544       a$a=fffffffffffffffffffffffff ; donetange@macosx:~$ seq 50000 | stdout
2545       parallel -Xj1  'echo {} {} | wc' | perl -pe 's/\d{3,5} //g'
2546       tange@macosx:~$ for a in `seq 100`; do eval export -f a$a ; done
2547
2548       seq 100000 | stdout parallel -Xj1  'echo {} {} | wc' export a=`seq
2549       10000` seq 100000 | stdout parallel -Xj1  'echo {} {} | wc'
2550
2551           my $already_spread;
2552           my $env_size;
2553
2554               if($^O eq "darwin") {
2555                   $env_size ||= 500+length(join'',%ENV);
2556                   $max_len -= $env_size;
2557               }
2558
2559       PASH: Light-touch Data-Parallel Shell Processing
2560       https://arxiv.org/pdf/2007.09436.pdf
2561
2562       https://github.com/UnixJunkie/pardi
2563
2564       https://github.com/UnixJunkie/PAR (Same as
2565       http://savannah.nongnu.org/projects/par above?)
2566
2567       https://gitlab.com/netikras/bthread
2568
2569       https://github.com/JeiKeiLim/simple_distribute_job
2570
2571       https://github.com/reggi/pkgrun
2572
2573       https://github.com/benoror/better-npm-run - not obvious how to use
2574
2575       https://github.com/bahmutov/with-package
2576
2577       https://github.com/xuchenCN/go-pssh
2578
2579       https://github.com/flesler/parallel
2580
2581       https://github.com/Julian/Verge
2582
2583       https://github.com/ExpectationMax/simple_gpu_scheduler
2584           simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
2585           parallel -j3 --shuf CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}
2586       {=uq;=}' < gpu_commands.txt
2587
2588           simple_hypersearch "python3 train_dnn.py --lr {lr} --batch_size {bs}" -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 | simple_gpu_scheduler --gpus 0,1,2
2589           parallel --header : --shuf -j3 -v CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' python3 train_dnn.py --lr {lr} --batch_size {bs} ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2590
2591           simple_hypersearch "python3 train_dnn.py --lr {lr} --batch_size {bs}" --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 | simple_gpu_scheduler --gpus 0,1,2
2592           parallel --header : --shuf CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq() > 5 and skip() =}' python3 train_dnn.py --lr {lr} --batch_size {bs} ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2593
2594           touch gpu.queue
2595           tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
2596           echo "my_command_with | and stuff > logfile" >> gpu.queue
2597
2598           touch gpu.queue
2599           tail -f -n 0 gpu.queue | parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
2600           # Needed to fill job slots once
2601           seq 3 | parallel echo true >> gpu.queue
2602           # Add jobs
2603           echo "my_command_with | and stuff > logfile" >> gpu.queue
2604           # Needed to flush output from completed jobs
2605           seq 3 | parallel echo true >> gpu.queue
2606

TESTING OTHER TOOLS

2608       There are certain issues that are very common on parallelizing tools.
2609       Here are a few stress tests. Be warned: If the tool is badly coded it
2610       may overload your machine.
2611
2612   MIX: Output mixes
2613       Output from 2 jobs should not mix. If the output is not used, this does
2614       not matter; but if the output is used then it is important that you do
2615       not get half a line from one job followed by half a line from another
2616       job.
2617
2618       If the tool does not buffer, output will most likely mix now and then.
2619
2620       This test stresses whether output mixes.
2621
2622         #!/bin/bash
2623
2624         paralleltool="parallel -j0"
2625
2626         cat <<-EOF > mycommand
2627         #!/bin/bash
2628
2629         # If a, b, c, d, e, and f mix: Very bad
2630         perl -e 'print STDOUT "a"x3000_000," "'
2631         perl -e 'print STDERR "b"x3000_000," "'
2632         perl -e 'print STDOUT "c"x3000_000," "'
2633         perl -e 'print STDERR "d"x3000_000," "'
2634         perl -e 'print STDOUT "e"x3000_000," "'
2635         perl -e 'print STDERR "f"x3000_000," "'
2636         echo
2637         echo >&2
2638         EOF
2639         chmod +x mycommand
2640
2641         # Run 30 jobs in parallel
2642         seq 30 |
2643           $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
2644
2645         # 'a c e' and 'b d f' should always stay together
2646         # and there should only be a single line per job
2647
2648   STDERRMERGE: Stderr is merged with stdout
2649       Output from stdout and stderr should not be merged, but kept separated.
2650
2651       This test shows whether stdout is mixed with stderr.
2652
2653         #!/bin/bash
2654
2655         paralleltool="parallel -j0"
2656
2657         cat <<-EOF > mycommand
2658         #!/bin/bash
2659
2660         echo stdout
2661         echo stderr >&2
2662         echo stdout
2663         echo stderr >&2
2664         EOF
2665         chmod +x mycommand
2666
2667         # Run one job
2668         echo |
2669           $paralleltool ./mycommand > stdout 2> stderr
2670         cat stdout
2671         cat stderr
2672
2673   RAM: Output limited by RAM
2674       Some tools cache output in RAM. This makes them extremely slow if the
2675       output is bigger than physical memory and crash if the output is bigger
2676       than the virtual memory.
2677
2678         #!/bin/bash
2679
2680         paralleltool="parallel -j0"
2681
2682         cat <<'EOF' > mycommand
2683         #!/bin/bash
2684
2685         # Generate 1 GB output
2686         yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
2687         EOF
2688         chmod +x mycommand
2689
2690         # Run 20 jobs in parallel
2691         # Adjust 20 to be > physical RAM and < free space on /tmp
2692         seq 20 | time $paralleltool ./mycommand | wc -c
2693
2694   DISKFULL: Incomplete data if /tmp runs full
2695       If caching is done on disk, the disk can run full during the run. Not
2696       all programs discover this. GNU Parallel discovers it, if it stays full
2697       for at least 2 seconds.
2698
2699         #!/bin/bash
2700
2701         paralleltool="parallel -j0"
2702
2703         # This should be a dir with less than 100 GB free space
2704         smalldisk=/tmp/shm/parallel
2705
2706         TMPDIR="$smalldisk"
2707         export TMPDIR
2708
2709         max_output() {
2710             # Force worst case scenario:
2711             # Make GNU Parallel only check once per second
2712             sleep 10
2713             # Generate 100 GB to fill $TMPDIR
2714             # Adjust if /tmp is bigger than 100 GB
2715             yes | head -c 100G >$TMPDIR/$$
2716             # Generate 10 MB output that will not be buffered due to full disk
2717             perl -e 'print "X"x10_000_000' | head -c 10M
2718             echo This part is missing from incomplete output
2719             sleep 2
2720             rm $TMPDIR/$$
2721             echo Final output
2722         }
2723
2724         export -f max_output
2725         seq 10 | $paralleltool max_output | tr -s X
2726
2727   CLEANUP: Leaving tmp files at unexpected death
2728       Some tools do not clean up tmp files if they are killed. If the tool
2729       buffers on disk, they may not clean up, if they are killed.
2730
2731         #!/bin/bash
2732
2733         paralleltool=parallel
2734
2735         ls /tmp >/tmp/before
2736         seq 10 | $paralleltool sleep &
2737         pid=$!
2738         # Give the tool time to start up
2739         sleep 1
2740         # Kill it without giving it a chance to cleanup
2741         kill -9 $!
2742         # Should be empty: No files should be left behind
2743         diff <(ls /tmp) /tmp/before
2744
2745   SPCCHAR: Dealing badly with special file names.
2746       It is not uncommon for users to create files like:
2747
2748         My brother's 12" *** record  (costs $$$).jpg
2749
2750       Some tools break on this.
2751
2752         #!/bin/bash
2753
2754         paralleltool=parallel
2755
2756         touch "My brother's 12\" *** record  (costs \$\$\$).jpg"
2757         ls My*jpg | $paralleltool ls -l
2758
2759   COMPOSED: Composed commands do not work
2760       Some tools require you to wrap composed commands into bash -c.
2761
2762         echo bar | $paralleltool echo foo';' echo {}
2763
2764   ONEREP: Only one replacement string allowed
2765       Some tools can only insert the argument once.
2766
2767         echo bar | $paralleltool echo {} foo {}
2768
2769   INPUTSIZE: Length of input should not be limited
2770       Some tools limit the length of the input lines artificially with no
2771       good reason. GNU parallel does not:
2772
2773         perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
2774
2775       GNU parallel limits the command to run to 128 KB due to execve(1):
2776
2777         perl -e 'print "x"x131_000' | parallel echo {} | wc
2778
2779   NUMWORDS: Speed depends on number of words
2780       Some tools become very slow if output lines have many words.
2781
2782         #!/bin/bash
2783
2784         paralleltool=parallel
2785
2786         cat <<-EOF > mycommand
2787         #!/bin/bash
2788
2789         # 10 MB of lines with 1000 words
2790         yes "`seq 1000`" | head -c 10M
2791         EOF
2792         chmod +x mycommand
2793
2794         # Run 30 jobs in parallel
2795         seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
2796
2797   4GB: Output with a line > 4GB should be OK
2798         #!/bin/bash
2799
2800         paralleltool="parallel -j0"
2801
2802         cat <<-EOF > mycommand
2803         #!/bin/bash
2804
2805         perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
2806         EOF
2807         chmod +x mycommand
2808
2809         # Run 1 job
2810         seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
2811

AUTHOR

2813       When using GNU parallel for a publication please cite:
2814
2815       O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
2816       The USENIX Magazine, February 2011:42-47.
2817
2818       This helps funding further development; and it won't cost you a cent.
2819       If you pay 10000 EUR you should feel free to use GNU Parallel without
2820       citing.
2821
2822       Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
2823
2824       Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
2825
2826       Copyright (C) 2010-2020 Ole Tange, http://ole.tange.dk and Free
2827       Software Foundation, Inc.
2828
2829       Parts of the manual concerning xargs compatibility is inspired by the
2830       manual of xargs from GNU findutils 4.4.2.
2831

LICENSE

2833       This program is free software; you can redistribute it and/or modify it
2834       under the terms of the GNU General Public License as published by the
2835       Free Software Foundation; either version 3 of the License, or at your
2836       option any later version.
2837
2838       This program is distributed in the hope that it will be useful, but
2839       WITHOUT ANY WARRANTY; without even the implied warranty of
2840       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
2841       General Public License for more details.
2842
2843       You should have received a copy of the GNU General Public License along
2844       with this program.  If not, see <http://www.gnu.org/licenses/>.
2845
2846   Documentation license I
2847       Permission is granted to copy, distribute and/or modify this
2848       documentation under the terms of the GNU Free Documentation License,
2849       Version 1.3 or any later version published by the Free Software
2850       Foundation; with no Invariant Sections, with no Front-Cover Texts, and
2851       with no Back-Cover Texts.  A copy of the license is included in the
2852       file fdl.txt.
2853
2854   Documentation license II
2855       You are free:
2856
2857       to Share to copy, distribute and transmit the work
2858
2859       to Remix to adapt the work
2860
2861       Under the following conditions:
2862
2863       Attribution
2864                You must attribute the work in the manner specified by the
2865                author or licensor (but not in any way that suggests that they
2866                endorse you or your use of the work).
2867
2868       Share Alike
2869                If you alter, transform, or build upon this work, you may
2870                distribute the resulting work only under the same, similar or
2871                a compatible license.
2872
2873       With the understanding that:
2874
2875       Waiver   Any of the above conditions can be waived if you get
2876                permission from the copyright holder.
2877
2878       Public Domain
2879                Where the work or any of its elements is in the public domain
2880                under applicable law, that status is in no way affected by the
2881                license.
2882
2883       Other Rights
2884                In no way are any of the following rights affected by the
2885                license:
2886
2887                • Your fair dealing or fair use rights, or other applicable
2888                  copyright exceptions and limitations;
2889
2890                • The author's moral rights;
2891
2892                • Rights other persons may have either in the work itself or
2893                  in how the work is used, such as publicity or privacy
2894                  rights.
2895
2896       Notice   For any reuse or distribution, you must make clear to others
2897                the license terms of this work.
2898
2899       A copy of the full license is included in the file as cc-by-sa.txt.
2900

DEPENDENCIES

2902       GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
2903       Symbol, IO::File, POSIX, and File::Temp. For remote usage it also uses
2904       rsync with ssh.
2905

SEE ALSO

2907       find(1), xargs(1), make(1), pexec(1), ppss(1), xjobs(1), prll(1),
2908       dxargs(1), mdm(1)
2909
2910
2911
291220201122                          2020-12-20          PARALLEL_ALTERNATIVES(7)
Impressum