1PARALLEL_ALTERNATIVES(7)           parallel           PARALLEL_ALTERNATIVES(7)
2
3
4

NAME

6       parallel_alternatives - Alternatives to GNU parallel
7

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

9       There are a lot programs with some of the functionality of GNU
10       parallel. GNU parallel strives to include the best of the functionality
11       without sacrificing ease of use.
12
13   SUMMARY TABLE
14       The following features are in some of the comparable tools:
15
16       Inputs
17        I1. Arguments can be read from stdin
18        I2. Arguments can be read from a file
19        I3. Arguments can be read from multiple files
20        I4. Arguments can be read from command line
21        I5. Arguments can be read from a table
22        I6. Arguments can be read from the same file using #! (shebang)
23        I7. Line oriented input as default (Quoting of special chars not
24       needed)
25
26       Manipulation of input
27        M1. Composed command
28        M2. Multiple arguments can fill up an execution line
29        M3. Arguments can be put anywhere in the execution line
30        M4. Multiple arguments can be put anywhere in the execution line
31        M5. Arguments can be replaced with context
32        M6. Input can be treated as the complete command line
33
34       Outputs
35        O1. Grouping output so output from different jobs do not mix
36        O2. Send stderr (standard error) to stderr (standard error)
37        O3. Send stdout (standard output) to stdout (standard output)
38        O4. Order of output can be same as order of input
39        O5. Stdout only contains stdout (standard output) from the command
40        O6. Stderr only contains stderr (standard error) from the command
41
42       Execution
43        E1. Running jobs in parallel
44        E2. List running jobs
45        E3. Finish running jobs, but do not start new jobs
46        E4. Number of running jobs can depend on number of cpus
47        E5. Finish running jobs, but do not start new jobs after first failure
48        E6. Number of running jobs can be adjusted while running
49
50       Remote execution
51        R1. Jobs can be run on remote computers
52        R2. Basefiles can be transferred
53        R3. Argument files can be transferred
54        R4. Result files can be transferred
55        R5. Cleanup of transferred files
56        R6. No config files needed
57        R7. Do not run more than SSHD's MaxStartups can handle
58        R8. Configurable SSH command
59        R9. Retry if connection breaks occasionally
60
61       Semaphore
62        S1. Possibility to work as a mutex
63        S2. Possibility to work as a counting semaphore
64
65       Legend
66        - = no
67        x = not applicable
68        ID = yes
69
70       As every new version of the programs are not tested the table may be
71       outdated. Please file a bug-report if you find errors (See REPORTING
72       BUGS).
73
74       parallel: I1 I2 I3 I4 I5 I6 I7 M1 M2 M3 M4 M5 M6 O1 O2 O3 O4 O5 O6 E1
75       E2 E3 E4 E5 E6 R1 R2 R3 R4 R5 R6 R7 R8 R9 S1 S2
76
77       xargs: I1 I2 -  -  -  -  - -  M2 M3 -  -  - -  O2 O3 -  O5 O6 E1 -  -
78       -  -  - -  -  -  -  -  x  -  -  - -  -
79
80       find -exec: -  -  -  x  -  x  - -  M2 M3 -  -  -  - -  O2 O3 O4 O5 O6 -
81       -  -  -  -  -  - -  -  -  -  -  -  -  -  - x  x
82
83       make -j: -  -  -  -  -  -  - -  -  -  -  -  - O1 O2 O3 -  x  O6 E1 -  -
84       -  E5 - -  -  -  -  -  -  -  -  - -  -
85
86       ppss: I1 I2 -  -  -  -  I7 M1 -  M3 -  -  M6 O1 -  -  x  -  - E1 E2 ?E3
87       E4 - - R1 R2 R3 R4 -  -  ?R7 ? ?  -  -
88
89       pexec: I1 I2 -  I4 I5 -  - M1 -  M3 -  -  M6 O1 O2 O3 -  O5 O6 E1 -  -
90       E4 -  E6 R1 -  -  -  -  R6 -  -  - S1 -
91
92       xjobs, prll, dxargs, mdm/middelman, xapply, paexec, ladon, jobflow,
93       ClusterSSH: TODO - Please file a bug-report if you know what features
94       they support (See REPORTING BUGS).
95
96   DIFFERENCES BETWEEN xargs AND GNU Parallel
97       xargs offers some of the same possibilities as GNU parallel.
98
99       xargs deals badly with special characters (such as space, \, ' and ").
100       To see the problem try this:
101
102         touch important_file
103         touch 'not important_file'
104         ls not* | xargs rm
105         mkdir -p "My brother's 12\" records"
106         ls | xargs rmdir
107         touch 'c:\windows\system32\clfs.sys'
108         echo 'c:\windows\system32\clfs.sys' | xargs ls -l
109
110       You can specify -0, but many input generators are not optimized for
111       using NUL as separator but are optimized for newline as separator. E.g
112       head, tail, awk, ls, echo, sed, tar -v, perl (-0 and \0 instead of \n),
113       locate (requires using -0), find (requires using -print0), grep
114       (requires user to use -z or -Z), sort (requires using -z).
115
116       GNU parallel's newline separation can be emulated with:
117
118       cat | xargs -d "\n" -n1 command
119
120       xargs can run a given number of jobs in parallel, but has no support
121       for running number-of-cpu-cores jobs in parallel.
122
123       xargs has no support for grouping the output, therefore output may run
124       together, e.g. the first half of a line is from one process and the
125       last half of the line is from another process. The example Parallel
126       grep cannot be done reliably with xargs because of this. To see this in
127       action try:
128
129         parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
130           ::: a b c d e f
131         ls -l a b c d e f
132         parallel -kP4 -n1 grep 1 > out.par ::: a b c d e f
133         echo a b c d e f | xargs -P4 -n1 grep 1 > out.xargs-unbuf
134         echo a b c d e f | \
135           xargs -P4 -n1 grep --line-buffered 1 > out.xargs-linebuf
136         echo a b c d e f | xargs -n1 grep 1 > out.xargs-serial
137         ls -l out*
138         md5sum out*
139
140       Or try this:
141
142         slow_seq() {
143           seq "$@" |
144             perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
145         }
146         export -f slow_seq
147         seq 5 | xargs -n1 -P0 -I {} bash -c 'slow_seq {}'
148         seq 5 | parallel -P0 slow_seq {}
149
150       xargs has no support for keeping the order of the output, therefore if
151       running jobs in parallel using xargs the output of the second job
152       cannot be postponed till the first job is done.
153
154       xargs has no support for running jobs on remote computers.
155
156       xargs has no support for context replace, so you will have to create
157       the arguments.
158
159       If you use a replace string in xargs (-I) you can not force xargs to
160       use more than one argument.
161
162       Quoting in xargs works like -q in GNU parallel. This means composed
163       commands and redirection require using bash -c.
164
165         ls | parallel "wc {} >{}.wc"
166         ls | parallel "echo {}; ls {}|wc"
167
168       becomes (assuming you have 8 cores)
169
170         ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
171         ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
172
173       https://www.gnu.org/software/findutils/
174
175   DIFFERENCES BETWEEN find -exec AND GNU Parallel
176       find -exec offer some of the same possibilities as GNU parallel.
177
178       find -exec only works on files. So processing other input (such as
179       hosts or URLs) will require creating these inputs as files. find -exec
180       has no support for running commands in parallel.
181
182       https://www.gnu.org/software/findutils/
183
184   DIFFERENCES BETWEEN make -j AND GNU Parallel
185       make -j can run jobs in parallel, but requires a crafted Makefile to do
186       this. That results in extra quoting to get filename containing newline
187       to work correctly.
188
189       make -j computes a dependency graph before running jobs. Jobs run by
190       GNU parallel does not depend on eachother.
191
192       (Very early versions of GNU parallel were coincidently implemented
193       using make -j).
194
195       https://www.gnu.org/software/make/
196
197   DIFFERENCES BETWEEN ppss AND GNU Parallel
198       ppss is also a tool for running jobs in parallel.
199
200       The output of ppss is status information and thus not useful for using
201       as input for another command. The output from the jobs are put into
202       files.
203
204       The argument replace string ($ITEM) cannot be changed. Arguments must
205       be quoted - thus arguments containing special characters (space '"&!*)
206       may cause problems. More than one argument is not supported. File names
207       containing newlines are not processed correctly. When reading input
208       from a file null cannot be used as a terminator. ppss needs to read the
209       whole input file before starting any jobs.
210
211       Output and status information is stored in ppss_dir and thus requires
212       cleanup when completed. If the dir is not removed before running ppss
213       again it may cause nothing to happen as ppss thinks the task is already
214       done. GNU parallel will normally not need cleaning up if running
215       locally and will only need cleaning up if stopped abnormally and
216       running remote (--cleanup may not complete if stopped abnormally). The
217       example Parallel grep would require extra postprocessing if written
218       using ppss.
219
220       For remote systems PPSS requires 3 steps: config, deploy, and start.
221       GNU parallel only requires one step.
222
223       EXAMPLES FROM ppss MANUAL
224
225       Here are the examples from ppss's manual page with the equivalent using
226       GNU parallel:
227
228       1 ./ppss.sh standalone -d /path/to/files -c 'gzip '
229
230       1 find /path/to/files -type f | parallel gzip
231
232       2 ./ppss.sh standalone -d /path/to/files -c 'cp "$ITEM"
233       /destination/dir '
234
235       2 find /path/to/files -type f | parallel cp {} /destination/dir
236
237       3 ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
238
239       3 parallel -a list-of-urls.txt wget -q
240
241       4 ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
242
243       4 parallel -a list-of-urls.txt wget -q {}
244
245       5 ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir -m
246       192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh -n nodes.txt -o
247       /some/output/dir --upload --download ; ./ppss deploy -C config.cfg ;
248       ./ppss start -C config
249
250       5 # parallel does not use configs. If you want a different username put
251       it in nodes.txt: user@hostname
252
253       5 find source/dir -type f | parallel --sshloginfile nodes.txt --trc
254       {.}.mp3 lame -a {} -o {.}.mp3 --preset standard --quiet
255
256       6 ./ppss stop -C config.cfg
257
258       6 killall -TERM parallel
259
260       7 ./ppss pause -C config.cfg
261
262       7 Press: CTRL-Z or killall -SIGTSTP parallel
263
264       8 ./ppss continue -C config.cfg
265
266       8 Enter: fg or killall -SIGCONT parallel
267
268       9 ./ppss.sh status -C config.cfg
269
270       9 killall -SIGUSR2 parallel
271
272       https://github.com/louwrentius/PPSS
273
274   DIFFERENCES BETWEEN pexec AND GNU Parallel
275       pexec is also a tool for running jobs in parallel.
276
277       EXAMPLES FROM pexec MANUAL
278
279       Here are the examples from pexec's info page with the equivalent using
280       GNU parallel:
281
282       1 pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
283         'echo "scale=10000;sqrt($NUM)" | bc'
284
285       1 seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | bc >
286       sqrt-{}.dat'
287
288       2 pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
289
290       2 ls myfiles*.ext | parallel sort {} ">{}.sort"
291
292       3 pexec -f image.list -n auto -e B -u star.log -c -- \
293         'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
294
295       3 parallel -a image.list \
296         'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
297
298       4 pexec -r *.png -e IMG -c -o - -- \
299         'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
300
301       4 ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
302
303       5 pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
304
305       5 ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
306
307       6 for p in *.png ; do echo ${p%.png} ; done | \
308         pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
309
310       6 ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
311
312       7 LIST=$(for p in *.png ; do echo ${p%.png} ; done)
313         pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
314
315       7 ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
316
317       8 pexec -n 8 -r *.jpg -y unix -e IMG -c \
318         'pexec -j -m blockread -d $IMG | \
319         jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
320         pexec -j -m blockwrite -s th_$IMG'
321
322       8 Combining GNU parallel and GNU sem.
323
324       8 ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
325         'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
326
327       8 If reading and writing is done to the same disk, this may be faster
328       as only one process will be either reading or writing:
329
330       8 ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
331         'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
332
333       https://www.gnu.org/software/pexec/
334
335   DIFFERENCES BETWEEN xjobs AND GNU Parallel
336       xjobs is also a tool for running jobs in parallel. It only supports
337       running jobs on your local computer.
338
339       xjobs deals badly with special characters just like xargs. See the
340       section DIFFERENCES BETWEEN xargs AND GNU Parallel.
341
342       Here are the examples from xjobs's man page with the equivalent using
343       GNU parallel:
344
345       1 ls -1 *.zip | xjobs unzip
346
347       1 ls *.zip | parallel unzip
348
349       2 ls -1 *.zip | xjobs -n unzip
350
351       2 ls *.zip | parallel unzip >/dev/null
352
353       3 find . -name '*.bak' | xjobs gzip
354
355       3 find . -name '*.bak' | parallel gzip
356
357       4 ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
358
359       4 ls *.jar | parallel jar tf {} '>' {}.idx
360
361       5 xjobs -s script
362
363       5 cat script | parallel
364
365       6 mkfifo /var/run/my_named_pipe; xjobs -s /var/run/my_named_pipe & echo
366       unzip 1.zip >> /var/run/my_named_pipe; echo tar cf /backup/myhome.tar
367       /home/me >> /var/run/my_named_pipe
368
369       6 mkfifo /var/run/my_named_pipe; cat /var/run/my_named_pipe | parallel
370       & echo unzip 1.zip >> /var/run/my_named_pipe; echo tar cf
371       /backup/myhome.tar /home/me >> /var/run/my_named_pipe
372
373       http://www.maier-komor.de/xjobs.html
374
375   DIFFERENCES BETWEEN prll AND GNU Parallel
376       prll is also a tool for running jobs in parallel. It does not support
377       running jobs on remote computers.
378
379       prll encourages using BASH aliases and BASH functions instead of
380       scripts. GNU parallel supports scripts directly, functions if they are
381       exported using export -f, and aliases if using env_parallel.
382
383       prll generates a lot of status information on stderr (standard error)
384       which makes it harder to use the stderr (standard error) output of the
385       job directly as input for another program.
386
387       Here is the example from prll's man page with the equivalent using GNU
388       parallel:
389
390         prll -s 'mogrify -flip $1' *.jpg
391         parallel mogrify -flip ::: *.jpg
392
393       https://github.com/exzombie/prll
394
395   DIFFERENCES BETWEEN dxargs AND GNU Parallel
396       dxargs is also a tool for running jobs in parallel.
397
398       dxargs does not deal well with more simultaneous jobs than SSHD's
399       MaxStartups. dxargs is only built for remote run jobs, but does not
400       support transferring of files.
401
402       http://www.semicomplete.com/blog/geekery/distributed-xargs.html
403
404   DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
405       middleman(mdm) is also a tool for running jobs in parallel.
406
407       Here are the shellscripts of http://mdm.berlios.de/usage.html ported to
408       GNU parallel:
409
410         seq 19 | parallel buffon -o - | sort -n > result
411         cat files | parallel cmd
412         find dir -execdir sem cmd {} \;
413
414       https://github.com/cklin/mdm
415
416   DIFFERENCES BETWEEN xapply AND GNU Parallel
417       xapply can run jobs in parallel on the local computer.
418
419       Here are the examples from xapply's man page with the equivalent using
420       GNU parallel:
421
422       1 xapply '(cd %1 && make all)' */
423
424       1 parallel 'cd {} && make all' ::: */
425
426       2 xapply -f 'diff %1 ../version5/%1' manifest | more
427
428       2 parallel diff {} ../version5/{} < manifest | more
429
430       3 xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
431
432       3 parallel --link diff {1} {2} :::: manifest1 checklist1
433
434       4 xapply 'indent' *.c
435
436       4 parallel indent ::: *.c
437
438       5 find ~ksb/bin -type f ! -perm -111 -print | xapply -f -v 'chmod a+x'
439       -
440
441       5 find ~ksb/bin -type f ! -perm -111 -print | parallel -v chmod a+x
442
443       6 find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
444
445       6 sh <(find */ -... | parallel -s 1024 echo vi)
446
447       6 find */ -... | parallel -s 1024 -Xuj1 vi
448
449       7 find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
450
451       7 sh <(find ... |parallel -n5 echo vi)
452
453       7 find ... |parallel -n5 -uj1 vi
454
455       8 xapply -fn "" /etc/passwd
456
457       8 parallel -k echo < /etc/passwd
458
459       9 tr ':' '\012' < /etc/passwd | xapply -7 -nf 'chown %1 %6' - - - - - -
460       -
461
462       9 tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
463
464       10 xapply '[ -d %1/RCS ] || echo %1' */
465
466       10 parallel '[ -d {}/RCS ] || echo {}' ::: */
467
468       11 xapply -f '[ -f %1 ] && echo %1' List | ...
469
470       11 parallel '[ -f {} ] && echo {}' < List | ...
471
472       http://carrera.databits.net/~ksb/msrc/local/bin/xapply/xapply.html
473
474   DIFFERENCES BETWEEN AIX apply AND GNU Parallel
475       apply can build command lines based on a template and arguments - very
476       much like GNU parallel. apply does not run jobs in parallel. apply does
477       not use an argument separator (like :::); instead the template must be
478       the first argument.
479
480       Here are the examples from
481       https://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.cmds1/apply.htm
482
483       1. To obtain results similar to those of the ls command, enter:
484
485         apply echo *
486         parallel echo ::: *
487
488       2. To compare the file named a1 to the file named b1, and the file
489       named a2 to the file named b2, enter:
490
491         apply -2 cmp a1 b1 a2 b2
492         parallel -N2 cmp ::: a1 b1 a2 b2
493
494       3. To run the who command five times, enter:
495
496         apply -0 who 1 2 3 4 5
497         parallel -N0 who ::: 1 2 3 4 5
498
499       4. To link all files in the current directory to the directory
500       /usr/joe, enter:
501
502         apply 'ln %1 /usr/joe' *
503         parallel ln {} /usr/joe ::: *
504
505       https://www.ibm.com/support/knowledgecenter/en/ssw_aix_61/com.ibm.aix.cmds1/apply.htm
506
507   DIFFERENCES BETWEEN paexec AND GNU Parallel
508       paexec can run jobs in parallel on both the local and remote computers.
509
510       paexec requires commands to print a blank line as the last output. This
511       means you will have to write a wrapper for most programs.
512
513       paexec has a job dependency facility so a job can depend on another job
514       to be executed successfully. Sort of a poor-man's make.
515
516       Here are the examples from paexec's example catalog with the equivalent
517       using GNU parallel:
518
519       1_div_X_run:
520          ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
521          parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
522
523       all_substr_run:
524          ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
525          parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
526
527       cc_wrapper_run:
528          ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
529                     -n 'host1 host2' \
530                     -t '/usr/bin/ssh -x' <<EOF [...]
531          parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
532                     -S host1,host2 <<EOF [...]
533          # This is not exactly the same, but avoids the wrapper
534          parallel gcc -O2 -c -o {.}.o {} \
535                     -S host1,host2 <<EOF [...]
536
537       toupper_run:
538          ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
539          parallel echo {} '|' ./toupper_cmd <<EOF [...]
540          # Without the wrapper:
541          parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
542
543       https://github.com/cheusov/paexec
544
545   DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
546       map sees it as a feature to have less features and in doing so it also
547       handles corner cases incorrectly. A lot of GNU parallel's code is to
548       handle corner cases correctly on every platform, so you will not get a
549       nasty surprise if a user for example saves a file called: My brother's
550       12" records.txt
551
552       map's example showing how to deal with special characters fails on
553       special characters:
554
555         echo "The Cure" > My\ brother\'s\ 12\"\ records
556
557         ls | \
558           map 'echo -n `gzip < "%" | wc -c`; echo -n '*100/'; wc -c < "%"' | bc
559
560       It works with GNU parallel:
561
562         ls | \
563           parallel 'echo -n `gzip < {} | wc -c`; echo -n '*100/'; wc -c < {}' | bc
564
565       And you can even get the file name prepended:
566
567         ls | \
568           parallel --tag '(echo -n `gzip < {} | wc -c`'*100/'; wc -c < {}) | bc'
569
570       map has no support for grouping. So this gives the wrong results
571       without any warnings:
572
573         parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
574           ::: a b c d e f
575         ls -l a b c d e f
576         parallel -kP4 -n1 grep 1 > out.par ::: a b c d e f
577         map -p 4 'grep 1' a b c d e f > out.map-unbuf
578         map -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
579         map -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
580         ls -l out*
581         md5sum out*
582
583       The documentation shows a workaround, but not only does that mix stdout
584       (standard output) with stderr (standard error) it also fails completely
585       for certain jobs (and may even be considered less readable):
586
587         parallel echo -n {} ::: 1 2 3
588
589         map -p 4 'echo -n % 2>&1 | sed -e "s/^/$$:/"' 1 2 3 | sort | cut -f2- -d:
590
591       maps replacement strings (% %D %B %E) can be simulated in GNU parallel
592       by putting this in ~/.parallel/config:
593
594         --rpl '%'
595         --rpl '%D $_=::shell_quote(::dirname($_));'
596         --rpl '%B s:.*/::;s:\.[^/.]+$::;'
597         --rpl '%E s:.*\.::'
598
599       map cannot handle bundled options: map -vp 0 echo this fails
600
601       map does not have an argument separator on the command line, but uses
602       the first argument as command. This makes quoting harder which again
603       may affect readability. Compare:
604
605         map -p 2 perl\\\ -ne\\\ \\\'/^\\\\S+\\\\s+\\\\S+\\\$/\\\ and\\\ print\\\ \\\$ARGV,\\\"\\\\n\\\"\\\' *
606
607         parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
608
609       map can do multiple arguments with context replace, but not without
610       context replace:
611
612         parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
613
614       map does not set exit value according to whether one of the jobs
615       failed:
616
617         parallel false ::: 1 || echo Job failed
618
619         map false 1 || echo Never run
620
621       map requires Perl v5.10.0 making it harder to use on old systems.
622
623       map has no way of using % in the command (GNU Parallel has -I to
624       specify another replacement string than {}).
625
626       By design map is option incompatible with xargs, it does not have
627       remote job execution, a structured way of saving results, multiple
628       input sources, progress indicator, configurable record delimiter (only
629       field delimiter), logging of jobs run with possibility to resume,
630       keeping the output in the same order as input, --pipe processing, and
631       dynamically timeouts.
632
633       https://github.com/sitaramc/map
634
635   DIFFERENCES BETWEEN ladon AND GNU Parallel
636       ladon can run multiple jobs on files in parallel.
637
638       ladon only works on files and the only way to specify files is using a
639       quoted glob string (such as \*.jpg). It is not possible to list the
640       files manually.
641
642       As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
643       RELPATH
644
645       These can be simulated using GNU parallel by putting this in
646       ~/.parallel/config:
647
648           --rpl 'FULLPATH $_=::shell_quote($_);chomp($_=qx{readlink -f $_});'
649           --rpl 'DIRNAME $_=::shell_quote(::dirname($_));chomp($_=qx{readlink -f $_});'
650           --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
651           --rpl 'EXT s:.*\.::'
652           --rpl 'RELDIR $_=::shell_quote($_);chomp(($_,$c)=qx{readlink -f $_;pwd});s:\Q$c/\E::;$_=::dirname($_);'
653           --rpl 'RELPATH $_=::shell_quote($_);chomp(($_,$c)=qx{readlink -f $_;pwd});s:\Q$c/\E::;'
654
655       ladon deals badly with filenames containing " and newline, and it fails
656       for output larger than 200k:
657
658           ladon '*' -- seq 36000 | wc
659
660       EXAMPLES FROM ladon MANUAL
661
662       It is assumed that the '--rpl's above are put in ~/.parallel/config and
663       that it is run under a shell that supports '**' globbing (such as zsh):
664
665       1 ladon "**/*.txt" -- echo RELPATH
666
667       1 parallel echo RELPATH ::: **/*.txt
668
669       2 ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
670
671       2 parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
672
673       3 ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH -thumbnail
674       100x100^ -gravity center -extent 100x100 thumbs/RELPATH
675
676       3 parallel mkdir -p thumbs/RELDIR\; convert FULLPATH -thumbnail
677       100x100^ -gravity center -extent 100x100 thumbs/RELPATH ::: **/*.jpg
678
679       4 ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
680
681       4 parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
682
683       https://github.com/danielgtaylor/ladon
684
685   DIFFERENCES BETWEEN jobflow AND GNU Parallel
686       jobflow can run multiple jobs in parallel.
687
688       Just like xargs output from jobflow jobs running in parallel mix
689       together by default. jobflow can buffer into files (placed in
690       /run/shm), but these are not cleaned up - not even if jobflow dies
691       unexpectently. If the total output is big (in the order of RAM+swap) it
692       can cause the system to run out of memory.
693
694       jobflow gives no error if the command is unknown, and like xargs
695       redirection requires wrapping with bash -c.
696
697       jobflow makes it possible to set resource limits on the running jobs.
698       This can be emulated by GNU parallel using bash's ulimit:
699
700         jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
701
702         parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
703
704       EXAMPLES FROM jobflow README
705
706       1 cat things.list | jobflow -threads=8 -exec ./mytask {}
707
708       1 cat things.list | parallel -j8 ./mytask {}
709
710       2 seq 100 | jobflow -threads=100 -exec echo {}
711
712       2 seq 100 | parallel -j100 echo {}
713
714       3 cat urls.txt | jobflow -threads=32 -exec wget {}
715
716       3 cat urls.txt | parallel -j32 wget {}
717
718       4 find . -name '*.bmp' | jobflow -threads=8 -exec bmp2jpeg {.}.bmp
719       {.}.jpg
720
721       4 find . -name '*.bmp' | parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
722
723       https://github.com/rofl0r/jobflow
724
725   DIFFERENCES BETWEEN gargs AND GNU Parallel
726       gargs can run multiple jobs in parallel.
727
728       It caches output in memory. This causes it to be extremely slow when
729       the output is larger than the physical RAM, and can cause the system to
730       run out of memory.
731
732       See more details on this in man parallel_design.
733
734       Output to stderr (standard error) is changed if the command fails.
735
736       Here are the two examples from gargs website.
737
738       1 seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
739
740       1 seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
741
742       2 cat t.txt | gargs --sep "\s+" -p 2 "echo '{0}:{1}-{2}' full-line:
743       \'{}\'"
744
745       2 cat t.txt | parallel --colsep "\\s+" -P 2 "echo '{1}:{2}-{3}' full-
746       line: \'{}\'"
747
748       https://github.com/brentp/gargs
749
750   DIFFERENCES BETWEEN orgalorg AND GNU Parallel
751       orgalorg can run the same job on multiple machines. This is related to
752       --onall and --nonall.
753
754       orgalorg supports entering the SSH password - provided it is the same
755       for all servers. GNU parallel advocates using ssh-agent instead, but it
756       is possible to emulate orgalorg's behavior by setting SSHPASS and by
757       using --ssh "sshpass ssh".
758
759       To make the emulation easier, make a simple alias:
760
761         alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --linebuffer"
762
763       If you want to supply a password run:
764
765         SSHPASS=`ssh-askpass`
766
767       or set the password directly:
768
769         SSHPASS=P4$$w0rd!
770
771       If the above is set up you can then do:
772
773         orgalorg -o frontend1 -o frontend2 -p -C uptime
774         par_emul -S frontend1 -S frontend2 uptime
775
776         orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
777         par_emul -S frontend1 -S frontend2 top -bid 1
778
779         orgalorg -o frontend1 -o frontend2 -p -er /tmp -n 'md5sum /tmp/bigfile' -S bigfile
780         par_emul -S frontend1 -S frontend2 --basefile bigfile --workdir /tmp  md5sum /tmp/bigfile
781
782       orgalorg has a progress indicator for the transferring of a file. GNU
783       parallel does not.
784
785       https://github.com/reconquest/orgalorg
786
787   DIFFERENCES BETWEEN Rust parallel AND GNU Parallel
788       Rust parallel focuses on speed. It is almost as fast as xargs. It
789       implements a few features from GNU parallel, but lacks many functions.
790       All these fail:
791
792         # Show what would be executed
793         parallel --dry-run echo ::: a
794         # Read arguments from file
795         parallel -a file echo
796         # Changing the delimiter
797         parallel -d _ echo ::: a_b_c_
798
799       These do something different from GNU parallel
800
801         # Read more arguments at a time -n
802         parallel -n 2 echo ::: 1 a 2 b
803         # -q to protect quoted $ and space
804         parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
805         # Generation of combination of inputs
806         parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
807         # {= perl expression =} replacement string
808         parallel echo '{= s/new/old/ =}' ::: my.new your.new
809         # --pipe
810         seq 100000 | parallel --pipe wc
811         # linked arguments
812         parallel echo ::: S M L :::+ small medium large ::: R G B :::+ red green blue
813         # Run different shell dialects
814         zsh -c 'parallel echo \={} ::: zsh && true'
815         csh -c 'parallel echo \$\{\} ::: shell && true'
816         bash -c 'parallel echo \$\({}\) ::: pwd && true'
817         # Rust parallel does not start before the last argument is read
818         (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
819         tail -f /var/log/syslog | parallel echo
820
821       Rust parallel has no remote facilities.
822
823       It uses /tmp/parallel for tmp files and does not clean up if terminated
824       abrubtly. If another user on the system uses Rust parallel, then
825       /tmp/parallel will have the wrong permissions and Rust parallel will
826       fail. A malicious user can setup the right permissions and symlink the
827       output file to one of the user's files and next time the user uses Rust
828       parallel it will overwrite this file.
829
830       If /tmp/parallel runs full during the run, Rust parallel does not
831       report this, but finishes with success - thereby risking data loss.
832
833       https://github.com/mmstick/parallel
834
835   DIFFERENCES BETWEEN Rush AND GNU Parallel
836       rush (https://github.com/shenwei356/rush) is written in Go and based on
837       gargs.
838
839       Just like GNU parallel rush buffers in temporary files. But opposite
840       GNU parallel rush does not clean up, if the process dies abnormally.
841
842       rush has some string manipulations that can be emulated by putting this
843       into ~/.parallel/config (/ is used instead of %, and % is used instead
844       of ^ as that is closer to bash's ${var%postfix}):
845
846         --rpl '{:} s:(\.[^/]+)*$::'
847         --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
848         --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
849         --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
850         --rpl '{@(.*?)} /$$1/ and $_=$1;'
851
852       Here are the examples from rush's website with the equivalent command
853       in GNU parallel.
854
855       EXAMPLES
856
857       1. Simple run, quoting is not necessary
858
859         $ seq 1 3 | rush echo {}
860
861         $ seq 1 3 | parallel echo {}
862
863       2. Read data from file (`-i`)
864
865         $ rush echo {} -i data1.txt -i data2.txt
866
867         $ cat data1.txt data2.txt | parallel echo {}
868
869       3. Keep output order (`-k`)
870
871         $ seq 1 3 | rush 'echo {}' -k
872
873         $ seq 1 3 | parallel -k echo {}
874
875       4. Timeout (`-t`)
876
877         $ time seq 1 | rush 'sleep 2; echo {}' -t 1
878
879         $ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
880
881       5. Retry (`-r`)
882
883         $ seq 1 | rush 'python unexisted_script.py' -r 1
884
885         $ seq 1 | parallel --retries 2 'python unexisted_script.py'
886
887       Use -u to see it is really run twice:
888
889         $ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
890
891       6. Dirname (`{/}`) and basename (`{%}`) and remove custom suffix
892       (`{^suffix}`)
893
894         $ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
895
896         $ echo dir/file_1.txt.gz |
897             parallel --plus echo {//} {/} {%_1.txt.gz}
898
899       7. Get basename, and remove last (`{.}`) or any (`{:}`) extension
900
901         $ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
902
903         $ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
904
905       8. Job ID, combine fields index and other replacement strings
906
907         $ echo 12 file.txt dir/s_1.fq.gz |
908             rush 'echo job {#}: {2} {2.} {3%:^_1}'
909
910         $ echo 12 file.txt dir/s_1.fq.gz |
911             parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
912
913       9. Capture submatch using regular expression (`{@regexp}`)
914
915         $ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
916
917         $ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
918
919       10. Custom field delimiter (`-d`)
920
921         $ echo a=b=c | rush 'echo {1} {2} {3}' -d =
922
923         $ echo a=b=c | parallel -d = echo {1} {2} {3}
924
925       11. Send multi-lines to every command (`-n`)
926
927         $ seq 5 | rush -n 2 -k 'echo "{}"; echo'
928
929         $ seq 5 |
930             parallel -n 2 -k \
931               'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
932
933         $ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
934
935         $ seq 5 | parallel -n 2 -k 'echo {}; echo'
936
937       12. Custom record delimiter (`-D`), note that empty records are not
938       used.
939
940         $ echo a b c d | rush -D " " -k 'echo {}'
941
942         $ echo a b c d | parallel -d " " -k 'echo {}'
943
944         $ echo abcd | rush -D "" -k 'echo {}'
945
946         Cannot be done by GNU Parallel
947
948         $ cat fasta.fa
949         >seq1
950         tag
951         >seq2
952         cat
953         gat
954         >seq3
955         attac
956         a
957         cat
958
959         $ cat fasta.fa | rush -D ">" \
960             'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
961         # rush fails to join the multiline sequences
962
963         $ cat fasta.fa | (read -n1 ignore_first_char;
964             parallel -d '>' --colsep '\n' echo FASTA record {#}: \
965               name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
966           )
967
968       13. Assign value to variable, like `awk -v` (`-v`)
969
970         $ seq 1 |
971             rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
972
973         $ seq 1 |
974             parallel -N0 \
975               'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
976
977         $ for var in a b; do \
978         $   seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
979         $ done
980
981       In GNU parallel you would typically do:
982
983         $ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
984
985       If you really want the var:
986
987         $ seq 1 3 |
988             parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
989
990       If you really want the for-loop:
991
992         $ for var in a b; do
993         >   export var;
994         >   seq 1 3 | parallel -k 'echo var: $var, data: {}';
995         > done
996
997       Contrary to rush this also works if the value is complex like:
998
999         My brother's 12" records
1000
1001       14. Preset variable (`-v`), avoid repeatedly writing verbose
1002       replacement strings
1003
1004         # naive way
1005         $ echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1006
1007         $ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1008
1009         # macro + removing suffix
1010         $ echo read_1.fq.gz |
1011             rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1012
1013         $ echo read_1.fq.gz |
1014             parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1015
1016         # macro + regular expression
1017         $ echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1018
1019         $ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1020
1021       Contrary to rush GNU parallel works with complex values:
1022
1023         echo "My brother's 12\"read_1.fq.gz" |
1024           parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1025
1026       15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and
1027       exit.
1028
1029         $ seq 1 20 | rush 'sleep 1; echo {}'
1030         ^C
1031
1032         $ seq 1 20 | parallel 'sleep 1; echo {}'
1033         ^C
1034
1035       16. Continue/resume jobs (`-c`). When some jobs failed (by execution
1036       failure, timeout, or cancelling by user with `Ctrl + C`), please switch
1037       flag `-c/--continue` on and run again, so that `rush` can save
1038       successful commands and ignore them in NEXT run.
1039
1040         $ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1041         $ cat successful_cmds.rush
1042         $ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1043
1044         $ seq 1 3 | parallel --joblog mylog --timeout 2 \
1045             'sleep {}; echo {}'
1046         $ cat mylog
1047         $ seq 1 3 | parallel --joblog mylog --retry-failed \
1048             'sleep {}; echo {}'
1049
1050       Multi-line jobs:
1051
1052         $ seq 1 3 | rush 'sleep {}; echo {}; \
1053           echo finish {}' -t 3 -c -C finished.rush
1054         $ cat finished.rush
1055         $ seq 1 3 | rush 'sleep {}; echo {}; \
1056           echo finish {}' -t 3 -c -C finished.rush
1057
1058         $ seq 1 3 |
1059             parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1060           echo finish {}'
1061         $ cat mylog
1062         $ seq 1 3 |
1063             parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1064               echo finish {}'
1065
1066       17. A comprehensive example: downloading 1K+ pages given by three URL
1067       list files using `phantomjs save_page.js` (some page contents are
1068       dynamicly generated by Javascript, so `wget` does not work). Here I set
1069       max jobs number (`-j`) as `20`, each job has a max running time (`-t`)
1070       of `60` seconds and `3` retry changes (`-r`). Continue flag `-c` is
1071       also switched on, so we can continue unfinished jobs. Luckily, it's
1072       accomplished in one run :)
1073
1074         $ for f in $(seq 2014 2016); do \
1075         $    /bin/rm -rf $f; mkdir -p $f; \
1076         $    cat $f.html.txt | rush -v d=$f -d = \
1077                'phantomjs save_page.js "{}" > {d}/{3}.html' \
1078                -j 20 -t 60 -r 3 -c; \
1079         $ done
1080
1081       GNU parallel can append to an existing joblog with '+':
1082
1083         $ rm mylog
1084         $ for f in $(seq 2014 2016); do
1085             /bin/rm -rf $f; mkdir -p $f;
1086             cat $f.html.txt |
1087               parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1088                 --colsep = \
1089                 phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1090           done
1091
1092       18. A bioinformatics example: mapping with `bwa`, and processing result
1093       with `samtools`:
1094
1095         $ ref=ref/xxx.fa
1096         $ threads=25
1097         $ ls -d raw.cluster.clean.mapping/* \
1098           | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1099               'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz > {p}.sam; \
1100               samtools view -bS {p}.sam > {p}.bam; \
1101               samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1102               samtools index {p}.sorted.bam; \
1103               samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1104               /bin/rm {p}.bam {p}.sam;' \
1105               -j 2 --verbose -c -C mapping.rush
1106
1107       GNU parallel would use a function:
1108
1109         $ ref=ref/xxx.fa
1110         $ export ref
1111         $ thr=25
1112         $ export thr
1113         $ bwa_sam() {
1114             p="$1"
1115             bam="$p".bam
1116             sam="$p".sam
1117             sortbam="$p".sorted.bam
1118             bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1119             samtools view -bS "$sam" > "$bam"
1120             samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1121             samtools index "$sortbam"
1122             samtools flagstat "$sortbam" > "$sortbam".flagstat
1123             /bin/rm "$bam" "$sam"
1124           }
1125         $ export -f bwa_sam
1126         $ ls -d raw.cluster.clean.mapping/* |
1127             parallel -j 2 --verbose --joblog mylog bwa_sam
1128
1129       Other rush features
1130
1131       rush has:
1132
1133       ·   awk -v like custom defined variables (-v)
1134
1135           With GNU parallel you would simply simply set a shell variable:
1136
1137              parallel 'v={}; echo "$v"' ::: foo
1138              echo foo | rush -v v={} 'echo {v}'
1139
1140           Also rush does not like special chars. So these do not work:
1141
1142              echo does not work | rush -v v=\" 'echo {v}'
1143              echo "My  brother's  12\"  records" | rush -v v={} 'echo {v}'
1144
1145           Whereas the corresponding GNU parallel version works:
1146
1147              parallel 'v=\"; echo "$v"' ::: works
1148              parallel 'v={}; echo "$v"' ::: "My  brother's  12\"  records"
1149
1150       ·   Exit on first error(s) (-e)
1151
1152           This is called --halt now,fail=1 (or shorter: --halt 2) when used
1153           with GNU parallel.
1154
1155       ·   Settable records sending to every command (-n, default 1)
1156
1157           This is also called -n in GNU parallel.
1158
1159       ·   Practical replacement strings
1160
1161           {:} remove any extension
1162               With GNU parallel this can be emulated by:
1163
1164                 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1165
1166           {^suffix}, remove suffix
1167               With GNU parallel this can be emulated by:
1168
1169                 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1170
1171           {@regexp}, capture submatch using regular expression
1172               With GNU parallel this can be emulated by:
1173
1174                 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1175                   echo '{@\d_(.*).gz}' ::: 1_foo.gz
1176
1177           {%.}, {%:}, basename without extension
1178               With GNU parallel this can be emulated by:
1179
1180                 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1181
1182               And if you need it often, you define a --rpl in
1183               $HOME/.parallel/config:
1184
1185                 --rpl '{%.} s:.*/::;s/\..*//'
1186                 --rpl '{%:} s:.*/::;s/\..*//'
1187
1188               Then you can use them as:
1189
1190                 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1191
1192       ·   Preset variable (macro)
1193
1194           E.g.
1195
1196             echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1197
1198           With GNU parallel this can be emulated by:
1199
1200             echo foosuffix | parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1201
1202           Opposite rush GNU parallel works fine if the input contains double
1203           space, ' and ":
1204
1205             echo "1'6\"  foosuffix" |
1206               parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1207
1208       ·   Commands of multi-lines
1209
1210           While you can use multi-lined commands in GNU parallel, to improve
1211           readibilty GNU parallel discourages the use of multi-line commands.
1212           In most cases it can be written as a function:
1213
1214             seq 1 3 | parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1215             echo finish {}'
1216
1217           Could be written as:
1218
1219             doit() {
1220               sleep "$1"
1221               echo "$1"
1222               echo finish "$1"
1223             }
1224             export -f doit
1225             seq 1 3 | parallel --timeout 2 --joblog my.log doit
1226
1227           The failed commands can be resumed with:
1228
1229             seq 1 3 |
1230               parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1231             echo finish {}'
1232
1233       https://github.com/shenwei356/rush
1234
1235   DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1236       ClusterSSH solves a different problem than GNU parallel.
1237
1238       ClusterSSH opens a terminal window for each computer and using a master
1239       window you can run the same command on all the computers. This is
1240       typically used for administrating several computers that are almost
1241       identical.
1242
1243       GNU parallel runs the same (or different) commands with different
1244       arguments in parallel possibly using remote computers to help
1245       computing. If more than one computer is listed in -S GNU parallel may
1246       only use one of these (e.g. if there are 8 jobs to be run and one
1247       computer has 8 cores).
1248
1249       GNU parallel can be used as a poor-man's version of ClusterSSH:
1250
1251       parallel --nonall -S server-a,server-b do_stuff foo bar
1252
1253       https://github.com/duncs/clusterssh
1254
1255   DIFFERENCES BETWEEN coshell AND GNU Parallel
1256       coshell only accepts full commands on standard input. Any quoting needs
1257       to be done by the user.
1258
1259       Commands are run in sh so any bash/tcsh/zsh specific syntax will not
1260       work.
1261
1262       Output can be buffered by using -d. Output is buffered in memory, so
1263       big output can cause swapping and therefore be terrible slow or even
1264       cause out of memory.
1265
1266       https://github.com/gdm85/coshell
1267
1268   DIFFERENCES BETWEEN spread AND GNU Parallel
1269       spread runs commands on all directories.
1270
1271       It can be emulated with GNU parallel using this Bash function:
1272
1273         spread() {
1274           _cmds() {
1275             perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1276           }
1277           parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1278         }
1279
1280       This works execpt for the --exclude option.
1281
1282   DIFFERENCES BETWEEN pyargs AND GNU Parallel
1283       pyargs deals badly with input containing spaces. It buffers stdout, but
1284       not stderr. It buffers in RAM. {} does not work as replacement string.
1285       It does not support running functions.
1286
1287       pyargs does not support composed commands if run with --lines, and
1288       fails on pyargs traceroute gnu.org fsf.org.
1289
1290       Examples
1291
1292         seq 5 | pyargs -P50 -L seq
1293         seq 5 | parallel -P50 --lb seq
1294
1295         seq 5 | pyargs -P50 --mark -L seq
1296         seq 5 | parallel -P50 --lb \
1297           --tagstring OUTPUT'[{= $_=$job->replaced()=}]' seq
1298         # Similar, but not precisely the same
1299         seq 5 | parallel -P50 --lb --tag seq
1300
1301         seq 5 | pyargs -P50  --mark command
1302         # Somewhat longer with GNU Parallel due to the special
1303         #   --mark formatting
1304         cmd="$(echo "command" | parallel --shellquote)"
1305         wrap_cmd() {
1306            echo "MARK $cmd $@================================" >&3
1307            echo "OUTPUT START[$cmd $@]:"
1308            eval $cmd "$@"
1309            echo "OUTPUT END[$cmd $@]"
1310         }
1311         (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1312         # Similar, but not exactly the same
1313         seq 5 | parallel -t --tag command
1314
1315         (echo '1  2  3';echo 4 5 6) | pyargs  --stream seq
1316         (echo '1  2  3';echo 4 5 6) | perl -pe 's/\n/ /' |
1317           parallel -r -d' ' seq
1318         # Similar, but not exactly the same
1319         parallel seq ::: 1 2 3 4 5 6
1320
1321       https://github.com/robertblackwell/pyargs
1322
1323   DIFFERENCES BETWEEN concurrently AND GNU Parallel
1324       concurrently runs jobs in parallel.
1325
1326       The output is prepended with the job number, and may be incomplete:
1327
1328         $ concurrently 'seq 100000' | (sleep 3;wc -l)
1329         7165
1330
1331       When pretty printing it caches output in memory. Output mixes by using
1332       test MIX below wether or not output is cached.
1333
1334       There seems to be no way of making a template command and have
1335       concurrently fill that with different args. The full commands must be
1336       given on the command line.
1337
1338       There is also no way of controlling how many jobs should be run in
1339       parallel at a time - i.e. "number of jobslots". Instead all jobs are
1340       simply started in parallel.
1341
1342       https://github.com/kimmobrunfeldt/concurrently
1343
1344   DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1345       map does not run jobs in parallel by default. The README suggests
1346       using:
1347
1348         ... | map t 'sleep $t && say done &'
1349
1350       But this fails if more jobs are run in parallel than the number of
1351       available processes. Since there is no support for parallelization in
1352       map itself, the output also mixes:
1353
1354         seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1355
1356       The major difference is that GNU parallel is build for parallelization
1357       and map is not. So GNU parallel has lots of ways of dealing with the
1358       issues that parallelization raises:
1359
1360       ·   Keep the number of processes manageable
1361
1362       ·   Make sure output does not mix
1363
1364       ·   Make Ctrl-C kill all running processes
1365
1366       Here are the 5 examples converted to GNU Parallel:
1367
1368         1$ ls *.c | map f 'foo $f'
1369         1$ ls *.c | parallel foo
1370
1371         2$ ls *.c | map f 'foo $f; bar $f'
1372         2$ ls *.c | parallel 'foo {}; bar {}'
1373
1374         3$ cat urls | map u 'curl -O $u'
1375         3$ cat urls | parallel curl -O
1376
1377         4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1378         4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1379         4$ paralllel 'sleep {} && say done' ::: 1 1 1
1380
1381         5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1382         5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1383         5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1384
1385       https://github.com/soveran/map
1386
1387   Todo
1388       Url for map, spread
1389
1390       machma. Requires Go >= 1.7.
1391
1392       https://github.com/k-bx/par requires Haskell to work. This limits the
1393       number of platforms this can work on.
1394
1395       https://github.com/otonvm/Parallel
1396
1397       https://github.com/flesler/parallel
1398
1399       https://github.com/kou1okada/lesser-parallel
1400
1401       https://github.com/Julian/Verge
1402
1403       https://github.com/amattn/paral
1404
1405       pyargs
1406

TESTING OTHER TOOLS

1408       There are certain issues that are very common on parallelizing tools.
1409       Here are a few stress tests. Be warned: If the tool is badly coded it
1410       may overload you machine.
1411
1412   MIX: Output mixes
1413       Output from 2 jobs should not mix. If the output is not used, this does
1414       not matter; but if the output is used then it is important that you do
1415       not get half a line from one job followed by half a line from another
1416       job.
1417
1418       If the tool does not buffer, output will most likely mix now and then.
1419
1420       This test stresses whether output mixes.
1421
1422         #!/bin/bash
1423
1424         paralleltool="parallel -j0"
1425
1426         cat <<-EOF > mycommand
1427         #!/bin/bash
1428
1429         # If a, b, c, d, e, and f mix: Very bad
1430         perl -e 'print STDOUT "a"x3000_000," "'
1431         perl -e 'print STDERR "b"x3000_000," "'
1432         perl -e 'print STDOUT "c"x3000_000," "'
1433         perl -e 'print STDERR "d"x3000_000," "'
1434         perl -e 'print STDOUT "e"x3000_000," "'
1435         perl -e 'print STDERR "f"x3000_000," "'
1436         echo
1437         echo >&2
1438         EOF
1439         chmod +x mycommand
1440
1441         # Run 30 jobs in parallel
1442         seq 30 | $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
1443
1444         # 'a c e' and 'b d f' should always stay together
1445         # and there should only be a single line per job
1446
1447   RAM: Output limited by RAM
1448       Some tools cache output in RAM. This makes them extremely slow if the
1449       output is bigger than physical memory and crash if the the output is
1450       bigger than the virtual memory.
1451
1452         #!/bin/bash
1453
1454         paralleltool="parallel -j0"
1455
1456         cat <<'EOF' > mycommand
1457         #!/bin/bash
1458
1459         # Generate 1 GB output
1460         yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
1461         EOF
1462         chmod +x mycommand
1463
1464         # Run 20 jobs in parallel
1465         # Adjust 20 to be > physical RAM and < free space on /tmp
1466         seq 20 | time $paralleltool ./mycommand | wc -c
1467
1468   DISKFULL: Incomplete data if /tmp runs full
1469       If caching is done on disk, the disk can run full during the run. Not
1470       all programs discover this. GNU Parallel discovers it, if it stays full
1471       for at least 2 seconds.
1472
1473         #!/bin/bash
1474
1475         paralleltool="parallel -j0"
1476
1477         # This should be a dir with less than 100 GB free space
1478         smalldisk=/tmp/shm/parallel
1479
1480         TMPDIR="$smalldisk"
1481         export TMPDIR
1482
1483         max_output() {
1484             # Force worst case scenario:
1485             # Make GNU Parallel only check once per second
1486             sleep 10
1487             # Generate 100 GB to fill $TMPDIR
1488             # Adjust if /tmp is bigger than 100 GB
1489             yes | head -c 100G >$TMPDIR/$$
1490             # Generate 10 MB output that will not be buffered due to full disk
1491             perl -e 'print "X"x10_000_000' | head -c 10M
1492             echo This part is missing from incomplete output
1493             sleep 2
1494             rm $TMPDIR/$$
1495             echo Final output
1496         }
1497
1498         export -f max_output
1499         seq 10 | $paralleltool max_output | tr -s X
1500
1501   CLEANUP: Leaving tmp files at unexpected death
1502       Some tools do not clean up tmp files if they are killed. If the tool
1503       buffers on disk, they may not clean up, if they are killed.
1504
1505         #!/bin/bash
1506
1507         paralleltool=parallel
1508
1509         ls /tmp >/tmp/before
1510         seq 10 | $paralleltool sleep &
1511         pid=$!
1512         # Give the tool time to start up
1513         sleep 1
1514         # Kill it without giving it a chance to cleanup
1515         kill -9 $!
1516         # Should be empty: No files should be left behind
1517         diff <(ls /tmp) /tmp/before
1518
1519   SPCCHAR: Dealing badly with special file names.
1520       It is not uncommon for users to create files like:
1521
1522         My brother's 12" *** record  (costs $$$).jpg
1523
1524       Some tools break on this.
1525
1526         #!/bin/bash
1527
1528         paralleltool=parallel
1529
1530         touch "My brother's 12\" *** record  (costs \$\$\$).jpg"
1531         ls My*jpg | $paralleltool ls -l
1532
1533   COMPOSED: Composed commands do not work
1534       Some tools require you to wrap composed commands into bash -c.
1535
1536         echo bar | $paralleltool echo foo';' echo {}
1537
1538   ONEREP: Only one replacement string allowed
1539       Some tools can only insert the argument once.
1540
1541         echo bar | $paralleltool echo {} foo {}
1542
1543   NUMWORDS: Speed depends on number of words
1544       Some tools become very slow if output lines have many words.
1545
1546         #!/bin/bash
1547
1548         paralleltool=parallel
1549
1550         cat <<-EOF > mycommand
1551         #!/bin/bash
1552
1553         # 10 MB of lines with 1000 words
1554         yes "`seq 1000`" | head -c 10M
1555         EOF
1556         chmod +x mycommand
1557
1558         # Run 30 jobs in parallel
1559         seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
1560

AUTHOR

1562       When using GNU parallel for a publication please cite:
1563
1564       O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
1565       The USENIX Magazine, February 2011:42-47.
1566
1567       This helps funding further development; and it won't cost you a cent.
1568       If you pay 10000 EUR you should feel free to use GNU Parallel without
1569       citing.
1570
1571       Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
1572
1573       Copyright (C) 2008,2009,2010 Ole Tange, http://ole.tange.dk
1574
1575       Copyright (C) 2010,2011,2012,2013,2014,2015,2016,2017,2018 Ole Tange,
1576       http://ole.tange.dk and Free Software Foundation, Inc.
1577
1578       Parts of the manual concerning xargs compatibility is inspired by the
1579       manual of xargs from GNU findutils 4.4.2.
1580

LICENSE

1582       Copyright (C)
1583       2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018 Free
1584       Software Foundation, Inc.
1585
1586       This program is free software; you can redistribute it and/or modify it
1587       under the terms of the GNU General Public License as published by the
1588       Free Software Foundation; either version 3 of the License, or at your
1589       option any later version.
1590
1591       This program is distributed in the hope that it will be useful, but
1592       WITHOUT ANY WARRANTY; without even the implied warranty of
1593       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
1594       General Public License for more details.
1595
1596       You should have received a copy of the GNU General Public License along
1597       with this program.  If not, see <http://www.gnu.org/licenses/>.
1598
1599   Documentation license I
1600       Permission is granted to copy, distribute and/or modify this
1601       documentation under the terms of the GNU Free Documentation License,
1602       Version 1.3 or any later version published by the Free Software
1603       Foundation; with no Invariant Sections, with no Front-Cover Texts, and
1604       with no Back-Cover Texts.  A copy of the license is included in the
1605       file fdl.txt.
1606
1607   Documentation license II
1608       You are free:
1609
1610       to Share to copy, distribute and transmit the work
1611
1612       to Remix to adapt the work
1613
1614       Under the following conditions:
1615
1616       Attribution
1617                You must attribute the work in the manner specified by the
1618                author or licensor (but not in any way that suggests that they
1619                endorse you or your use of the work).
1620
1621       Share Alike
1622                If you alter, transform, or build upon this work, you may
1623                distribute the resulting work only under the same, similar or
1624                a compatible license.
1625
1626       With the understanding that:
1627
1628       Waiver   Any of the above conditions can be waived if you get
1629                permission from the copyright holder.
1630
1631       Public Domain
1632                Where the work or any of its elements is in the public domain
1633                under applicable law, that status is in no way affected by the
1634                license.
1635
1636       Other Rights
1637                In no way are any of the following rights affected by the
1638                license:
1639
1640                · Your fair dealing or fair use rights, or other applicable
1641                  copyright exceptions and limitations;
1642
1643                · The author's moral rights;
1644
1645                · Rights other persons may have either in the work itself or
1646                  in how the work is used, such as publicity or privacy
1647                  rights.
1648
1649       Notice   For any reuse or distribution, you must make clear to others
1650                the license terms of this work.
1651
1652       A copy of the full license is included in the file as cc-by-sa.txt.
1653

DEPENDENCIES

1655       GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
1656       Symbol, IO::File, POSIX, and File::Temp. For remote usage it also uses
1657       rsync with ssh.
1658

SEE ALSO

1660       find(1), xargs(1), make(1), pexec(1), ppss(1), xjobs(1), prll(1),
1661       dxargs(1), mdm(1)
1662
1663
1664
166520180322                          2018-03-21          PARALLEL_ALTERNATIVES(7)
Impressum