1PARALLEL_ALTERNATIVES(7) parallel PARALLEL_ALTERNATIVES(7)
2
3
4
6 parallel_alternatives - Alternatives to GNU parallel
7
9 There are a lot programs that share functionality with GNU parallel.
10 Some of these are specialized tools, and while GNU parallel can emulate
11 many of them, a specialized tool can be better at a given task. GNU
12 parallel strives to include the best of the general functionality
13 without sacrificing ease of use.
14
15 parallel has existed since 2002-01-06 and as GNU parallel since 2010. A
16 lot of the alternatives have not had the vitality to survive that long,
17 but have come and gone during that time.
18
19 GNU parallel is actively maintained with a new release every month
20 since 2010. Most other alternatives are fleeting interests of the
21 developers with irregular releases and only maintained for a few years.
22
23 SUMMARY LEGEND
24 The following features are in some of the comparable tools:
25
26 Inputs
27
28 I1. Arguments can be read from stdin
29 I2. Arguments can be read from a file
30 I3. Arguments can be read from multiple files
31 I4. Arguments can be read from command line
32 I5. Arguments can be read from a table
33 I6. Arguments can be read from the same file using #! (shebang)
34 I7. Line oriented input as default (Quoting of special chars not
35 needed)
36
37 Manipulation of input
38
39 M1. Composed command
40 M2. Multiple arguments can fill up an execution line
41 M3. Arguments can be put anywhere in the execution line
42 M4. Multiple arguments can be put anywhere in the execution line
43 M5. Arguments can be replaced with context
44 M6. Input can be treated as the complete command line
45
46 Outputs
47
48 O1. Grouping output so output from different jobs do not mix
49 O2. Send stderr (standard error) to stderr (standard error)
50 O3. Send stdout (standard output) to stdout (standard output)
51 O4. Order of output can be same as order of input
52 O5. Stdout only contains stdout (standard output) from the command
53 O6. Stderr only contains stderr (standard error) from the command
54 O7. Buffering on disk
55 O8. No temporary files left if killed
56 O9. Test if disk runs full during run
57 O10. Output of a line bigger than 4 GB
58
59 Execution
60
61 E1. Run jobs in parallel
62 E2. List running jobs
63 E3. Finish running jobs, but do not start new jobs
64 E4. Number of running jobs can depend on number of cpus
65 E5. Finish running jobs, but do not start new jobs after first failure
66 E6. Number of running jobs can be adjusted while running
67 E7. Only spawn new jobs if load is less than a limit
68
69 Remote execution
70
71 R1. Jobs can be run on remote computers
72 R2. Basefiles can be transferred
73 R3. Argument files can be transferred
74 R4. Result files can be transferred
75 R5. Cleanup of transferred files
76 R6. No config files needed
77 R7. Do not run more than SSHD's MaxStartups can handle
78 R8. Configurable SSH command
79 R9. Retry if connection breaks occasionally
80
81 Semaphore
82
83 S1. Possibility to work as a mutex
84 S2. Possibility to work as a counting semaphore
85
86 Legend
87
88 - = no
89 x = not applicable
90 ID = yes
91
92 As every new version of the programs are not tested the table may be
93 outdated. Please file a bug report if you find errors (See REPORTING
94 BUGS).
95
96 parallel:
97
98 I1 I2 I3 I4 I5 I6 I7
99 M1 M2 M3 M4 M5 M6
100 O1 O2 O3 O4 O5 O6 O7 O8 O9 O10
101 E1 E2 E3 E4 E5 E6 E7
102 R1 R2 R3 R4 R5 R6 R7 R8 R9
103 S1 S2
104
105 DIFFERENCES BETWEEN xargs AND GNU Parallel
106 Summary (see legend above):
107
108 I1 I2 - - - - -
109 - M2 M3 - - -
110 - O2 O3 - O5 O6
111 E1 - - - - - -
112 - - - - - x - - -
113 - -
114
115 xargs offers some of the same possibilities as GNU parallel.
116
117 xargs deals badly with special characters (such as space, \, ' and ").
118 To see the problem try this:
119
120 touch important_file
121 touch 'not important_file'
122 ls not* | xargs rm
123 mkdir -p "My brother's 12\" records"
124 ls | xargs rmdir
125 touch 'c:\windows\system32\clfs.sys'
126 echo 'c:\windows\system32\clfs.sys' | xargs ls -l
127
128 You can specify -0, but many input generators are not optimized for
129 using NUL as separator but are optimized for newline as separator. E.g.
130 awk, ls, echo, tar -v, head (requires using -z), tail (requires using
131 -z), sed (requires using -z), perl (-0 and \0 instead of \n), locate
132 (requires using -0), find (requires using -print0), grep (requires
133 using -z or -Z), sort (requires using -z).
134
135 GNU parallel's newline separation can be emulated with:
136
137 cat | xargs -d "\n" -n1 command
138
139 xargs can run a given number of jobs in parallel, but has no support
140 for running number-of-cpu-cores jobs in parallel.
141
142 xargs has no support for grouping the output, therefore output may run
143 together, e.g. the first half of a line is from one process and the
144 last half of the line is from another process. The example Parallel
145 grep cannot be done reliably with xargs because of this. To see this in
146 action try:
147
148 parallel perl -e "'"'$a="1"."{}"x10000000;print $a,"\n"'"'" \
149 '>' {} ::: a b c d e f g h
150 # Serial = no mixing = the wanted result
151 # 'tr -s a-z' squeezes repeating letters into a single letter
152 echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
153 # Compare to 8 jobs in parallel
154 parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
155 echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
156 echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
157 tr -s a-z
158
159 Or try this:
160
161 slow_seq() {
162 echo Count to "$@"
163 seq "$@" |
164 perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
165 }
166 export -f slow_seq
167 # Serial = no mixing = the wanted result
168 seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
169 # Compare to 8 jobs in parallel
170 seq 8 | parallel -P8 slow_seq {}
171 seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
172
173 xargs has no support for keeping the order of the output, therefore if
174 running jobs in parallel using xargs the output of the second job
175 cannot be postponed till the first job is done.
176
177 xargs has no support for running jobs on remote computers.
178
179 xargs has no support for context replace, so you will have to create
180 the arguments.
181
182 If you use a replace string in xargs (-I) you can not force xargs to
183 use more than one argument.
184
185 Quoting in xargs works like -q in GNU parallel. This means composed
186 commands and redirection require using bash -c.
187
188 ls | parallel "wc {} >{}.wc"
189 ls | parallel "echo {}; ls {}|wc"
190
191 becomes (assuming you have 8 cores and that none of the filenames
192 contain space, " or ').
193
194 ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
195 ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
196
197 A more extreme example can be found on:
198 https://unix.stackexchange.com/q/405552/
199
200 https://www.gnu.org/software/findutils/
201
202 DIFFERENCES BETWEEN find -exec AND GNU Parallel
203 Summary (see legend above):
204
205 - - - x - x -
206 - M2 M3 - - - -
207 - O2 O3 O4 O5 O6
208 - - - - - - -
209 - - - - - - - - -
210 x x
211
212 find -exec offers some of the same possibilities as GNU parallel.
213
214 find -exec only works on files. Processing other input (such as hosts
215 or URLs) will require creating these inputs as files. find -exec has no
216 support for running commands in parallel.
217
218 https://www.gnu.org/software/findutils/ (Last checked: 2019-01)
219
220 DIFFERENCES BETWEEN make -j AND GNU Parallel
221 Summary (see legend above):
222
223 - - - - - - -
224 - - - - - -
225 O1 O2 O3 - x O6
226 E1 - - - E5 -
227 - - - - - - - - -
228 - -
229
230 make -j can run jobs in parallel, but requires a crafted Makefile to do
231 this. That results in extra quoting to get filenames containing
232 newlines to work correctly.
233
234 make -j computes a dependency graph before running jobs. Jobs run by
235 GNU parallel does not depend on each other.
236
237 (Very early versions of GNU parallel were coincidentally implemented
238 using make -j).
239
240 https://www.gnu.org/software/make/ (Last checked: 2019-01)
241
242 DIFFERENCES BETWEEN ppss AND GNU Parallel
243 Summary (see legend above):
244
245 I1 I2 - - - - I7
246 M1 - M3 - - M6
247 O1 - - x - -
248 E1 E2 ?E3 E4 - - -
249 R1 R2 R3 R4 - - ?R7 ? ?
250 - -
251
252 ppss is also a tool for running jobs in parallel.
253
254 The output of ppss is status information and thus not useful for using
255 as input for another command. The output from the jobs are put into
256 files.
257
258 The argument replace string ($ITEM) cannot be changed. Arguments must
259 be quoted - thus arguments containing special characters (space '"&!*)
260 may cause problems. More than one argument is not supported. Filenames
261 containing newlines are not processed correctly. When reading input
262 from a file null cannot be used as a terminator. ppss needs to read the
263 whole input file before starting any jobs.
264
265 Output and status information is stored in ppss_dir and thus requires
266 cleanup when completed. If the dir is not removed before running ppss
267 again it may cause nothing to happen as ppss thinks the task is already
268 done. GNU parallel will normally not need cleaning up if running
269 locally and will only need cleaning up if stopped abnormally and
270 running remote (--cleanup may not complete if stopped abnormally). The
271 example Parallel grep would require extra postprocessing if written
272 using ppss.
273
274 For remote systems PPSS requires 3 steps: config, deploy, and start.
275 GNU parallel only requires one step.
276
277 EXAMPLES FROM ppss MANUAL
278
279 Here are the examples from ppss's manual page with the equivalent using
280 GNU parallel:
281
282 1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
283
284 1$ find /path/to/files -type f | parallel gzip
285
286 2$ ./ppss.sh standalone -d /path/to/files \
287 -c 'cp "$ITEM" /destination/dir '
288
289 2$ find /path/to/files -type f | parallel cp {} /destination/dir
290
291 3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
292
293 3$ parallel -a list-of-urls.txt wget -q
294
295 4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
296
297 4$ parallel -a list-of-urls.txt wget -q {}
298
299 5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
300 -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
301 -n nodes.txt -o /some/output/dir --upload --download;
302 ./ppss deploy -C config.cfg
303 ./ppss start -C config
304
305 5$ # parallel does not use configs. If you want
306 # a different username put it in nodes.txt: user@hostname
307 find source/dir -type f |
308 parallel --sshloginfile nodes.txt --trc {.}.mp3 \
309 lame -a {} -o {.}.mp3 --preset standard --quiet
310
311 6$ ./ppss stop -C config.cfg
312
313 6$ killall -TERM parallel
314
315 7$ ./ppss pause -C config.cfg
316
317 7$ Press: CTRL-Z or killall -SIGTSTP parallel
318
319 8$ ./ppss continue -C config.cfg
320
321 8$ Enter: fg or killall -SIGCONT parallel
322
323 9$ ./ppss.sh status -C config.cfg
324
325 9$ killall -SIGUSR2 parallel
326
327 https://github.com/louwrentius/PPSS (Last checked: 2010-12)
328
329 DIFFERENCES BETWEEN pexec AND GNU Parallel
330 Summary (see legend above):
331
332 I1 I2 - I4 I5 - -
333 M1 - M3 - - M6
334 O1 O2 O3 - O5 O6
335 E1 - - E4 - E6 -
336 R1 - - - - R6 - - -
337 S1 -
338
339 pexec is also a tool for running jobs in parallel.
340
341 EXAMPLES FROM pexec MANUAL
342
343 Here are the examples from pexec's info page with the equivalent using
344 GNU parallel:
345
346 1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
347 'echo "scale=10000;sqrt($NUM)" | bc'
348
349 1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
350 bc > sqrt-{}.dat'
351
352 2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
353
354 2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
355
356 3$ pexec -f image.list -n auto -e B -u star.log -c -- \
357 'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
358
359 3$ parallel -a image.list \
360 'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
361
362 4$ pexec -r *.png -e IMG -c -o - -- \
363 'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
364
365 4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
366
367 5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
368
369 5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
370
371 6$ for p in *.png ; do echo ${p%.png} ; done | \
372 pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
373
374 6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
375
376 7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
377 pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
378
379 7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
380
381 8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
382 'pexec -j -m blockread -d $IMG | \
383 jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
384 pexec -j -m blockwrite -s th_$IMG'
385
386 8$ # Combining GNU B<parallel> and GNU B<sem>.
387 ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
388 'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
389
390 # If reading and writing is done to the same disk, this may be
391 # faster as only one process will be either reading or writing:
392 ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
393 'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
394
395 https://www.gnu.org/software/pexec/ (Last checked: 2010-12)
396
397 DIFFERENCES BETWEEN xjobs AND GNU Parallel
398 xjobs is also a tool for running jobs in parallel. It only supports
399 running jobs on your local computer.
400
401 xjobs deals badly with special characters just like xargs. See the
402 section DIFFERENCES BETWEEN xargs AND GNU Parallel.
403
404 EXAMPLES FROM xjobs MANUAL
405
406 Here are the examples from xjobs's man page with the equivalent using
407 GNU parallel:
408
409 1$ ls -1 *.zip | xjobs unzip
410
411 1$ ls *.zip | parallel unzip
412
413 2$ ls -1 *.zip | xjobs -n unzip
414
415 2$ ls *.zip | parallel unzip >/dev/null
416
417 3$ find . -name '*.bak' | xjobs gzip
418
419 3$ find . -name '*.bak' | parallel gzip
420
421 4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
422
423 4$ ls *.jar | parallel jar tf {} '>' {}.idx
424
425 5$ xjobs -s script
426
427 5$ cat script | parallel
428
429 6$ mkfifo /var/run/my_named_pipe;
430 xjobs -s /var/run/my_named_pipe &
431 echo unzip 1.zip >> /var/run/my_named_pipe;
432 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
433
434 6$ mkfifo /var/run/my_named_pipe;
435 cat /var/run/my_named_pipe | parallel &
436 echo unzip 1.zip >> /var/run/my_named_pipe;
437 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
438
439 https://www.maier-komor.de/xjobs.html (Last checked: 2019-01)
440
441 DIFFERENCES BETWEEN prll AND GNU Parallel
442 prll is also a tool for running jobs in parallel. It does not support
443 running jobs on remote computers.
444
445 prll encourages using BASH aliases and BASH functions instead of
446 scripts. GNU parallel supports scripts directly, functions if they are
447 exported using export -f, and aliases if using env_parallel.
448
449 prll generates a lot of status information on stderr (standard error)
450 which makes it harder to use the stderr (standard error) output of the
451 job directly as input for another program.
452
453 EXAMPLES FROM prll's MANUAL
454
455 Here is the example from prll's man page with the equivalent using GNU
456 parallel:
457
458 1$ prll -s 'mogrify -flip $1' *.jpg
459
460 1$ parallel mogrify -flip ::: *.jpg
461
462 https://github.com/exzombie/prll (Last checked: 2019-01)
463
464 DIFFERENCES BETWEEN dxargs AND GNU Parallel
465 dxargs is also a tool for running jobs in parallel.
466
467 dxargs does not deal well with more simultaneous jobs than SSHD's
468 MaxStartups. dxargs is only built for remote run jobs, but does not
469 support transferring of files.
470
471 https://web.archive.org/web/20120518070250/http://www.
472 semicomplete.com/blog/geekery/distributed-xargs.html (Last checked:
473 2019-01)
474
475 DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
476 middleman(mdm) is also a tool for running jobs in parallel.
477
478 EXAMPLES FROM middleman's WEBSITE
479
480 Here are the shellscripts of
481 https://web.archive.org/web/20110728064735/http://mdm.
482 berlios.de/usage.html ported to GNU parallel:
483
484 1$ seq 19 | parallel buffon -o - | sort -n > result
485 cat files | parallel cmd
486 find dir -execdir sem cmd {} \;
487
488 https://github.com/cklin/mdm (Last checked: 2019-01)
489
490 DIFFERENCES BETWEEN xapply AND GNU Parallel
491 xapply can run jobs in parallel on the local computer.
492
493 EXAMPLES FROM xapply's MANUAL
494
495 Here are the examples from xapply's man page with the equivalent using
496 GNU parallel:
497
498 1$ xapply '(cd %1 && make all)' */
499
500 1$ parallel 'cd {} && make all' ::: */
501
502 2$ xapply -f 'diff %1 ../version5/%1' manifest | more
503
504 2$ parallel diff {} ../version5/{} < manifest | more
505
506 3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
507
508 3$ parallel --link diff {1} {2} :::: manifest1 checklist1
509
510 4$ xapply 'indent' *.c
511
512 4$ parallel indent ::: *.c
513
514 5$ find ~ksb/bin -type f ! -perm -111 -print | \
515 xapply -f -v 'chmod a+x' -
516
517 5$ find ~ksb/bin -type f ! -perm -111 -print | \
518 parallel -v chmod a+x
519
520 6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
521
522 6$ sh <(find */ -... | parallel -s 1024 echo vi)
523
524 6$ find */ -... | parallel -s 1024 -Xuj1 vi
525
526 7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
527
528 7$ sh <(find ... | parallel -n5 echo vi)
529
530 7$ find ... | parallel -n5 -uj1 vi
531
532 8$ xapply -fn "" /etc/passwd
533
534 8$ parallel -k echo < /etc/passwd
535
536 9$ tr ':' '\012' < /etc/passwd | \
537 xapply -7 -nf 'chown %1 %6' - - - - - - -
538
539 9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
540
541 10$ xapply '[ -d %1/RCS ] || echo %1' */
542
543 10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
544
545 11$ xapply -f '[ -f %1 ] && echo %1' List | ...
546
547 11$ parallel '[ -f {} ] && echo {}' < List | ...
548
549 https://www.databits.net/~ksb/msrc/local/bin/xapply/xapply.html (Last
550 checked: 2010-12)
551
552 DIFFERENCES BETWEEN AIX apply AND GNU Parallel
553 apply can build command lines based on a template and arguments - very
554 much like GNU parallel. apply does not run jobs in parallel. apply does
555 not use an argument separator (like :::); instead the template must be
556 the first argument.
557
558 EXAMPLES FROM IBM's KNOWLEDGE CENTER
559
560 Here are the examples from IBM's Knowledge Center and the corresponding
561 command using GNU parallel:
562
563 To obtain results similar to those of the ls command, enter:
564
565 1$ apply echo *
566 1$ parallel echo ::: *
567
568 To compare the file named a1 to the file named b1, and the file named
569 a2 to the file named b2, enter:
570
571 2$ apply -2 cmp a1 b1 a2 b2
572 2$ parallel -N2 cmp ::: a1 b1 a2 b2
573
574 To run the who command five times, enter:
575
576 3$ apply -0 who 1 2 3 4 5
577 3$ parallel -N0 who ::: 1 2 3 4 5
578
579 To link all files in the current directory to the directory /usr/joe,
580 enter:
581
582 4$ apply 'ln %1 /usr/joe' *
583 4$ parallel ln {} /usr/joe ::: *
584
585 https://www-01.ibm.com/support/knowledgecenter/
586 ssw_aix_71/com.ibm.aix.cmds1/apply.htm (Last checked: 2019-01)
587
588 DIFFERENCES BETWEEN paexec AND GNU Parallel
589 paexec can run jobs in parallel on both the local and remote computers.
590
591 paexec requires commands to print a blank line as the last output. This
592 means you will have to write a wrapper for most programs.
593
594 paexec has a job dependency facility so a job can depend on another job
595 to be executed successfully. Sort of a poor-man's make.
596
597 EXAMPLES FROM paexec's EXAMPLE CATALOG
598
599 Here are the examples from paexec's example catalog with the equivalent
600 using GNU parallel:
601
602 1_div_X_run
603
604 1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
605
606 1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
607
608 all_substr_run
609
610 2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
611
612 2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
613
614 cc_wrapper_run
615
616 3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
617 -n 'host1 host2' \
618 -t '/usr/bin/ssh -x' <<EOF [...]
619
620 3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
621 -S host1,host2 <<EOF [...]
622
623 # This is not exactly the same, but avoids the wrapper
624 parallel gcc -O2 -c -o {.}.o {} \
625 -S host1,host2 <<EOF [...]
626
627 toupper_run
628
629 4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
630
631 4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
632
633 # Without the wrapper:
634 parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
635
636 https://github.com/cheusov/paexec (Last checked: 2010-12)
637
638 DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
639 Summary (see legend above):
640
641 I1 - - I4 - - (I7)
642 M1 (M2) M3 (M4) M5 M6
643 - O2 O3 - O5 - - N/A N/A O10
644 E1 - - - - - -
645 - - - - - - - - -
646 - -
647
648 (I7): Only under special circumstances. See below.
649
650 (M2+M4): Only if there is a single replacement string.
651
652 map rejects input with special characters:
653
654 echo "The Cure" > My\ brother\'s\ 12\"\ records
655
656 ls | map 'echo %; wc %'
657
658 It works with GNU parallel:
659
660 ls | parallel 'echo {}; wc {}'
661
662 Under some circumstances it also works with map:
663
664 ls | map 'echo % works %'
665
666 But tiny changes make it reject the input with special characters:
667
668 ls | map 'echo % does not work "%"'
669
670 This means that many UTF-8 characters will be rejected. This is by
671 design. From the web page: "As such, programs that quietly handle them,
672 with no warnings at all, are doing their users a disservice."
673
674 map delays each job by 0.01 s. This can be emulated by using parallel
675 --delay 0.01.
676
677 map prints '+' on stderr when a job starts, and '-' when a job
678 finishes. This cannot be disabled. parallel has --bar if you need to
679 see progress.
680
681 map's replacement strings (% %D %B %E) can be simulated in GNU parallel
682 by putting this in ~/.parallel/config:
683
684 --rpl '%'
685 --rpl '%D $_=Q(::dirname($_));'
686 --rpl '%B s:.*/::;s:\.[^/.]+$::;'
687 --rpl '%E s:.*\.::'
688
689 map does not have an argument separator on the command line, but uses
690 the first argument as command. This makes quoting harder which again
691 may affect readability. Compare:
692
693 map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
694
695 parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
696
697 map can do multiple arguments with context replace, but not without
698 context replace:
699
700 parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
701
702 map "echo 'BEGIN{'%'}END'" 1 2 3
703
704 map has no support for grouping. So this gives the wrong results:
705
706 parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
707 ::: a b c d e f
708 ls -l a b c d e f
709 parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
710 map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
711 map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
712 map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
713 ls -l out*
714 md5sum out*
715
716 EXAMPLES FROM map's WEBSITE
717
718 Here are the examples from map's web page with the equivalent using GNU
719 parallel:
720
721 1$ ls *.gif | map convert % %B.png # default max-args: 1
722
723 1$ ls *.gif | parallel convert {} {.}.png
724
725 2$ map "mkdir %B; tar -C %B -xf %" *.tgz # default max-args: 1
726
727 2$ parallel 'mkdir {.}; tar -C {.} -xf {}' ::: *.tgz
728
729 3$ ls *.gif | map cp % /tmp # default max-args: 100
730
731 3$ ls *.gif | parallel -X cp {} /tmp
732
733 4$ ls *.tar | map -n 1 tar -xf %
734
735 4$ ls *.tar | parallel tar -xf
736
737 5$ map "cp % /tmp" *.tgz
738
739 5$ parallel cp {} /tmp ::: *.tgz
740
741 6$ map "du -sm /home/%/mail" alice bob carol
742
743 6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
744 or if you prefer running a single job with multiple args:
745 6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
746
747 7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
748
749 7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
750
751 8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
752
753 8$ export PARALLEL=-j50%
754
755 https://github.com/sitaramc/map (Last checked: 2020-05)
756
757 DIFFERENCES BETWEEN ladon AND GNU Parallel
758 ladon can run multiple jobs on files in parallel.
759
760 ladon only works on files and the only way to specify files is using a
761 quoted glob string (such as \*.jpg). It is not possible to list the
762 files manually.
763
764 As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
765 RELPATH
766
767 These can be simulated using GNU parallel by putting this in
768 ~/.parallel/config:
769
770 --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
771 --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
772 --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
773 --rpl 'EXT s:.*\.::'
774 --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
775 s:\Q$c/\E::;$_=::dirname($_);'
776 --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
777 s:\Q$c/\E::;'
778
779 ladon deals badly with filenames containing " and newline, and it fails
780 for output larger than 200k:
781
782 ladon '*' -- seq 36000 | wc
783
784 EXAMPLES FROM ladon MANUAL
785
786 It is assumed that the '--rpl's above are put in ~/.parallel/config and
787 that it is run under a shell that supports '**' globbing (such as zsh):
788
789 1$ ladon "**/*.txt" -- echo RELPATH
790
791 1$ parallel echo RELPATH ::: **/*.txt
792
793 2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
794
795 2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
796
797 3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
798 -thumbnail 100x100^ -gravity center -extent 100x100 \
799 thumbs/RELPATH
800
801 3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
802 -thumbnail 100x100^ -gravity center -extent 100x100 \
803 thumbs/RELPATH ::: **/*.jpg
804
805 4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
806
807 4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
808
809 https://github.com/danielgtaylor/ladon (Last checked: 2019-01)
810
811 DIFFERENCES BETWEEN jobflow AND GNU Parallel
812 Summary (see legend above):
813
814 I1 - - - - - I7
815 - - M3 - - (M6)
816 O1 O2 O3 - O5 O6 (O7) - - O10
817 E1 - - - - E6 -
818 - - - - - - - - -
819 - -
820
821 jobflow can run multiple jobs in parallel.
822
823 Just like xargs output from jobflow jobs running in parallel mix
824 together by default. jobflow can buffer into files with -buffered
825 (placed in /run/shm), but these are not cleaned up if jobflow dies
826 unexpectedly (e.g. by Ctrl-C). If the total output is big (in the order
827 of RAM+swap) it can cause the system to slow to a crawl and eventually
828 run out of memory.
829
830 Just like xargs redirection and composed commands require wrapping with
831 bash -c.
832
833 Input lines can at most be 4096 bytes.
834
835 jobflow is faster than GNU parallel but around 6 times slower than
836 parallel-bash.
837
838 jobflow has no equivalent for --pipe, or --sshlogin.
839
840 jobflow makes it possible to set resource limits on the running jobs.
841 This can be emulated by GNU parallel using bash's ulimit:
842
843 jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
844
845 parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
846
847 EXAMPLES FROM jobflow README
848
849 1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
850
851 1$ cat things.list | parallel -j8 ./mytask {}
852
853 2$ seq 100 | jobflow -threads=100 -exec echo {}
854
855 2$ seq 100 | parallel -j100 echo {}
856
857 3$ cat urls.txt | jobflow -threads=32 -exec wget {}
858
859 3$ cat urls.txt | parallel -j32 wget {}
860
861 4$ find . -name '*.bmp' | \
862 jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
863
864 4$ find . -name '*.bmp' | \
865 parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
866
867 5$ seq 100 | jobflow -skip 10 -count 10
868
869 5$ seq 100 | parallel --filter '{1} > 10 and {1} <= 20' echo
870
871 5$ seq 100 | parallel echo '{= $_>10 and $_<=20 or skip() =}'
872
873 https://github.com/rofl0r/jobflow (Last checked: 2022-05)
874
875 DIFFERENCES BETWEEN gargs AND GNU Parallel
876 gargs can run multiple jobs in parallel.
877
878 Older versions cache output in memory. This causes it to be extremely
879 slow when the output is larger than the physical RAM, and can cause the
880 system to run out of memory.
881
882 See more details on this in man parallel_design.
883
884 Newer versions cache output in files, but leave files in $TMPDIR if it
885 is killed.
886
887 Output to stderr (standard error) is changed if the command fails.
888
889 EXAMPLES FROM gargs WEBSITE
890
891 1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
892
893 1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
894
895 2$ cat t.txt | gargs --sep "\s+" \
896 -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
897
898 2$ cat t.txt | parallel --colsep "\\s+" \
899 -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
900
901 https://github.com/brentp/gargs (Last checked: 2016-08)
902
903 DIFFERENCES BETWEEN orgalorg AND GNU Parallel
904 orgalorg can run the same job on multiple machines. This is related to
905 --onall and --nonall.
906
907 orgalorg supports entering the SSH password - provided it is the same
908 for all servers. GNU parallel advocates using ssh-agent instead, but it
909 is possible to emulate orgalorg's behavior by setting SSHPASS and by
910 using --ssh "sshpass ssh".
911
912 To make the emulation easier, make a simple alias:
913
914 alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
915
916 If you want to supply a password run:
917
918 SSHPASS=`ssh-askpass`
919
920 or set the password directly:
921
922 SSHPASS=P4$$w0rd!
923
924 If the above is set up you can then do:
925
926 orgalorg -o frontend1 -o frontend2 -p -C uptime
927 par_emul -S frontend1 -S frontend2 uptime
928
929 orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
930 par_emul -S frontend1 -S frontend2 top -bid 1
931
932 orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
933 'md5sum /tmp/bigfile' -S bigfile
934 par_emul -S frontend1 -S frontend2 --basefile bigfile \
935 --workdir /tmp md5sum /tmp/bigfile
936
937 orgalorg has a progress indicator for the transferring of a file. GNU
938 parallel does not.
939
940 https://github.com/reconquest/orgalorg (Last checked: 2016-08)
941
942 DIFFERENCES BETWEEN Rust parallel(mmstick) AND GNU Parallel
943 Rust parallel focuses on speed. It is almost as fast as xargs, but not
944 as fast as parallel-bash. It implements a few features from GNU
945 parallel, but lacks many functions. All these fail:
946
947 # Read arguments from file
948 parallel -a file echo
949 # Changing the delimiter
950 parallel -d _ echo ::: a_b_c_
951
952 These do something different from GNU parallel
953
954 # -q to protect quoted $ and space
955 parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
956 # Generation of combination of inputs
957 parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
958 # {= perl expression =} replacement string
959 parallel echo '{= s/new/old/ =}' ::: my.new your.new
960 # --pipe
961 seq 100000 | parallel --pipe wc
962 # linked arguments
963 parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
964 # Run different shell dialects
965 zsh -c 'parallel echo \={} ::: zsh && true'
966 csh -c 'parallel echo \$\{\} ::: shell && true'
967 bash -c 'parallel echo \$\({}\) ::: pwd && true'
968 # Rust parallel does not start before the last argument is read
969 (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
970 tail -f /var/log/syslog | parallel echo
971
972 Most of the examples from the book GNU Parallel 2018 do not work, thus
973 Rust parallel is not close to being a compatible replacement.
974
975 Rust parallel has no remote facilities.
976
977 It uses /tmp/parallel for tmp files and does not clean up if terminated
978 abruptly. If another user on the system uses Rust parallel, then
979 /tmp/parallel will have the wrong permissions and Rust parallel will
980 fail. A malicious user can setup the right permissions and symlink the
981 output file to one of the user's files and next time the user uses Rust
982 parallel it will overwrite this file.
983
984 attacker$ mkdir /tmp/parallel
985 attacker$ chmod a+rwX /tmp/parallel
986 # Symlink to the file the attacker wants to zero out
987 attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
988 victim$ seq 1000 | parallel echo
989 # This file is now overwritten with stderr from 'echo'
990 victim$ cat ~victim/.important-file
991
992 If /tmp/parallel runs full during the run, Rust parallel does not
993 report this, but finishes with success - thereby risking data loss.
994
995 https://github.com/mmstick/parallel (Last checked: 2016-08)
996
997 DIFFERENCES BETWEEN Rush AND GNU Parallel
998 rush (https://github.com/shenwei356/rush) is written in Go and based on
999 gargs.
1000
1001 Just like GNU parallel rush buffers in temporary files. But opposite
1002 GNU parallel rush does not clean up, if the process dies abnormally.
1003
1004 rush has some string manipulations that can be emulated by putting this
1005 into ~/.parallel/config (/ is used instead of %, and % is used instead
1006 of ^ as that is closer to bash's ${var%postfix}):
1007
1008 --rpl '{:} s:(\.[^/]+)*$::'
1009 --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
1010 --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
1011 --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
1012 --rpl '{@(.*?)} /$$1/ and $_=$1;'
1013
1014 EXAMPLES FROM rush's WEBSITE
1015
1016 Here are the examples from rush's website with the equivalent command
1017 in GNU parallel.
1018
1019 1. Simple run, quoting is not necessary
1020
1021 1$ seq 1 3 | rush echo {}
1022
1023 1$ seq 1 3 | parallel echo {}
1024
1025 2. Read data from file (`-i`)
1026
1027 2$ rush echo {} -i data1.txt -i data2.txt
1028
1029 2$ cat data1.txt data2.txt | parallel echo {}
1030
1031 3. Keep output order (`-k`)
1032
1033 3$ seq 1 3 | rush 'echo {}' -k
1034
1035 3$ seq 1 3 | parallel -k echo {}
1036
1037 4. Timeout (`-t`)
1038
1039 4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
1040
1041 4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
1042
1043 5. Retry (`-r`)
1044
1045 5$ seq 1 | rush 'python unexisted_script.py' -r 1
1046
1047 5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
1048
1049 Use -u to see it is really run twice:
1050
1051 5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
1052
1053 6. Dirname (`{/}`) and basename (`{%}`) and remove custom suffix
1054 (`{^suffix}`)
1055
1056 6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
1057
1058 6$ echo dir/file_1.txt.gz |
1059 parallel --plus echo {//} {/} {%_1.txt.gz}
1060
1061 7. Get basename, and remove last (`{.}`) or any (`{:}`) extension
1062
1063 7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
1064
1065 7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
1066
1067 8. Job ID, combine fields index and other replacement strings
1068
1069 8$ echo 12 file.txt dir/s_1.fq.gz |
1070 rush 'echo job {#}: {2} {2.} {3%:^_1}'
1071
1072 8$ echo 12 file.txt dir/s_1.fq.gz |
1073 parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1074
1075 9. Capture submatch using regular expression (`{@regexp}`)
1076
1077 9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1078
1079 9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1080
1081 10. Custom field delimiter (`-d`)
1082
1083 10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1084
1085 10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1086
1087 11. Send multi-lines to every command (`-n`)
1088
1089 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1090
1091 11$ seq 5 |
1092 parallel -n 2 -k \
1093 'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1094
1095 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1096
1097 11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1098
1099 12. Custom record delimiter (`-D`), note that empty records are not
1100 used.
1101
1102 12$ echo a b c d | rush -D " " -k 'echo {}'
1103
1104 12$ echo a b c d | parallel -d " " -k 'echo {}'
1105
1106 12$ echo abcd | rush -D "" -k 'echo {}'
1107
1108 Cannot be done by GNU Parallel
1109
1110 12$ cat fasta.fa
1111 >seq1
1112 tag
1113 >seq2
1114 cat
1115 gat
1116 >seq3
1117 attac
1118 a
1119 cat
1120
1121 12$ cat fasta.fa | rush -D ">" \
1122 'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1123 # rush fails to join the multiline sequences
1124
1125 12$ cat fasta.fa | (read -n1 ignore_first_char;
1126 parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1127 name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1128 )
1129
1130 13. Assign value to variable, like `awk -v` (`-v`)
1131
1132 13$ seq 1 |
1133 rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1134
1135 13$ seq 1 |
1136 parallel -N0 \
1137 'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1138
1139 13$ for var in a b; do \
1140 13$ seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1141 13$ done
1142
1143 In GNU parallel you would typically do:
1144
1145 13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1146
1147 If you really want the var:
1148
1149 13$ seq 1 3 |
1150 parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1151
1152 If you really want the for-loop:
1153
1154 13$ for var in a b; do
1155 export var;
1156 seq 1 3 | parallel -k 'echo var: $var, data: {}';
1157 done
1158
1159 Contrary to rush this also works if the value is complex like:
1160
1161 My brother's 12" records
1162
1163 14. Preset variable (`-v`), avoid repeatedly writing verbose
1164 replacement strings
1165
1166 14$ # naive way
1167 echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1168
1169 14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1170
1171 14$ # macro + removing suffix
1172 echo read_1.fq.gz |
1173 rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1174
1175 14$ echo read_1.fq.gz |
1176 parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1177
1178 14$ # macro + regular expression
1179 echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1180
1181 14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1182
1183 Contrary to rush GNU parallel works with complex values:
1184
1185 14$ echo "My brother's 12\"read_1.fq.gz" |
1186 parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1187
1188 15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and
1189 exit.
1190
1191 15$ seq 1 20 | rush 'sleep 1; echo {}'
1192 ^C
1193
1194 15$ seq 1 20 | parallel 'sleep 1; echo {}'
1195 ^C
1196
1197 16. Continue/resume jobs (`-c`). When some jobs failed (by execution
1198 failure, timeout, or canceling by user with `Ctrl + C`), please switch
1199 flag `-c/--continue` on and run again, so that `rush` can save
1200 successful commands and ignore them in NEXT run.
1201
1202 16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1203 cat successful_cmds.rush
1204 seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1205
1206 16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1207 'sleep {}; echo {}'
1208 cat mylog
1209 seq 1 3 | parallel --joblog mylog --retry-failed \
1210 'sleep {}; echo {}'
1211
1212 Multi-line jobs:
1213
1214 16$ seq 1 3 | rush 'sleep {}; echo {}; \
1215 echo finish {}' -t 3 -c -C finished.rush
1216 cat finished.rush
1217 seq 1 3 | rush 'sleep {}; echo {}; \
1218 echo finish {}' -t 3 -c -C finished.rush
1219
1220 16$ seq 1 3 |
1221 parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1222 echo finish {}'
1223 cat mylog
1224 seq 1 3 |
1225 parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1226 echo finish {}'
1227
1228 17. A comprehensive example: downloading 1K+ pages given by three URL
1229 list files using `phantomjs save_page.js` (some page contents are
1230 dynamically generated by Javascript, so `wget` does not work). Here I
1231 set max jobs number (`-j`) as `20`, each job has a max running time
1232 (`-t`) of `60` seconds and `3` retry changes (`-r`). Continue flag `-c`
1233 is also switched on, so we can continue unfinished jobs. Luckily, it's
1234 accomplished in one run :)
1235
1236 17$ for f in $(seq 2014 2016); do \
1237 /bin/rm -rf $f; mkdir -p $f; \
1238 cat $f.html.txt | rush -v d=$f -d = \
1239 'phantomjs save_page.js "{}" > {d}/{3}.html' \
1240 -j 20 -t 60 -r 3 -c; \
1241 done
1242
1243 GNU parallel can append to an existing joblog with '+':
1244
1245 17$ rm mylog
1246 for f in $(seq 2014 2016); do
1247 /bin/rm -rf $f; mkdir -p $f;
1248 cat $f.html.txt |
1249 parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1250 --colsep = \
1251 phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1252 done
1253
1254 18. A bioinformatics example: mapping with `bwa`, and processing result
1255 with `samtools`:
1256
1257 18$ ref=ref/xxx.fa
1258 threads=25
1259 ls -d raw.cluster.clean.mapping/* \
1260 | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1261 'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1262 samtools view -bS {p}.sam > {p}.bam; \
1263 samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1264 samtools index {p}.sorted.bam; \
1265 samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1266 /bin/rm {p}.bam {p}.sam;' \
1267 -j 2 --verbose -c -C mapping.rush
1268
1269 GNU parallel would use a function:
1270
1271 18$ ref=ref/xxx.fa
1272 export ref
1273 thr=25
1274 export thr
1275 bwa_sam() {
1276 p="$1"
1277 bam="$p".bam
1278 sam="$p".sam
1279 sortbam="$p".sorted.bam
1280 bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1281 samtools view -bS "$sam" > "$bam"
1282 samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1283 samtools index "$sortbam"
1284 samtools flagstat "$sortbam" > "$sortbam".flagstat
1285 /bin/rm "$bam" "$sam"
1286 }
1287 export -f bwa_sam
1288 ls -d raw.cluster.clean.mapping/* |
1289 parallel -j 2 --verbose --joblog mylog bwa_sam
1290
1291 Other rush features
1292
1293 rush has:
1294
1295 • awk -v like custom defined variables (-v)
1296
1297 With GNU parallel you would simply set a shell variable:
1298
1299 parallel 'v={}; echo "$v"' ::: foo
1300 echo foo | rush -v v={} 'echo {v}'
1301
1302 Also rush does not like special chars. So these do not work:
1303
1304 echo does not work | rush -v v=\" 'echo {v}'
1305 echo "My brother's 12\" records" | rush -v v={} 'echo {v}'
1306
1307 Whereas the corresponding GNU parallel version works:
1308
1309 parallel 'v=\"; echo "$v"' ::: works
1310 parallel 'v={}; echo "$v"' ::: "My brother's 12\" records"
1311
1312 • Exit on first error(s) (-e)
1313
1314 This is called --halt now,fail=1 (or shorter: --halt 2) when used
1315 with GNU parallel.
1316
1317 • Settable records sending to every command (-n, default 1)
1318
1319 This is also called -n in GNU parallel.
1320
1321 • Practical replacement strings
1322
1323 {:} remove any extension
1324 With GNU parallel this can be emulated by:
1325
1326 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1327
1328 {^suffix}, remove suffix
1329 With GNU parallel this can be emulated by:
1330
1331 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1332
1333 {@regexp}, capture submatch using regular expression
1334 With GNU parallel this can be emulated by:
1335
1336 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1337 echo '{@\d_(.*).gz}' ::: 1_foo.gz
1338
1339 {%.}, {%:}, basename without extension
1340 With GNU parallel this can be emulated by:
1341
1342 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1343
1344 And if you need it often, you define a --rpl in
1345 $HOME/.parallel/config:
1346
1347 --rpl '{%.} s:.*/::;s/\..*//'
1348 --rpl '{%:} s:.*/::;s/\..*//'
1349
1350 Then you can use them as:
1351
1352 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1353
1354 • Preset variable (macro)
1355
1356 E.g.
1357
1358 echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1359
1360 With GNU parallel this can be emulated by:
1361
1362 echo foosuffix |
1363 parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1364
1365 Opposite rush GNU parallel works fine if the input contains double
1366 space, ' and ":
1367
1368 echo "1'6\" foosuffix" |
1369 parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1370
1371 • Commands of multi-lines
1372
1373 While you can use multi-lined commands in GNU parallel, to improve
1374 readability GNU parallel discourages the use of multi-line
1375 commands. In most cases it can be written as a function:
1376
1377 seq 1 3 |
1378 parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1379 echo finish {}'
1380
1381 Could be written as:
1382
1383 doit() {
1384 sleep "$1"
1385 echo "$1"
1386 echo finish "$1"
1387 }
1388 export -f doit
1389 seq 1 3 | parallel --timeout 2 --joblog my.log doit
1390
1391 The failed commands can be resumed with:
1392
1393 seq 1 3 |
1394 parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1395 echo finish {}'
1396
1397 https://github.com/shenwei356/rush (Last checked: 2017-05)
1398
1399 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1400 ClusterSSH solves a different problem than GNU parallel.
1401
1402 ClusterSSH opens a terminal window for each computer and using a master
1403 window you can run the same command on all the computers. This is
1404 typically used for administrating several computers that are almost
1405 identical.
1406
1407 GNU parallel runs the same (or different) commands with different
1408 arguments in parallel possibly using remote computers to help
1409 computing. If more than one computer is listed in -S GNU parallel may
1410 only use one of these (e.g. if there are 8 jobs to be run and one
1411 computer has 8 cores).
1412
1413 GNU parallel can be used as a poor-man's version of ClusterSSH:
1414
1415 parallel --nonall -S server-a,server-b do_stuff foo bar
1416
1417 https://github.com/duncs/clusterssh (Last checked: 2010-12)
1418
1419 DIFFERENCES BETWEEN coshell AND GNU Parallel
1420 coshell only accepts full commands on standard input. Any quoting needs
1421 to be done by the user.
1422
1423 Commands are run in sh so any bash/tcsh/zsh specific syntax will not
1424 work.
1425
1426 Output can be buffered by using -d. Output is buffered in memory, so
1427 big output can cause swapping and therefore be terrible slow or even
1428 cause out of memory.
1429
1430 https://github.com/gdm85/coshell (Last checked: 2019-01)
1431
1432 DIFFERENCES BETWEEN spread AND GNU Parallel
1433 spread runs commands on all directories.
1434
1435 It can be emulated with GNU parallel using this Bash function:
1436
1437 spread() {
1438 _cmds() {
1439 perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1440 }
1441 parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1442 }
1443
1444 This works except for the --exclude option.
1445
1446 (Last checked: 2017-11)
1447
1448 DIFFERENCES BETWEEN pyargs AND GNU Parallel
1449 pyargs deals badly with input containing spaces. It buffers stdout, but
1450 not stderr. It buffers in RAM. {} does not work as replacement string.
1451 It does not support running functions.
1452
1453 pyargs does not support composed commands if run with --lines, and
1454 fails on pyargs traceroute gnu.org fsf.org.
1455
1456 Examples
1457
1458 seq 5 | pyargs -P50 -L seq
1459 seq 5 | parallel -P50 --lb seq
1460
1461 seq 5 | pyargs -P50 --mark -L seq
1462 seq 5 | parallel -P50 --lb \
1463 --tagstring OUTPUT'[{= $_=$job->replaced() =}]' seq
1464 # Similar, but not precisely the same
1465 seq 5 | parallel -P50 --lb --tag seq
1466
1467 seq 5 | pyargs -P50 --mark command
1468 # Somewhat longer with GNU Parallel due to the special
1469 # --mark formatting
1470 cmd="$(echo "command" | parallel --shellquote)"
1471 wrap_cmd() {
1472 echo "MARK $cmd $@================================" >&3
1473 echo "OUTPUT START[$cmd $@]:"
1474 eval $cmd "$@"
1475 echo "OUTPUT END[$cmd $@]"
1476 }
1477 (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1478 # Similar, but not exactly the same
1479 seq 5 | parallel -t --tag command
1480
1481 (echo '1 2 3';echo 4 5 6) | pyargs --stream seq
1482 (echo '1 2 3';echo 4 5 6) | perl -pe 's/\n/ /' |
1483 parallel -r -d' ' seq
1484 # Similar, but not exactly the same
1485 parallel seq ::: 1 2 3 4 5 6
1486
1487 https://github.com/robertblackwell/pyargs (Last checked: 2019-01)
1488
1489 DIFFERENCES BETWEEN concurrently AND GNU Parallel
1490 concurrently runs jobs in parallel.
1491
1492 The output is prepended with the job number, and may be incomplete:
1493
1494 $ concurrently 'seq 100000' | (sleep 3;wc -l)
1495 7165
1496
1497 When pretty printing it caches output in memory. Output mixes by using
1498 test MIX below whether or not output is cached.
1499
1500 There seems to be no way of making a template command and have
1501 concurrently fill that with different args. The full commands must be
1502 given on the command line.
1503
1504 There is also no way of controlling how many jobs should be run in
1505 parallel at a time - i.e. "number of jobslots". Instead all jobs are
1506 simply started in parallel.
1507
1508 https://github.com/kimmobrunfeldt/concurrently (Last checked: 2019-01)
1509
1510 DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1511 map does not run jobs in parallel by default. The README suggests
1512 using:
1513
1514 ... | map t 'sleep $t && say done &'
1515
1516 But this fails if more jobs are run in parallel than the number of
1517 available processes. Since there is no support for parallelization in
1518 map itself, the output also mixes:
1519
1520 seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1521
1522 The major difference is that GNU parallel is built for parallelization
1523 and map is not. So GNU parallel has lots of ways of dealing with the
1524 issues that parallelization raises:
1525
1526 • Keep the number of processes manageable
1527
1528 • Make sure output does not mix
1529
1530 • Make Ctrl-C kill all running processes
1531
1532 EXAMPLES FROM maps WEBSITE
1533
1534 Here are the 5 examples converted to GNU Parallel:
1535
1536 1$ ls *.c | map f 'foo $f'
1537 1$ ls *.c | parallel foo
1538
1539 2$ ls *.c | map f 'foo $f; bar $f'
1540 2$ ls *.c | parallel 'foo {}; bar {}'
1541
1542 3$ cat urls | map u 'curl -O $u'
1543 3$ cat urls | parallel curl -O
1544
1545 4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1546 4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1547 4$ parallel 'sleep {} && say done' ::: 1 1 1
1548
1549 5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1550 5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1551 5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1552
1553 https://github.com/soveran/map (Last checked: 2019-01)
1554
1555 DIFFERENCES BETWEEN loop AND GNU Parallel
1556 loop mixes stdout and stderr:
1557
1558 loop 'ls /no-such-file' >/dev/null
1559
1560 loop's replacement string $ITEM does not quote strings:
1561
1562 echo 'two spaces' | loop 'echo $ITEM'
1563
1564 loop cannot run functions:
1565
1566 myfunc() { echo joe; }
1567 export -f myfunc
1568 loop 'myfunc this fails'
1569
1570 EXAMPLES FROM loop's WEBSITE
1571
1572 Some of the examples from https://github.com/Miserlou/Loop/ can be
1573 emulated with GNU parallel:
1574
1575 # A couple of functions will make the code easier to read
1576 $ loopy() {
1577 yes | parallel -uN0 -j1 "$@"
1578 }
1579 $ export -f loopy
1580 $ time_out() {
1581 parallel -uN0 -q --timeout "$@" ::: 1
1582 }
1583 $ match() {
1584 perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1585 }
1586 $ export -f match
1587
1588 $ loop 'ls' --every 10s
1589 $ loopy --delay 10s ls
1590
1591 $ loop 'touch $COUNT.txt' --count-by 5
1592 $ loopy touch '{= $_=seq()*5 =}'.txt
1593
1594 $ loop --until-contains 200 -- \
1595 ./get_response_code.sh --site mysite.biz`
1596 $ loopy --halt now,success=1 \
1597 './get_response_code.sh --site mysite.biz | match 200'
1598
1599 $ loop './poke_server' --for-duration 8h
1600 $ time_out 8h loopy ./poke_server
1601
1602 $ loop './poke_server' --until-success
1603 $ loopy --halt now,success=1 ./poke_server
1604
1605 $ cat files_to_create.txt | loop 'touch $ITEM'
1606 $ cat files_to_create.txt | parallel touch {}
1607
1608 $ loop 'ls' --for-duration 10min --summary
1609 # --joblog is somewhat more verbose than --summary
1610 $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1611
1612 $ loop 'echo hello'
1613 $ loopy echo hello
1614
1615 $ loop 'echo $COUNT'
1616 # GNU Parallel counts from 1
1617 $ loopy echo {#}
1618 # Counting from 0 can be forced
1619 $ loopy echo '{= $_=seq()-1 =}'
1620
1621 $ loop 'echo $COUNT' --count-by 2
1622 $ loopy echo '{= $_=2*(seq()-1) =}'
1623
1624 $ loop 'echo $COUNT' --count-by 2 --offset 10
1625 $ loopy echo '{= $_=10+2*(seq()-1) =}'
1626
1627 $ loop 'echo $COUNT' --count-by 1.1
1628 # GNU Parallel rounds 3.3000000000000003 to 3.3
1629 $ loopy echo '{= $_=1.1*(seq()-1) =}'
1630
1631 $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1632 $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1633
1634 $ loop 'echo $COUNT' --num 3 --summary
1635 # --joblog is somewhat more verbose than --summary
1636 $ seq 3 | parallel --joblog my.log echo; cat my.log
1637
1638 $ loop 'ls -foobarbatz' --num 3 --summary
1639 # --joblog is somewhat more verbose than --summary
1640 $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1641
1642 $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1643 # Can be emulated by running 2 jobs
1644 $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1645 $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1646
1647 $ loop 'date' --every 5s
1648 $ loopy --delay 5s date
1649
1650 $ loop 'date' --for-duration 8s --every 2s
1651 $ time_out 8s loopy --delay 2s date
1652
1653 $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1654 $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date +%s`))s
1655 $ time_out $seconds loopy --delay 5s date -u
1656
1657 $ loop 'echo $RANDOM' --until-contains "666"
1658 $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1659
1660 $ loop 'if (( RANDOM % 2 )); then
1661 (echo "TRUE"; true);
1662 else
1663 (echo "FALSE"; false);
1664 fi' --until-success
1665 $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1666 (echo "TRUE"; true);
1667 else
1668 (echo "FALSE"; false);
1669 fi'
1670
1671 $ loop 'if (( RANDOM % 2 )); then
1672 (echo "TRUE"; true);
1673 else
1674 (echo "FALSE"; false);
1675 fi' --until-error
1676 $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1677 (echo "TRUE"; true);
1678 else
1679 (echo "FALSE"; false);
1680 fi'
1681
1682 $ loop 'date' --until-match "(\d{4})"
1683 $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1684
1685 $ loop 'echo $ITEM' --for red,green,blue
1686 $ parallel echo ::: red green blue
1687
1688 $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1689 $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1690
1691 $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1692 $ ls | parallel cp {} {}.bak; ls
1693
1694 $ loop 'echo $ITEM | tr a-z A-Z' -i
1695 $ parallel 'echo {} | tr a-z A-Z'
1696 # Or more efficiently:
1697 $ parallel --pipe tr a-z A-Z
1698
1699 $ loop 'echo $ITEM' --for "`ls`"
1700 $ parallel echo {} ::: "`ls`"
1701
1702 $ ls | loop './my_program $ITEM' --until-success;
1703 $ ls | parallel --halt now,success=1 ./my_program {}
1704
1705 $ ls | loop './my_program $ITEM' --until-fail;
1706 $ ls | parallel --halt now,fail=1 ./my_program {}
1707
1708 $ ./deploy.sh;
1709 loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1710 --every 5s --until-contains 200;
1711 ./announce_to_slack.sh
1712 $ ./deploy.sh;
1713 loopy --delay 5s --halt now,success=1 \
1714 'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1715 ./announce_to_slack.sh
1716
1717 $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1718 $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1719
1720 $ ./create_big_file -o my_big_file.bin;
1721 loop 'ls' --until-contains 'my_big_file.bin';
1722 ./upload_big_file my_big_file.bin
1723 # inotifywait is a better tool to detect file system changes.
1724 # It can even make sure the file is complete
1725 # so you are not uploading an incomplete file
1726 $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1727 grep my_big_file.bin
1728
1729 $ ls | loop 'cp $ITEM $ITEM.bak'
1730 $ ls | parallel cp {} {}.bak
1731
1732 $ loop './do_thing.sh' --every 15s --until-success --num 5
1733 $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
1734
1735 https://github.com/Miserlou/Loop/ (Last checked: 2018-10)
1736
1737 DIFFERENCES BETWEEN lorikeet AND GNU Parallel
1738 lorikeet can run jobs in parallel. It does this based on a dependency
1739 graph described in a file, so this is similar to make.
1740
1741 https://github.com/cetra3/lorikeet (Last checked: 2018-10)
1742
1743 DIFFERENCES BETWEEN spp AND GNU Parallel
1744 spp can run jobs in parallel. spp does not use a command template to
1745 generate the jobs, but requires jobs to be in a file. Output from the
1746 jobs mix.
1747
1748 https://github.com/john01dav/spp (Last checked: 2019-01)
1749
1750 DIFFERENCES BETWEEN paral AND GNU Parallel
1751 paral prints a lot of status information and stores the output from the
1752 commands run into files. This means it cannot be used the middle of a
1753 pipe like this
1754
1755 paral "echo this" "echo does not" "echo work" | wc
1756
1757 Instead it puts the output into files named like out_#_command.out.log.
1758 To get a very similar behaviour with GNU parallel use --results
1759 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta
1760
1761 paral only takes arguments on the command line and each argument should
1762 be a full command. Thus it does not use command templates.
1763
1764 This limits how many jobs it can run in total, because they all need to
1765 fit on a single command line.
1766
1767 paral has no support for running jobs remotely.
1768
1769 EXAMPLES FROM README.markdown
1770
1771 The examples from README.markdown and the corresponding command run
1772 with GNU parallel (--results
1773 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta is omitted from the
1774 GNU parallel command):
1775
1776 1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
1777 1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
1778
1779 2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
1780 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
1781 2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
1782 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
1783 # Or shorter:
1784 parallel "sleep {} && echo c{}" ::: {1..5}
1785
1786 3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1787 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1788 3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
1789 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1790 # Or shorter:
1791 parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
1792
1793 4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1794 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1795 4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
1796
1797 5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1798 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1799 5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
1800
1801 6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
1802 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
1803 6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
1804
1805 7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1806 echo c && sleep 0.5 && echo d && sleep 0.5 && \
1807 echo e && sleep 0.5 && echo f && sleep 0.5 && \
1808 echo g && sleep 0.5 && echo h"
1809 7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
1810 echo c && sleep 0.5 && echo d && sleep 0.5 && \
1811 echo e && sleep 0.5 && echo f && sleep 0.5 && \
1812 echo g && sleep 0.5 && echo h"
1813
1814 https://github.com/amattn/paral (Last checked: 2019-01)
1815
1816 DIFFERENCES BETWEEN concurr AND GNU Parallel
1817 concurr is built to run jobs in parallel using a client/server model.
1818
1819 EXAMPLES FROM README.md
1820
1821 The examples from README.md:
1822
1823 1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
1824 1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
1825
1826 2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
1827 2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
1828
1829 3$ concurr 'echo {}' < input_file
1830 3$ parallel 'echo {}' < input_file
1831
1832 4$ cat file | concurr 'echo {}'
1833 4$ cat file | parallel 'echo {}'
1834
1835 concurr deals badly empty input files and with output larger than 64
1836 KB.
1837
1838 https://github.com/mmstick/concurr (Last checked: 2019-01)
1839
1840 DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
1841 lesser-parallel is the inspiration for parallel --embed. Both lesser-
1842 parallel and parallel --embed define bash functions that can be
1843 included as part of a bash script to run jobs in parallel.
1844
1845 lesser-parallel implements a few of the replacement strings, but hardly
1846 any options, whereas parallel --embed gives you the full GNU parallel
1847 experience.
1848
1849 https://github.com/kou1okada/lesser-parallel (Last checked: 2019-01)
1850
1851 DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
1852 npm-parallel can run npm tasks in parallel.
1853
1854 There are no examples and very little documentation, so it is hard to
1855 compare to GNU parallel.
1856
1857 https://github.com/spion/npm-parallel (Last checked: 2019-01)
1858
1859 DIFFERENCES BETWEEN machma AND GNU Parallel
1860 machma runs tasks in parallel. It gives time stamped output. It buffers
1861 in RAM.
1862
1863 EXAMPLES FROM README.md
1864
1865 The examples from README.md:
1866
1867 1$ # Put shorthand for timestamp in config for the examples
1868 echo '--rpl '\
1869 \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
1870 > ~/.parallel/machma
1871 echo '--line-buffer --tagstring "{#} {time} {}"' \
1872 >> ~/.parallel/machma
1873
1874 2$ find . -iname '*.jpg' |
1875 machma -- mogrify -resize 1200x1200 -filter Lanczos {}
1876 find . -iname '*.jpg' |
1877 parallel --bar -Jmachma mogrify -resize 1200x1200 \
1878 -filter Lanczos {}
1879
1880 3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
1881 3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
1882
1883 4$ cat /tmp/ips |
1884 machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
1885 4$ cat /tmp/ips |
1886 parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
1887
1888 5$ find . -iname '*.jpg' |
1889 machma --timeout 5s -- mogrify -resize 1200x1200 \
1890 -filter Lanczos {}
1891 5$ find . -iname '*.jpg' |
1892 parallel --timeout 5s --bar mogrify -resize 1200x1200 \
1893 -filter Lanczos {}
1894
1895 6$ find . -iname '*.jpg' -print0 |
1896 machma --null -- mogrify -resize 1200x1200 -filter Lanczos {}
1897 6$ find . -iname '*.jpg' -print0 |
1898 parallel --null --bar mogrify -resize 1200x1200 \
1899 -filter Lanczos {}
1900
1901 https://github.com/fd0/machma (Last checked: 2019-06)
1902
1903 DIFFERENCES BETWEEN interlace AND GNU Parallel
1904 Summary (see legend above):
1905
1906 - I2 I3 I4 - - -
1907 M1 - M3 - - M6
1908 - O2 O3 - - - - x x
1909 E1 E2 - - - - -
1910 - - - - - - - - -
1911 - -
1912
1913 interlace is built for network analysis to run network tools in
1914 parallel.
1915
1916 interface does not buffer output, so output from different jobs mixes.
1917
1918 The overhead for each target is O(n*n), so with 1000 targets it becomes
1919 very slow with an overhead in the order of 500ms/target.
1920
1921 EXAMPLES FROM interlace's WEBSITE
1922
1923 Using prips most of the examples from
1924 https://github.com/codingo/Interlace can be run with GNU parallel:
1925
1926 Blocker
1927
1928 commands.txt:
1929 mkdir -p _output_/_target_/scans/
1930 _blocker_
1931 nmap _target_ -oA _output_/_target_/scans/_target_-nmap
1932 interlace -tL ./targets.txt -cL commands.txt -o $output
1933
1934 parallel -a targets.txt \
1935 mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
1936
1937 Blocks
1938
1939 commands.txt:
1940 _block:nmap_
1941 mkdir -p _target_/output/scans/
1942 nmap _target_ -oN _target_/output/scans/_target_-nmap
1943 _block:nmap_
1944 nikto --host _target_
1945 interlace -tL ./targets.txt -cL commands.txt
1946
1947 _nmap() {
1948 mkdir -p $1/output/scans/
1949 nmap $1 -oN $1/output/scans/$1-nmap
1950 }
1951 export -f _nmap
1952 parallel ::: _nmap "nikto --host" :::: targets.txt
1953
1954 Run Nikto Over Multiple Sites
1955
1956 interlace -tL ./targets.txt -threads 5 \
1957 -c "nikto --host _target_ > ./_target_-nikto.txt" -v
1958
1959 parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
1960
1961 Run Nikto Over Multiple Sites and Ports
1962
1963 interlace -tL ./targets.txt -threads 5 -c \
1964 "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
1965 -p 80,443 -v
1966
1967 parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
1968 :::: targets.txt ::: 80 443
1969
1970 Run a List of Commands against Target Hosts
1971
1972 commands.txt:
1973 nikto --host _target_:_port_ > _output_/_target_-nikto.txt
1974 sslscan _target_:_port_ > _output_/_target_-sslscan.txt
1975 testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
1976 interlace -t example.com -o ~/Engagements/example/ \
1977 -cL ./commands.txt -p 80,443
1978
1979 parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
1980 ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
1981
1982 CIDR notation with an application that doesn't support it
1983
1984 interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
1985 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1986
1987 prips 192.168.12.0/24 |
1988 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1989
1990 Glob notation with an application that doesn't support it
1991
1992 interlace -t 192.168.12.* -c "vhostscan _target_ \
1993 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
1994
1995 # Glob is not supported in prips
1996 prips 192.168.12.0/24 |
1997 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
1998
1999 Dash (-) notation with an application that doesn't support it
2000
2001 interlace -t 192.168.12.1-15 -c \
2002 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2003 -o ~/scans/ -threads 50
2004
2005 # Dash notation is not supported in prips
2006 prips 192.168.12.1 192.168.12.15 |
2007 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2008
2009 Threading Support for an application that doesn't support it
2010
2011 interlace -tL ./target-list.txt -c \
2012 "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
2013 -o ~/scans/ -threads 50
2014
2015 cat ./target-list.txt |
2016 parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
2017
2018 alternatively
2019
2020 ./vhosts-commands.txt:
2021 vhostscan -t $target -oN _output_/_target_-vhosts.txt
2022 interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
2023 -threads 50 -o ~/scans
2024
2025 ./vhosts-commands.txt:
2026 vhostscan -t "$1" -oN "$2"
2027 parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
2028 :::: ./target-list.txt
2029
2030 Exclusions
2031
2032 interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
2033 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2034 -o ~/scans/ -threads 50
2035
2036 prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
2037 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2038
2039 Run Nikto Using Multiple Proxies
2040
2041 interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
2042 "nikto --host _target_:_port_ -useproxy _proxy_ > \
2043 ./_target_-_port_-nikto.txt" -p 80,443 -v
2044
2045 parallel -j5 \
2046 "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
2047 :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
2048
2049 https://github.com/codingo/Interlace (Last checked: 2019-09)
2050
2051 DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
2052 I have been unable to get the code to run at all. It seems unfinished.
2053
2054 https://github.com/otonvm/Parallel (Last checked: 2019-02)
2055
2056 DIFFERENCES BETWEEN k-bx par AND GNU Parallel
2057 par requires Haskell to work. This limits the number of platforms this
2058 can work on.
2059
2060 par does line buffering in memory. The memory usage is 3x the longest
2061 line (compared to 1x for parallel --lb). Commands must be given as
2062 arguments. There is no template.
2063
2064 These are the examples from https://github.com/k-bx/par with the
2065 corresponding GNU parallel command.
2066
2067 par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2068 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2069 parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2070 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2071
2072 par "echo foo; sleep 1; foofoo" \
2073 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2074 parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
2075 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2076
2077 par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
2078 parallel --lb --colsep , --tagstring {1} {2} \
2079 ::: "[fooechoer],echo foo" "[bar],echo bar"
2080
2081 par --succeed "foo" "bar" && echo 'wow'
2082 parallel "foo" "bar"; true && echo 'wow'
2083
2084 https://github.com/k-bx/par (Last checked: 2019-02)
2085
2086 DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2087 parallelshell does not allow for composed commands:
2088
2089 # This does not work
2090 parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2091
2092 Instead you have to wrap that in a shell:
2093
2094 parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2095
2096 It buffers output in RAM. All commands must be given on the command
2097 line and all commands are started in parallel at the same time. This
2098 will cause the system to freeze if there are so many jobs that there is
2099 not enough memory to run them all at the same time.
2100
2101 https://github.com/keithamus/parallelshell (Last checked: 2019-02)
2102
2103 https://github.com/darkguy2008/parallelshell (Last checked: 2019-03)
2104
2105 DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2106 shell-executor does not allow for composed commands:
2107
2108 # This does not work
2109 sx 'echo foo;echo bar' 'echo baz;echo quuz'
2110
2111 Instead you have to wrap that in a shell:
2112
2113 sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2114
2115 It buffers output in RAM. All commands must be given on the command
2116 line and all commands are started in parallel at the same time. This
2117 will cause the system to freeze if there are so many jobs that there is
2118 not enough memory to run them all at the same time.
2119
2120 https://github.com/royriojas/shell-executor (Last checked: 2019-02)
2121
2122 DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2123 par buffers in memory to avoid mixing of jobs. It takes 1s per 1
2124 million output lines.
2125
2126 par needs to have all commands before starting the first job. The jobs
2127 are read from stdin (standard input) so any quoting will have to be
2128 done by the user.
2129
2130 Stdout (standard output) is prepended with o:. Stderr (standard error)
2131 is sendt to stdout (standard output) and prepended with e:.
2132
2133 For short jobs with little output par is 20% faster than GNU parallel
2134 and 60% slower than xargs.
2135
2136 https://github.com/UnixJunkie/PAR
2137
2138 https://savannah.nongnu.org/projects/par (Last checked: 2019-02)
2139
2140 DIFFERENCES BETWEEN fd AND GNU Parallel
2141 fd does not support composed commands, so commands must be wrapped in
2142 sh -c.
2143
2144 It buffers output in RAM.
2145
2146 It only takes file names from the filesystem as input (similar to
2147 find).
2148
2149 https://github.com/sharkdp/fd (Last checked: 2019-02)
2150
2151 DIFFERENCES BETWEEN lateral AND GNU Parallel
2152 lateral is very similar to sem: It takes a single command and runs it
2153 in the background. The design means that output from parallel running
2154 jobs may mix. If it dies unexpectly it leaves a socket in
2155 ~/.lateral/socket.PID.
2156
2157 lateral deals badly with too long command lines. This makes the lateral
2158 server crash:
2159
2160 lateral run echo `seq 100000| head -c 1000k`
2161
2162 Any options will be read by lateral so this does not work (lateral
2163 interprets the -l):
2164
2165 lateral run ls -l
2166
2167 Composed commands do not work:
2168
2169 lateral run pwd ';' ls
2170
2171 Functions do not work:
2172
2173 myfunc() { echo a; }
2174 export -f myfunc
2175 lateral run myfunc
2176
2177 Running emacs in the terminal causes the parent shell to die:
2178
2179 echo '#!/bin/bash' > mycmd
2180 echo emacs -nw >> mycmd
2181 chmod +x mycmd
2182 lateral start
2183 lateral run ./mycmd
2184
2185 Here are the examples from https://github.com/akramer/lateral with the
2186 corresponding GNU sem and GNU parallel commands:
2187
2188 1$ lateral start
2189 for i in $(cat /tmp/names); do
2190 lateral run -- some_command $i
2191 done
2192 lateral wait
2193
2194 1$ for i in $(cat /tmp/names); do
2195 sem some_command $i
2196 done
2197 sem --wait
2198
2199 1$ parallel some_command :::: /tmp/names
2200
2201 2$ lateral start
2202 for i in $(seq 1 100); do
2203 lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2204 done
2205 lateral wait
2206
2207 2$ for i in $(seq 1 100); do
2208 sem my_slow_command < workfile$i > /tmp/logfile$i
2209 done
2210 sem --wait
2211
2212 2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2213 ::: {1..100}
2214
2215 3$ lateral start -p 0 # yup, it will just queue tasks
2216 for i in $(seq 1 100); do
2217 lateral run -- command_still_outputs_but_wont_spam inputfile$i
2218 done
2219 # command output spam can commence
2220 lateral config -p 10; lateral wait
2221
2222 3$ for i in $(seq 1 100); do
2223 echo "command inputfile$i" >> joblist
2224 done
2225 parallel -j 10 :::: joblist
2226
2227 3$ echo 1 > /tmp/njobs
2228 parallel -j /tmp/njobs command inputfile{} \
2229 ::: {1..100} &
2230 echo 10 >/tmp/njobs
2231 wait
2232
2233 https://github.com/akramer/lateral (Last checked: 2019-03)
2234
2235 DIFFERENCES BETWEEN with-this AND GNU Parallel
2236 The examples from https://github.com/amritb/with-this.git and the
2237 corresponding GNU parallel command:
2238
2239 with -v "$(cat myurls.txt)" "curl -L this"
2240 parallel curl -L ::: myurls.txt
2241
2242 with -v "$(cat myregions.txt)" \
2243 "aws --region=this ec2 describe-instance-status"
2244 parallel aws --region={} ec2 describe-instance-status \
2245 :::: myregions.txt
2246
2247 with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2248 ls | parallel kubectl --kubeconfig={} get pods
2249
2250 with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2251 ls | grep config | parallel kubectl --kubeconfig={} get pods
2252
2253 with -v "$(echo {1..10})" "echo 123"
2254 parallel -N0 echo 123 ::: {1..10}
2255
2256 Stderr is merged with stdout. with-this buffers in RAM. It uses 3x the
2257 output size, so you cannot have output larger than 1/3rd the amount of
2258 RAM. The input values cannot contain spaces. Composed commands do not
2259 work.
2260
2261 with-this gives some additional information, so the output has to be
2262 cleaned before piping it to the next command.
2263
2264 https://github.com/amritb/with-this.git (Last checked: 2019-03)
2265
2266 DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2267 Summary (see legend above):
2268
2269 - - - I4 - - I7
2270 - - M3 - - M6
2271 - O2 O3 - O5 O6 - x x
2272 E1 - - - - - E7
2273 - x x x x x x x x
2274 - -
2275
2276 EXAMPLES FROM Tollef's parallel MANUAL
2277
2278 Tollef parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2279
2280 GNU parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2281
2282 Tollef parallel -j 3 ufraw -o processed -- *.NEF
2283
2284 GNU parallel -j 3 ufraw -o processed ::: *.NEF
2285
2286 Tollef parallel -j 3 -- ls df "echo hi"
2287
2288 GNU parallel -j 3 ::: ls df "echo hi"
2289
2290 (Last checked: 2019-08)
2291
2292 DIFFERENCES BETWEEN rargs AND GNU Parallel
2293 Summary (see legend above):
2294
2295 I1 - - - - - I7
2296 - - M3 M4 - -
2297 - O2 O3 - O5 O6 - O8 -
2298 E1 - - E4 - - -
2299 - - - - - - - - -
2300 - -
2301
2302 rargs has elegant ways of doing named regexp capture and field ranges.
2303
2304 With GNU parallel you can use --rpl to get a similar functionality as
2305 regexp capture gives, and use join and @arg to get the field ranges.
2306 But the syntax is longer. This:
2307
2308 --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2309
2310 would make it possible to use:
2311
2312 {1r3..6}
2313
2314 for field 3..6.
2315
2316 For full support of {n..m:s} including negative numbers use a dynamic
2317 replacement string like this:
2318
2319 PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2320 $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2321 $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2322 $s = defined $$6 ? $$7 : " ";
2323 $_ = join $s,@arg[$a..$b]'\'
2324 export PARALLEL
2325
2326 You can then do:
2327
2328 head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2329 4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2330 ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2331
2332 EXAMPLES FROM rargs MANUAL
2333
2334 1$ ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2335
2336 1$ ls *.bak | parallel mv {} {.}
2337
2338 2$ cat download-list.csv |
2339 rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2340
2341 2$ cat download-list.csv |
2342 parallel --csv wget {1} -O {2}
2343 # or use regexps:
2344 2$ cat download-list.csv |
2345 parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' \
2346 wget {url} -O {filename}
2347
2348 3$ cat /etc/passwd |
2349 rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2350
2351 3$ cat /etc/passwd |
2352 parallel -q --colsep : \
2353 echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2354
2355 https://github.com/lotabout/rargs (Last checked: 2020-01)
2356
2357 DIFFERENCES BETWEEN threader AND GNU Parallel
2358 Summary (see legend above):
2359
2360 I1 - - - - - -
2361 M1 - M3 - - M6
2362 O1 - O3 - O5 - - N/A N/A
2363 E1 - - E4 - - -
2364 - - - - - - - - -
2365 - -
2366
2367 Newline separates arguments, but newline at the end of file is treated
2368 as an empty argument. So this runs 2 jobs:
2369
2370 echo two_jobs | threader -run 'echo "$THREADID"'
2371
2372 threader ignores stderr, so any output to stderr is lost. threader
2373 buffers in RAM, so output bigger than the machine's virtual memory will
2374 cause the machine to crash.
2375
2376 https://github.com/voodooEntity/threader (Last checked: 2020-04)
2377
2378 DIFFERENCES BETWEEN runp AND GNU Parallel
2379 Summary (see legend above):
2380
2381 I1 I2 - - - - -
2382 M1 - (M3) - - M6
2383 O1 O2 O3 - O5 O6 - N/A N/A -
2384 E1 - - - - - -
2385 - - - - - - - - -
2386 - -
2387
2388 (M3): You can add a prefix and a postfix to the input, so it means you
2389 can only insert the argument on the command line once.
2390
2391 runp runs 10 jobs in parallel by default. runp blocks if output of a
2392 command is > 64 Kbytes. Quoting of input is needed. It adds output to
2393 stderr (this can be prevented with -q)
2394
2395 Examples as GNU Parallel
2396
2397 base='https://images-api.nasa.gov/search'
2398 query='jupiter'
2399 desc='planet'
2400 type='image'
2401 url="$base?q=$query&description=$desc&media_type=$type"
2402
2403 # Download the images in parallel using runp
2404 curl -s $url | jq -r .collection.items[].href | \
2405 runp -p 'curl -s' | jq -r .[] | grep large | \
2406 runp -p 'curl -s -L -O'
2407
2408 time curl -s $url | jq -r .collection.items[].href | \
2409 runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2410 runp -g 1 -q -p 'curl -s -L -O'
2411
2412 # Download the images in parallel
2413 curl -s $url | jq -r .collection.items[].href | \
2414 parallel curl -s | jq -r .[] | grep large | \
2415 parallel curl -s -L -O
2416
2417 time curl -s $url | jq -r .collection.items[].href | \
2418 parallel -j 1 curl -s | jq -r .[] | grep large | \
2419 parallel -j 1 curl -s -L -O
2420
2421 Run some test commands (read from file)
2422
2423 # Create a file containing commands to run in parallel.
2424 cat << EOF > /tmp/test-commands.txt
2425 sleep 5
2426 sleep 3
2427 blah # this will fail
2428 ls $PWD # PWD shell variable is used here
2429 EOF
2430
2431 # Run commands from the file.
2432 runp /tmp/test-commands.txt > /dev/null
2433
2434 parallel -a /tmp/test-commands.txt > /dev/null
2435
2436 Ping several hosts and see packet loss (read from stdin)
2437
2438 # First copy this line and press Enter
2439 runp -p 'ping -c 5 -W 2' -s '| grep loss'
2440 localhost
2441 1.1.1.1
2442 8.8.8.8
2443 # Press Enter and Ctrl-D when done entering the hosts
2444
2445 # First copy this line and press Enter
2446 parallel ping -c 5 -W 2 {} '| grep loss'
2447 localhost
2448 1.1.1.1
2449 8.8.8.8
2450 # Press Enter and Ctrl-D when done entering the hosts
2451
2452 Get directories' sizes (read from stdin)
2453
2454 echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2455
2456 echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2457 # or:
2458 parallel sudo du -sh ::: "$HOME" /etc /tmp
2459
2460 Compress files
2461
2462 find . -iname '*.txt' | runp -p 'gzip --best'
2463
2464 find . -iname '*.txt' | parallel gzip --best
2465
2466 Measure HTTP request + response time
2467
2468 export CURL="curl -w 'time_total: %{time_total}\n'"
2469 CURL="$CURL -o /dev/null -s https://golang.org/"
2470 perl -wE 'for (1..10) { say $ENV{CURL} }' |
2471 runp -q # Make 10 requests
2472
2473 perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2474 # or:
2475 parallel -N0 "$CURL" ::: {1..10}
2476
2477 Find open TCP ports
2478
2479 cat << EOF > /tmp/host-port.txt
2480 localhost 22
2481 localhost 80
2482 localhost 81
2483 127.0.0.1 443
2484 127.0.0.1 444
2485 scanme.nmap.org 22
2486 scanme.nmap.org 23
2487 scanme.nmap.org 443
2488 EOF
2489
2490 1$ cat /tmp/host-port.txt |
2491 runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2492
2493 # --colsep is needed to split the line
2494 1$ cat /tmp/host-port.txt |
2495 parallel --colsep ' ' netcat -v -w2 -z 2>&1 |
2496 egrep '(succeeded!|open)$'
2497 # or use uq for unquoted:
2498 1$ cat /tmp/host-port.txt |
2499 parallel netcat -v -w2 -z {=uq=} 2>&1 |
2500 egrep '(succeeded!|open)$'
2501
2502 https://github.com/jreisinger/runp (Last checked: 2020-04)
2503
2504 DIFFERENCES BETWEEN papply AND GNU Parallel
2505 Summary (see legend above):
2506
2507 - - - I4 - - -
2508 M1 - M3 - - M6
2509 - - O3 - O5 - - N/A N/A O10
2510 E1 - - E4 - - -
2511 - - - - - - - - -
2512 - -
2513
2514 papply does not print the output if the command fails:
2515
2516 $ papply 'echo %F; false' foo
2517 "echo foo; false" did not succeed
2518
2519 papply's replacement strings (%F %d %f %n %e %z) can be simulated in
2520 GNU parallel by putting this in ~/.parallel/config:
2521
2522 --rpl '%F'
2523 --rpl '%d $_=Q(::dirname($_));'
2524 --rpl '%f s:.*/::;'
2525 --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2526 --rpl '%e s:.*\.:.:'
2527 --rpl '%z $_=""'
2528
2529 papply buffers in RAM, and uses twice the amount of output. So output
2530 of 5 GB takes 10 GB RAM.
2531
2532 The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2533 seconds (compared to 10 seconds with GNU parallel).
2534
2535 Examples as GNU Parallel
2536
2537 1$ papply gzip *.txt
2538
2539 1$ parallel gzip ::: *.txt
2540
2541 2$ papply "convert %F %n.jpg" *.png
2542
2543 2$ parallel convert {} {.}.jpg ::: *.png
2544
2545 https://pypi.org/project/papply/ (Last checked: 2020-04)
2546
2547 DIFFERENCES BETWEEN async AND GNU Parallel
2548 Summary (see legend above):
2549
2550 - - - I4 - - I7
2551 - - - - - M6
2552 - O2 O3 - O5 O6 - N/A N/A O10
2553 E1 - - E4 - E6 -
2554 - - - - - - - - -
2555 S1 S2
2556
2557 async is very similary to GNU parallel's --semaphore mode (aka sem).
2558 async requires the user to start a server process.
2559
2560 The input is quoted like -q so you need bash -c "...;..." to run
2561 composed commands.
2562
2563 Examples as GNU Parallel
2564
2565 1$ S="/tmp/example_socket"
2566
2567 1$ ID=myid
2568
2569 2$ async -s="$S" server --start
2570
2571 2$ # GNU Parallel does not need a server to run
2572
2573 3$ for i in {1..20}; do
2574 # prints command output to stdout
2575 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2576 done
2577
2578 3$ for i in {1..20}; do
2579 # prints command output to stdout
2580 sem --id "$ID" -j100% "sleep 1 && echo test $i"
2581 # GNU Parallel will only print job when it is done
2582 # If you need output from different jobs to mix
2583 # use -u or --line-buffer
2584 sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2585 done
2586
2587 4$ # wait until all commands are finished
2588 async -s="$S" wait
2589
2590 4$ sem --id "$ID" --wait
2591
2592 5$ # configure the server to run four commands in parallel
2593 async -s="$S" server -j4
2594
2595 5$ export PARALLEL=-j4
2596
2597 6$ mkdir "/tmp/ex_dir"
2598 for i in {21..40}; do
2599 # redirects command output to /tmp/ex_dir/file*
2600 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2601 bash -c "sleep 1 && echo test $i"
2602 done
2603
2604 6$ mkdir "/tmp/ex_dir"
2605 for i in {21..40}; do
2606 # redirects command output to /tmp/ex_dir/file*
2607 sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
2608 "sleep 1 && echo test $i"
2609 done
2610
2611 7$ sem --id "$ID" --wait
2612
2613 7$ async -s="$S" wait
2614
2615 8$ # stops server
2616 async -s="$S" server --stop
2617
2618 8$ # GNU Parallel does not need to stop a server
2619
2620 https://github.com/ctbur/async/ (Last checked: 2023-01)
2621
2622 DIFFERENCES BETWEEN pardi AND GNU Parallel
2623 Summary (see legend above):
2624
2625 I1 I2 - - - - I7
2626 M1 - - - - M6
2627 O1 O2 O3 O4 O5 - O7 - - O10
2628 E1 - - E4 - - -
2629 - - - - - - - - -
2630 - -
2631
2632 pardi is very similar to parallel --pipe --cat: It reads blocks of data
2633 and not arguments. So it cannot insert an argument in the command line.
2634 It puts the block into a temporary file, and this file name (%IN) can
2635 be put in the command line. You can only use %IN once.
2636
2637 It can also run full command lines in parallel (like: cat file |
2638 parallel).
2639
2640 EXAMPLES FROM pardi test.sh
2641
2642 1$ time pardi -v -c 100 -i data/decoys.smi -ie .smi -oe .smi \
2643 -o data/decoys_std_pardi.smi \
2644 -w '(standardiser -i %IN -o %OUT 2>&1) > /dev/null'
2645
2646 1$ cat data/decoys.smi |
2647 time parallel -N 100 --pipe --cat \
2648 '(standardiser -i {} -o {#} 2>&1) > /dev/null; cat {#}; rm {#}' \
2649 > data/decoys_std_pardi.smi
2650
2651 2$ pardi -n 1 -i data/test_in.types -o data/test_out.types \
2652 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
2653
2654 2$ cat data/test_in.types |
2655 parallel -n 1 -k --pipe --cat --regexp --recstart '^#atoms' \
2656 'cat {}' > data/test_out.types
2657
2658 3$ pardi -c 6 -i data/test_in.types -o data/test_out.types \
2659 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
2660
2661 3$ cat data/test_in.types |
2662 parallel -n 6 -k --pipe --cat --regexp --recstart '^#atoms' \
2663 'cat {}' > data/test_out.types
2664
2665 4$ pardi -i data/decoys.mol2 -o data/still_decoys.mol2 \
2666 -d 's:@<TRIPOS>MOLECULE' -w 'cp %IN %OUT'
2667
2668 4$ cat data/decoys.mol2 |
2669 parallel -n 1 --pipe --cat --recstart '@<TRIPOS>MOLECULE' \
2670 'cp {} {#}; cat {#}; rm {#}' > data/still_decoys.mol2
2671
2672 5$ pardi -i data/decoys.mol2 -o data/decoys2.mol2 \
2673 -d b:10000 -w 'cp %IN %OUT' --preserve
2674
2675 5$ cat data/decoys.mol2 |
2676 parallel -k --pipe --block 10k --recend '' --cat \
2677 'cat {} > {#}; cat {#}; rm {#}' > data/decoys2.mol2
2678
2679 https://github.com/UnixJunkie/pardi (Last checked: 2021-01)
2680
2681 DIFFERENCES BETWEEN bthread AND GNU Parallel
2682 Summary (see legend above):
2683
2684 - - - I4 - - -
2685 - - - - - M6
2686 O1 - O3 - - - O7 O8 - -
2687 E1 - - - - - -
2688 - - - - - - - - -
2689 - -
2690
2691 bthread takes around 1 sec per MB of output. The maximal output line
2692 length is 1073741759.
2693
2694 You cannot quote space in the command, so you cannot run composed
2695 commands like sh -c "echo a; echo b".
2696
2697 https://gitlab.com/netikras/bthread (Last checked: 2021-01)
2698
2699 DIFFERENCES BETWEEN simple_gpu_scheduler AND GNU Parallel
2700 Summary (see legend above):
2701
2702 I1 - - - - - I7
2703 M1 - - - - M6
2704 - O2 O3 - - O6 - x x O10
2705 E1 - - - - - -
2706 - - - - - - - - -
2707 - -
2708
2709 EXAMPLES FROM simple_gpu_scheduler MANUAL
2710
2711 1$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
2712
2713 1$ parallel -j3 --shuf \
2714 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' \
2715 < gpu_commands.txt
2716
2717 2$ simple_hypersearch \
2718 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
2719 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
2720 simple_gpu_scheduler --gpus 0,1,2
2721
2722 2$ parallel --header : --shuf -j3 -v \
2723 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' \
2724 python3 train_dnn.py --lr {lr} --batch_size {bs} \
2725 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2726
2727 3$ simple_hypersearch \
2728 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
2729 --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
2730 simple_gpu_scheduler --gpus 0,1,2
2731
2732 3$ parallel --header : --shuf \
2733 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq()>5 and skip() =}' \
2734 python3 train_dnn.py --lr {lr} --batch_size {bs} \
2735 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
2736
2737 4$ touch gpu.queue
2738 tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
2739 echo "my_command_with | and stuff > logfile" >> gpu.queue
2740
2741 4$ touch gpu.queue
2742 tail -f -n 0 gpu.queue |
2743 parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
2744 # Needed to fill job slots once
2745 seq 3 | parallel echo true >> gpu.queue
2746 # Add jobs
2747 echo "my_command_with | and stuff > logfile" >> gpu.queue
2748 # Needed to flush output from completed jobs
2749 seq 3 | parallel echo true >> gpu.queue
2750
2751 https://github.com/ExpectationMax/simple_gpu_scheduler (Last checked:
2752 2021-01)
2753
2754 DIFFERENCES BETWEEN parasweep AND GNU Parallel
2755 parasweep is a Python module for facilitating parallel parameter
2756 sweeps.
2757
2758 A parasweep job will normally take a text file as input. The text file
2759 contains arguments for the job. Some of these arguments will be fixed
2760 and some of them will be changed by parasweep.
2761
2762 It does this by having a template file such as template.txt:
2763
2764 Xval: {x}
2765 Yval: {y}
2766 FixedValue: 9
2767 # x with 2 decimals
2768 DecimalX: {x:.2f}
2769 TenX: ${x*10}
2770 RandomVal: {r}
2771
2772 and from this template it generates the file to be used by the job by
2773 replacing the replacement strings.
2774
2775 Being a Python module parasweep integrates tighter with Python than GNU
2776 parallel. You get the parameters directly in a Python data structure.
2777 With GNU parallel you can use the JSON or CSV output format to get
2778 something similar, but you would have to read the output.
2779
2780 parasweep has a filtering method to ignore parameter combinations you
2781 do not need.
2782
2783 Instead of calling the jobs directly, parasweep can use Python's
2784 Distributed Resource Management Application API to make jobs run with
2785 different cluster software.
2786
2787 GNU parallel --tmpl supports templates with replacement strings. Such
2788 as:
2789
2790 Xval: {x}
2791 Yval: {y}
2792 FixedValue: 9
2793 # x with 2 decimals
2794 DecimalX: {=x $_=sprintf("%.2f",$_) =}
2795 TenX: {=x $_=$_*10 =}
2796 RandomVal: {=1 $_=rand() =}
2797
2798 that can be used like:
2799
2800 parallel --header : --tmpl my.tmpl={#}.t myprog {#}.t \
2801 ::: x 1 2 3 ::: y 1 2 3
2802
2803 Filtering is supported as:
2804
2805 parallel --filter '{1} > {2}' echo ::: 1 2 3 ::: 1 2 3
2806
2807 https://github.com/eviatarbach/parasweep (Last checked: 2021-01)
2808
2809 DIFFERENCES BETWEEN parallel-bash AND GNU Parallel
2810 Summary (see legend above):
2811
2812 I1 I2 - - - - -
2813 - - M3 - - M6
2814 - O2 O3 - O5 O6 - O8 x O10
2815 E1 - - - - - -
2816 - - - - - - - - -
2817 - -
2818
2819 parallel-bash is written in pure bash. It is really fast (overhead of
2820 ~0.05 ms/job compared to GNU parallel's 3-10 ms/job). So if your jobs
2821 are extremely short lived, and you can live with the quite limited
2822 command, this may be useful.
2823
2824 It works by making a queue for each process. Then the jobs are
2825 distributed to the queues in a round robin fashion. Finally the queues
2826 are started in parallel. This works fine, if you are lucky, but if not,
2827 all the long jobs may end up in the same queue, so you may see:
2828
2829 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
2830 time parallel -P4 sleep {}
2831 (7 seconds)
2832 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
2833 time ./parallel-bash.bash -p 4 -c sleep {}
2834 (12 seconds)
2835
2836 Because it uses bash lists, the total number of jobs is limited to
2837 167000..265000 depending on your environment. You get a segmentation
2838 fault, when you reach the limit.
2839
2840 Ctrl-C does not stop spawning new jobs. Ctrl-Z does not suspend running
2841 jobs.
2842
2843 EXAMPLES FROM parallel-bash
2844
2845 1$ some_input | parallel-bash -p 5 -c echo
2846
2847 1$ some_input | parallel -j 5 echo
2848
2849 2$ parallel-bash -p 5 -c echo < some_file
2850
2851 2$ parallel -j 5 echo < some_file
2852
2853 3$ parallel-bash -p 5 -c echo <<< 'some string'
2854
2855 3$ parallel -j 5 -c echo <<< 'some string'
2856
2857 4$ something | parallel-bash -p 5 -c echo {} {}
2858
2859 4$ something | parallel -j 5 echo {} {}
2860
2861 https://reposhub.com/python/command-line-tools/Akianonymus-parallel-bash.html
2862 (Last checked: 2021-06)
2863
2864 DIFFERENCES BETWEEN bash-concurrent AND GNU Parallel
2865 bash-concurrent is more an alternative to make than to GNU parallel.
2866 Its input is very similar to a Makefile, where jobs depend on other
2867 jobs.
2868
2869 It has a nice progress indicator where you can see which jobs completed
2870 successfully, which jobs are currently running, which jobs failed, and
2871 which jobs were skipped due to a depending job failed. The indicator
2872 does not deal well with resizing the window.
2873
2874 Output is cached in tempfiles on disk, but is only shown if there is an
2875 error, so it is not meant to be part of a UNIX pipeline. If bash-
2876 concurrent crashes these tempfiles are not removed.
2877
2878 It uses an O(n*n) algorithm, so if you have 1000 independent jobs it
2879 takes 22 seconds to start it.
2880
2881 https://github.com/themattrix/bash-concurrent (Last checked: 2021-02)
2882
2883 DIFFERENCES BETWEEN spawntool AND GNU Parallel
2884 Summary (see legend above):
2885
2886 I1 - - - - - -
2887 M1 - - - - M6
2888 - O2 O3 - O5 O6 - x x O10
2889 E1 - - - - - -
2890 - - - - - - - - -
2891 - -
2892
2893 spawn reads a full command line from stdin which it executes in
2894 parallel.
2895
2896 http://code.google.com/p/spawntool/ (Last checked: 2021-07)
2897
2898 DIFFERENCES BETWEEN go-pssh AND GNU Parallel
2899 Summary (see legend above):
2900
2901 - - - - - - -
2902 M1 - - - - -
2903 O1 - - - - - - x x O10
2904 E1 - - - - - -
2905 R1 R2 - - - R6 - - -
2906 - -
2907
2908 go-pssh does ssh in parallel to multiple machines. It runs the same
2909 command on multiple machines similar to --nonall.
2910
2911 The hostnames must be given as IP-addresses (not as hostnames).
2912
2913 Output is sent to stdout (standard output) if command is successful,
2914 and to stderr (standard error) if the command fails.
2915
2916 EXAMPLES FROM go-pssh
2917
2918 1$ go-pssh -l <ip>,<ip> -u <user> -p <port> -P <passwd> -c "<command>"
2919
2920 1$ parallel -S 'sshpass -p <passwd> ssh -p <port> <user>@<ip>' \
2921 --nonall "<command>"
2922
2923 2$ go-pssh scp -f host.txt -u <user> -p <port> -P <password> \
2924 -s /local/file_or_directory -d /remote/directory
2925
2926 2$ parallel --nonall --slf host.txt \
2927 --basefile /local/file_or_directory/./ --wd /remote/directory
2928 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
2929
2930 3$ go-pssh scp -l <ip>,<ip> -u <user> -p <port> -P <password> \
2931 -s /local/file_or_directory -d /remote/directory
2932
2933 3$ parallel --nonall -S <ip>,<ip> \
2934 --basefile /local/file_or_directory/./ --wd /remote/directory
2935 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
2936
2937 https://github.com/xuchenCN/go-pssh (Last checked: 2021-07)
2938
2939 DIFFERENCES BETWEEN go-parallel AND GNU Parallel
2940 Summary (see legend above):
2941
2942 I1 I2 - - - - I7
2943 - - M3 - - M6
2944 - O2 O3 - O5 - - x x - O10
2945 E1 - - E4 - - -
2946 - - - - - - - - -
2947 - -
2948
2949 go-parallel uses Go templates for replacement strings. Quite similar to
2950 the {= perl expr =} replacement string.
2951
2952 EXAMPLES FROM go-parallel
2953
2954 1$ go-parallel -a ./files.txt -t 'cp {{.Input}} {{.Input | dirname | dirname}}'
2955
2956 1$ parallel -a ./files.txt cp {} '{= $_=::dirname(::dirname($_)) =}'
2957
2958 2$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{noExt .Input}}'
2959
2960 2$ parallel -a ./files.txt echo mkdir -p {} {.}
2961
2962 3$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{.Input | basename | noExt}}'
2963
2964 3$ parallel -a ./files.txt echo mkdir -p {} {/.}
2965
2966 https://github.com/mylanconnolly/parallel (Last checked: 2021-07)
2967
2968 DIFFERENCES BETWEEN p AND GNU Parallel
2969 Summary (see legend above):
2970
2971 - - - I4 - - N/A
2972 - - - - - M6
2973 - O2 O3 - O5 O6 - x x - O10
2974 E1 - - - - - -
2975 - - - - - - - - -
2976 - -
2977
2978 p is a tiny shell script. It can color output with some predefined
2979 colors, but is otherwise quite limited.
2980
2981 It maxes out at around 116000 jobs (probably due to limitations in
2982 Bash).
2983
2984 EXAMPLES FROM p
2985
2986 Some of the examples from p cannot be implemented 100% by GNU parallel:
2987 The coloring is a bit different, and GNU parallel cannot have --tag for
2988 some inputs and not for others.
2989
2990 The coloring done by GNU parallel is not exactly the same as p.
2991
2992 1$ p -bc blue "ping 127.0.0.1" -uc red "ping 192.168.0.1" \
2993 -rc yellow "ping 192.168.1.1" -t example "ping example.com"
2994
2995 1$ parallel --lb -j0 --color --tag ping \
2996 ::: 127.0.0.1 192.168.0.1 192.168.1.1 example.com
2997
2998 2$ p "tail -f /var/log/httpd/access_log" \
2999 -bc red "tail -f /var/log/httpd/error_log"
3000
3001 2$ cd /var/log/httpd;
3002 parallel --lb --color --tag tail -f ::: access_log error_log
3003
3004 3$ p tail -f "some file" \& p tail -f "other file with space.txt"
3005
3006 3$ parallel --lb tail -f ::: 'some file' "other file with space.txt"
3007
3008 4$ p -t project1 "hg pull project1" -t project2 \
3009 "hg pull project2" -t project3 "hg pull project3"
3010
3011 4$ parallel --lb hg pull ::: project{1..3}
3012
3013 https://github.com/rudymatela/evenmoreutils/blob/master/man/p.1.adoc
3014 (Last checked: 2022-04)
3015
3016 DIFFERENCES BETWEEN senechal AND GNU Parallel
3017 Summary (see legend above):
3018
3019 I1 - - - - - -
3020 M1 - M3 - - M6
3021 O1 - O3 O4 - - - x x -
3022 E1 - - - - - -
3023 - - - - - - - - -
3024 - -
3025
3026 seneschal only starts the first job after reading the last job, and
3027 output from the first job is only printed after the last job finishes.
3028
3029 1 byte of output requites 3.5 bytes of RAM.
3030
3031 This makes it impossible to have a total output bigger than the virtual
3032 memory.
3033
3034 Even though output is kept in RAM outputing is quite slow: 30 MB/s.
3035
3036 Output larger than 4 GB causes random problems - it looks like a race
3037 condition.
3038
3039 This:
3040
3041 echo 1 | seneschal --prefix='yes `seq 1000`|head -c 1G' >/dev/null
3042
3043 takes 4100(!) CPU seconds to run on a 64C64T server, but only 140 CPU
3044 seconds on a 4C8T laptop. So it looks like seneschal wastes a lot of
3045 CPU time coordinating the CPUs.
3046
3047 Compare this to:
3048
3049 echo 1 | time -v parallel -N0 'yes `seq 1000`|head -c 1G' >/dev/null
3050
3051 which takes 3-8 CPU seconds.
3052
3053 EXAMPLES FROM seneschal README.md
3054
3055 1$ echo $REPOS | seneschal --prefix="cd {} && git pull"
3056
3057 # If $REPOS is newline separated
3058 1$ echo "$REPOS" | parallel -k "cd {} && git pull"
3059 # If $REPOS is space separated
3060 1$ echo -n "$REPOS" | parallel -d' ' -k "cd {} && git pull"
3061
3062 COMMANDS="pwd
3063 sleep 5 && echo boom
3064 echo Howdy
3065 whoami"
3066
3067 2$ echo "$COMMANDS" | seneschal --debug
3068
3069 2$ echo "$COMMANDS" | parallel -k -v
3070
3071 3$ ls -1 | seneschal --prefix="pushd {}; git pull; popd;"
3072
3073 3$ ls -1 | parallel -k "pushd {}; git pull; popd;"
3074 # Or if current dir also contains files:
3075 3$ parallel -k "pushd {}; git pull; popd;" ::: */
3076
3077 https://github.com/TheWizardTower/seneschal (Last checked: 2022-06)
3078
3079 DIFFERENCES BETWEEN async AND GNU Parallel
3080 Summary (see legend above):
3081
3082 x x x x x x x
3083 - x x x x x
3084 x O2 O3 O4 O5 O6 - x x O10
3085 E1 - - E4 - - -
3086 - - - - - - - - -
3087 S1 S2
3088
3089 async works like sem.
3090
3091 EXAMPLES FROM async
3092
3093 1$ S="/tmp/example_socket"
3094
3095 async -s="$S" server --start
3096
3097 for i in {1..20}; do
3098 # prints command output to stdout
3099 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
3100 done
3101
3102 # wait until all commands are finished
3103 async -s="$S" wait
3104
3105 1$ S="example_id"
3106
3107 # server not needed
3108
3109 for i in {1..20}; do
3110 # prints command output to stdout
3111 sem --bg --id "$S" -j100% "sleep 1 && echo test $i"
3112 done
3113
3114 # wait until all commands are finished
3115 sem --fg --id "$S" --wait
3116
3117 2$ # configure the server to run four commands in parallel
3118 async -s="$S" server -j4
3119
3120 mkdir "/tmp/ex_dir"
3121 for i in {21..40}; do
3122 # redirects command output to /tmp/ex_dir/file*
3123 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
3124 bash -c "sleep 1 && echo test $i"
3125 done
3126
3127 async -s="$S" wait
3128
3129 # stops server
3130 async -s="$S" server --stop
3131
3132 2$ # starting server not needed
3133
3134 mkdir "/tmp/ex_dir"
3135 for i in {21..40}; do
3136 # redirects command output to /tmp/ex_dir/file*
3137 sem --bg --id "$S" --results "/tmp/ex_dir/file$i{}" \
3138 "sleep 1 && echo test $i"
3139 done
3140
3141 sem --fg --id "$S" --wait
3142
3143 # there is no server to stop
3144
3145 https://github.com/ctbur/async (Last checked: 2023-01)
3146
3147 DIFFERENCES BETWEEN tandem AND GNU Parallel
3148 Summary (see legend above):
3149
3150 - - - I4 - - N/A
3151 M1 - - - - M6
3152 - - O3 - - - - N/A - -
3153 E1 - E3 - E5 - -
3154 - - - - - - - - -
3155 - -
3156
3157 tandem runs full commands in parallel. It is made for starting a
3158 "server", running a job against the server, and when the job is done,
3159 the server is killed.
3160
3161 More generally: it kills all jobs when the first job completes -
3162 similar to '--halt now,done=1'.
3163
3164 tandem silently discards some output. It is unclear exactly when this
3165 happens. It looks like a race condition, because it varies for each
3166 run.
3167
3168 $ tandem "seq 10000" | wc -l
3169 6731 <- This should always be 10002
3170
3171 EXAMPLES FROM Demo
3172
3173 tandem \
3174 'php -S localhost:8000' \
3175 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3176 'tailwind -i src/index.css -o dist/index.css --watch'
3177
3178 # Emulate tandem's behaviour
3179 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3180 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3181 export PARALLEL
3182
3183 parallel ::: \
3184 'php -S localhost:8000' \
3185 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3186 'tailwind -i src/index.css -o dist/index.css --watch'
3187
3188 EXAMPLES FROM tandem -h
3189
3190 # Emulate tandem's behaviour
3191 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3192 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3193 export PARALLEL
3194
3195 1$ tandem 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3196
3197 1$ parallel ::: 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3198
3199 # '-t 0' fails. But '--timeout 0 works'
3200 2$ tandem --timeout 0 'sleep 5 && echo "hello"' \
3201 'sleep 2 && echo "world"'
3202
3203 2$ parallel --timeout 0 ::: 'sleep 5 && echo "hello"' \
3204 'sleep 2 && echo "world"'
3205
3206 EXAMPLES FROM tandem's readme.md
3207
3208 # Emulate tandem's behaviour
3209 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3210 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3211 export PARALLEL
3212
3213 1$ tandem 'next dev' 'nodemon --quiet ./server.js'
3214
3215 1$ parallel ::: 'next dev' 'nodemon --quiet ./server.js'
3216
3217 2$ cat package.json
3218 {
3219 "scripts": {
3220 "dev:php": "...",
3221 "dev:js": "...",
3222 "dev:css": "..."
3223 }
3224 }
3225
3226 tandem 'npm:dev:php' 'npm:dev:js' 'npm:dev:css'
3227
3228 # GNU Parallel uses bash functions instead
3229 2$ cat package.sh
3230 dev:php() { ... ; }
3231 dev:js() { ... ; }
3232 dev:css() { ... ; }
3233 export -f dev:php dev:js dev:css
3234
3235 . package.sh
3236 parallel ::: dev:php dev:js dev:css
3237
3238 3$ tandem 'npm:dev:*'
3239
3240 3$ compgen -A function | grep ^dev: | parallel
3241
3242 For usage in Makefiles, include a copy of GNU Parallel with your source
3243 using `parallel --embed`. This has the added benefit of also working if
3244 access to the internet is down or restricted.
3245
3246 https://github.com/rosszurowski/tandem (Last checked: 2023-01)
3247
3248 DIFFERENCES BETWEEN rust-parallel(aaronriekenberg) AND GNU Parallel
3249 Summary (see legend above):
3250
3251 I1 I2 I3 - - - -
3252 - - - - - M6
3253 O1 O2 O3 - O5 O6 - N/A - O10
3254 E1 - - E4 - - -
3255 - - - - - - - - -
3256 - -
3257
3258 rust-parallel has a goal of only using Rust. It seems it is impossible
3259 to call bash functions from the command line. You would need to put
3260 these in a script.
3261
3262 Calling a script that misses the shebang line (#! as first line) fails.
3263
3264 EXAMPLES FROM rust-parallel's README.md
3265
3266 $ cat >./test <<EOL
3267 echo hi
3268 echo there
3269 echo how
3270 echo are
3271 echo you
3272 EOL
3273
3274 1$ cat test | rust-parallel -j5
3275
3276 1$ cat test | parallel -j5
3277
3278 2$ cat test | rust-parallel -j1
3279
3280 2$ cat test | parallel -j1
3281
3282 3$ head -100 /usr/share/dict/words | rust-parallel md5 -s
3283
3284 3$ head -100 /usr/share/dict/words | parallel md5 -s
3285
3286 4$ find . -type f -print0 | rust-parallel -0 gzip -f -k
3287
3288 4$ find . -type f -print0 | parallel -0 gzip -f -k
3289
3290 5$ head -100 /usr/share/dict/words |
3291 awk '{printf "md5 -s %s\n", $1}' | rust-parallel
3292
3293 5$ head -100 /usr/share/dict/words |
3294 awk '{printf "md5 -s %s\n", $1}' | parallel
3295
3296 6$ head -100 /usr/share/dict/words | rust-parallel md5 -s |
3297 grep -i abba
3298
3299 6$ head -100 /usr/share/dict/words | parallel md5 -s |
3300 grep -i abba
3301
3302 https://github.com/aaronriekenberg/rust-parallel (Last checked:
3303 2023-01)
3304
3305 DIFFERENCES BETWEEN parallelium AND GNU Parallel
3306 Summary (see legend above):
3307
3308 - I2 - - - - -
3309 M1 - - - - M6
3310 O1 - O3 - - - - N/A - -
3311 E1 - - E4 - - -
3312 - - - - - - - - -
3313 - -
3314
3315 parallelium merges standard output (stdout) and standard error
3316 (stderr). The maximal output of a command is 8192 bytes. Bigger output
3317 makes parallelium go into an infinite loop.
3318
3319 In the input file for parallelium you can define a tag, so that you can
3320 select to run only these commands. A bit like a target in a Makefile.
3321
3322 Progress is printed on standard output (stdout) prepended with '#' with
3323 similar information as GNU parallel's --bar.
3324
3325 EXAMPLES
3326
3327 $ cat testjobs.txt
3328 #tag common sleeps classA
3329 (sleep 4.495;echo "job 000")
3330 :
3331 (sleep 2.587;echo "job 016")
3332
3333 #tag common sleeps classB
3334 (sleep 0.218;echo "job 017")
3335 :
3336 (sleep 2.269;echo "job 040")
3337
3338 #tag common sleeps classC
3339 (sleep 2.586;echo "job 041")
3340 :
3341 (sleep 1.626;echo "job 099")
3342
3343 #tag lasthalf, sleeps, classB
3344 (sleep 1.540;echo "job 100")
3345 :
3346 (sleep 2.001;echo "job 199")
3347
3348 1$ parallelium -f testjobs.txt -l logdir -t classB,classC
3349
3350 1$ cat testjobs.txt |
3351 parallel --plus --results logdir/testjobs.txt_{0#}.output \
3352 '{= if(/^#tag /) { @tag = split/,|\s+/ }
3353 (grep /^(classB|classC)$/, @tag) or skip =}'
3354
3355 https://github.com/beomagi/parallelium (Last checked: 2023-01)
3356
3357 DIFFERENCES BETWEEN forkrun AND GNU Parallel
3358 Summary (see legend above):
3359
3360 I1 - - - - - I7
3361 - - - - - -
3362 - O2 O3 - O5 - - - - O10
3363 E1 - - E4 - - -
3364 - - - - - - - - -
3365 - -
3366
3367 forkrun blocks if it receives fewer jobs than slots:
3368
3369 echo | forkrun -p 2 echo
3370
3371 or when it gets some specific commands e.g.:
3372
3373 f() { seq "$@" | pv -qL 3; }
3374 seq 10 | forkrun f
3375
3376 It is not clear why.
3377
3378 It is faster than GNU parallel (overhead: 1.2 ms/job vs 3 ms/job), but
3379 way slower than parallel-bash (0.059 ms/job).
3380
3381 Running jobs cannot be stopped by pressing CTRL-C.
3382
3383 -k is supposed to keep the order but fails on the MIX testing example
3384 below. If used with -k it caches output in RAM.
3385
3386 If forkrun is killed, it leaves temporary files in /tmp/.forkrun.* that
3387 has to be cleaned up manually.
3388
3389 EXAMPLES
3390
3391 1$ time find ./ -type f |
3392 forkrun -l512 -- sha256sum 2>/dev/null | wc -l
3393 1$ time find ./ -type f |
3394 parallel -j28 -m -- sha256sum 2>/dev/null | wc -l
3395
3396 2$ time find ./ -type f |
3397 forkrun -l512 -k -- sha256sum 2>/dev/null | wc -l
3398 2$ time find ./ -type f |
3399 parallel -j28 -k -m -- sha256sum 2>/dev/null | wc -l
3400
3401 https://github.com/jkool702/forkrun (Last checked: 2023-02)
3402
3403 DIFFERENCES BETWEEN parallel-sh AND GNU Parallel
3404 Summary (see legend above):
3405
3406 I1 I2 - I4 - - -
3407 M1 - - - - M6
3408 O1 O2 O3 - O5 O6 - - - O10
3409 E1 - - E4 - - -
3410 - - - - - - - - -
3411 - -
3412
3413 parallel-sh buffers in RAM. The buffering data takes O(n^1.5) time:
3414
3415 2MB=0.107s 4MB=0.175s 8MB=0.342s 16MB=0.766s 32MB=2.2s 64MB=6.7s
3416 128MB=20s 256MB=64s 512MB=248s 1024MB=998s 2048MB=3756s
3417
3418 It limits the practical usability to jobs outputting < 256 MB. GNU
3419 parallel buffers on disk, yet is faster for jobs with outputs > 16 MB
3420 and is only limited by the free space in $TMPDIR.
3421
3422 parallel-sh can kill running jobs if a job fails (Similar to --halt
3423 now,fail=1).
3424
3425 EXAMPLES
3426
3427 1$ parallel-sh "sleep 2 && echo first" "sleep 1 && echo second"
3428
3429 1$ parallel ::: "sleep 2 && echo first" "sleep 1 && echo second"
3430
3431 2$ cat /tmp/commands
3432 sleep 2 && echo first
3433 sleep 1 && echo second
3434
3435 2$ parallel-sh -f /tmp/commands
3436
3437 2$ parallel -a /tmp/commands
3438
3439 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
3440 parallel-sh
3441
3442 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
3443 parallel
3444
3445 https://github.com/thyrc/parallel-sh (Last checked: 2023-04)
3446
3447 DIFFERENCES BETWEEN bash-parallel AND GNU Parallel
3448 Summary (see legend above):
3449
3450 - I2 - - - - I7
3451 M1 - M3 - M5 M6
3452 - O2 O3 - - O6 - O8 - O10
3453 E1 - - - - - -
3454 - - - - - - - - -
3455 - -
3456
3457 bash-parallel is not as much a command as it is a shell script that you
3458 have to alter. It requires you to change the shell function process_job
3459 that runs the job, and set $MAX_POOL_SIZE to the number of jobs to run
3460 in parallel.
3461
3462 It is half as fast as GNU parallel for short jobs.
3463
3464 https://github.com/thilinaba/bash-parallel (Last checked: 2023-05)
3465
3466 DIFFERENCES BETWEEN PaSH AND GNU Parallel
3467 Summary (see legend above): N/A
3468
3469 pash is quite different from GNU parallel. It is not a general
3470 parallelizer. It takes a shell script and analyses it and parallelizes
3471 parts of it by replacing the parts with commands that will give the
3472 same result.
3473
3474 This will replace sort with a command that does pretty much the same as
3475 parsort --parallel=8 (except somewhat slower):
3476
3477 pa.sh --width 8 -c 'cat bigfile | sort'
3478
3479 However, even a simple change will confuse pash and you will get no
3480 parallelization:
3481
3482 pa.sh --width 8 -c 'mysort() { sort; }; cat bigfile | mysort'
3483 pa.sh --width 8 -c 'cat bigfile | sort | md5sum'
3484
3485 From the source it seems pash only looks at: awk cat col comm cut diff
3486 grep head mkfifo mv rm sed seq sort tail tee tr uniq wc xargs
3487
3488 For pipelines where these commands are bottlenecks, it might be worth
3489 testing if pash is faster than GNU parallel.
3490
3491 pash does not respect $TMPDIR but always uses /tmp. If pash dies
3492 unexpectantly it does not clean up.
3493
3494 https://github.com/binpash/pash (Last checked: 2023-05)
3495
3496 DIFFERENCES BETWEEN korovkin-parallel AND GNU Parallel
3497 Summary (see legend above):
3498
3499 I1 - - - - - -
3500 M1 - - - - M6
3501 - - O3 - - - - N/A N/A -
3502 E1 - - - - - -
3503 R1 - - - - R6 N/A N/A -
3504 - -
3505
3506 korovkin-parallel prepends all lines with some info.
3507
3508 The output is colored with 6 color combinations, so job 1 and 7 will
3509 get the same color.
3510
3511 You can get similar output with:
3512
3513 (echo ...) |
3514 parallel --color -j 10 --lb --tagstring \
3515 '[l:{#}:{=$_=sprintf("%7.03f",::now()-$^T)=} {=$_=hh_mm_ss($^T)=} {%}]'
3516
3517 Lines longer than 8192 chars are broken into lines shorter than 8192.
3518 korovkin-parallel loses the last char for lines exactly 8193 chars
3519 long.
3520
3521 Short lines from different jobs do not mix, but long lines do:
3522
3523 fun() {
3524 perl -e '$a="'$1'"x1000000; for(1..'$2') { print $a };';
3525 echo;
3526 }
3527 export -f fun
3528 (echo fun a 100;echo fun b 100) | korovkin-parallel | tr -s abcdef
3529 # Compare to:
3530 (echo fun a 100;echo fun b 100) | parallel | tr -s abcdef
3531
3532 There should be only one line of a's and one line of b's.
3533
3534 Just like GNU parallel korovkin-parallel offers a master/slave model,
3535 so workers on other servers can do some of the tasks. But contrary to
3536 GNU parallel you must manually start workers on these servers. The
3537 communication is neither authenticated nor encrypted.
3538
3539 It caches output in RAM: a 1GB line uses ~2.5GB RAM
3540
3541 https://github.com/korovkin/parallel (Last checked: 2023-07)
3542
3543 Todo
3544 https://www.npmjs.com/package/concurrently
3545
3546 http://code.google.com/p/push/ (cannot compile)
3547
3548 https://github.com/krashanoff/parallel
3549
3550 https://github.com/Nukesor/pueue
3551
3552 https://arxiv.org/pdf/2012.15443.pdf KumQuat
3553
3554 https://github.com/JeiKeiLim/simple_distribute_job
3555
3556 https://github.com/reggi/pkgrun - not obvious how to use
3557
3558 https://github.com/benoror/better-npm-run - not obvious how to use
3559
3560 https://github.com/bahmutov/with-package
3561
3562 https://github.com/flesler/parallel
3563
3564 https://github.com/Julian/Verge
3565
3566 https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html
3567
3568 https://vicerveza.homeunix.net/~viric/soft/ts/
3569
3570 https://github.com/chapmanjacobd/que
3571
3573 There are certain issues that are very common on parallelizing tools.
3574 Here are a few stress tests. Be warned: If the tool is badly coded it
3575 may overload your machine.
3576
3577 MIX: Output mixes
3578 Output from 2 jobs should not mix. If the output is not used, this does
3579 not matter; but if the output is used then it is important that you do
3580 not get half a line from one job followed by half a line from another
3581 job.
3582
3583 If the tool does not buffer, output will most likely mix now and then.
3584
3585 This test stresses whether output mixes.
3586
3587 #!/bin/bash
3588
3589 paralleltool="parallel -j 30"
3590
3591 cat <<-EOF > mycommand
3592 #!/bin/bash
3593
3594 # If a, b, c, d, e, and f mix: Very bad
3595 perl -e 'print STDOUT "a"x3000_000," "'
3596 perl -e 'print STDERR "b"x3000_000," "'
3597 perl -e 'print STDOUT "c"x3000_000," "'
3598 perl -e 'print STDERR "d"x3000_000," "'
3599 perl -e 'print STDOUT "e"x3000_000," "'
3600 perl -e 'print STDERR "f"x3000_000," "'
3601 echo
3602 echo >&2
3603 EOF
3604 chmod +x mycommand
3605
3606 # Run 30 jobs in parallel
3607 seq 30 |
3608 $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
3609
3610 # 'a c e' and 'b d f' should always stay together
3611 # and there should only be a single line per job
3612
3613 STDERRMERGE: Stderr is merged with stdout
3614 Output from stdout and stderr should not be merged, but kept separated.
3615
3616 This test shows whether stdout is mixed with stderr.
3617
3618 #!/bin/bash
3619
3620 paralleltool="parallel -j0"
3621
3622 cat <<-EOF > mycommand
3623 #!/bin/bash
3624
3625 echo stdout
3626 echo stderr >&2
3627 echo stdout
3628 echo stderr >&2
3629 EOF
3630 chmod +x mycommand
3631
3632 # Run one job
3633 echo |
3634 $paralleltool ./mycommand > stdout 2> stderr
3635 cat stdout
3636 cat stderr
3637
3638 RAM: Output limited by RAM
3639 Some tools cache output in RAM. This makes them extremely slow if the
3640 output is bigger than physical memory and crash if the output is bigger
3641 than the virtual memory.
3642
3643 #!/bin/bash
3644
3645 paralleltool="parallel -j0"
3646
3647 cat <<'EOF' > mycommand
3648 #!/bin/bash
3649
3650 # Generate 1 GB output
3651 yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
3652 EOF
3653 chmod +x mycommand
3654
3655 # Run 20 jobs in parallel
3656 # Adjust 20 to be > physical RAM and < free space on /tmp
3657 seq 20 | time $paralleltool ./mycommand | wc -c
3658
3659 DISKFULL: Incomplete data if /tmp runs full
3660 If caching is done on disk, the disk can run full during the run. Not
3661 all programs discover this. GNU Parallel discovers it, if it stays full
3662 for at least 2 seconds.
3663
3664 #!/bin/bash
3665
3666 paralleltool="parallel -j0"
3667
3668 # This should be a dir with less than 100 GB free space
3669 smalldisk=/tmp/shm/parallel
3670
3671 TMPDIR="$smalldisk"
3672 export TMPDIR
3673
3674 max_output() {
3675 # Force worst case scenario:
3676 # Make GNU Parallel only check once per second
3677 sleep 10
3678 # Generate 100 GB to fill $TMPDIR
3679 # Adjust if /tmp is bigger than 100 GB
3680 yes | head -c 100G >$TMPDIR/$$
3681 # Generate 10 MB output that will not be buffered
3682 # due to full disk
3683 perl -e 'print "X"x10_000_000' | head -c 10M
3684 echo This part is missing from incomplete output
3685 sleep 2
3686 rm $TMPDIR/$$
3687 echo Final output
3688 }
3689
3690 export -f max_output
3691 seq 10 | $paralleltool max_output | tr -s X
3692
3693 CLEANUP: Leaving tmp files at unexpected death
3694 Some tools do not clean up tmp files if they are killed. If the tool
3695 buffers on disk, they may not clean up, if they are killed.
3696
3697 #!/bin/bash
3698
3699 paralleltool=parallel
3700
3701 ls /tmp >/tmp/before
3702 seq 10 | $paralleltool sleep &
3703 pid=$!
3704 # Give the tool time to start up
3705 sleep 1
3706 # Kill it without giving it a chance to cleanup
3707 kill -9 $!
3708 # Should be empty: No files should be left behind
3709 diff <(ls /tmp) /tmp/before
3710
3711 SPCCHAR: Dealing badly with special file names.
3712 It is not uncommon for users to create files like:
3713
3714 My brother's 12" *** record (costs $$$).jpg
3715
3716 Some tools break on this.
3717
3718 #!/bin/bash
3719
3720 paralleltool=parallel
3721
3722 touch "My brother's 12\" *** record (costs \$\$\$).jpg"
3723 ls My*jpg | $paralleltool ls -l
3724
3725 COMPOSED: Composed commands do not work
3726 Some tools require you to wrap composed commands into bash -c.
3727
3728 echo bar | $paralleltool echo foo';' echo {}
3729
3730 ONEREP: Only one replacement string allowed
3731 Some tools can only insert the argument once.
3732
3733 echo bar | $paralleltool echo {} foo {}
3734
3735 INPUTSIZE: Length of input should not be limited
3736 Some tools limit the length of the input lines artificially with no
3737 good reason. GNU parallel does not:
3738
3739 perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
3740
3741 GNU parallel limits the command to run to 128 KB due to execve(1):
3742
3743 perl -e 'print "x"x131_000' | parallel echo {} | wc
3744
3745 NUMWORDS: Speed depends on number of words
3746 Some tools become very slow if output lines have many words.
3747
3748 #!/bin/bash
3749
3750 paralleltool=parallel
3751
3752 cat <<-EOF > mycommand
3753 #!/bin/bash
3754
3755 # 10 MB of lines with 1000 words
3756 yes "`seq 1000`" | head -c 10M
3757 EOF
3758 chmod +x mycommand
3759
3760 # Run 30 jobs in parallel
3761 seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
3762
3763 4GB: Output with a line > 4GB should be OK
3764 #!/bin/bash
3765
3766 paralleltool="parallel -j0"
3767
3768 cat <<-EOF > mycommand
3769 #!/bin/bash
3770
3771 perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
3772 EOF
3773 chmod +x mycommand
3774
3775 # Run 1 job
3776 seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
3777
3779 When using GNU parallel for a publication please cite:
3780
3781 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
3782 The USENIX Magazine, February 2011:42-47.
3783
3784 This helps funding further development; and it won't cost you a cent.
3785 If you pay 10000 EUR you should feel free to use GNU Parallel without
3786 citing.
3787
3788 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
3789
3790 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
3791
3792 Copyright (C) 2010-2023 Ole Tange, http://ole.tange.dk and Free
3793 Software Foundation, Inc.
3794
3795 Parts of the manual concerning xargs compatibility is inspired by the
3796 manual of xargs from GNU findutils 4.4.2.
3797
3799 This program is free software; you can redistribute it and/or modify it
3800 under the terms of the GNU General Public License as published by the
3801 Free Software Foundation; either version 3 of the License, or at your
3802 option any later version.
3803
3804 This program is distributed in the hope that it will be useful, but
3805 WITHOUT ANY WARRANTY; without even the implied warranty of
3806 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
3807 General Public License for more details.
3808
3809 You should have received a copy of the GNU General Public License along
3810 with this program. If not, see <https://www.gnu.org/licenses/>.
3811
3812 Documentation license I
3813 Permission is granted to copy, distribute and/or modify this
3814 documentation under the terms of the GNU Free Documentation License,
3815 Version 1.3 or any later version published by the Free Software
3816 Foundation; with no Invariant Sections, with no Front-Cover Texts, and
3817 with no Back-Cover Texts. A copy of the license is included in the
3818 file LICENSES/GFDL-1.3-or-later.txt.
3819
3820 Documentation license II
3821 You are free:
3822
3823 to Share to copy, distribute and transmit the work
3824
3825 to Remix to adapt the work
3826
3827 Under the following conditions:
3828
3829 Attribution
3830 You must attribute the work in the manner specified by the
3831 author or licensor (but not in any way that suggests that they
3832 endorse you or your use of the work).
3833
3834 Share Alike
3835 If you alter, transform, or build upon this work, you may
3836 distribute the resulting work only under the same, similar or
3837 a compatible license.
3838
3839 With the understanding that:
3840
3841 Waiver Any of the above conditions can be waived if you get
3842 permission from the copyright holder.
3843
3844 Public Domain
3845 Where the work or any of its elements is in the public domain
3846 under applicable law, that status is in no way affected by the
3847 license.
3848
3849 Other Rights
3850 In no way are any of the following rights affected by the
3851 license:
3852
3853 • Your fair dealing or fair use rights, or other applicable
3854 copyright exceptions and limitations;
3855
3856 • The author's moral rights;
3857
3858 • Rights other persons may have either in the work itself or
3859 in how the work is used, such as publicity or privacy
3860 rights.
3861
3862 Notice For any reuse or distribution, you must make clear to others
3863 the license terms of this work.
3864
3865 A copy of the full license is included in the file as
3866 LICENCES/CC-BY-SA-4.0.txt
3867
3869 GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
3870 Symbol, IO::File, POSIX, and File::Temp. For remote usage it also uses
3871 rsync with ssh.
3872
3874 find(1), xargs(1), make(1), pexec(1), ppss(1), xjobs(1), prll(1),
3875 dxargs(1), mdm(1)
3876
3877
3878
387920230722 2023-07-28 PARALLEL_ALTERNATIVES(7)