1PARALLEL(1)                        parallel                        PARALLEL(1)
2
3
4

NAME

6       parallel - build and execute shell command lines from standard input in
7       parallel
8

SYNOPSIS

10       parallel [options] [command [arguments]] < list_of_arguments
11
12       parallel [options] [command [arguments]] ( ::: arguments | :::+
13       arguments | :::: argfile(s) | ::::+ argfile(s) ) ...
14
15       parallel --semaphore [options] command
16
17       #!/usr/bin/parallel --shebang [options] [command [arguments]]
18
19       #!/usr/bin/parallel --shebang-wrap [options] [command [arguments]]
20

DESCRIPTION

22       STOP!
23
24       Read the Reader's guide below if you are new to GNU parallel.
25
26       GNU parallel is a shell tool for executing jobs in parallel using one
27       or more computers. A job can be a single command or a small script that
28       has to be run for each of the lines in the input. The typical input is
29       a list of files, a list of hosts, a list of users, a list of URLs, or a
30       list of tables. A job can also be a command that reads from a pipe. GNU
31       parallel can then split the input into blocks and pipe a block into
32       each command in parallel.
33
34       If you use xargs and tee today you will find GNU parallel very easy to
35       use as GNU parallel is written to have the same options as xargs. If
36       you write loops in shell, you will find GNU parallel may be able to
37       replace most of the loops and make them run faster by running several
38       jobs in parallel.
39
40       GNU parallel makes sure output from the commands is the same output as
41       you would get had you run the commands sequentially. This makes it
42       possible to use output from GNU parallel as input for other programs.
43
44       For each line of input GNU parallel will execute command with the line
45       as arguments. If no command is given, the line of input is executed.
46       Several lines will be run in parallel. GNU parallel can often be used
47       as a substitute for xargs or cat | bash.
48
49   Reader's guide
50       GNU parallel includes the 4 types of documentation: Tutorial, how-to,
51       reference and explanation.
52
53       Tutorial
54
55       If you prefer reading a book buy GNU Parallel 2018 at
56       http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html
57       or download it at: https://doi.org/10.5281/zenodo.1146014 Read at least
58       chapter 1+2. It should take you less than 20 minutes.
59
60       Otherwise start by watching the intro videos for a quick introduction:
61       http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
62
63       If you want to dive deeper: spend a couple of hours walking through the
64       tutorial (man parallel_tutorial). Your command line will love you for
65       it.
66
67       How-to
68
69       You can find a lot of EXAMPLEs of use after the list of OPTIONS in man
70       parallel (Use LESS=+/EXAMPLE: man parallel). That will give you an idea
71       of what GNU parallel is capable of, and you may find a solution you can
72       simply adapt to your situation.
73
74       Reference
75
76       If you need a one page printable cheat sheet you can find it on:
77       https://www.gnu.org/software/parallel/parallel_cheat.pdf
78
79       The man page is the reference for all options.
80
81       Design discussion
82
83       If you want to know the design decisions behind GNU parallel, try: man
84       parallel_design. This is also a good intro if you intend to change GNU
85       parallel.
86

OPTIONS

88       command
89           Command to execute.  If command or the following arguments contain
90           replacement strings (such as {}) every instance will be substituted
91           with the input.
92
93           If command is given, GNU parallel solve the same tasks as xargs. If
94           command is not given GNU parallel will behave similar to cat | sh.
95
96           The command must be an executable, a script, a composed command, an
97           alias, or a function.
98
99           Bash functions: export -f the function first or use env_parallel.
100
101           Bash, Csh, or Tcsh aliases: Use env_parallel.
102
103           Zsh, Fish, Ksh, and Pdksh functions and aliases: Use env_parallel.
104
105       {}  Input line. This replacement string will be replaced by a full line
106           read from the input source. The input source is normally stdin
107           (standard input), but can also be given with -a, :::, or ::::.
108
109           The replacement string {} can be changed with -I.
110
111           If the command line contains no replacement strings then {} will be
112           appended to the command line.
113
114           Replacement strings are normally quoted, so special characters are
115           not parsed by the shell. The exception is if the command starts
116           with a replacement string; then the string is not quoted.
117
118       {.} Input line without extension. This replacement string will be
119           replaced by the input with the extension removed. If the input line
120           contains . after the last /, the last . until the end of the string
121           will be removed and {.} will be replaced with the remaining. E.g.
122           foo.jpg becomes foo, subdir/foo.jpg becomes subdir/foo,
123           sub.dir/foo.jpg becomes sub.dir/foo, sub.dir/bar remains
124           sub.dir/bar. If the input line does not contain . it will remain
125           unchanged.
126
127           The replacement string {.} can be changed with --er.
128
129           To understand replacement strings see {}.
130
131       {/} Basename of input line. This replacement string will be replaced by
132           the input with the directory part removed.
133
134           The replacement string {/} can be changed with --basenamereplace.
135
136           To understand replacement strings see {}.
137
138       {//}
139           Dirname of input line. This replacement string will be replaced by
140           the dir of the input line. See dirname(1).
141
142           The replacement string {//} can be changed with --dirnamereplace.
143
144           To understand replacement strings see {}.
145
146       {/.}
147           Basename of input line without extension. This replacement string
148           will be replaced by the input with the directory and extension part
149           removed. It is a combination of {/} and {.}.
150
151           The replacement string {/.} can be changed with
152           --basenameextensionreplace.
153
154           To understand replacement strings see {}.
155
156       {#} Sequence number of the job to run. This replacement string will be
157           replaced by the sequence number of the job being run. It contains
158           the same number as $PARALLEL_SEQ.
159
160           The replacement string {#} can be changed with --seqreplace.
161
162           To understand replacement strings see {}.
163
164       {%} Job slot number. This replacement string will be replaced by the
165           job's slot number between 1 and number of jobs to run in parallel.
166           There will never be 2 jobs running at the same time with the same
167           job slot number.
168
169           The replacement string {%} can be changed with --slotreplace.
170
171           If the job needs to be retried (e.g using --retries or
172           --retry-failed) the job slot is not automatically updated. You
173           should then instead use $PARALLEL_JOBSLOT:
174
175             $ do_test() {
176                 id="$3 {%}=$1 PARALLEL_JOBSLOT=$2"
177                 echo run "$id";
178                 sleep 1
179                 # fail if {%} is odd
180                 return `echo $1%2 | bc`
181               }
182             $ export -f do_test
183             $ parallel -j3 --jl mylog do_test {%} \$PARALLEL_JOBSLOT {} ::: A B C D
184             run A {%}=1 PARALLEL_JOBSLOT=1
185             run B {%}=2 PARALLEL_JOBSLOT=2
186             run C {%}=3 PARALLEL_JOBSLOT=3
187             run D {%}=1 PARALLEL_JOBSLOT=1
188             $ parallel --retry-failed -j3 --jl mylog do_test {%} \$PARALLEL_JOBSLOT {} ::: A B C D
189             run A {%}=1 PARALLEL_JOBSLOT=1
190             run C {%}=3 PARALLEL_JOBSLOT=2
191             run D {%}=1 PARALLEL_JOBSLOT=3
192
193           Notice how {%} and $PARALLEL_JOBSLOT differ in the retry run of C
194           and D.
195
196           To understand replacement strings see {}.
197
198       {n} Argument from input source n or the n'th argument. This positional
199           replacement string will be replaced by the input from input source
200           n (when used with -a or ::::) or with the n'th argument (when used
201           with -N). If n is negative it refers to the n'th last argument.
202
203           To understand replacement strings see {}.
204
205       {n.}
206           Argument from input source n or the n'th argument without
207           extension. It is a combination of {n} and {.}.
208
209           This positional replacement string will be replaced by the input
210           from input source n (when used with -a or ::::) or with the n'th
211           argument (when used with -N). The input will have the extension
212           removed.
213
214           To understand positional replacement strings see {n}.
215
216       {n/}
217           Basename of argument from input source n or the n'th argument.  It
218           is a combination of {n} and {/}.
219
220           This positional replacement string will be replaced by the input
221           from input source n (when used with -a or ::::) or with the n'th
222           argument (when used with -N). The input will have the directory (if
223           any) removed.
224
225           To understand positional replacement strings see {n}.
226
227       {n//}
228           Dirname of argument from input source n or the n'th argument.  It
229           is a combination of {n} and {//}.
230
231           This positional replacement string will be replaced by the dir of
232           the input from input source n (when used with -a or ::::) or with
233           the n'th argument (when used with -N). See dirname(1).
234
235           To understand positional replacement strings see {n}.
236
237       {n/.}
238           Basename of argument from input source n or the n'th argument
239           without extension.  It is a combination of {n}, {/}, and {.}.
240
241           This positional replacement string will be replaced by the input
242           from input source n (when used with -a or ::::) or with the n'th
243           argument (when used with -N). The input will have the directory (if
244           any) and extension removed.
245
246           To understand positional replacement strings see {n}.
247
248       {=perl expression=}
249           Replace with calculated perl expression. $_ will contain the same
250           as {}. After evaluating perl expression $_ will be used as the
251           value. It is recommended to only change $_ but you have full access
252           to all of GNU parallel's internal functions and data structures. A
253           few convenience functions and data structures have been made:
254
255            Q(string)     shell quote a string
256
257            pQ(string)    perl quote a string
258
259            uq() (or uq)  do not quote current replacement string
260
261            total_jobs()  number of jobs in total
262
263            slot()        slot number of job
264
265            seq()         sequence number of job
266
267            @arg          the arguments
268
269           Example:
270
271             seq 10 | parallel echo {} + 1 is {= '$_++' =}
272             parallel csh -c {= '$_="mkdir ".Q($_)' =} ::: '12" dir'
273             seq 50 | parallel echo job {#} of {= '$_=total_jobs()' =}
274
275           See also: --rpl --parens
276
277       {=n perl expression=}
278           Positional equivalent to {=perl expression=}. To understand
279           positional replacement strings see {n}.
280
281           See also: {=perl expression=} {n}.
282
283       ::: arguments
284           Use arguments from the command line as input source instead of
285           stdin (standard input). Unlike other options for GNU parallel :::
286           is placed after the command and before the arguments.
287
288           The following are equivalent:
289
290             (echo file1; echo file2) | parallel gzip
291             parallel gzip ::: file1 file2
292             parallel gzip {} ::: file1 file2
293             parallel --arg-sep ,, gzip {} ,, file1 file2
294             parallel --arg-sep ,, gzip ,, file1 file2
295             parallel ::: "gzip file1" "gzip file2"
296
297           To avoid treating ::: as special use --arg-sep to set the argument
298           separator to something else. See also --arg-sep.
299
300           If multiple ::: are given, each group will be treated as an input
301           source, and all combinations of input sources will be generated.
302           E.g. ::: 1 2 ::: a b c will result in the combinations (1,a) (1,b)
303           (1,c) (2,a) (2,b) (2,c). This is useful for replacing nested for-
304           loops.
305
306           ::: and :::: can be mixed. So these are equivalent:
307
308             parallel echo {1} {2} {3} ::: 6 7 ::: 4 5 ::: 1 2 3
309             parallel echo {1} {2} {3} :::: <(seq 6 7) <(seq 4 5) \
310               :::: <(seq 1 3)
311             parallel -a <(seq 6 7) echo {1} {2} {3} :::: <(seq 4 5) \
312               :::: <(seq 1 3)
313             parallel -a <(seq 6 7) -a <(seq 4 5) echo {1} {2} {3} \
314               ::: 1 2 3
315             seq 6 7 | parallel -a - -a <(seq 4 5) echo {1} {2} {3} \
316               ::: 1 2 3
317             seq 4 5 | parallel echo {1} {2} {3} :::: <(seq 6 7) - \
318               ::: 1 2 3
319
320       :::+ arguments
321           Like ::: but linked like --link to the previous input source.
322
323           Contrary to --link, values do not wrap: The shortest input source
324           determines the length.
325
326           Example:
327
328             parallel echo ::: a b c :::+ 1 2 3 ::: X Y :::+ 11 22
329
330       :::: argfiles
331           Another way to write -a argfile1 -a argfile2 ...
332
333           ::: and :::: can be mixed.
334
335           See -a, ::: and --link.
336
337       ::::+ argfiles
338           Like :::: but linked like --link to the previous input source.
339
340           Contrary to --link, values do not wrap: The shortest input source
341           determines the length.
342
343       --null
344       -0  Use NUL as delimiter.  Normally input lines will end in \n
345           (newline). If they end in \0 (NUL), then use this option. It is
346           useful for processing arguments that may contain \n (newline).
347
348       --arg-file input-file
349       -a input-file
350           Use input-file as input source. If you use this option, stdin
351           (standard input) is given to the first process run.  Otherwise,
352           stdin (standard input) is redirected from /dev/null.
353
354           If multiple -a are given, each input-file will be treated as an
355           input source, and all combinations of input sources will be
356           generated. E.g. The file foo contains 1 2, the file bar contains a
357           b c.  -a foo -a bar will result in the combinations (1,a) (1,b)
358           (1,c) (2,a) (2,b) (2,c). This is useful for replacing nested for-
359           loops.
360
361           See also --link and {n}.
362
363       --arg-file-sep sep-str
364           Use sep-str instead of :::: as separator string between command and
365           argument files. Useful if :::: is used for something else by the
366           command.
367
368           See also: ::::.
369
370       --arg-sep sep-str
371           Use sep-str instead of ::: as separator string. Useful if ::: is
372           used for something else by the command.
373
374           Also useful if you command uses ::: but you still want to read
375           arguments from stdin (standard input): Simply change --arg-sep to a
376           string that is not in the command line.
377
378           See also: :::.
379
380       --bar
381           Show progress as a progress bar. In the bar is shown: % of jobs
382           completed, estimated seconds left, and number of jobs started.
383
384           It is compatible with zenity:
385
386             seq 1000 | parallel -j30 --bar '(echo {};sleep 0.1)' \
387               2> >(zenity --progress --auto-kill) | wc
388
389       --basefile file
390       --bf file
391           file will be transferred to each sshlogin before a job is started.
392           It will be removed if --cleanup is active. The file may be a script
393           to run or some common base data needed for the job.  Multiple --bf
394           can be specified to transfer more basefiles. The file will be
395           transferred the same way as --transferfile.
396
397       --basenamereplace replace-str
398       --bnr replace-str
399           Use the replacement string replace-str instead of {/} for basename
400           of input line.
401
402       --basenameextensionreplace replace-str
403       --bner replace-str
404           Use the replacement string replace-str instead of {/.} for basename
405           of input line without extension.
406
407       --bin binexpr (beta testing)
408           Use binexpr as binning key and bin input to the jobs.
409
410           binexpr is [column number|column name] [perlexpression] e.g. 3,
411           Address, 3 $_%=100, Address s/\D//g.
412
413           Each input line is split using --colsep. The value of the column is
414           put into $_, the perl expression is executed, the resulting value
415           is is the job slot that will be given the line. If the value is
416           bigger than the number of jobslots the value will be modulo number
417           of jobslots.
418
419           This is similar to --shard but the hashing algorithm is a simple
420           modulo, which makes it predictible which jobslot will receive which
421           value.
422
423           The performance is in the order of 100K rows per second. Faster if
424           the bincol is small (<10), slower if it is big (>100).
425
426           --bin requires --pipe and a fixed numeric value for --jobs.
427
428           See also --shard, --group-by, --roundrobin.
429
430       --bg
431           Run command in background thus GNU parallel will not wait for
432           completion of the command before exiting. This is the default if
433           --semaphore is set.
434
435           See also: --fg, man sem.
436
437           Implies --semaphore.
438
439       --bibtex
440       --citation
441           Print the citation notice and BibTeX entry for GNU parallel,
442           silence citation notice for all future runs, and exit. It will not
443           run any commands.
444
445           If it is impossible for you to run --citation you can instead use
446           --will-cite, which will run commands, but which will only silence
447           the citation notice for this single run.
448
449           If you use --will-cite in scripts to be run by others you are
450           making it harder for others to see the citation notice.  The
451           development of GNU parallel is indirectly financed through
452           citations, so if your users do not know they should cite then you
453           are making it harder to finance development. However, if you pay
454           10000 EUR, you have done your part to finance future development
455           and should feel free to use --will-cite in scripts.
456
457           If you do not want to help financing future development by letting
458           other users see the citation notice or by paying, then please use
459           another tool instead of GNU parallel. You can find some of the
460           alternatives in man parallel_alternatives.
461
462       --block size
463       --block-size size
464           Size of block in bytes to read at a time. The size can be postfixed
465           with K, M, G, T, P, E, k, m, g, t, p, or e which would multiply the
466           size with 1024, 1048576, 1073741824, 1099511627776,
467           1125899906842624, 1152921504606846976, 1000, 1000000, 1000000000,
468           1000000000000, 1000000000000000, or 1000000000000000000
469           respectively.
470
471           GNU parallel tries to meet the block size but can be off by the
472           length of one record. For performance reasons size should be bigger
473           than a two records. GNU parallel will warn you and automatically
474           increase the size if you choose a size that is too small.
475
476           If you use -N, --block-size should be bigger than N+1 records.
477
478           size defaults to 1M.
479
480           When using --pipepart a negative block size is not interpreted as a
481           blocksize but as the number of blocks each jobslot should have. So
482           this will run 10*5 = 50 jobs in total:
483
484             parallel --pipepart -a myfile --block -10 -j5 wc
485
486           This is an efficient alternative to --roundrobin because data is
487           never read by GNU parallel, but you can still have very few
488           jobslots process a large amount of data.
489
490           See --pipe and --pipepart for use of this.
491
492       --blocktimeout duration
493       --bt duration
494           Time out for reading block when using --pipe. If it takes longer
495           than duration to read a full block, use the partial block read so
496           far.
497
498           duration must be in whole seconds, but can be expressed as floats
499           postfixed with s, m, h, or d which would multiply the float by 1,
500           60, 3600, or 86400. Thus these are equivalent: --blocktimeout
501           100000 and --blocktimeout 1d3.5h16.6m4s.
502
503       --cat
504           Create a temporary file with content. Normally --pipe/--pipepart
505           will give data to the program on stdin (standard input). With --cat
506           GNU parallel will create a temporary file with the name in {}, so
507           you can do: parallel --pipe --cat wc {}.
508
509           Implies --pipe unless --pipepart is used.
510
511           See also --fifo.
512
513       --cleanup
514           Remove transferred files. --cleanup will remove the transferred
515           files on the remote computer after processing is done.
516
517             find log -name '*gz' | parallel \
518               --sshlogin server.example.com --transferfile {} \
519               --return {.}.bz2 --cleanup "zcat {} | bzip -9 >{.}.bz2"
520
521           With --transferfile {} the file transferred to the remote computer
522           will be removed on the remote computer.  Directories created will
523           not be removed - even if they are empty.
524
525           With --return the file transferred from the remote computer will be
526           removed on the remote computer.  Directories created will not be
527           removed - even if they are empty.
528
529           --cleanup is ignored when not used with --transferfile or --return.
530
531       --colsep regexp
532       -C regexp
533           Column separator. The input will be treated as a table with regexp
534           separating the columns. The n'th column can be accessed using {n}
535           or {n.}. E.g. {3} is the 3rd column.
536
537           If there are more input sources, each input source will be
538           separated, but the columns from each input source will be linked
539           (see --link).
540
541             parallel --colsep '-' echo {4} {3} {2} {1} \
542               ::: A-B C-D ::: e-f g-h
543
544           --colsep implies --trim rl, which can be overridden with --trim n.
545
546           regexp is a Perl Regular Expression:
547           http://perldoc.perl.org/perlre.html
548
549       --compress
550           Compress temporary files. If the output is big and very
551           compressible this will take up less disk space in $TMPDIR and
552           possibly be faster due to less disk I/O.
553
554           GNU parallel will try pzstd, lbzip2, pbzip2, zstd, pigz, lz4, lzop,
555           plzip, lzip, lrz, gzip, pxz, lzma, bzip2, xz, clzip, in that order,
556           and use the first available.
557
558       --compress-program prg
559       --decompress-program prg
560           Use prg for (de)compressing temporary files. It is assumed that prg
561           -dc will decompress stdin (standard input) to stdout (standard
562           output) unless --decompress-program is given.
563
564       --csv
565           Treat input as CSV-format. --colsep sets the field delimiter. It
566           works very much like --colsep except it deals correctly with
567           quoting:
568
569              echo '"1 big, 2 small","2""x4"" plank",12.34' |
570                parallel --csv echo {1} of {2} at {3}
571
572           Even quoted newlines are parsed correctly:
573
574              (echo '"Start of field 1 with newline'
575               echo 'Line 2 in field 1";value 2') |
576                parallel --csv --colsep ';' echo Field 1: {1} Field 2: {2}
577
578           When used with --pipe only pass full CSV-records.
579
580       --delay mytime
581           Delay starting next job by mytime. GNU parallel will pause mytime
582           after starting each job. mytime is normally in seconds, but can be
583           floats postfixed with s, m, h, or d which would multiply the float
584           by 1, 60, 3600, or 86400. Thus these are equivalent: --delay 100000
585           and --delay 1d3.5h16.6m4s.
586
587       --delimiter delim
588       -d delim
589           Input items are terminated by delim.  Quotes and backslash are not
590           special; every character in the input is taken literally.  Disables
591           the end-of-file string, which is treated like any other argument.
592           The specified delimiter may be characters, C-style character
593           escapes such as \n, or octal or hexadecimal escape codes.  Octal
594           and hexadecimal escape codes are understood as for the printf
595           command.  Multibyte characters are not supported.
596
597       --dirnamereplace replace-str
598       --dnr replace-str
599           Use the replacement string replace-str instead of {//} for dirname
600           of input line.
601
602       --dry-run
603           Print the job to run on stdout (standard output), but do not run
604           the job. Use -v -v to include the wrapping that GNU parallel
605           generates (for remote jobs, --tmux, --nice, --pipe, --pipepart,
606           --fifo and --cat). Do not count on this literally, though, as the
607           job may be scheduled on another computer or the local computer if :
608           is in the list.
609
610       -E eof-str
611           Set the end of file string to eof-str.  If the end of file string
612           occurs as a line of input, the rest of the input is not read.  If
613           neither -E nor -e is used, no end of file string is used.
614
615       --eof[=eof-str]
616       -e[eof-str]
617           This option is a synonym for the -E option.  Use -E instead,
618           because it is POSIX compliant for xargs while this option is not.
619           If eof-str is omitted, there is no end of file string.  If neither
620           -E nor -e is used, no end of file string is used.
621
622       --embed
623           Embed GNU parallel in a shell script. If you need to distribute
624           your script to someone who does not want to install GNU parallel
625           you can embed GNU parallel in your own shell script:
626
627             parallel --embed > new_script
628
629           After which you add your code at the end of new_script. This is
630           tested on ash, bash, dash, ksh, sh, and zsh.
631
632       --env var
633           Copy environment variable var. This will copy var to the
634           environment that the command is run in. This is especially useful
635           for remote execution.
636
637           In Bash var can also be a Bash function - just remember to export
638           -f the function, see command.
639
640           The variable '_' is special. It will copy all exported environment
641           variables except for the ones mentioned in
642           ~/.parallel/ignored_vars.
643
644           To copy the full environment (both exported and not exported
645           variables, arrays, and functions) use env_parallel.
646
647           See also: --record-env, --session.
648
649       --eta
650           Show the estimated number of seconds before finishing. This forces
651           GNU parallel to read all jobs before starting to find the number of
652           jobs. GNU parallel normally only reads the next job to run.
653
654           The estimate is based on the runtime of finished jobs, so the first
655           estimate will only be shown when the first job has finished.
656
657           Implies --progress.
658
659           See also: --bar, --progress.
660
661       --fg
662           Run command in foreground.
663
664           With --tmux and --tmuxpane GNU parallel will start tmux in the
665           foreground.
666
667           With --semaphore GNU parallel will run the command in the
668           foreground (opposite --bg), and wait for completion of the command
669           before exiting.
670
671           See also --bg, man sem.
672
673       --fifo
674           Create a temporary fifo with content. Normally --pipe and
675           --pipepart will give data to the program on stdin (standard input).
676           With --fifo GNU parallel will create a temporary fifo with the name
677           in {}, so you can do: parallel --pipe --fifo wc {}.
678
679           Beware: If data is not read from the fifo, the job will block
680           forever.
681
682           Implies --pipe unless --pipepart is used.
683
684           See also --cat.
685
686       --filter-hosts
687           Remove down hosts. For each remote host: check that login through
688           ssh works. If not: do not use this host.
689
690           For performance reasons, this check is performed only at the start
691           and every time --sshloginfile is changed. If an host goes down
692           after the first check, it will go undetected until --sshloginfile
693           is changed; --retries can be used to mitigate this.
694
695           Currently you can not put --filter-hosts in a profile, $PARALLEL,
696           /etc/parallel/config or similar. This is because GNU parallel uses
697           GNU parallel to compute this, so you will get an infinite loop.
698           This will likely be fixed in a later release.
699
700       --gnu
701           Behave like GNU parallel. This option historically took precedence
702           over --tollef. The --tollef option is now retired, and therefore
703           may not be used. --gnu is kept for compatibility.
704
705       --group
706           Group output. Output from each job is grouped together and is only
707           printed when the command is finished. Stdout (standard output)
708           first followed by stderr (standard error).
709
710           This takes in the order of 0.5ms per job and depends on the speed
711           of your disk for larger output. It can be disabled with -u, but
712           this means output from different commands can get mixed.
713
714           --group is the default. Can be reversed with -u.
715
716           See also: --line-buffer --ungroup
717
718       --group-by val
719           Group input by value. Combined with --pipe/--pipepart --group-by
720           groups lines with the same value into a record.
721
722           The value can be computed from the full line or from a single
723           column.
724
725           val can be:
726
727            column number Use the value in the column numbered.
728
729            column name   Treat the first line as a header and use the value
730                          in the column named.
731
732                          (Not supported with --pipepart).
733
734            perl expression
735                          Run the perl expression and use $_ as the value.
736
737            column number perl expression
738                          Put the value of the column put in $_, run the perl
739                          expression, and use $_ as the value.
740
741            column name perl expression
742                          Put the value of the column put in $_, run the perl
743                          expression, and use $_ as the value.
744
745                          (Not supported with --pipepart).
746
747           Example:
748
749             UserID, Consumption
750             123,    1
751             123,    2
752             12-3,   1
753             221,    3
754             221,    1
755             2/21,   5
756
757           If you want to group 123, 12-3, 221, and 2/21 into 4 records and
758           pass one record at a time to wc:
759
760             tail -n +2 table.csv | \
761               parallel --pipe --colsep , --group-by 1 -kN1 wc
762
763           Make GNU parallel treat the first line as a header:
764
765             cat table.csv | \
766               parallel --pipe --colsep , --header : --group-by 1 -kN1 wc
767
768           Address column by column name:
769
770             cat table.csv | \
771               parallel --pipe --colsep , --header : --group-by UserID -kN1 wc
772
773           If 12-3 and 123 are really the same UserID, remove non-digits in
774           UserID when grouping:
775
776             cat table.csv | parallel --pipe --colsep , --header : \
777               --group-by 'UserID s/\D//g' -kN1 wc
778
779           See also --shard, --roundrobin.
780
781       --help
782       -h  Print a summary of the options to GNU parallel and exit.
783
784       --halt-on-error val
785       --halt val
786           When should GNU parallel terminate? In some situations it makes no
787           sense to run all jobs. GNU parallel should simply give up as soon
788           as a condition is met.
789
790           val defaults to never, which runs all jobs no matter what.
791
792           val can also take on the form of when,why.
793
794           when can be 'now' which means kill all running jobs and halt
795           immediately, or it can be 'soon' which means wait for all running
796           jobs to complete, but start no new jobs.
797
798           why can be 'fail=X', 'fail=Y%', 'success=X', 'success=Y%',
799           'done=X', or 'done=Y%' where X is the number of jobs that has to
800           fail, succeed, or be done before halting, and Y is the percentage
801           of jobs that has to fail, succeed, or be done before halting.
802
803           Example:
804
805            --halt now,fail=1     exit when the first job fails. Kill running
806                                  jobs.
807
808            --halt soon,fail=3    exit when 3 jobs fail, but wait for running
809                                  jobs to complete.
810
811            --halt soon,fail=3%   exit when 3% of the jobs have failed, but
812                                  wait for running jobs to complete.
813
814            --halt now,success=1  exit when a job succeeds. Kill running jobs.
815
816            --halt soon,success=3 exit when 3 jobs succeeds, but wait for
817                                  running jobs to complete.
818
819            --halt now,success=3% exit when 3% of the jobs have succeeded.
820                                  Kill running jobs.
821
822            --halt now,done=1     exit when one of the jobs finishes. Kill
823                                  running jobs.
824
825            --halt soon,done=3    exit when 3 jobs finishes, but wait for
826                                  running jobs to complete.
827
828            --halt now,done=3%    exit when 3% of the jobs have finished. Kill
829                                  running jobs.
830
831           For backwards compatibility these also work:
832
833           0           never
834
835           1           soon,fail=1
836
837           2           now,fail=1
838
839           -1          soon,success=1
840
841           -2          now,success=1
842
843           1-99%       soon,fail=1-99%
844
845       --header regexp
846           Use regexp as header. For normal usage the matched header
847           (typically the first line: --header '.*\n') will be split using
848           --colsep (which will default to '\t') and column names can be used
849           as replacement variables: {column name}, {column name/}, {column
850           name//}, {column name/.}, {column name.}, {=column name perl
851           expression =}, ..
852
853           For --pipe the matched header will be prepended to each output.
854
855           --header : is an alias for --header '.*\n'.
856
857           If regexp is a number, it is a fixed number of lines.
858
859       --hostgroups
860       --hgrp
861           Enable hostgroups on arguments. If an argument contains '@' the
862           string after '@' will be removed and treated as a list of
863           hostgroups on which this job is allowed to run. If there is no
864           --sshlogin with a corresponding group, the job will run on any
865           hostgroup.
866
867           Example:
868
869             parallel --hostgroups \
870               --sshlogin @grp1/myserver1 -S @grp1+grp2/myserver2 \
871               --sshlogin @grp3/myserver3 \
872               echo ::: my_grp1_arg@grp1 arg_for_grp2@grp2 third@grp1+grp3
873
874           my_grp1_arg may be run on either myserver1 or myserver2, third may
875           be run on either myserver1 or myserver3, but arg_for_grp2 will only
876           be run on myserver2.
877
878           See also: --sshlogin.
879
880       -I replace-str
881           Use the replacement string replace-str instead of {}.
882
883       --replace[=replace-str]
884       -i[replace-str]
885           This option is a synonym for -Ireplace-str if replace-str is
886           specified, and for -I {} otherwise.  This option is deprecated; use
887           -I instead.
888
889       --joblog logfile
890           Logfile for executed jobs. Save a list of the executed jobs to
891           logfile in the following TAB separated format: sequence number,
892           sshlogin, start time as seconds since epoch, run time in seconds,
893           bytes in files transferred, bytes in files returned, exit status,
894           signal, and command run.
895
896           For --pipe bytes transferred and bytes returned are number of input
897           and output of bytes.
898
899           If logfile is prepended with '+' log lines will be appended to the
900           logfile.
901
902           To convert the times into ISO-8601 strict do:
903
904             cat logfile | perl -a -F"\t" -ne \
905               'chomp($F[2]=`date -d \@$F[2] +%FT%T`); print join("\t",@F)'
906
907           If the host is long, you can use column -t to pretty print it:
908
909             cat joblog | column -t
910
911           See also --resume --resume-failed.
912
913       --jobs N
914       -j N
915       --max-procs N
916       -P N
917           Number of jobslots on each machine. Run up to N jobs in parallel.
918           0 means as many as possible. Default is 100% which will run one job
919           per CPU on each machine.
920
921           If --semaphore is set, the default is 1 thus making a mutex.
922
923       --jobs +N
924       -j +N
925       --max-procs +N
926       -P +N
927           Add N to the number of CPUs.  Run this many jobs in parallel.  See
928           also --use-cores-instead-of-threads and
929           --use-sockets-instead-of-threads.
930
931       --jobs -N
932       -j -N
933       --max-procs -N
934       -P -N
935           Subtract N from the number of CPUs.  Run this many jobs in
936           parallel.  If the evaluated number is less than 1 then 1 will be
937           used.  See also --use-cores-instead-of-threads and
938           --use-sockets-instead-of-threads.
939
940       --jobs N%
941       -j N%
942       --max-procs N%
943       -P N%
944           Multiply N% with the number of CPUs.  Run this many jobs in
945           parallel. See also --use-cores-instead-of-threads and
946           --use-sockets-instead-of-threads.
947
948       --jobs procfile
949       -j procfile
950       --max-procs procfile
951       -P procfile
952           Read parameter from file. Use the content of procfile as parameter
953           for -j. E.g. procfile could contain the string 100% or +2 or 10. If
954           procfile is changed when a job completes, procfile is read again
955           and the new number of jobs is computed. If the number is lower than
956           before, running jobs will be allowed to finish but new jobs will
957           not be started until the wanted number of jobs has been reached.
958           This makes it possible to change the number of simultaneous running
959           jobs while GNU parallel is running.
960
961       --keep-order
962       -k  Keep sequence of output same as the order of input. Normally the
963           output of a job will be printed as soon as the job completes. Try
964           this to see the difference:
965
966             parallel -j4 sleep {}\; echo {} ::: 2 1 4 3
967             parallel -j4 -k sleep {}\; echo {} ::: 2 1 4 3
968
969           If used with --onall or --nonall the output will grouped by
970           sshlogin in sorted order.
971
972           If used with --pipe --roundrobin and the same input, the jobslots
973           will get the same blocks in the same order in every run.
974
975           -k only affects the order in which the output is printed - not the
976           order in which jobs are run.
977
978       -L recsize
979           When used with --pipe: Read records of recsize.
980
981           When used otherwise: Use at most recsize nonblank input lines per
982           command line.  Trailing blanks cause an input line to be logically
983           continued on the next input line.
984
985           -L 0 means read one line, but insert 0 arguments on the command
986           line.
987
988           Implies -X unless -m, --xargs, or --pipe is set.
989
990       --max-lines[=recsize]
991       -l[recsize]
992           When used with --pipe: Read records of recsize lines.
993
994           When used otherwise: Synonym for the -L option.  Unlike -L, the
995           recsize argument is optional.  If recsize is not specified, it
996           defaults to one.  The -l option is deprecated since the POSIX
997           standard specifies -L instead.
998
999           -l 0 is an alias for -l 1.
1000
1001           Implies -X unless -m, --xargs, or --pipe is set.
1002
1003       --limit "command args"
1004           Dynamic job limit. Before starting a new job run command with args.
1005           The exit value of command determines what GNU parallel will do:
1006
1007           0   Below limit. Start another job.
1008
1009           1   Over limit. Start no jobs.
1010
1011           2   Way over limit. Kill the youngest job.
1012
1013           You can use any shell command. There are 3 predefined commands:
1014
1015           "io n"    Limit for I/O. The amount of disk I/O will be computed as
1016                     a value 0-100, where 0 is no I/O and 100 is at least one
1017                     disk is 100% saturated.
1018
1019           "load n"  Similar to --load.
1020
1021           "mem n"   Similar to --memfree.
1022
1023       --line-buffer
1024       --lb
1025           Buffer output on line basis. --group will keep the output together
1026           for a whole job. --ungroup allows output to mixup with half a line
1027           coming from one job and half a line coming from another job.
1028           --line-buffer fits between these two: GNU parallel will print a
1029           full line, but will allow for mixing lines of different jobs.
1030
1031           --line-buffer takes more CPU power than both --group and --ungroup,
1032           but can be much faster than --group if the CPU is not the limiting
1033           factor.
1034
1035           Normally --line-buffer does not buffer on disk, and can thus
1036           process an infinite amount of data, but it will buffer on disk when
1037           combined with: --keep-order, --results, --compress, and --files.
1038           This will make it as slow as --group and will limit output to the
1039           available disk space.
1040
1041           With --keep-order --line-buffer will output lines from the first
1042           job continuously while it is running, then lines from the second
1043           job while that is running. It will buffer full lines, but jobs will
1044           not mix. Compare:
1045
1046             parallel -j0 'echo {};sleep {};echo {}' ::: 1 3 2 4
1047             parallel -j0 --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1048             parallel -j0 -k --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1049
1050           See also: --group --ungroup
1051
1052       --xapply
1053       --link
1054           Link input sources. Read multiple input sources like xapply. If
1055           multiple input sources are given, one argument will be read from
1056           each of the input sources. The arguments can be accessed in the
1057           command as {1} .. {n}, so {1} will be a line from the first input
1058           source, and {6} will refer to the line with the same line number
1059           from the 6th input source.
1060
1061           Compare these two:
1062
1063             parallel echo {1} {2} ::: 1 2 3 ::: a b c
1064             parallel --link echo {1} {2} ::: 1 2 3 ::: a b c
1065
1066           Arguments will be recycled if one input source has more arguments
1067           than the others:
1068
1069             parallel --link echo {1} {2} {3} \
1070               ::: 1 2 ::: I II III ::: a b c d e f g
1071
1072           See also --header, :::+, ::::+.
1073
1074       --load max-load
1075           Do not start new jobs on a given computer unless the number of
1076           running processes on the computer is less than max-load. max-load
1077           uses the same syntax as --jobs, so 100% for one per CPU is a valid
1078           setting. Only difference is 0 which is interpreted as 0.01.
1079
1080       --controlmaster
1081       -M  Use ssh's ControlMaster to make ssh connections faster. Useful if
1082           jobs run remote and are very fast to run. This is disabled for
1083           sshlogins that specify their own ssh command.
1084
1085       -m  Multiple arguments. Insert as many arguments as the command line
1086           length permits. If multiple jobs are being run in parallel:
1087           distribute the arguments evenly among the jobs. Use -j1 or --xargs
1088           to avoid this.
1089
1090           If {} is not used the arguments will be appended to the line.  If
1091           {} is used multiple times each {} will be replaced with all the
1092           arguments.
1093
1094           Support for -m with --sshlogin is limited and may fail.
1095
1096           See also -X for context replace. If in doubt use -X as that will
1097           most likely do what is needed.
1098
1099       --memfree size
1100           Minimum memory free when starting another job. The size can be
1101           postfixed with K, M, G, T, P, k, m, g, t, or p which would multiply
1102           the size with 1024, 1048576, 1073741824, 1099511627776,
1103           1125899906842624, 1000, 1000000, 1000000000, 1000000000000, or
1104           1000000000000000, respectively.
1105
1106           If the jobs take up very different amount of RAM, GNU parallel will
1107           only start as many as there is memory for. If less than size bytes
1108           are free, no more jobs will be started. If less than 50% size bytes
1109           are free, the youngest job will be killed, and put back on the
1110           queue to be run later.
1111
1112           --retries must be set to determine how many times GNU parallel
1113           should retry a given job.
1114
1115       --minversion version
1116           Print the version GNU parallel and exit.  If the current version of
1117           GNU parallel is less than version the exit code is 255. Otherwise
1118           it is 0.
1119
1120           This is useful for scripts that depend on features only available
1121           from a certain version of GNU parallel.
1122
1123       --max-args=max-args
1124       -n max-args
1125           Use at most max-args arguments per command line.  Fewer than max-
1126           args arguments will be used if the size (see the -s option) is
1127           exceeded, unless the -x option is given, in which case GNU parallel
1128           will exit.
1129
1130           -n 0 means read one argument, but insert 0 arguments on the command
1131           line.
1132
1133           Implies -X unless -m is set.
1134
1135       --max-replace-args=max-args
1136       -N max-args
1137           Use at most max-args arguments per command line. Like -n but also
1138           makes replacement strings {1} .. {max-args} that represents
1139           argument 1 .. max-args. If too few args the {n} will be empty.
1140
1141           -N 0 means read one argument, but insert 0 arguments on the command
1142           line.
1143
1144           This will set the owner of the homedir to the user:
1145
1146             tr ':' '\n' < /etc/passwd | parallel -N7 chown {1} {6}
1147
1148           Implies -X unless -m or --pipe is set.
1149
1150           When used with --pipe -N is the number of records to read. This is
1151           somewhat slower than --block.
1152
1153       --nonall
1154           --onall with no arguments. Run the command on all computers given
1155           with --sshlogin but take no arguments. GNU parallel will log into
1156           --jobs number of computers in parallel and run the job on the
1157           computer. -j adjusts how many computers to log into in parallel.
1158
1159           This is useful for running the same command (e.g. uptime) on a list
1160           of servers.
1161
1162       --onall
1163           Run all the jobs on all computers given with --sshlogin. GNU
1164           parallel will log into --jobs number of computers in parallel and
1165           run one job at a time on the computer. The order of the jobs will
1166           not be changed, but some computers may finish before others.
1167
1168           When using --group the output will be grouped by each server, so
1169           all the output from one server will be grouped together.
1170
1171           --joblog will contain an entry for each job on each server, so
1172           there will be several job sequence 1.
1173
1174       --output-as-files
1175       --outputasfiles
1176       --files
1177           Instead of printing the output to stdout (standard output) the
1178           output of each job is saved in a file and the filename is then
1179           printed.
1180
1181           See also: --results
1182
1183       --pipe
1184       --spreadstdin
1185           Spread input to jobs on stdin (standard input). Read a block of
1186           data from stdin (standard input) and give one block of data as
1187           input to one job.
1188
1189           The block size is determined by --block. The strings --recstart and
1190           --recend tell GNU parallel how a record starts and/or ends. The
1191           block read will have the final partial record removed before the
1192           block is passed on to the job. The partial record will be prepended
1193           to next block.
1194
1195           If --recstart is given this will be used to split at record start.
1196
1197           If --recend is given this will be used to split at record end.
1198
1199           If both --recstart and --recend are given both will have to match
1200           to find a split position.
1201
1202           If neither --recstart nor --recend are given --recend defaults to
1203           '\n'. To have no record separator use --recend "".
1204
1205           --files is often used with --pipe.
1206
1207           --pipe maxes out at around 1 GB/s input, and 100 MB/s output. If
1208           performance is important use --pipepart.
1209
1210           See also: --recstart, --recend, --fifo, --cat, --pipepart, --files.
1211
1212       --pipepart
1213           Pipe parts of a physical file. --pipepart works similar to --pipe,
1214           but is much faster.
1215
1216           --pipepart has a few limitations:
1217
1218           ·  The file must be a normal file or a block device (technically it
1219              must be seekable) and must be given using -a or ::::. The file
1220              cannot be a pipe or a fifo as they are not seekable.
1221
1222              If using a block device with lot of NUL bytes, remember to set
1223              --recend ''.
1224
1225           ·  Record counting (-N) and line counting (-L/-l) do not work.
1226
1227       --plain
1228           Ignore any --profile, $PARALLEL, and ~/.parallel/config to get full
1229           control on the command line (used by GNU parallel internally when
1230           called with --sshlogin).
1231
1232       --plus
1233           Activate additional replacement strings: {+/} {+.} {+..} {+...}
1234           {..} {...} {/..} {/...} {##}. The idea being that '{+foo}' matches
1235           the opposite of '{foo}' and {} = {+/}/{/} = {.}.{+.} =
1236           {+/}/{/.}.{+.} = {..}.{+..} = {+/}/{/..}.{+..} = {...}.{+...} =
1237           {+/}/{/...}.{+...}
1238
1239           {##} is the total number of jobs to be run. It is incompatible with
1240           -X/-m/--xargs.
1241
1242           {choose_k} is inspired by n choose k: Given a list of n elements,
1243           choose k. k is the number of input sources and n is the number of
1244           arguments in an input source.  The content of the input sources
1245           must be the same and the arguments must be unique.
1246
1247           Shorthands for variables:
1248
1249             {slot}        $PARALLEL_JOBSLOT (see {%})
1250             {sshlogin}    $PARALLEL_SSHLOGIN
1251             {host}        $PARALLEL_SSHHOST
1252
1253           The following dynamic replacement strings are also activated. They
1254           are inspired by bash's parameter expansion:
1255
1256             {:-str}       str if the value is empty
1257             {:num}        remove the first num characters
1258             {:num1:num2}  characters from num1 to num2
1259             {#str}        remove prefix str
1260             {%str}        remove postfix str
1261             {/str1/str2}  replace str1 with str2
1262             {^str}        uppercase str if found at the start
1263             {^^str}       uppercase str
1264             {,str}        lowercase str if found at the start
1265             {,,str}       lowercase str
1266
1267       --progress
1268           Show progress of computations. List the computers involved in the
1269           task with number of CPUs detected and the max number of jobs to
1270           run. After that show progress for each computer: number of running
1271           jobs, number of completed jobs, and percentage of all jobs done by
1272           this computer. The percentage will only be available after all jobs
1273           have been scheduled as GNU parallel only read the next job when
1274           ready to schedule it - this is to avoid wasting time and memory by
1275           reading everything at startup.
1276
1277           By sending GNU parallel SIGUSR2 you can toggle turning on/off
1278           --progress on a running GNU parallel process.
1279
1280           See also --eta and --bar.
1281
1282       --max-line-length-allowed
1283           Print the maximal number of characters allowed on the command line
1284           and exit (used by GNU parallel itself to determine the line length
1285           on remote computers).
1286
1287       --number-of-cpus (obsolete)
1288           Print the number of physical CPU cores and exit.
1289
1290       --number-of-cores
1291           Print the number of physical CPU cores and exit (used by GNU
1292           parallel itself to determine the number of physical CPU cores on
1293           remote computers).
1294
1295       --number-of-sockets
1296           Print the number of filled CPU sockets and exit (used by GNU
1297           parallel itself to determine the number of filled CPU sockets on
1298           remote computers).
1299
1300       --number-of-threads
1301           Print the number of hyperthreaded CPU cores and exit (used by GNU
1302           parallel itself to determine the number of hyperthreaded CPU cores
1303           on remote computers).
1304
1305       --no-keep-order
1306           Overrides an earlier --keep-order (e.g. if set in
1307           ~/.parallel/config).
1308
1309       --nice niceness
1310           Run the command at this niceness.
1311
1312           By default GNU parallel will run jobs at the same nice level as GNU
1313           parallel is started - both on the local machine and remote servers,
1314           so you are unlikely to ever use this option.
1315
1316           Setting --nice will override this nice level. If the nice level is
1317           smaller than the current nice level, it will only affect remote
1318           jobs (e.g. if current level is 10 then --nice 5 will cause local
1319           jobs to be run at level 10, but remote jobs run at nice level 5).
1320
1321       --interactive
1322       -p  Prompt the user about whether to run each command line and read a
1323           line from the terminal.  Only run the command line if the response
1324           starts with 'y' or 'Y'.  Implies -t.
1325
1326       --parens parensstring
1327           Define start and end parenthesis for {= perl expression =}. The
1328           left and the right parenthesis can be multiple characters and are
1329           assumed to be the same length. The default is {==} giving {= as the
1330           start parenthesis and =} as the end parenthesis.
1331
1332           Another useful setting is ,,,, which would make both parenthesis
1333           ,,:
1334
1335             parallel --parens ,,,, echo foo is ,,s/I/O/g,, ::: FII
1336
1337           See also: --rpl {= perl expression =}
1338
1339       --profile profilename
1340       -J profilename
1341           Use profile profilename for options. This is useful if you want to
1342           have multiple profiles. You could have one profile for running jobs
1343           in parallel on the local computer and a different profile for
1344           running jobs on remote computers. See the section PROFILE FILES for
1345           examples.
1346
1347           profilename corresponds to the file ~/.parallel/profilename.
1348
1349           You can give multiple profiles by repeating --profile. If parts of
1350           the profiles conflict, the later ones will be used.
1351
1352           Default: config
1353
1354       --quote
1355       -q  Quote command. If your command contains special characters that
1356           should not be interpreted by the shell (e.g. ; \ | *), use --quote
1357           to escape these. The command must be a simple command (see man
1358           bash) without redirections and without variable assignments.
1359
1360           See the section QUOTING. Most people will not need this.  Quoting
1361           is disabled by default.
1362
1363       --no-run-if-empty
1364       -r  If the stdin (standard input) only contains whitespace, do not run
1365           the command.
1366
1367           If used with --pipe this is slow.
1368
1369       --noswap
1370           Do not start new jobs on a given computer if there is both swap-in
1371           and swap-out activity.
1372
1373           The swap activity is only sampled every 10 seconds as the sampling
1374           takes 1 second to do.
1375
1376           Swap activity is computed as (swap-in)*(swap-out) which in practice
1377           is a good value: swapping out is not a problem, swapping in is not
1378           a problem, but both swapping in and out usually indicates a
1379           problem.
1380
1381           --memfree may give better results, so try using that first.
1382
1383       --record-env
1384           Record current environment variables in ~/.parallel/ignored_vars.
1385           This is useful before using --env _.
1386
1387           See also --env, --session.
1388
1389       --recstart startstring
1390       --recend endstring
1391           If --recstart is given startstring will be used to split at record
1392           start.
1393
1394           If --recend is given endstring will be used to split at record end.
1395
1396           If both --recstart and --recend are given the combined string
1397           endstringstartstring will have to match to find a split position.
1398           This is useful if either startstring or endstring match in the
1399           middle of a record.
1400
1401           If neither --recstart nor --recend are given then --recend defaults
1402           to '\n'. To have no record separator use --recend "".
1403
1404           --recstart and --recend are used with --pipe.
1405
1406           Use --regexp to interpret --recstart and --recend as regular
1407           expressions. This is slow, however.
1408
1409       --regexp
1410           Use --regexp to interpret --recstart and --recend as regular
1411           expressions. This is slow, however.
1412
1413       --remove-rec-sep
1414       --removerecsep
1415       --rrs
1416           Remove the text matched by --recstart and --recend before piping it
1417           to the command.
1418
1419           Only used with --pipe.
1420
1421       --results name
1422       --res name
1423           Save the output into files.
1424
1425           Simple string output dir
1426
1427           If name does not contain replacement strings and does not end in
1428           .csv/.tsv, the output will be stored in a directory tree rooted at
1429           name.  Within this directory tree, each command will result in
1430           three files: name/<ARGS>/stdout and name/<ARGS>/stderr,
1431           name/<ARGS>/seq, where <ARGS> is a sequence of directories
1432           representing the header of the input source (if using --header :)
1433           or the number of the input source and corresponding values.
1434
1435           E.g:
1436
1437             parallel --header : --results foo echo {a} {b} \
1438               ::: a I II ::: b III IIII
1439
1440           will generate the files:
1441
1442             foo/a/II/b/III/seq
1443             foo/a/II/b/III/stderr
1444             foo/a/II/b/III/stdout
1445             foo/a/II/b/IIII/seq
1446             foo/a/II/b/IIII/stderr
1447             foo/a/II/b/IIII/stdout
1448             foo/a/I/b/III/seq
1449             foo/a/I/b/III/stderr
1450             foo/a/I/b/III/stdout
1451             foo/a/I/b/IIII/seq
1452             foo/a/I/b/IIII/stderr
1453             foo/a/I/b/IIII/stdout
1454
1455           and
1456
1457             parallel --results foo echo {1} {2} ::: I II ::: III IIII
1458
1459           will generate the files:
1460
1461             foo/1/II/2/III/seq
1462             foo/1/II/2/III/stderr
1463             foo/1/II/2/III/stdout
1464             foo/1/II/2/IIII/seq
1465             foo/1/II/2/IIII/stderr
1466             foo/1/II/2/IIII/stdout
1467             foo/1/I/2/III/seq
1468             foo/1/I/2/III/stderr
1469             foo/1/I/2/III/stdout
1470             foo/1/I/2/IIII/seq
1471             foo/1/I/2/IIII/stderr
1472             foo/1/I/2/IIII/stdout
1473
1474           CSV file output
1475
1476           If name ends in .csv/.tsv the output will be a CSV-file named name.
1477
1478           .csv gives a comma separated value file. .tsv gives a TAB separated
1479           value file.
1480
1481           -.csv/-.tsv are special: It will give the file on stdout (standard
1482           output).
1483
1484           Replacement string output file
1485
1486           If name contains a replacement string and the replaced result does
1487           not end in /, then the standard output will be stored in a file
1488           named by this result. Standard error will be stored in the same
1489           file name with '.err' added, and the sequence number will be stored
1490           in the same file name with '.seq' added.
1491
1492           E.g.
1493
1494             parallel --results my_{} echo ::: foo bar baz
1495
1496           will generate the files:
1497
1498             my_bar
1499             my_bar.err
1500             my_bar.seq
1501             my_baz
1502             my_baz.err
1503             my_baz.seq
1504             my_foo
1505             my_foo.err
1506             my_foo.seq
1507
1508           Replacement string output dir
1509
1510           If name contains a replacement string and the replaced result ends
1511           in /, then output files will be stored in the resulting dir.
1512
1513           E.g.
1514
1515             parallel --results my_{}/ echo ::: foo bar baz
1516
1517           will generate the files:
1518
1519             my_bar/seq
1520             my_bar/stderr
1521             my_bar/stdout
1522             my_baz/seq
1523             my_baz/stderr
1524             my_baz/stdout
1525             my_foo/seq
1526             my_foo/stderr
1527             my_foo/stdout
1528
1529           See also --files, --tag, --header, --joblog.
1530
1531       --resume
1532           Resumes from the last unfinished job. By reading --joblog or the
1533           --results dir GNU parallel will figure out the last unfinished job
1534           and continue from there. As GNU parallel only looks at the sequence
1535           numbers in --joblog then the input, the command, and --joblog all
1536           have to remain unchanged; otherwise GNU parallel may run wrong
1537           commands.
1538
1539           See also --joblog, --results, --resume-failed, --retries.
1540
1541       --resume-failed
1542           Retry all failed and resume from the last unfinished job. By
1543           reading --joblog GNU parallel will figure out the failed jobs and
1544           run those again. After that it will resume last unfinished job and
1545           continue from there. As GNU parallel only looks at the sequence
1546           numbers in --joblog then the input, the command, and --joblog all
1547           have to remain unchanged; otherwise GNU parallel may run wrong
1548           commands.
1549
1550           See also --joblog, --resume, --retry-failed, --retries.
1551
1552       --retry-failed
1553           Retry all failed jobs in joblog. By reading --joblog GNU parallel
1554           will figure out the failed jobs and run those again.
1555
1556           --retry-failed ignores the command and arguments on the command
1557           line: It only looks at the joblog.
1558
1559           Differences between --resume, --resume-failed, --retry-failed
1560
1561           In this example exit {= $_%=2 =} will cause every other job to
1562           fail.
1563
1564             timeout -k 1 4 parallel --joblog log -j10 \
1565               'sleep {}; exit {= $_%=2 =}' ::: {10..1}
1566
1567           4 jobs completed. 2 failed:
1568
1569             Seq   [...]   Exitval Signal  Command
1570             10    [...]   1       0       sleep 1; exit 1
1571             9     [...]   0       0       sleep 2; exit 0
1572             8     [...]   1       0       sleep 3; exit 1
1573             7     [...]   0       0       sleep 4; exit 0
1574
1575           --resume does not care about the Exitval, but only looks at Seq. If
1576           the Seq is run, it will not be run again. So if needed, you can
1577           change the command for the seqs not run yet:
1578
1579             parallel --resume --joblog log -j10 \
1580               'sleep .{}; exit {= $_%=2 =}' ::: {10..1}
1581
1582             Seq   [...]   Exitval Signal  Command
1583             [... as above ...]
1584             1     [...]   0       0       sleep .10; exit 0
1585             6     [...]   1       0       sleep .5; exit 1
1586             5     [...]   0       0       sleep .6; exit 0
1587             4     [...]   1       0       sleep .7; exit 1
1588             3     [...]   0       0       sleep .8; exit 0
1589             2     [...]   1       0       sleep .9; exit 1
1590
1591           --resume-failed cares about the Exitval, but also only looks at Seq
1592           to figure out which commands to run. Again this means you can
1593           change the command, but not the arguments. It will run the failed
1594           seqs and the seqs not yet run:
1595
1596             parallel --resume-failed --joblog log -j10 \
1597               'echo {};sleep .{}; exit {= $_%=3 =}' ::: {10..1}
1598
1599             Seq   [...]   Exitval Signal  Command
1600             [... as above ...]
1601             10    [...]   1       0       echo 1;sleep .1; exit 1
1602             8     [...]   0       0       echo 3;sleep .3; exit 0
1603             6     [...]   2       0       echo 5;sleep .5; exit 2
1604             4     [...]   1       0       echo 7;sleep .7; exit 1
1605             2     [...]   0       0       echo 9;sleep .9; exit 0
1606
1607           --retry-failed cares about the Exitval, but takes the command from
1608           the joblog. It ignores any arguments or commands given on the
1609           command line:
1610
1611             parallel --retry-failed --joblog log -j10 this part is ignored
1612
1613             Seq   [...]   Exitval Signal  Command
1614             [... as above ...]
1615             10    [...]   1       0       echo 1;sleep .1; exit 1
1616             6     [...]   2       0       echo 5;sleep .5; exit 2
1617             4     [...]   1       0       echo 7;sleep .7; exit 1
1618
1619           See also --joblog, --resume, --resume-failed, --retries.
1620
1621       --retries n
1622           If a job fails, retry it on another computer on which it has not
1623           failed. Do this n times. If there are fewer than n computers in
1624           --sshlogin GNU parallel will re-use all the computers. This is
1625           useful if some jobs fail for no apparent reason (such as network
1626           failure).
1627
1628       --return filename
1629           Transfer files from remote computers. --return is used with
1630           --sshlogin when the arguments are files on the remote computers.
1631           When processing is done the file filename will be transferred from
1632           the remote computer using rsync and will be put relative to the
1633           default login dir. E.g.
1634
1635             echo foo/bar.txt | parallel --return {.}.out \
1636               --sshlogin server.example.com touch {.}.out
1637
1638           This will transfer the file $HOME/foo/bar.out from the computer
1639           server.example.com to the file foo/bar.out after running touch
1640           foo/bar.out on server.example.com.
1641
1642             parallel -S server --trc out/./{}.out touch {}.out ::: in/file
1643
1644           This will transfer the file in/file.out from the computer
1645           server.example.com to the files out/in/file.out after running touch
1646           in/file.out on server.
1647
1648             echo /tmp/foo/bar.txt | parallel --return {.}.out \
1649               --sshlogin server.example.com touch {.}.out
1650
1651           This will transfer the file /tmp/foo/bar.out from the computer
1652           server.example.com to the file /tmp/foo/bar.out after running touch
1653           /tmp/foo/bar.out on server.example.com.
1654
1655           Multiple files can be transferred by repeating the option multiple
1656           times:
1657
1658             echo /tmp/foo/bar.txt | parallel \
1659               --sshlogin server.example.com \
1660               --return {.}.out --return {.}.out2 touch {.}.out {.}.out2
1661
1662           --return is often used with --transferfile and --cleanup.
1663
1664           --return is ignored when used with --sshlogin : or when not used
1665           with --sshlogin.
1666
1667       --round-robin
1668       --round
1669           Normally --pipe will give a single block to each instance of the
1670           command. With --roundrobin all blocks will at random be written to
1671           commands already running. This is useful if the command takes a
1672           long time to initialize.
1673
1674           --keep-order will not work with --roundrobin as it is impossible to
1675           track which input block corresponds to which output.
1676
1677           --roundrobin implies --pipe, except if --pipepart is given.
1678
1679           See also --group-by, --shard.
1680
1681       --rpl 'tag perl expression'
1682           Use tag as a replacement string for perl expression. This makes it
1683           possible to define your own replacement strings. GNU parallel's 7
1684           replacement strings are implemented as:
1685
1686             --rpl '{} '
1687             --rpl '{#} 1 $_=$job->seq()'
1688             --rpl '{%} 1 $_=$job->slot()'
1689             --rpl '{/} s:.*/::'
1690             --rpl '{//} $Global::use{"File::Basename"} ||=
1691               eval "use File::Basename; 1;"; $_ = dirname($_);'
1692             --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
1693             --rpl '{.} s:\.[^/.]+$::'
1694
1695           The --plus replacement strings are implemented as:
1696
1697             --rpl '{+/} s:/[^/]*$::'
1698             --rpl '{+.} s:.*\.::'
1699             --rpl '{+..} s:.*\.([^.]*\.):$1:'
1700             --rpl '{+...} s:.*\.([^.]*\.[^.]*\.):$1:'
1701             --rpl '{..} s:\.[^/.]+$::; s:\.[^/.]+$::'
1702             --rpl '{...} s:\.[^/.]+$::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1703             --rpl '{/..} s:.*/::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1704             --rpl '{/...} s:.*/::;s:\.[^/.]+$::;s:\.[^/.]+$::;s:\.[^/.]+$::'
1705             --rpl '{##} $_=total_jobs()'
1706             --rpl '{:-(.+?)} $_ ||= $$1'
1707             --rpl '{:(\d+?)} substr($_,0,$$1) = ""'
1708             --rpl '{:(\d+?):(\d+?)} $_ = substr($_,$$1,$$2);'
1709             --rpl '{#([^#].*?)} s/^$$1//;'
1710             --rpl '{%(.+?)} s/$$1$//;'
1711             --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1712             --rpl '{^(.+?)} s/^($$1)/uc($1)/e;'
1713             --rpl '{^^(.+?)} s/($$1)/uc($1)/eg;'
1714             --rpl '{,(.+?)} s/^($$1)/lc($1)/e;'
1715             --rpl '{,,(.+?)} s/($$1)/lc($1)/eg;'
1716
1717           If the user defined replacement string starts with '{' it can also
1718           be used as a positional replacement string (like {2.}).
1719
1720           It is recommended to only change $_ but you have full access to all
1721           of GNU parallel's internal functions and data structures.
1722
1723           Here are a few examples:
1724
1725             Is the job sequence even or odd?
1726             --rpl '{odd} $_ = seq() % 2 ? "odd" : "even"'
1727             Pad job sequence with leading zeros to get equal width
1728             --rpl '{0#} $f=1+int("".(log(total_jobs())/log(10)));
1729               $_=sprintf("%0${f}d",seq())'
1730             Job sequence counting from 0
1731             --rpl '{#0} $_ = seq() - 1'
1732             Job slot counting from 2
1733             --rpl '{%1} $_ = slot() + 1'
1734             Remove all extensions
1735             --rpl '{:} s:(\.[^/]+)*$::'
1736
1737           You can have dynamic replacement strings by including parenthesis
1738           in the replacement string and adding a regular expression between
1739           the parenthesis. The matching string will be inserted as $$1:
1740
1741             parallel --rpl '{%(.*?)} s/$$1//' echo {%.tar.gz} ::: my.tar.gz
1742             parallel --rpl '{:%(.+?)} s:$$1(\.[^/]+)*$::' \
1743               echo {:%_file} ::: my_file.tar.gz
1744             parallel -n3 --rpl '{/:%(.*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:' \
1745               echo job {#}: {2} {2.} {3/:%_1} ::: a/b.c c/d.e f/g_1.h.i
1746
1747           You can even use multiple matches:
1748
1749             parallel --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1750               echo {/replacethis/withthis} {/b/C} ::: a_replacethis_b
1751
1752             parallel --rpl '{(.*?)/(.*?)} $_="$$2$_$$1"' \
1753               echo {swap/these} ::: -middle-
1754
1755           See also: {= perl expression =} --parens
1756
1757       --rsync-opts options
1758           Options to pass on to rsync. Setting --rsync-opts takes precedence
1759           over setting the environment variable $PARALLEL_RSYNC_OPTS.
1760
1761       --max-chars=max-chars
1762       -s max-chars
1763           Use at most max-chars characters per command line, including the
1764           command and initial-arguments and the terminating nulls at the ends
1765           of the argument strings.  The largest allowed value is system-
1766           dependent, and is calculated as the argument length limit for exec,
1767           less the size of your environment.  The default value is the
1768           maximum.
1769
1770           Implies -X unless -m is set.
1771
1772       --show-limits
1773           Display the limits on the command-line length which are imposed by
1774           the operating system and the -s option.  Pipe the input from
1775           /dev/null (and perhaps specify --no-run-if-empty) if you don't want
1776           GNU parallel to do anything.
1777
1778       --semaphore
1779           Work as a counting semaphore. --semaphore will cause GNU parallel
1780           to start command in the background. When the number of jobs given
1781           by --jobs is reached, GNU parallel will wait for one of these to
1782           complete before starting another command.
1783
1784           --semaphore implies --bg unless --fg is specified.
1785
1786           --semaphore implies --semaphorename `tty` unless --semaphorename is
1787           specified.
1788
1789           Used with --fg, --wait, and --semaphorename.
1790
1791           The command sem is an alias for parallel --semaphore.
1792
1793           See also man sem.
1794
1795       --semaphorename name
1796       --id name
1797           Use name as the name of the semaphore. Default is the name of the
1798           controlling tty (output from tty).
1799
1800           The default normally works as expected when used interactively, but
1801           when used in a script name should be set. $$ or my_task_name are
1802           often a good value.
1803
1804           The semaphore is stored in ~/.parallel/semaphores/
1805
1806           Implies --semaphore.
1807
1808           See also man sem.
1809
1810       --semaphoretimeout secs
1811       --st secs
1812           If secs > 0: If the semaphore is not released within secs seconds,
1813           take it anyway.
1814
1815           If secs < 0: If the semaphore is not released within secs seconds,
1816           exit.
1817
1818           Implies --semaphore.
1819
1820           See also man sem.
1821
1822       --seqreplace replace-str
1823           Use the replacement string replace-str instead of {#} for job
1824           sequence number.
1825
1826       --session
1827           Record names in current environment in $PARALLEL_IGNORED_NAMES and
1828           exit. Only used with env_parallel. Aliases, functions, and
1829           variables with names in $PARALLEL_IGNORED_NAMES will not be copied.
1830
1831           Only supported in Ash, Bash, Dash, Ksh, Sh, and Zsh.
1832
1833           See also --env, --record-env.
1834
1835       --shard shardexpr
1836           Use shardexpr as shard key and shard input to the jobs.
1837
1838           shardexpr is [column number|column name] [perlexpression] e.g. 3,
1839           Address, 3 $_%=100, Address s/\d//g.
1840
1841           Each input line is split using --colsep. The value of the column is
1842           put into $_, the perl expression is executed, the resulting value
1843           is hashed so that all lines of a given value is given to the same
1844           job slot.
1845
1846           This is similar to sharding in databases.
1847
1848           The performance is in the order of 100K rows per second. Faster if
1849           the shardcol is small (<10), slower if it is big (>100).
1850
1851           --shard requires --pipe and a fixed numeric value for --jobs.
1852
1853           See also --bin, --group-by, --roundrobin.
1854
1855       --shebang
1856       --hashbang
1857           GNU parallel can be called as a shebang (#!) command as the first
1858           line of a script. The content of the file will be treated as
1859           inputsource.
1860
1861           Like this:
1862
1863             #!/usr/bin/parallel --shebang -r wget
1864
1865             https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
1866             https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
1867             https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
1868
1869           --shebang must be set as the first option.
1870
1871           On FreeBSD env is needed:
1872
1873             #!/usr/bin/env -S parallel --shebang -r wget
1874
1875             https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
1876             https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
1877             https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
1878
1879           There are many limitations of shebang (#!) depending on your
1880           operating system. See details on
1881           http://www.in-ulm.de/~mascheck/various/shebang/
1882
1883       --shebang-wrap
1884           GNU parallel can parallelize scripts by wrapping the shebang line.
1885           If the program can be run like this:
1886
1887             cat arguments | parallel the_program
1888
1889           then the script can be changed to:
1890
1891             #!/usr/bin/parallel --shebang-wrap /original/parser --options
1892
1893           E.g.
1894
1895             #!/usr/bin/parallel --shebang-wrap /usr/bin/python
1896
1897           If the program can be run like this:
1898
1899             cat data | parallel --pipe the_program
1900
1901           then the script can be changed to:
1902
1903             #!/usr/bin/parallel --shebang-wrap --pipe /orig/parser --opts
1904
1905           E.g.
1906
1907             #!/usr/bin/parallel --shebang-wrap --pipe /usr/bin/perl -w
1908
1909           --shebang-wrap must be set as the first option.
1910
1911       --shellquote
1912           Does not run the command but quotes it. Useful for making quoted
1913           composed commands for GNU parallel.
1914
1915           Multiple --shellquote with quote the string multiple times, so
1916           parallel --shellquote | parallel --shellquote can be written as
1917           parallel --shellquote --shellquote.
1918
1919       --shuf
1920           Shuffle jobs. When having multiple input sources it is hard to
1921           randomize jobs. --shuf will generate all jobs, and shuffle them
1922           before running them. This is useful to get a quick preview of the
1923           results before running the full batch.
1924
1925       --skip-first-line
1926           Do not use the first line of input (used by GNU parallel itself
1927           when called with --shebang).
1928
1929       --sql DBURL (obsolete)
1930           Use --sqlmaster instead.
1931
1932       --sqlmaster DBURL (beta testing)
1933           Submit jobs via SQL server. DBURL must point to a table, which will
1934           contain the same information as --joblog, the values from the input
1935           sources (stored in columns V1 .. Vn), and the output (stored in
1936           columns Stdout and Stderr).
1937
1938           If DBURL is prepended with '+' GNU parallel assumes the table is
1939           already made with the correct columns and appends the jobs to it.
1940
1941           If DBURL is not prepended with '+' the table will be dropped and
1942           created with the correct amount of V-columns unless
1943
1944           --sqlmaster does not run any jobs, but it creates the values for
1945           the jobs to be run. One or more --sqlworker must be run to actually
1946           execute the jobs.
1947
1948           If --wait is set, GNU parallel will wait for the jobs to complete.
1949
1950           The format of a DBURL is:
1951
1952             [sql:]vendor://[[user][:pwd]@][host][:port]/[db]/table
1953
1954           E.g.
1955
1956             sql:mysql://hr:hr@localhost:3306/hrdb/jobs
1957             mysql://scott:tiger@my.example.com/pardb/paralleljobs
1958             sql:oracle://scott:tiger@ora.example.com/xe/parjob
1959             postgresql://scott:tiger@pg.example.com/pgdb/parjob
1960             pg:///parjob
1961             sqlite3:///%2Ftmp%2Fpardb.sqlite/parjob
1962             csv:///%2Ftmp%2Fpardb/parjob
1963
1964           Notice how / in the path of sqlite and CVS must be encoded as %2F.
1965           Except the last / in CSV which must be a /.
1966
1967           It can also be an alias from ~/.sql/aliases:
1968
1969             :myalias mysql:///mydb/paralleljobs
1970
1971       --sqlandworker DBURL (beta testing)
1972           Shorthand for: --sqlmaster DBURL --sqlworker DBURL.
1973
1974       --sqlworker DBURL (beta testing)
1975           Execute jobs via SQL server. Read the input sources variables from
1976           the table pointed to by DBURL. The command on the command line
1977           should be the same as given by --sqlmaster.
1978
1979           If you have more than one --sqlworker jobs may be run more than
1980           once.
1981
1982           If --sqlworker runs on the local machine, the hostname in the SQL
1983           table will not be ':' but instead the hostname of the machine.
1984
1985       --ssh sshcommand
1986           GNU parallel defaults to using ssh for remote access. This can be
1987           overridden with --ssh. It can also be set on a per server basis
1988           (see --sshlogin).
1989
1990       --sshdelay secs
1991           Delay starting next ssh by secs seconds. GNU parallel will pause
1992           secs seconds after starting each ssh. secs can be less than 1
1993           seconds.
1994
1995       -S
1996       [@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]
1997       -S @hostgroup
1998       --sshlogin
1999       [@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]
2000       --sshlogin @hostgroup
2001           Distribute jobs to remote computers. The jobs will be run on a list
2002           of remote computers.
2003
2004           If hostgroups is given, the sshlogin will be added to that
2005           hostgroup. Multiple hostgroups are separated by '+'. The sshlogin
2006           will always be added to a hostgroup named the same as sshlogin.
2007
2008           If only the @hostgroup is given, only the sshlogins in that
2009           hostgroup will be used. Multiple @hostgroup can be given.
2010
2011           GNU parallel will determine the number of CPUs on the remote
2012           computers and run the number of jobs as specified by -j.  If the
2013           number ncpus is given GNU parallel will use this number for number
2014           of CPUs on the host. Normally ncpus will not be needed.
2015
2016           An sshlogin is of the form:
2017
2018             [sshcommand [options]] [username@]hostname
2019
2020           The sshlogin must not require a password (ssh-agent, ssh-copy-id,
2021           and sshpass may help with that).
2022
2023           The sshlogin ':' is special, it means 'no ssh' and will therefore
2024           run on the local computer.
2025
2026           The sshlogin '..' is special, it read sshlogins from
2027           ~/.parallel/sshloginfile or $XDG_CONFIG_HOME/parallel/sshloginfile
2028
2029           The sshlogin '-' is special, too, it read sshlogins from stdin
2030           (standard input).
2031
2032           To specify more sshlogins separate the sshlogins by comma, newline
2033           (in the same string), or repeat the options multiple times.
2034
2035           For examples: see --sshloginfile.
2036
2037           The remote host must have GNU parallel installed.
2038
2039           --sshlogin is known to cause problems with -m and -X.
2040
2041           --sshlogin is often used with --transferfile, --return, --cleanup,
2042           and --trc.
2043
2044       --sshloginfile filename
2045       --slf filename
2046           File with sshlogins. The file consists of sshlogins on separate
2047           lines. Empty lines and lines starting with '#' are ignored.
2048           Example:
2049
2050             server.example.com
2051             username@server2.example.com
2052             8/my-8-cpu-server.example.com
2053             2/my_other_username@my-dualcore.example.net
2054             # This server has SSH running on port 2222
2055             ssh -p 2222 server.example.net
2056             4/ssh -p 2222 quadserver.example.net
2057             # Use a different ssh program
2058             myssh -p 2222 -l myusername hexacpu.example.net
2059             # Use a different ssh program with default number of CPUs
2060             //usr/local/bin/myssh -p 2222 -l myusername hexacpu
2061             # Use a different ssh program with 6 CPUs
2062             6//usr/local/bin/myssh -p 2222 -l myusername hexacpu
2063             # Assume 16 CPUs on the local computer
2064             16/:
2065             # Put server1 in hostgroup1
2066             @hostgroup1/server1
2067             # Put myusername@server2 in hostgroup1+hostgroup2
2068             @hostgroup1+hostgroup2/myusername@server2
2069             # Force 4 CPUs and put 'ssh -p 2222 server3' in hostgroup1
2070             @hostgroup1/4/ssh -p 2222 server3
2071
2072           When using a different ssh program the last argument must be the
2073           hostname.
2074
2075           Multiple --sshloginfile are allowed.
2076
2077           GNU parallel will first look for the file in current dir; if that
2078           fails it look for the file in ~/.parallel.
2079
2080           The sshloginfile '..' is special, it read sshlogins from
2081           ~/.parallel/sshloginfile
2082
2083           The sshloginfile '.' is special, it read sshlogins from
2084           /etc/parallel/sshloginfile
2085
2086           The sshloginfile '-' is special, too, it read sshlogins from stdin
2087           (standard input).
2088
2089           If the sshloginfile is changed it will be re-read when a job
2090           finishes though at most once per second. This makes it possible to
2091           add and remove hosts while running.
2092
2093           This can be used to have a daemon that updates the sshloginfile to
2094           only contain servers that are up:
2095
2096               cp original.slf tmp2.slf
2097               while [ 1 ] ; do
2098                 nice parallel --nonall -j0 -k --slf original.slf \
2099                   --tag echo | perl 's/\t$//' > tmp.slf
2100                 if diff tmp.slf tmp2.slf; then
2101                   mv tmp.slf tmp2.slf
2102                 fi
2103                 sleep 10
2104               done &
2105               parallel --slf tmp2.slf ...
2106
2107       --slotreplace replace-str
2108           Use the replacement string replace-str instead of {%} for job slot
2109           number.
2110
2111       --silent
2112           Silent.  The job to be run will not be printed. This is the
2113           default.  Can be reversed with -v.
2114
2115       --tty
2116           Open terminal tty. If GNU parallel is used for starting a program
2117           that accesses the tty (such as an interactive program) then this
2118           option may be needed. It will default to starting only one job at a
2119           time (i.e. -j1), not buffer the output (i.e. -u), and it will open
2120           a tty for the job.
2121
2122           You can of course override -j1 and -u.
2123
2124           Using --tty unfortunately means that GNU parallel cannot kill the
2125           jobs (with --timeout, --memfree, or --halt). This is due to GNU
2126           parallel giving each child its own process group, which is then
2127           killed. Process groups are dependant on the tty.
2128
2129       --tag
2130           Tag lines with arguments. Each output line will be prepended with
2131           the arguments and TAB (\t). When combined with --onall or --nonall
2132           the lines will be prepended with the sshlogin instead.
2133
2134           --tag is ignored when using -u.
2135
2136       --tagstring str
2137           Tag lines with a string. Each output line will be prepended with
2138           str and TAB (\t). str can contain replacement strings such as {}.
2139
2140           --tagstring is ignored when using -u, --onall, and --nonall.
2141
2142       --tee
2143           Pipe all data to all jobs. Used with --pipe/--pipepart and :::.
2144
2145             seq 1000 | parallel --pipe --tee -v wc {} ::: -w -l -c
2146
2147           How many numbers in 1..1000 contain 0..9, and how many bytes do
2148           they fill:
2149
2150             seq 1000 | parallel --pipe --tee --tag \
2151               'grep {1} | wc {2}' ::: {0..9} ::: -l -c
2152
2153           How many words contain a..z and how many bytes do they fill?
2154
2155             parallel -a /usr/share/dict/words --pipepart --tee --tag \
2156               'grep {1} | wc {2}' ::: {a..z} ::: -l -c
2157
2158       --termseq sequence
2159           Termination sequence. When a job is killed due to --timeout,
2160           --memfree, --halt, or abnormal termination of GNU parallel,
2161           sequence determines how the job is killed. The default is:
2162
2163               TERM,200,TERM,100,TERM,50,KILL,25
2164
2165           which sends a TERM signal, waits 200 ms, sends another TERM signal,
2166           waits 100 ms, sends another TERM signal, waits 50 ms, sends a KILL
2167           signal, waits 25 ms, and exits. GNU parallel detects if a process
2168           dies before the waiting time is up.
2169
2170       --tmpdir dirname
2171           Directory for temporary files. GNU parallel normally buffers output
2172           into temporary files in /tmp. By setting --tmpdir you can use a
2173           different dir for the files. Setting --tmpdir is equivalent to
2174           setting $TMPDIR.
2175
2176       --tmux (Long beta testing)
2177           Use tmux for output. Start a tmux session and run each job in a
2178           window in that session. No other output will be produced.
2179
2180       --tmuxpane (Long beta testing)
2181           Use tmux for output but put output into panes in the first window.
2182           Useful if you want to monitor the progress of less than 100
2183           concurrent jobs.
2184
2185       --timeout duration
2186           Time out for command. If the command runs for longer than duration
2187           seconds it will get killed as per --termseq.
2188
2189           If duration is followed by a % then the timeout will dynamically be
2190           computed as a percentage of the median average runtime of
2191           successful jobs. Only values > 100% will make sense.
2192
2193           duration is normally in seconds, but can be floats postfixed with
2194           s, m, h, or d which would multiply the float by 1, 60, 3600, or
2195           86400. Thus these are equivalent: --timeout 100000 and --timeout
2196           1d3.5h16.6m4s.
2197
2198       --verbose
2199       -t  Print the job to be run on stderr (standard error).
2200
2201           See also -v, -p.
2202
2203       --transfer
2204           Transfer files to remote computers. Shorthand for: --transferfile
2205           {}.
2206
2207       --transferfile filename
2208       --tf filename
2209           --transferfile is used with --sshlogin to transfer files to the
2210           remote computers. The files will be transferred using rsync and
2211           will be put relative to the default work dir. If the path contains
2212           /./ the remaining path will be relative to the work dir. E.g.
2213
2214             echo foo/bar.txt | parallel --transferfile {} \
2215               --sshlogin server.example.com wc
2216
2217           This will transfer the file foo/bar.txt to the computer
2218           server.example.com to the file $HOME/foo/bar.txt before running wc
2219           foo/bar.txt on server.example.com.
2220
2221             echo /tmp/foo/bar.txt | parallel --transferfile {} \
2222               --sshlogin server.example.com wc
2223
2224           This will transfer the file /tmp/foo/bar.txt to the computer
2225           server.example.com to the file /tmp/foo/bar.txt before running wc
2226           /tmp/foo/bar.txt on server.example.com.
2227
2228             echo /tmp/./foo/bar.txt | parallel --transferfile {} \
2229               --sshlogin server.example.com wc {= s:.*/./:./: =}
2230
2231           This will transfer the file /tmp/foo/bar.txt to the computer
2232           server.example.com to the file foo/bar.txt before running wc
2233           ./foo/bar.txt on server.example.com.
2234
2235           --transferfile is often used with --return and --cleanup. A
2236           shorthand for --transferfile {} is --transfer.
2237
2238           --transferfile is ignored when used with --sshlogin : or when not
2239           used with --sshlogin.
2240
2241       --trc filename
2242           Transfer, Return, Cleanup. Shorthand for:
2243
2244           --transferfile {} --return filename --cleanup
2245
2246       --trim <n|l|r|lr|rl>
2247           Trim white space in input.
2248
2249           n   No trim. Input is not modified. This is the default.
2250
2251           l   Left trim. Remove white space from start of input. E.g. " a bc
2252               " -> "a bc ".
2253
2254           r   Right trim. Remove white space from end of input. E.g. " a bc "
2255               -> " a bc".
2256
2257           lr
2258           rl  Both trim. Remove white space from both start and end of input.
2259               E.g. " a bc " -> "a bc". This is the default if --colsep is
2260               used.
2261
2262       --ungroup
2263       -u  Ungroup output.  Output is printed as soon as possible and bypasses
2264           GNU parallel internal processing. This may cause output from
2265           different commands to be mixed thus should only be used if you do
2266           not care about the output. Compare these:
2267
2268             seq 4 | parallel -j0 \
2269               'sleep {};echo -n start{};sleep {};echo {}end'
2270             seq 4 | parallel -u -j0 \
2271               'sleep {};echo -n start{};sleep {};echo {}end'
2272
2273           It also disables --tag. GNU parallel outputs faster with -u.
2274           Compare the speeds of these:
2275
2276             parallel seq ::: 300000000 >/dev/null
2277             parallel -u seq ::: 300000000 >/dev/null
2278             parallel --line-buffer seq ::: 300000000 >/dev/null
2279
2280           Can be reversed with --group.
2281
2282           See also: --line-buffer --group
2283
2284       --extensionreplace replace-str
2285       --er replace-str
2286           Use the replacement string replace-str instead of {.} for input
2287           line without extension.
2288
2289       --use-sockets-instead-of-threads
2290       --use-cores-instead-of-threads
2291       --use-cpus-instead-of-cores (obsolete)
2292           Determine how GNU parallel counts the number of CPUs. GNU parallel
2293           uses this number when the number of jobslots is computed relative
2294           to the number of CPUs (e.g. 100% or +1).
2295
2296           CPUs can be counted in three different ways:
2297
2298           sockets The number of filled CPU sockets (i.e. the number of
2299                   physical chips).
2300
2301           cores   The number of physical cores (i.e. the number of physical
2302                   compute cores).
2303
2304           threads The number of hyperthreaded cores (i.e. the number of
2305                   virtual cores - with some of them possibly being
2306                   hyperthreaded)
2307
2308           Normally the number of CPUs is computed as the number of CPU
2309           threads. With --use-sockets-instead-of-threads or
2310           --use-cores-instead-of-threads you can force it to be computed as
2311           the number of filled sockets or number of cores instead.
2312
2313           Most users will not need these options.
2314
2315           --use-cpus-instead-of-cores is a (misleading) alias for
2316           --use-sockets-instead-of-threads and is kept for backwards
2317           compatibility.
2318
2319       -v  Verbose.  Print the job to be run on stdout (standard output). Can
2320           be reversed with --silent. See also -t.
2321
2322           Use -v -v to print the wrapping ssh command when running remotely.
2323
2324       --version
2325       -V  Print the version GNU parallel and exit.
2326
2327       --workdir mydir
2328       --wd mydir
2329           Jobs will be run in the dir mydir.
2330
2331           Files transferred using --transferfile and --return will be
2332           relative to mydir on remote computers.
2333
2334           The special mydir value ... will create working dirs under
2335           ~/.parallel/tmp/. If --cleanup is given these dirs will be removed.
2336
2337           The special mydir value . uses the current working dir.  If the
2338           current working dir is beneath your home dir, the value . is
2339           treated as the relative path to your home dir. This means that if
2340           your home dir is different on remote computers (e.g. if your login
2341           is different) the relative path will still be relative to your home
2342           dir.
2343
2344           To see the difference try:
2345
2346             parallel -S server pwd ::: ""
2347             parallel --wd . -S server pwd ::: ""
2348             parallel --wd ... -S server pwd ::: ""
2349
2350           mydir can contain GNU parallel's replacement strings.
2351
2352       --wait
2353           Wait for all commands to complete.
2354
2355           Used with --semaphore or --sqlmaster.
2356
2357           See also man sem.
2358
2359       -X  Multiple arguments with context replace. Insert as many arguments
2360           as the command line length permits. If multiple jobs are being run
2361           in parallel: distribute the arguments evenly among the jobs. Use
2362           -j1 to avoid this.
2363
2364           If {} is not used the arguments will be appended to the line.  If
2365           {} is used as part of a word (like pic{}.jpg) then the whole word
2366           will be repeated. If {} is used multiple times each {} will be
2367           replaced with the arguments.
2368
2369           Normally -X will do the right thing, whereas -m can give unexpected
2370           results if {} is used as part of a word.
2371
2372           Support for -X with --sshlogin is limited and may fail.
2373
2374           See also -m.
2375
2376       --exit
2377       -x  Exit if the size (see the -s option) is exceeded.
2378
2379       --xargs
2380           Multiple arguments. Insert as many arguments as the command line
2381           length permits.
2382
2383           If {} is not used the arguments will be appended to the line.  If
2384           {} is used multiple times each {} will be replaced with all the
2385           arguments.
2386
2387           Support for --xargs with --sshlogin is limited and may fail.
2388
2389           See also -X for context replace. If in doubt use -X as that will
2390           most likely do what is needed.
2391

EXAMPLE: Working as xargs -n1. Argument appending

2393       GNU parallel can work similar to xargs -n1.
2394
2395       To compress all html files using gzip run:
2396
2397         find . -name '*.html' | parallel gzip --best
2398
2399       If the file names may contain a newline use -0. Substitute FOO BAR with
2400       FUBAR in all files in this dir and subdirs:
2401
2402         find . -type f -print0 | \
2403           parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
2404
2405       Note -q is needed because of the space in 'FOO BAR'.
2406

EXAMPLE: Simple network scanner

2408       prips can generate IP-addresses from CIDR notation. With GNU parallel
2409       you can build a simple network scanner to see which addresses respond
2410       to ping:
2411
2412         prips 130.229.16.0/20 | \
2413           parallel --timeout 2 -j0 \
2414             'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null
2415

EXAMPLE: Reading arguments from command line

2417       GNU parallel can take the arguments from command line instead of stdin
2418       (standard input). To compress all html files in the current dir using
2419       gzip run:
2420
2421         parallel gzip --best ::: *.html
2422
2423       To convert *.wav to *.mp3 using LAME running one process per CPU run:
2424
2425         parallel lame {} -o {.}.mp3 ::: *.wav
2426

EXAMPLE: Inserting multiple arguments

2428       When moving a lot of files like this: mv *.log destdir you will
2429       sometimes get the error:
2430
2431         bash: /bin/mv: Argument list too long
2432
2433       because there are too many files. You can instead do:
2434
2435         ls | grep -E '\.log$' | parallel mv {} destdir
2436
2437       This will run mv for each file. It can be done faster if mv gets as
2438       many arguments that will fit on the line:
2439
2440         ls | grep -E '\.log$' | parallel -m mv {} destdir
2441
2442       In many shells you can also use printf:
2443
2444         printf '%s\0' *.log | parallel -0 -m mv {} destdir
2445

EXAMPLE: Context replace

2447       To remove the files pict0000.jpg .. pict9999.jpg you could do:
2448
2449         seq -w 0 9999 | parallel rm pict{}.jpg
2450
2451       You could also do:
2452
2453         seq -w 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm
2454
2455       The first will run rm 10000 times, while the last will only run rm as
2456       many times needed to keep the command line length short enough to avoid
2457       Argument list too long (it typically runs 1-2 times).
2458
2459       You could also run:
2460
2461         seq -w 0 9999 | parallel -X rm pict{}.jpg
2462
2463       This will also only run rm as many times needed to keep the command
2464       line length short enough.
2465

EXAMPLE: Compute intensive jobs and substitution

2467       If ImageMagick is installed this will generate a thumbnail of a jpg
2468       file:
2469
2470         convert -geometry 120 foo.jpg thumb_foo.jpg
2471
2472       This will run with number-of-cpus jobs in parallel for all jpg files in
2473       a directory:
2474
2475         ls *.jpg | parallel convert -geometry 120 {} thumb_{}
2476
2477       To do it recursively use find:
2478
2479         find . -name '*.jpg' | \
2480           parallel convert -geometry 120 {} {}_thumb.jpg
2481
2482       Notice how the argument has to start with {} as {} will include path
2483       (e.g. running convert -geometry 120 ./foo/bar.jpg thumb_./foo/bar.jpg
2484       would clearly be wrong). The command will generate files like
2485       ./foo/bar.jpg_thumb.jpg.
2486
2487       Use {.} to avoid the extra .jpg in the file name. This command will
2488       make files like ./foo/bar_thumb.jpg:
2489
2490         find . -name '*.jpg' | \
2491           parallel convert -geometry 120 {} {.}_thumb.jpg
2492

EXAMPLE: Substitution and redirection

2494       This will generate an uncompressed version of .gz-files next to the
2495       .gz-file:
2496
2497         parallel zcat {} ">"{.} ::: *.gz
2498
2499       Quoting of > is necessary to postpone the redirection. Another solution
2500       is to quote the whole command:
2501
2502         parallel "zcat {} >{.}" ::: *.gz
2503
2504       Other special shell characters (such as * ; $ > < | >> <<) also need to
2505       be put in quotes, as they may otherwise be interpreted by the shell and
2506       not given to GNU parallel.
2507

EXAMPLE: Composed commands

2509       A job can consist of several commands. This will print the number of
2510       files in each directory:
2511
2512         ls | parallel 'echo -n {}" "; ls {}|wc -l'
2513
2514       To put the output in a file called <name>.dir:
2515
2516         ls | parallel '(echo -n {}" "; ls {}|wc -l) >{}.dir'
2517
2518       Even small shell scripts can be run by GNU parallel:
2519
2520         find . | parallel 'a={}; name=${a##*/};' \
2521           'upper=$(echo "$name" | tr "[:lower:]" "[:upper:]");'\
2522           'echo "$name - $upper"'
2523
2524         ls | parallel 'mv {} "$(echo {} | tr "[:upper:]" "[:lower:]")"'
2525
2526       Given a list of URLs, list all URLs that fail to download. Print the
2527       line number and the URL.
2528
2529         cat urlfile | parallel "wget {} 2>/dev/null || grep -n {} urlfile"
2530
2531       Create a mirror directory with the same filenames except all files and
2532       symlinks are empty files.
2533
2534         cp -rs /the/source/dir mirror_dir
2535         find mirror_dir -type l | parallel -m rm {} '&&' touch {}
2536
2537       Find the files in a list that do not exist
2538
2539         cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi'
2540

EXAMPLE: Composed command with perl replacement string

2542       You have a bunch of file. You want them sorted into dirs. The dir of
2543       each file should be named the first letter of the file name.
2544
2545         parallel 'mkdir -p {=s/(.).*/$1/=}; mv {} {=s/(.).*/$1/=}' ::: *
2546

EXAMPLE: Composed command with multiple input sources

2548       You have a dir with files named as 24 hours in 5 minute intervals:
2549       00:00, 00:05, 00:10 .. 23:55. You want to find the files missing:
2550
2551         parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \
2552           ::: {00..23} ::: {00..55..5}
2553

EXAMPLE: Calling Bash functions

2555       If the composed command is longer than a line, it becomes hard to read.
2556       In Bash you can use functions. Just remember to export -f the function.
2557
2558         doit() {
2559           echo Doing it for $1
2560           sleep 2
2561           echo Done with $1
2562         }
2563         export -f doit
2564         parallel doit ::: 1 2 3
2565
2566         doubleit() {
2567           echo Doing it for $1 $2
2568           sleep 2
2569           echo Done with $1 $2
2570         }
2571         export -f doubleit
2572         parallel doubleit ::: 1 2 3 ::: a b
2573
2574       To do this on remote servers you need to transfer the function using
2575       --env:
2576
2577         parallel --env doit -S server doit ::: 1 2 3
2578         parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b
2579
2580       If your environment (aliases, variables, and functions) is small you
2581       can copy the full environment without having to export -f anything. See
2582       env_parallel.
2583

EXAMPLE: Function tester

2585       To test a program with different parameters:
2586
2587         tester() {
2588           if (eval "$@") >&/dev/null; then
2589             perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@"
2590           else
2591             perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@"
2592           fi
2593         }
2594         export -f tester
2595         parallel tester my_program ::: arg1 arg2
2596         parallel tester exit ::: 1 0 2 0
2597
2598       If my_program fails a red FAIL will be printed followed by the failing
2599       command; otherwise a green OK will be printed followed by the command.
2600

EXAMPLE: Continously show the latest line of output

2602       It can be useful to monitor the output of running jobs.
2603
2604       This shows the most recent output line until a job finishes. After
2605       which the output of the job is printed in full:
2606
2607         parallel '{} | tee >(cat >&3)' ::: 'command 1' 'command 2' \
2608           3> >(perl -ne '$|=1;chomp;printf"%.'$COLUMNS's\r",$_." "x100')
2609

EXAMPLE: Log rotate

2611       Log rotation renames a logfile to an extension with a higher number:
2612       log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is
2613       removed. To avoid overwriting files the process starts backwards from
2614       the high number to the low number.  This will keep 10 old versions of
2615       the log:
2616
2617         seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}'
2618         mv log log.1
2619

EXAMPLE: Removing file extension when processing files

2621       When processing files removing the file extension using {.} is often
2622       useful.
2623
2624       Create a directory for each zip-file and unzip it in that dir:
2625
2626         parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
2627
2628       Recompress all .gz files in current directory using bzip2 running 1 job
2629       per CPU in parallel:
2630
2631         parallel "zcat {} | bzip2 >{.}.bz2 && rm {}" ::: *.gz
2632
2633       Convert all WAV files to MP3 using LAME:
2634
2635         find sounddir -type f -name '*.wav' | parallel lame {} -o {.}.mp3
2636
2637       Put all converted in the same directory:
2638
2639         find sounddir -type f -name '*.wav' | \
2640           parallel lame {} -o mydir/{/.}.mp3
2641

EXAMPLE: Removing strings from the argument

2643       If you have directory with tar.gz files and want these extracted in the
2644       corresponding dir (e.g foo.tar.gz will be extracted in the dir foo) you
2645       can do:
2646
2647         parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
2648
2649       If you want to remove a different ending, you can use {%string}:
2650
2651         parallel --plus echo {%_demo} ::: mycode_demo keep_demo_here
2652
2653       You can also remove a starting string with {#string}
2654
2655         parallel --plus echo {#demo_} ::: demo_mycode keep_demo_here
2656
2657       To remove a string anywhere you can use regular expressions with
2658       {/regexp/replacement} and leave the replacement empty:
2659
2660         parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here
2661

EXAMPLE: Download 24 images for each of the past 30 days

2663       Let us assume a website stores images like:
2664
2665         http://www.example.com/path/to/YYYYMMDD_##.jpg
2666
2667       where YYYYMMDD is the date and ## is the number 01-24. This will
2668       download images for the past 30 days:
2669
2670         getit() {
2671           date=$(date -d "today -$1 days" +%Y%m%d)
2672           num=$2
2673           echo wget http://www.example.com/path/to/${date}_${num}.jpg
2674         }
2675         export -f getit
2676
2677         parallel getit ::: $(seq 30) ::: $(seq -w 24)
2678
2679       $(date -d "today -$1 days" +%Y%m%d) will give the dates in YYYYMMDD
2680       with $1 days subtracted.
2681

EXAMPLE: Download world map from NASA

2683       NASA provides tiles to download on earthdata.nasa.gov. Download tiles
2684       for Blue Marble world map and create a 10240x20480 map.
2685
2686         base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi
2687         service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0"
2688         layer="LAYER=BlueMarble_ShadedRelief_Bathymetry"
2689         set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5"
2690         tile="TILEROW={1}&TILECOL={2}"
2691         format="FORMAT=image%2Fjpeg"
2692         url="$base?$service&$layer&$set&$tile&$format"
2693
2694         parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39}
2695         parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19}
2696         convert -append line{0..19}.jpg world.jpg
2697

EXAMPLE: Download Apollo-11 images from NASA using jq

2699       Search NASA using their API to get JSON for images related to 'apollo
2700       11' and has 'moon landing' in the description.
2701
2702       The search query returns JSON containing URLs to JSON containing
2703       collections of pictures. One of the pictures in each of these
2704       collection is large.
2705
2706       wget is used to get the JSON for the search query. jq is then used to
2707       extract the URLs of the collections. parallel then calls wget to get
2708       each collection, which is passed to jq to extract the URLs of all
2709       images. grep filters out the large images, and parallel finally uses
2710       wget to fetch the images.
2711
2712         base="https://images-api.nasa.gov/search"
2713         q="q=apollo 11"
2714         description="description=moon landing"
2715         media_type="media_type=image"
2716         wget -O - "$base?$q&$description&$media_type" |
2717           jq -r .collection.items[].href |
2718           parallel wget -O - |
2719           jq -r .[] |
2720           grep large |
2721           parallel wget
2722

EXAMPLE: Download video playlist in parallel

2724       youtube-dl is an excellent tool to download videos. It can, however,
2725       not download videos in parallel. This takes a playlist and downloads 10
2726       videos in parallel.
2727
2728         url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ'
2729         export url
2730         youtube-dl --flat-playlist "https://$url" |
2731           parallel --tagstring {#} --lb -j10 \
2732             youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"'
2733

EXAMPLE: Prepend last modified date (ISO8601) to file name

2735         parallel mv {} '{= $a=pQ($_); $b=$_;' \
2736           '$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: *
2737
2738       {= and =} mark a perl expression. pQ perl-quotes the string. date
2739       +%FT%T is the date in ISO8601 with time.
2740

EXAMPLE: Save output in ISO8601 dirs

2742       Save output from ps aux every second into dirs named
2743       yyyy-mm-ddThh:mm:ss+zz:zz.
2744
2745         seq 1000 | parallel -N0 -j1 --delay 1 \
2746           --results '{= $_=`date -Isec`; chomp=}/' ps aux
2747

EXAMPLE: Digital clock with "blinking" :

2749       The : in a digital clock blinks. To make every other line have a ':'
2750       and the rest a ' ' a perl expression is used to look at the 3rd input
2751       source. If the value modulo 2 is 1: Use ":" otherwise use " ":
2752
2753         parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \
2754           ::: {0..12} ::: {0..5} ::: {0..9}
2755

EXAMPLE: Aggregating content of files

2757       This:
2758
2759         parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \
2760         ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
2761
2762       will generate the files x1y01z1 .. x5y10z5. If you want to aggregate
2763       the output grouping on x and z you can do this:
2764
2765         parallel eval 'cat {=s/y01/y*/=} > {=s/y01//=}' ::: *y01*
2766
2767       For all values of x and z it runs commands like:
2768
2769         cat x1y*z1 > x1z1
2770
2771       So you end up with x1z1 .. x5z5 each containing the content of all
2772       values of y.
2773

EXAMPLE: Breadth first parallel web crawler/mirrorer

2775       This script below will crawl and mirror a URL in parallel.  It
2776       downloads first pages that are 1 click down, then 2 clicks down, then
2777       3; instead of the normal depth first, where the first link link on each
2778       page is fetched first.
2779
2780       Run like this:
2781
2782         PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/
2783
2784       Remove the wget part if you only want a web crawler.
2785
2786       It works by fetching a page from a list of URLs and looking for links
2787       in that page that are within the same starting URL and that have not
2788       already been seen. These links are added to a new queue. When all the
2789       pages from the list is done, the new queue is moved to the list of URLs
2790       and the process is started over until no unseen links are found.
2791
2792         #!/bin/bash
2793
2794         # E.g. http://gatt.org.yeslab.org/
2795         URL=$1
2796         # Stay inside the start dir
2797         BASEURL=$(echo $URL | perl -pe 's:#.*::; s:(//.*/)[^/]*:$1:')
2798         URLLIST=$(mktemp urllist.XXXX)
2799         URLLIST2=$(mktemp urllist.XXXX)
2800         SEEN=$(mktemp seen.XXXX)
2801
2802         # Spider to get the URLs
2803         echo $URL >$URLLIST
2804         cp $URLLIST $SEEN
2805
2806         while [ -s $URLLIST ] ; do
2807           cat $URLLIST |
2808             parallel lynx -listonly -image_links -dump {} \; \
2809               wget -qm -l1 -Q1 {} \; echo Spidered: {} \>\&2 |
2810               perl -ne 's/#.*//; s/\s+\d+.\s(\S+)$/$1/ and
2811                 do { $seen{$1}++ or print }' |
2812             grep -F $BASEURL |
2813             grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
2814           mv $URLLIST2 $URLLIST
2815         done
2816
2817         rm -f $URLLIST $URLLIST2 $SEEN
2818

EXAMPLE: Process files from a tar file while unpacking

2820       If the files to be processed are in a tar file then unpacking one file
2821       and processing it immediately may be faster than first unpacking all
2822       files.
2823
2824         tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \
2825           parallel echo
2826
2827       The Perl one-liner is needed to make sure the file is complete before
2828       handing it to GNU parallel.
2829

EXAMPLE: Rewriting a for-loop and a while-read-loop

2831       for-loops like this:
2832
2833         (for x in `cat list` ; do
2834           do_something $x
2835         done) | process_output
2836
2837       and while-read-loops like this:
2838
2839         cat list | (while read x ; do
2840           do_something $x
2841         done) | process_output
2842
2843       can be written like this:
2844
2845         cat list | parallel do_something | process_output
2846
2847       For example: Find which host name in a list has IP address 1.2.3 4:
2848
2849         cat hosts.txt | parallel -P 100 host | grep 1.2.3.4
2850
2851       If the processing requires more steps the for-loop like this:
2852
2853         (for x in `cat list` ; do
2854           no_extension=${x%.*};
2855           do_step1 $x scale $no_extension.jpg
2856           do_step2 <$x $no_extension
2857         done) | process_output
2858
2859       and while-loops like this:
2860
2861         cat list | (while read x ; do
2862           no_extension=${x%.*};
2863           do_step1 $x scale $no_extension.jpg
2864           do_step2 <$x $no_extension
2865         done) | process_output
2866
2867       can be written like this:
2868
2869         cat list | parallel "do_step1 {} scale {.}.jpg ; do_step2 <{} {.}" |\
2870           process_output
2871
2872       If the body of the loop is bigger, it improves readability to use a
2873       function:
2874
2875         (for x in `cat list` ; do
2876           do_something $x
2877           [... 100 lines that do something with $x ...]
2878         done) | process_output
2879
2880         cat list | (while read x ; do
2881           do_something $x
2882           [... 100 lines that do something with $x ...]
2883         done) | process_output
2884
2885       can both be rewritten as:
2886
2887         doit() {
2888           x=$1
2889           do_something $x
2890           [... 100 lines that do something with $x ...]
2891         }
2892         export -f doit
2893         cat list | parallel doit
2894

EXAMPLE: Rewriting nested for-loops

2896       Nested for-loops like this:
2897
2898         (for x in `cat xlist` ; do
2899           for y in `cat ylist` ; do
2900             do_something $x $y
2901           done
2902         done) | process_output
2903
2904       can be written like this:
2905
2906         parallel do_something {1} {2} :::: xlist ylist | process_output
2907
2908       Nested for-loops like this:
2909
2910         (for colour in red green blue ; do
2911           for size in S M L XL XXL ; do
2912             echo $colour $size
2913           done
2914         done) | sort
2915
2916       can be written like this:
2917
2918         parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort
2919

EXAMPLE: Finding the lowest difference between files

2921       diff is good for finding differences in text files. diff | wc -l gives
2922       an indication of the size of the difference. To find the differences
2923       between all files in the current dir do:
2924
2925         parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3
2926
2927       This way it is possible to see if some files are closer to other files.
2928

EXAMPLE: for-loops with column names

2930       When doing multiple nested for-loops it can be easier to keep track of
2931       the loop variable if is is named instead of just having a number. Use
2932       --header : to let the first argument be an named alias for the
2933       positional replacement string:
2934
2935         parallel --header : echo {colour} {size} \
2936           ::: colour red green blue ::: size S M L XL XXL
2937
2938       This also works if the input file is a file with columns:
2939
2940         cat addressbook.tsv | \
2941           parallel --colsep '\t' --header : echo {Name} {E-mail address}
2942

EXAMPLE: All combinations in a list

2944       GNU parallel makes all combinations when given two lists.
2945
2946       To make all combinations in a single list with unique values, you
2947       repeat the list and use replacement string {choose_k}:
2948
2949         parallel --plus echo {choose_k} ::: A B C D ::: A B C D
2950
2951         parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D
2952
2953       {choose_k} works for any number of input sources:
2954
2955         parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D
2956

EXAMPLE: From a to b and b to c

2958       Assume you have input like:
2959
2960         aardvark
2961         babble
2962         cab
2963         dab
2964         each
2965
2966       and want to run combinations like:
2967
2968         aardvark babble
2969         babble cab
2970         cab dab
2971         dab each
2972
2973       If the input is in the file in.txt:
2974
2975         parallel echo {1} - {2} ::::+ <(head -n -1 in.txt) <(tail -n +2 in.txt)
2976
2977       If the input is in the array $a here are two solutions:
2978
2979         seq $((${#a[@]}-1)) | \
2980           env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}'
2981         parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}"
2982

EXAMPLE: Count the differences between all files in a dir

2984       Using --results the results are saved in /tmp/diffcount*.
2985
2986         parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \
2987           tail -n +3 |grep -v '^@'|wc -l" ::: * ::: *
2988
2989       To see the difference between file A and file B look at the file
2990       '/tmp/diffcount/1/A/2/B'.
2991

EXAMPLE: Speeding up fast jobs

2993       Starting a job on the local machine takes around 10 ms. This can be a
2994       big overhead if the job takes very few ms to run. Often you can group
2995       small jobs together using -X which will make the overhead less
2996       significant. Compare the speed of these:
2997
2998         seq -w 0 9999 | parallel touch pict{}.jpg
2999         seq -w 0 9999 | parallel -X touch pict{}.jpg
3000
3001       If your program cannot take multiple arguments, then you can use GNU
3002       parallel to spawn multiple GNU parallels:
3003
3004         seq -w 0 9999999 | \
3005           parallel -j10 -q -I,, --pipe parallel -j0 touch pict{}.jpg
3006
3007       If -j0 normally spawns 252 jobs, then the above will try to spawn 2520
3008       jobs. On a normal GNU/Linux system you can spawn 32000 jobs using this
3009       technique with no problems. To raise the 32000 jobs limit raise
3010       /proc/sys/kernel/pid_max to 4194303.
3011
3012       If you do not need GNU parallel to have control over each job (so no
3013       need for --retries or --joblog or similar), then it can be even faster
3014       if you can generate the command lines and pipe those to a shell. So if
3015       you can do this:
3016
3017         mygenerator | sh
3018
3019       Then that can be parallelized like this:
3020
3021         mygenerator | parallel --pipe --block 10M sh
3022
3023       E.g.
3024
3025         mygenerator() {
3026           seq 10000000 | perl -pe 'print "echo This is fast job number "';
3027         }
3028         mygenerator | parallel --pipe --block 10M sh
3029
3030       The overhead is 100000 times smaller namely around 100 nanoseconds per
3031       job.
3032

EXAMPLE: Using shell variables

3034       When using shell variables you need to quote them correctly as they may
3035       otherwise be interpreted by the shell.
3036
3037       Notice the difference between:
3038
3039         ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3040         parallel echo ::: ${ARR[@]} # This is probably not what you want
3041
3042       and:
3043
3044         ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3045         parallel echo ::: "${ARR[@]}"
3046
3047       When using variables in the actual command that contains special
3048       characters (e.g. space) you can quote them using '"$VAR"' or using "'s
3049       and -q:
3050
3051         VAR="My brother's 12\" records are worth <\$\$\$>"
3052         parallel -q echo "$VAR" ::: '!'
3053         export VAR
3054         parallel echo '"$VAR"' ::: '!'
3055
3056       If $VAR does not contain ' then "'$VAR'" will also work (and does not
3057       need export):
3058
3059         VAR="My 12\" records are worth <\$\$\$>"
3060         parallel echo "'$VAR'" ::: '!'
3061
3062       If you use them in a function you just quote as you normally would do:
3063
3064         VAR="My brother's 12\" records are worth <\$\$\$>"
3065         export VAR
3066         myfunc() { echo "$VAR" "$1"; }
3067         export -f myfunc
3068         parallel myfunc ::: '!'
3069

EXAMPLE: Group output lines

3071       When running jobs that output data, you often do not want the output of
3072       multiple jobs to run together. GNU parallel defaults to grouping the
3073       output of each job, so the output is printed when the job finishes. If
3074       you want full lines to be printed while the job is running you can use
3075       --line-buffer. If you want output to be printed as soon as possible you
3076       can use -u.
3077
3078       Compare the output of:
3079
3080         parallel wget --limit-rate=100k \
3081           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3082           ::: {12..16}
3083         parallel --line-buffer wget --limit-rate=100k \
3084           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3085           ::: {12..16}
3086         parallel -u wget --limit-rate=100k \
3087           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3088           ::: {12..16}
3089

EXAMPLE: Tag output lines

3091       GNU parallel groups the output lines, but it can be hard to see where
3092       the different jobs begin. --tag prepends the argument to make that more
3093       visible:
3094
3095         parallel --tag wget --limit-rate=100k \
3096           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3097           ::: {12..16}
3098
3099       --tag works with --line-buffer but not with -u:
3100
3101         parallel --tag --line-buffer wget --limit-rate=100k \
3102           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3103           ::: {12..16}
3104
3105       Check the uptime of the servers in ~/.parallel/sshloginfile:
3106
3107         parallel --tag -S .. --nonall uptime
3108

EXAMPLE: Colorize output

3110       Give each job a new color. Most terminals support ANSI colors with the
3111       escape code "\033[30;3Xm" where 0 <= X <= 7:
3112
3113           seq 10 | \
3114             parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {}
3115           parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \
3116             --tagstring {color} seq {} ::: {1..10}
3117
3118       To get rid of the initial \t (which comes from --tagstring):
3119
3120           ... | perl -pe 's/\t//'
3121

EXAMPLE: Keep order of output same as order of input

3123       Normally the output of a job will be printed as soon as it completes.
3124       Sometimes you want the order of the output to remain the same as the
3125       order of the input. This is often important, if the output is used as
3126       input for another system. -k will make sure the order of output will be
3127       in the same order as input even if later jobs end before earlier jobs.
3128
3129       Append a string to every line in a text file:
3130
3131         cat textfile | parallel -k echo {} append_string
3132
3133       If you remove -k some of the lines may come out in the wrong order.
3134
3135       Another example is traceroute:
3136
3137         parallel traceroute ::: qubes-os.org debian.org freenetproject.org
3138
3139       will give traceroute of qubes-os.org, debian.org and
3140       freenetproject.org, but it will be sorted according to which job
3141       completed first.
3142
3143       To keep the order the same as input run:
3144
3145         parallel -k traceroute ::: qubes-os.org debian.org freenetproject.org
3146
3147       This will make sure the traceroute to qubes-os.org will be printed
3148       first.
3149
3150       A bit more complex example is downloading a huge file in chunks in
3151       parallel: Some internet connections will deliver more data if you
3152       download files in parallel. For downloading files in parallel see:
3153       "EXAMPLE: Download 10 images for each of the past 30 days". But if you
3154       are downloading a big file you can download the file in chunks in
3155       parallel.
3156
3157       To download byte 10000000-19999999 you can use curl:
3158
3159         curl -r 10000000-19999999 http://example.com/the/big/file >file.part
3160
3161       To download a 1 GB file we need 100 10MB chunks downloaded and combined
3162       in the correct order.
3163
3164         seq 0 99 | parallel -k curl -r \
3165           {}0000000-{}9999999 http://example.com/the/big/file > file
3166

EXAMPLE: Parallel grep

3168       grep -r greps recursively through directories. On multicore CPUs GNU
3169       parallel can often speed this up.
3170
3171         find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
3172
3173       This will run 1.5 job per CPU, and give 1000 arguments to grep.
3174

EXAMPLE: Grepping n lines for m regular expressions.

3176       The simplest solution to grep a big file for a lot of regexps is:
3177
3178         grep -f regexps.txt bigfile
3179
3180       Or if the regexps are fixed strings:
3181
3182         grep -F -f regexps.txt bigfile
3183
3184       There are 3 limiting factors: CPU, RAM, and disk I/O.
3185
3186       RAM is easy to measure: If the grep process takes up most of your free
3187       memory (e.g. when running top), then RAM is a limiting factor.
3188
3189       CPU is also easy to measure: If the grep takes >90% CPU in top, then
3190       the CPU is a limiting factor, and parallelization will speed this up.
3191
3192       It is harder to see if disk I/O is the limiting factor, and depending
3193       on the disk system it may be faster or slower to parallelize. The only
3194       way to know for certain is to test and measure.
3195
3196   Limiting factor: RAM
3197       The normal grep -f regexs.txt bigfile works no matter the size of
3198       bigfile, but if regexps.txt is so big it cannot fit into memory, then
3199       you need to split this.
3200
3201       grep -F takes around 100 bytes of RAM and grep takes about 500 bytes of
3202       RAM per 1 byte of regexp. So if regexps.txt is 1% of your RAM, then it
3203       may be too big.
3204
3205       If you can convert your regexps into fixed strings do that. E.g. if the
3206       lines you are looking for in bigfile all looks like:
3207
3208         ID1 foo bar baz Identifier1 quux
3209         fubar ID2 foo bar baz Identifier2
3210
3211       then your regexps.txt can be converted from:
3212
3213         ID1.*Identifier1
3214         ID2.*Identifier2
3215
3216       into:
3217
3218         ID1 foo bar baz Identifier1
3219         ID2 foo bar baz Identifier2
3220
3221       This way you can use grep -F which takes around 80% less memory and is
3222       much faster.
3223
3224       If it still does not fit in memory you can do this:
3225
3226         parallel --pipepart -a regexps.txt --block 1M grep -Ff - -n bigfile | \
3227           sort -un | perl -pe 's/^\d+://'
3228
3229       The 1M should be your free memory divided by the number of CPU threads
3230       and divided by 200 for grep -F and by 1000 for normal grep. On
3231       GNU/Linux you can do:
3232
3233         free=$(awk '/^((Swap)?Cached|MemFree|Buffers):/ { sum += $2 }
3234                     END { print sum }' /proc/meminfo)
3235         percpu=$((free / 200 / $(parallel --number-of-threads)))k
3236
3237         parallel --pipepart -a regexps.txt --block $percpu --compress \
3238           grep -F -f - -n bigfile | \
3239           sort -un | perl -pe 's/^\d+://'
3240
3241       If you can live with duplicated lines and wrong order, it is faster to
3242       do:
3243
3244         parallel --pipepart -a regexps.txt --block $percpu --compress \
3245           grep -F -f - bigfile
3246
3247   Limiting factor: CPU
3248       If the CPU is the limiting factor parallelization should be done on the
3249       regexps:
3250
3251         cat regexp.txt | parallel --pipe -L1000 --roundrobin --compress \
3252           grep -f - -n bigfile | \
3253           sort -un | perl -pe 's/^\d+://'
3254
3255       The command will start one grep per CPU and read bigfile one time per
3256       CPU, but as that is done in parallel, all reads except the first will
3257       be cached in RAM. Depending on the size of regexp.txt it may be faster
3258       to use --block 10m instead of -L1000.
3259
3260       Some storage systems perform better when reading multiple chunks in
3261       parallel. This is true for some RAID systems and for some network file
3262       systems. To parallelize the reading of bigfile:
3263
3264         parallel --pipepart --block 100M -a bigfile -k --compress \
3265           grep -f regexp.txt
3266
3267       This will split bigfile into 100MB chunks and run grep on each of these
3268       chunks. To parallelize both reading of bigfile and regexp.txt combine
3269       the two using --fifo:
3270
3271         parallel --pipepart --block 100M -a bigfile --fifo cat regexp.txt \
3272           \| parallel --pipe -L1000 --roundrobin grep -f - {}
3273
3274       If a line matches multiple regexps, the line may be duplicated.
3275
3276   Bigger problem
3277       If the problem is too big to be solved by this, you are probably ready
3278       for Lucene.
3279

EXAMPLE: Using remote computers

3281       To run commands on a remote computer SSH needs to be set up and you
3282       must be able to login without entering a password (The commands ssh-
3283       copy-id, ssh-agent, and sshpass may help you do that).
3284
3285       If you need to login to a whole cluster, you typically do not want to
3286       accept the host key for every host. You want to accept them the first
3287       time and be warned if they are ever changed. To do that:
3288
3289         # Add the servers to the sshloginfile
3290         (echo servera; echo serverb) > .parallel/my_cluster
3291         # Make sure .ssh/config exist
3292         touch .ssh/config
3293         cp .ssh/config .ssh/config.backup
3294         # Disable StrictHostKeyChecking temporarily
3295         (echo 'Host *'; echo StrictHostKeyChecking no) >> .ssh/config
3296         parallel --slf my_cluster --nonall true
3297         # Remove the disabling of StrictHostKeyChecking
3298         mv .ssh/config.backup .ssh/config
3299
3300       The servers in .parallel/my_cluster are now added in .ssh/known_hosts.
3301
3302       To run echo on server.example.com:
3303
3304         seq 10 | parallel --sshlogin server.example.com echo
3305
3306       To run commands on more than one remote computer run:
3307
3308         seq 10 | parallel --sshlogin s1.example.com,s2.example.net echo
3309
3310       Or:
3311
3312         seq 10 | parallel --sshlogin server.example.com \
3313           --sshlogin server2.example.net echo
3314
3315       If the login username is foo on server2.example.net use:
3316
3317         seq 10 | parallel --sshlogin server.example.com \
3318           --sshlogin foo@server2.example.net echo
3319
3320       If your list of hosts is server1-88.example.net with login foo:
3321
3322         seq 10 | parallel -Sfoo@server{1..88}.example.net echo
3323
3324       To distribute the commands to a list of computers, make a file
3325       mycomputers with all the computers:
3326
3327         server.example.com
3328         foo@server2.example.com
3329         server3.example.com
3330
3331       Then run:
3332
3333         seq 10 | parallel --sshloginfile mycomputers echo
3334
3335       To include the local computer add the special sshlogin ':' to the list:
3336
3337         server.example.com
3338         foo@server2.example.com
3339         server3.example.com
3340         :
3341
3342       GNU parallel will try to determine the number of CPUs on each of the
3343       remote computers, and run one job per CPU - even if the remote
3344       computers do not have the same number of CPUs.
3345
3346       If the number of CPUs on the remote computers is not identified
3347       correctly the number of CPUs can be added in front. Here the computer
3348       has 8 CPUs.
3349
3350         seq 10 | parallel --sshlogin 8/server.example.com echo
3351

EXAMPLE: Transferring of files

3353       To recompress gzipped files with bzip2 using a remote computer run:
3354
3355         find logs/ -name '*.gz' | \
3356           parallel --sshlogin server.example.com \
3357           --transfer "zcat {} | bzip2 -9 >{.}.bz2"
3358
3359       This will list the .gz-files in the logs directory and all directories
3360       below. Then it will transfer the files to server.example.com to the
3361       corresponding directory in $HOME/logs. On server.example.com the file
3362       will be recompressed using zcat and bzip2 resulting in the
3363       corresponding file with .gz replaced with .bz2.
3364
3365       If you want the resulting bz2-file to be transferred back to the local
3366       computer add --return {.}.bz2:
3367
3368         find logs/ -name '*.gz' | \
3369           parallel --sshlogin server.example.com \
3370           --transfer --return {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3371
3372       After the recompressing is done the .bz2-file is transferred back to
3373       the local computer and put next to the original .gz-file.
3374
3375       If you want to delete the transferred files on the remote computer add
3376       --cleanup. This will remove both the file transferred to the remote
3377       computer and the files transferred from the remote computer:
3378
3379         find logs/ -name '*.gz' | \
3380           parallel --sshlogin server.example.com \
3381           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3382
3383       If you want run on several computers add the computers to --sshlogin
3384       either using ',' or multiple --sshlogin:
3385
3386         find logs/ -name '*.gz' | \
3387           parallel --sshlogin server.example.com,server2.example.com \
3388           --sshlogin server3.example.com \
3389           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3390
3391       You can add the local computer using --sshlogin :. This will disable
3392       the removing and transferring for the local computer only:
3393
3394         find logs/ -name '*.gz' | \
3395           parallel --sshlogin server.example.com,server2.example.com \
3396           --sshlogin server3.example.com \
3397           --sshlogin : \
3398           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3399
3400       Often --transfer, --return and --cleanup are used together. They can be
3401       shortened to --trc:
3402
3403         find logs/ -name '*.gz' | \
3404           parallel --sshlogin server.example.com,server2.example.com \
3405           --sshlogin server3.example.com \
3406           --sshlogin : \
3407           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3408
3409       With the file mycomputers containing the list of computers it becomes:
3410
3411         find logs/ -name '*.gz' | parallel --sshloginfile mycomputers \
3412           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3413
3414       If the file ~/.parallel/sshloginfile contains the list of computers the
3415       special short hand -S .. can be used:
3416
3417         find logs/ -name '*.gz' | parallel -S .. \
3418           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3419

EXAMPLE: Distributing work to local and remote computers

3421       Convert *.mp3 to *.ogg running one process per CPU on local computer
3422       and server2:
3423
3424         parallel --trc {.}.ogg -S server2,: \
3425           'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3
3426

EXAMPLE: Running the same command on remote computers

3428       To run the command uptime on remote computers you can do:
3429
3430         parallel --tag --nonall -S server1,server2 uptime
3431
3432       --nonall reads no arguments. If you have a list of jobs you want to run
3433       on each computer you can do:
3434
3435         parallel --tag --onall -S server1,server2 echo ::: 1 2 3
3436
3437       Remove --tag if you do not want the sshlogin added before the output.
3438
3439       If you have a lot of hosts use '-j0' to access more hosts in parallel.
3440

EXAMPLE: Using remote computers behind NAT wall

3442       If the workers are behind a NAT wall, you need some trickery to get to
3443       them.
3444
3445       If you can ssh to a jumphost, and reach the workers from there, then
3446       the obvious solution would be this, but it does not work:
3447
3448         parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
3449
3450       It does not work because the command is dequoted by ssh twice where as
3451       GNU parallel only expects it to be dequoted once.
3452
3453       You can use a bash function and have GNU parallel quote the command:
3454
3455         jumpssh() { ssh -A jumphost ssh $(parallel --shellquote ::: "$@"); }
3456         export -f jumpssh
3457         parallel --ssh jumpssh -S host1 echo ::: this works
3458
3459       Or you can instead put this in ~/.ssh/config:
3460
3461         Host host1 host2 host3
3462           ProxyCommand ssh jumphost.domain nc -w 1 %h 22
3463
3464       It requires nc(netcat) to be installed on jumphost. With this you can
3465       simply:
3466
3467         parallel -S host1,host2,host3 echo ::: This does work
3468
3469   No jumphost, but port forwards
3470       If there is no jumphost but each server has port 22 forwarded from the
3471       firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 =
3472       host2, 22003 = host3) then you can use ~/.ssh/config:
3473
3474         Host host1.v
3475           Port 22001
3476         Host host2.v
3477           Port 22002
3478         Host host3.v
3479           Port 22003
3480         Host *.v
3481           Hostname firewall
3482
3483       And then use host{1..3}.v as normal hosts:
3484
3485         parallel -S host1.v,host2.v,host3.v echo ::: a b c
3486
3487   No jumphost, no port forwards
3488       If ports cannot be forwarded, you need some sort of VPN to traverse the
3489       NAT-wall. TOR is one options for that, as it is very easy to get
3490       working.
3491
3492       You need to install TOR and setup a hidden service. In torrc put:
3493
3494         HiddenServiceDir /var/lib/tor/hidden_service/
3495         HiddenServicePort 22 127.0.0.1:22
3496
3497       Then start TOR: /etc/init.d/tor restart
3498
3499       The TOR hostname is now in /var/lib/tor/hidden_service/hostname and is
3500       something similar to izjafdceobowklhz.onion. Now you simply prepend
3501       torsocks to ssh:
3502
3503         parallel --ssh 'torsocks ssh' -S izjafdceobowklhz.onion \
3504           -S zfcdaeiojoklbwhz.onion,auclucjzobowklhi.onion echo ::: a b c
3505
3506       If not all hosts are accessible through TOR:
3507
3508         parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \
3509           echo ::: a b c
3510
3511       See more ssh tricks on
3512       https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts
3513

EXAMPLE: Parallelizing rsync

3515       rsync is a great tool, but sometimes it will not fill up the available
3516       bandwidth. Running multiple rsync in parallel can fix this.
3517
3518         cd src-dir
3519         find . -type f |
3520           parallel -j10 -X rsync -zR -Ha ./{} fooserver:/dest-dir/
3521
3522       Adjust -j10 until you find the optimal number.
3523
3524       rsync -R will create the needed subdirectories, so all files are not
3525       put into a single dir. The ./ is needed so the resulting command looks
3526       similar to:
3527
3528         rsync -zR ././sub/dir/file fooserver:/dest-dir/
3529
3530       The /./ is what rsync -R works on.
3531
3532       If you are unable to push data, but need to pull them and the files are
3533       called digits.png (e.g. 000000.png) you might be able to do:
3534
3535         seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/
3536

EXAMPLE: Use multiple inputs in one command

3538       Copy files like foo.es.ext to foo.ext:
3539
3540         ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2}
3541
3542       The perl command spits out 2 lines for each input. GNU parallel takes 2
3543       inputs (using -N2) and replaces {1} and {2} with the inputs.
3544
3545       Count in binary:
3546
3547         parallel -k echo ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1
3548
3549       Print the number on the opposing sides of a six sided die:
3550
3551         parallel --link -a <(seq 6) -a <(seq 6 -1 1) echo
3552         parallel --link echo :::: <(seq 6) <(seq 6 -1 1)
3553
3554       Convert files from all subdirs to PNG-files with consecutive numbers
3555       (useful for making input PNG's for ffmpeg):
3556
3557         parallel --link -a <(find . -type f | sort) \
3558           -a <(seq $(find . -type f|wc -l)) convert {1} {2}.png
3559
3560       Alternative version:
3561
3562         find . -type f | sort | parallel convert {} {#}.png
3563

EXAMPLE: Use a table as input

3565       Content of table_file.tsv:
3566
3567         foo<TAB>bar
3568         baz <TAB> quux
3569
3570       To run:
3571
3572         cmd -o bar -i foo
3573         cmd -o quux -i baz
3574
3575       you can run:
3576
3577         parallel -a table_file.tsv --colsep '\t' cmd -o {2} -i {1}
3578
3579       Note: The default for GNU parallel is to remove the spaces around the
3580       columns. To keep the spaces:
3581
3582         parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1}
3583

EXAMPLE: Output to database

3585       GNU parallel can output to a database table and a CSV-file:
3586
3587         dburl=csv:///%2Ftmp%2Fmydir
3588         dbtableurl=$dburl/mytable.csv
3589         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3590
3591       It is rather slow and takes up a lot of CPU time because GNU parallel
3592       parses the whole CSV file for each update.
3593
3594       A better approach is to use an SQLite-base and then convert that to
3595       CSV:
3596
3597         dburl=sqlite3:///%2Ftmp%2Fmy.sqlite
3598         dbtableurl=$dburl/mytable
3599         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3600         sql $dburl '.headers on' '.mode csv' 'SELECT * FROM mytable;'
3601
3602       This takes around a second per job.
3603
3604       If you have access to a real database system, such as PostgreSQL, it is
3605       even faster:
3606
3607         dburl=pg://user:pass@host/mydb
3608         dbtableurl=$dburl/mytable
3609         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3610         sql $dburl \
3611           "COPY (SELECT * FROM mytable) TO stdout DELIMITER ',' CSV HEADER;"
3612
3613       Or MySQL:
3614
3615         dburl=mysql://user:pass@host/mydb
3616         dbtableurl=$dburl/mytable
3617         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3618         sql -p -B $dburl "SELECT * FROM mytable;" > mytable.tsv
3619         perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/;
3620           %s=("\\" => "\\", "t" => "\t", "n" => "\n");
3621           s/\\([\\tn])/$s{$1}/g;' mytable.tsv
3622

EXAMPLE: Output to CSV-file for R

3624       If you have no need for the advanced job distribution control that a
3625       database provides, but you simply want output into a CSV file that you
3626       can read into R or LibreCalc, then you can use --results:
3627
3628         parallel --results my.csv seq ::: 10 20 30
3629         R
3630         > mydf <- read.csv("my.csv");
3631         > print(mydf[2,])
3632         > write(as.character(mydf[2,c("Stdout")]),'')
3633

EXAMPLE: Use XML as input

3635       The show Aflyttet on Radio 24syv publishes an RSS feed with their audio
3636       podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232
3637
3638       Using xpath you can extract the URLs for 2019 and download them using
3639       GNU parallel:
3640
3641         wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \
3642           xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \
3643           parallel -u wget '{= s/ url="//; s/"//; =}'
3644

EXAMPLE: Run the same command 10 times

3646       If you want to run the same command with the same arguments 10 times in
3647       parallel you can do:
3648
3649         seq 10 | parallel -n0 my_command my_args
3650

EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation

3652       GNU parallel can work similar to cat | sh.
3653
3654       A resource inexpensive job is a job that takes very little CPU, disk
3655       I/O and network I/O. Ping is an example of a resource inexpensive job.
3656       wget is too - if the webpages are small.
3657
3658       The content of the file jobs_to_run:
3659
3660         ping -c 1 10.0.0.1
3661         wget http://example.com/status.cgi?ip=10.0.0.1
3662         ping -c 1 10.0.0.2
3663         wget http://example.com/status.cgi?ip=10.0.0.2
3664         ...
3665         ping -c 1 10.0.0.255
3666         wget http://example.com/status.cgi?ip=10.0.0.255
3667
3668       To run 100 processes simultaneously do:
3669
3670         parallel -j 100 < jobs_to_run
3671
3672       As there is not a command the jobs will be evaluated by the shell.
3673

EXAMPLE: Call program with FASTA sequence

3675       FASTA files have the format:
3676
3677         >Sequence name1
3678         sequence
3679         sequence continued
3680         >Sequence name2
3681         sequence
3682         sequence continued
3683         more sequence
3684
3685       To call myprog with the sequence as argument run:
3686
3687         cat file.fasta |
3688           parallel --pipe -N1 --recstart '>' --rrs \
3689             'read a; echo Name: "$a"; myprog $(tr -d "\n")'
3690

EXAMPLE: Processing a big file using more CPUs

3692       To process a big file or some output you can use --pipe to split up the
3693       data into blocks and pipe the blocks into the processing program.
3694
3695       If the program is gzip -9 you can do:
3696
3697         cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz
3698
3699       This will split bigfile into blocks of 1 MB and pass that to gzip -9 in
3700       parallel. One gzip will be run per CPU. The output of gzip -9 will be
3701       kept in order and saved to bigfile.gz
3702
3703       gzip works fine if the output is appended, but some processing does not
3704       work like that - for example sorting. For this GNU parallel can put the
3705       output of each command into a file. This will sort a big file in
3706       parallel:
3707
3708         cat bigfile | parallel --pipe --files sort |\
3709           parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
3710
3711       Here bigfile is split into blocks of around 1MB, each block ending in
3712       '\n' (which is the default for --recend). Each block is passed to sort
3713       and the output from sort is saved into files. These files are passed to
3714       the second parallel that runs sort -m on the files before it removes
3715       the files. The output is saved to bigfile.sort.
3716
3717       GNU parallel's --pipe maxes out at around 100 MB/s because every byte
3718       has to be copied through GNU parallel. But if bigfile is a real
3719       (seekable) file GNU parallel can by-pass the copying and send the parts
3720       directly to the program:
3721
3722         parallel --pipepart --block 100m -a bigfile --files sort |\
3723           parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
3724

EXAMPLE: Grouping input lines

3726       When processing with --pipe you may have lines grouped by a value. Here
3727       is my.csv:
3728
3729          Transaction Customer Item
3730               1       a       53
3731               2       b       65
3732               3       b       82
3733               4       c       96
3734               5       c       67
3735               6       c       13
3736               7       d       90
3737               8       d       43
3738               9       d       91
3739               10      d       84
3740               11      e       72
3741               12      e       102
3742               13      e       63
3743               14      e       56
3744               15      e       74
3745
3746       Let us assume you want GNU parallel to process each customer. In other
3747       words: You want all the transactions for a single customer to be
3748       treated as a single record.
3749
3750       To do this we preprocess the data with a program that inserts a record
3751       separator before each customer (column 2 = $F[1]). Here we first make a
3752       50 character random string, which we then use as the separator:
3753
3754         sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'`
3755         cat my.csv | \
3756            perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \
3757            parallel --recend $sep --rrs --pipe -N1 wc
3758
3759       If your program can process multiple customers replace -N1 with a
3760       reasonable --blocksize.
3761

EXAMPLE: Running more than 250 jobs workaround

3763       If you need to run a massive amount of jobs in parallel, then you will
3764       likely hit the filehandle limit which is often around 250 jobs. If you
3765       are super user you can raise the limit in /etc/security/limits.conf but
3766       you can also use this workaround. The filehandle limit is per process.
3767       That means that if you just spawn more GNU parallels then each of them
3768       can run 250 jobs. This will spawn up to 2500 jobs:
3769
3770         cat myinput |\
3771           parallel --pipe -N 50 --roundrobin -j50 parallel -j50 your_prg
3772
3773       This will spawn up to 62500 jobs (use with caution - you need 64 GB RAM
3774       to do this, and you may need to increase /proc/sys/kernel/pid_max):
3775
3776         cat myinput |\
3777           parallel --pipe -N 250 --roundrobin -j250 parallel -j250 your_prg
3778

EXAMPLE: Working as mutex and counting semaphore

3780       The command sem is an alias for parallel --semaphore.
3781
3782       A counting semaphore will allow a given number of jobs to be started in
3783       the background.  When the number of jobs are running in the background,
3784       GNU sem will wait for one of these to complete before starting another
3785       command. sem --wait will wait for all jobs to complete.
3786
3787       Run 10 jobs concurrently in the background:
3788
3789         for i in *.log ; do
3790           echo $i
3791           sem -j10 gzip $i ";" echo done
3792         done
3793         sem --wait
3794
3795       A mutex is a counting semaphore allowing only one job to run. This will
3796       edit the file myfile and prepends the file with lines with the numbers
3797       1 to 3.
3798
3799         seq 3 | parallel sem sed -i -e '1i{}' myfile
3800
3801       As myfile can be very big it is important only one process edits the
3802       file at the same time.
3803
3804       Name the semaphore to have multiple different semaphores active at the
3805       same time:
3806
3807         seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile
3808

EXAMPLE: Mutex for a script

3810       Assume a script is called from cron or from a web service, but only one
3811       instance can be run at a time. With sem and --shebang-wrap the script
3812       can be made to wait for other instances to finish. Here in bash:
3813
3814         #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash
3815
3816         echo This will run
3817         sleep 5
3818         echo exclusively
3819
3820       Here perl:
3821
3822         #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/perl
3823
3824         print "This will run ";
3825         sleep 5;
3826         print "exclusively\n";
3827
3828       Here python:
3829
3830         #!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python
3831
3832         import time
3833         print "This will run ";
3834         time.sleep(5)
3835         print "exclusively";
3836

EXAMPLE: Start editor with filenames from stdin (standard input)

3838       You can use GNU parallel to start interactive programs like emacs or
3839       vi:
3840
3841         cat filelist | parallel --tty -X emacs
3842         cat filelist | parallel --tty -X vi
3843
3844       If there are more files than will fit on a single command line, the
3845       editor will be started again with the remaining files.
3846

EXAMPLE: Running sudo

3848       sudo requires a password to run a command as root. It caches the
3849       access, so you only need to enter the password again if you have not
3850       used sudo for a while.
3851
3852       The command:
3853
3854         parallel sudo echo ::: This is a bad idea
3855
3856       is no good, as you would be prompted for the sudo password for each of
3857       the jobs. You can either do:
3858
3859         sudo echo This
3860         parallel sudo echo ::: is a good idea
3861
3862       or:
3863
3864         sudo parallel echo ::: This is a good idea
3865
3866       This way you only have to enter the sudo password once.
3867

EXAMPLE: GNU Parallel as queue system/batch manager

3869       GNU parallel can work as a simple job queue system or batch manager.
3870       The idea is to put the jobs into a file and have GNU parallel read from
3871       that continuously. As GNU parallel will stop at end of file we use tail
3872       to continue reading:
3873
3874         true >jobqueue; tail -n+0 -f jobqueue | parallel
3875
3876       To submit your jobs to the queue:
3877
3878         echo my_command my_arg >> jobqueue
3879
3880       You can of course use -S to distribute the jobs to remote computers:
3881
3882         true >jobqueue; tail -n+0 -f jobqueue | parallel -S ..
3883
3884       If you keep this running for a long time, jobqueue will grow. A way of
3885       removing the jobs already run is by making GNU parallel stop when it
3886       hits a special value and then restart. To use --eof to make GNU
3887       parallel exit, tail also needs to be forced to exit:
3888
3889         true >jobqueue;
3890         while true; do
3891           tail -n+0 -f jobqueue |
3892             (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
3893              perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
3894              (seq 1000 >> jobqueue &);
3895              echo Done appending dummy data forcing tail to exit)
3896           echo tail exited;
3897           mv j2 jobqueue
3898         done
3899
3900       In some cases you can run on more CPUs and computers during the night:
3901
3902         # Day time
3903         echo 50% > jobfile
3904         cp day_server_list ~/.parallel/sshloginfile
3905         # Night time
3906         echo 100% > jobfile
3907         cp night_server_list ~/.parallel/sshloginfile
3908         tail -n+0 -f jobqueue | parallel --jobs jobfile -S ..
3909
3910       GNU parallel discovers if jobfile or ~/.parallel/sshloginfile changes.
3911
3912       There is a a small issue when using GNU parallel as queue system/batch
3913       manager: You have to submit JobSlot number of jobs before they will
3914       start, and after that you can submit one at a time, and job will start
3915       immediately if free slots are available.  Output from the running or
3916       completed jobs are held back and will only be printed when JobSlots
3917       more jobs has been started (unless you use --ungroup or --line-buffer,
3918       in which case the output from the jobs are printed immediately).  E.g.
3919       if you have 10 jobslots then the output from the first completed job
3920       will only be printed when job 11 has started, and the output of second
3921       completed job will only be printed when job 12 has started.
3922

EXAMPLE: GNU Parallel as dir processor

3924       If you have a dir in which users drop files that needs to be processed
3925       you can do this on GNU/Linux (If you know what inotifywait is called on
3926       other platforms file a bug report):
3927
3928         inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
3929           parallel -u echo
3930
3931       This will run the command echo on each file put into my_dir or subdirs
3932       of my_dir.
3933
3934       You can of course use -S to distribute the jobs to remote computers:
3935
3936         inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
3937           parallel -S ..  -u echo
3938
3939       If the files to be processed are in a tar file then unpacking one file
3940       and processing it immediately may be faster than first unpacking all
3941       files. Set up the dir processor as above and unpack into the dir.
3942
3943       Using GNU parallel as dir processor has the same limitations as using
3944       GNU parallel as queue system/batch manager.
3945

EXAMPLE: Locate the missing package

3947       If you have downloaded source and tried compiling it, you may have
3948       seen:
3949
3950         $ ./configure
3951         [...]
3952         checking for something.h... no
3953         configure: error: "libsomething not found"
3954
3955       Often it is not obvious which package you should install to get that
3956       file. Debian has `apt-file` to search for a file. `tracefile` from
3957       https://gitlab.com/ole.tange/tangetools can tell which files a program
3958       tried to access. In this case we are interested in one of the last
3959       files:
3960
3961         $ tracefile -un ./configure | tail | parallel -j0 apt-file search
3962

SPREADING BLOCKS OF DATA

3964       --round-robin, --pipe-part, --shard, --bin and --group-by are all
3965       specialized versions of --pipe.
3966
3967       In the following n is the number of jobslots given by --jobs. A record
3968       starts with --recstart and ends with --recend. It is typically a full
3969       line. A chunk is a number of full records that is approximately the
3970       size of a block. A block can contain half records, a chunk cannot.
3971
3972       --pipe starts one job per chunk. It reads blocks from stdin (standard
3973       input). It finds a record end near a block border and passes a chunk to
3974       the program.
3975
3976       --pipe-part starts one job per chunk - just like normal --pipe. It
3977       first finds record endings near all block borders in the file and then
3978       starts the jobs. By using --block -1 it will set the block size to 1/n
3979       * size-of-file. Used this way it will start n jobs in total.
3980
3981       --round-robin starts n jobs in total. It reads a block and passes a
3982       chunk to whichever job is ready to read. It does not parse the content
3983       except for identifying where a record ends to make sure it only passes
3984       full records.
3985
3986       --shard starts n jobs in total. It parses each line to read the value
3987       in the given column. Based on this value the line is passed to one of
3988       the n jobs. All lines having this value will be given to the same
3989       jobslot.
3990
3991       --bin works like --shard but the value of the column is the jobslot
3992       number it will be passed to. If the value is bigger than n, then n will
3993       be subtracted from the value until the values is smaller than or equal
3994       to n.
3995
3996       --group-by starts one job per chunk. Record borders are not given by
3997       --recend/--recstart. Instead a record is defined by a number of lines
3998       having the same value in a given column. So the value of a given column
3999       changes at a chunk border. With --pipe every line is parsed, with
4000       --pipe-part only a few lines are parsed to find the chunk border.
4001
4002       --group-by can be combined with --round-robin or --pipe-part.
4003

QUOTING

4005       GNU parallel is very liberal in quoting. You only need to quote
4006       characters that have special meaning in shell:
4007
4008         ( ) $ ` ' " < > ; | \
4009
4010       and depending on context these needs to be quoted, too:
4011
4012         ~ & # ! ? space * {
4013
4014       Therefore most people will never need more quoting than putting '\' in
4015       front of the special characters.
4016
4017       Often you can simply put \' around every ':
4018
4019         perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4020
4021       can be quoted:
4022
4023         parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\' ::: file
4024
4025       However, when you want to use a shell variable you need to quote the
4026       $-sign. Here is an example using $PARALLEL_SEQ. This variable is set by
4027       GNU parallel itself, so the evaluation of the $ must be done by the sub
4028       shell started by GNU parallel:
4029
4030         seq 10 | parallel -N2 echo seq:\$PARALLEL_SEQ arg1:{1} arg2:{2}
4031
4032       If the variable is set before GNU parallel starts you can do this:
4033
4034         VAR=this_is_set_before_starting
4035         echo test | parallel echo {} $VAR
4036
4037       Prints: test this_is_set_before_starting
4038
4039       It is a little more tricky if the variable contains more than one space
4040       in a row:
4041
4042         VAR="two  spaces  between  each  word"
4043         echo test | parallel echo {} \'"$VAR"\'
4044
4045       Prints: test two  spaces  between  each  word
4046
4047       If the variable should not be evaluated by the shell starting GNU
4048       parallel but be evaluated by the sub shell started by GNU parallel,
4049       then you need to quote it:
4050
4051         echo test | parallel VAR=this_is_set_after_starting \; echo {} \$VAR
4052
4053       Prints: test this_is_set_after_starting
4054
4055       It is a little more tricky if the variable contains space:
4056
4057         echo test |\
4058           parallel VAR='"two  spaces  between  each  word"' echo {} \'"$VAR"\'
4059
4060       Prints: test two  spaces  between  each  word
4061
4062       $$ is the shell variable containing the process id of the shell. This
4063       will print the process id of the shell running GNU parallel:
4064
4065         seq 10 | parallel echo $$
4066
4067       And this will print the process ids of the sub shells started by GNU
4068       parallel.
4069
4070         seq 10 | parallel echo \$\$
4071
4072       If the special characters should not be evaluated by the sub shell then
4073       you need to protect it against evaluation from both the shell starting
4074       GNU parallel and the sub shell:
4075
4076         echo test | parallel echo {} \\\$VAR
4077
4078       Prints: test $VAR
4079
4080       GNU parallel can protect against evaluation by the sub shell by using
4081       -q:
4082
4083         echo test | parallel -q echo {} \$VAR
4084
4085       Prints: test $VAR
4086
4087       This is particularly useful if you have lots of quoting. If you want to
4088       run a perl script like this:
4089
4090         perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4091
4092       It needs to be quoted like one of these:
4093
4094         ls | parallel perl -ne '/^\\S+\\s+\\S+\$/\ and\ print\ \$ARGV,\"\\n\"'
4095         ls | parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\'
4096
4097       Notice how spaces, \'s, "'s, and $'s need to be quoted. GNU parallel
4098       can do the quoting by using option -q:
4099
4100         ls | parallel -q  perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"'
4101
4102       However, this means you cannot make the sub shell interpret special
4103       characters. For example because of -q this WILL NOT WORK:
4104
4105         ls *.gz | parallel -q "zcat {} >{.}"
4106         ls *.gz | parallel -q "zcat {} | bzip2 >{.}.bz2"
4107
4108       because > and | need to be interpreted by the sub shell.
4109
4110       If you get errors like:
4111
4112         sh: -c: line 0: syntax error near unexpected token
4113         sh: Syntax error: Unterminated quoted string
4114         sh: -c: line 0: unexpected EOF while looking for matching `''
4115         sh: -c: line 1: syntax error: unexpected end of file
4116         zsh:1: no matches found:
4117
4118       then you might try using -q.
4119
4120       If you are using bash process substitution like <(cat foo) then you may
4121       try -q and prepending command with bash -c:
4122
4123         ls | parallel -q bash -c 'wc -c <(echo {})'
4124
4125       Or for substituting output:
4126
4127         ls | parallel -q bash -c \
4128           'tar c {} | tee >(gzip >{}.tar.gz) | bzip2 >{}.tar.bz2'
4129
4130       Conclusion: To avoid dealing with the quoting problems it may be easier
4131       just to write a small script or a function (remember to export -f the
4132       function) and have GNU parallel call that.
4133

LIST RUNNING JOBS

4135       If you want a list of the jobs currently running you can run:
4136
4137         killall -USR1 parallel
4138
4139       GNU parallel will then print the currently running jobs on stderr
4140       (standard error).
4141

COMPLETE RUNNING JOBS BUT DO NOT START NEW JOBS

4143       If you regret starting a lot of jobs you can simply break GNU parallel,
4144       but if you want to make sure you do not have half-completed jobs you
4145       should send the signal SIGHUP to GNU parallel:
4146
4147         killall -HUP parallel
4148
4149       This will tell GNU parallel to not start any new jobs, but wait until
4150       the currently running jobs are finished before exiting.
4151

ENVIRONMENT VARIABLES

4153       $PARALLEL_HOME
4154                Dir where GNU parallel stores config files, semaphores, and
4155                caches information between invocations. Default:
4156                $HOME/.parallel.
4157
4158       $PARALLEL_JOBSLOT
4159                Set by GNU parallel and can be used in jobs run by GNU
4160                parallel.  Remember to quote the $, so it gets evaluated by
4161                the correct shell. Or use --plus and {slot}.
4162
4163                $PARALLEL_JOBSLOT is the jobslot of the job. It is equal to
4164                {%} unless the job is being retried. See {%} for details.
4165
4166       $PARALLEL_PID
4167                Set by GNU parallel and can be used in jobs run by GNU
4168                parallel.  Remember to quote the $, so it gets evaluated by
4169                the correct shell.
4170
4171                This makes it possible for the jobs to communicate directly to
4172                GNU parallel.
4173
4174                Example: If each of the jobs tests a solution and one of jobs
4175                finds the solution the job can tell GNU parallel not to start
4176                more jobs by: kill -HUP $PARALLEL_PID. This only works on the
4177                local computer.
4178
4179       $PARALLEL_RSYNC_OPTS
4180                Options to pass on to rsync. Defaults to: -rlDzR.
4181
4182       $PARALLEL_SHELL
4183                Use this shell for the commands run by GNU parallel:
4184
4185                · $PARALLEL_SHELL. If undefined use:
4186
4187                · The shell that started GNU parallel. If that cannot be
4188                  determined:
4189
4190                · $SHELL. If undefined use:
4191
4192                · /bin/sh
4193
4194       $PARALLEL_SSH
4195                GNU parallel defaults to using the ssh command for remote
4196                access. This can be overridden with $PARALLEL_SSH, which again
4197                can be overridden with --ssh. It can also be set on a per
4198                server basis (see --sshlogin).
4199
4200       $PARALLEL_SSHHOST
4201                Set by GNU parallel and can be used in jobs run by GNU
4202                parallel.  Remember to quote the $, so it gets evaluated by
4203                the correct shell. Or use --plus and {host}.
4204
4205                $PARALLEL_SSHHOST is the host part of an sshlogin line. E.g.
4206
4207                  4//usr/bin/specialssh user@host
4208
4209                becomes:
4210
4211                  host
4212
4213       $PARALLEL_SSHLOGIN
4214                Set by GNU parallel and can be used in jobs run by GNU
4215                parallel.  Remember to quote the $, so it gets evaluated by
4216                the correct shell. Or use --plus and {sshlogin}.
4217
4218                The value is the sshlogin line with number of cores removed.
4219                E.g.
4220
4221                  4//usr/bin/specialssh user@host
4222
4223                becomes:
4224
4225                  /usr/bin/specialssh user@host
4226
4227       $PARALLEL_SEQ
4228                Set by GNU parallel and can be used in jobs run by GNU
4229                parallel.  Remember to quote the $, so it gets evaluated by
4230                the correct shell.
4231
4232                $PARALLEL_SEQ is the sequence number of the job running.
4233
4234                Example:
4235
4236                  seq 10 | parallel -N2 \
4237                    echo seq:'$'PARALLEL_SEQ arg1:{1} arg2:{2}
4238
4239                {#} is a shorthand for $PARALLEL_SEQ.
4240
4241       $PARALLEL_TMUX
4242                Path to tmux. If unset the tmux in $PATH is used.
4243
4244       $TMPDIR  Directory for temporary files. See: --tmpdir.
4245
4246       $PARALLEL
4247                The environment variable $PARALLEL will be used as default
4248                options for GNU parallel. If the variable contains special
4249                shell characters (e.g. $, *, or space) then these need to be
4250                to be escaped with \.
4251
4252                Example:
4253
4254                  cat list | parallel -j1 -k -v ls
4255                  cat list | parallel -j1 -k -v -S"myssh user@server" ls
4256
4257                can be written as:
4258
4259                  cat list | PARALLEL="-kvj1" parallel ls
4260                  cat list | PARALLEL='-kvj1 -S myssh\ user@server' \
4261                    parallel echo
4262
4263                Notice the \ after 'myssh' is needed because 'myssh' and
4264                'user@server' must be one argument.
4265

DEFAULT PROFILE (CONFIG FILE)

4267       The global configuration file /etc/parallel/config, followed by user
4268       configuration file ~/.parallel/config (formerly known as .parallelrc)
4269       will be read in turn if they exist.  Lines starting with '#' will be
4270       ignored. The format can follow that of the environment variable
4271       $PARALLEL, but it is often easier to simply put each option on its own
4272       line.
4273
4274       Options on the command line take precedence, followed by the
4275       environment variable $PARALLEL, user configuration file
4276       ~/.parallel/config, and finally the global configuration file
4277       /etc/parallel/config.
4278
4279       Note that no file that is read for options, nor the environment
4280       variable $PARALLEL, may contain retired options such as --tollef.
4281

PROFILE FILES

4283       If --profile set, GNU parallel will read the profile from that file
4284       rather than the global or user configuration files. You can have
4285       multiple --profiles.
4286
4287       Profiles are searched for in ~/.parallel. If the name starts with / it
4288       is seen as an absolute path. If the name starts with ./ it is seen as a
4289       relative path from current dir.
4290
4291       Example: Profile for running a command on every sshlogin in
4292       ~/.ssh/sshlogins and prepend the output with the sshlogin:
4293
4294         echo --tag -S .. --nonall > ~/.parallel/n
4295         parallel -Jn uptime
4296
4297       Example: Profile for running every command with -j-1 and nice
4298
4299         echo -j-1 nice > ~/.parallel/nice_profile
4300         parallel -J nice_profile bzip2 -9 ::: *
4301
4302       Example: Profile for running a perl script before every command:
4303
4304         echo "perl -e '\$a=\$\$; print \$a,\" \",'\$PARALLEL_SEQ',\" \";';" \
4305           > ~/.parallel/pre_perl
4306         parallel -J pre_perl echo ::: *
4307
4308       Note how the $ and " need to be quoted using \.
4309
4310       Example: Profile for running distributed jobs with nice on the remote
4311       computers:
4312
4313         echo -S .. nice > ~/.parallel/dist
4314         parallel -J dist --trc {.}.bz2 bzip2 -9 ::: *
4315

EXIT STATUS

4317       Exit status depends on --halt-on-error if one of these is used:
4318       success=X, success=Y%, fail=Y%.
4319
4320       0     All jobs ran without error. If success=X is used: X jobs ran
4321             without error. If success=Y% is used: Y% of the jobs ran without
4322             error.
4323
4324       1-100 Some of the jobs failed. The exit status gives the number of
4325             failed jobs. If Y% is used the exit status is the percentage of
4326             jobs that failed.
4327
4328       101   More than 100 jobs failed.
4329
4330       255   Other error.
4331
4332       -1 (In joblog and SQL table)
4333             Killed by Ctrl-C, timeout, not enough memory or similar.
4334
4335       -2 (In joblog and SQL table)
4336             skip() was called in {= =}.
4337
4338       -1000 (In SQL table)
4339             Job is ready to run (set by --sqlmaster).
4340
4341       -1220 (In SQL table)
4342             Job is taken by worker (set by --sqlworker).
4343
4344       If fail=1 is used, the exit status will be the exit status of the
4345       failing job.
4346

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

4348       See: man parallel_alternatives
4349

BUGS

4351   Quoting of newline
4352       Because of the way newline is quoted this will not work:
4353
4354         echo 1,2,3 | parallel -vkd, "echo 'a{}b'"
4355
4356       However, these will all work:
4357
4358         echo 1,2,3 | parallel -vkd, echo a{}b
4359         echo 1,2,3 | parallel -vkd, "echo 'a'{}'b'"
4360         echo 1,2,3 | parallel -vkd, "echo 'a'"{}"'b'"
4361
4362   Speed
4363       Startup
4364
4365       GNU parallel is slow at starting up - around 250 ms the first time and
4366       150 ms after that.
4367
4368       Job startup
4369
4370       Starting a job on the local machine takes around 10 ms. This can be a
4371       big overhead if the job takes very few ms to run. Often you can group
4372       small jobs together using -X which will make the overhead less
4373       significant. Or you can run multiple GNU parallels as described in
4374       EXAMPLE: Speeding up fast jobs.
4375
4376       SSH
4377
4378       When using multiple computers GNU parallel opens ssh connections to
4379       them to figure out how many connections can be used reliably
4380       simultaneously (Namely SSHD's MaxStartups). This test is done for each
4381       host in serial, so if your --sshloginfile contains many hosts it may be
4382       slow.
4383
4384       If your jobs are short you may see that there are fewer jobs running on
4385       the remote systems than expected. This is due to time spent logging in
4386       and out. -M may help here.
4387
4388       Disk access
4389
4390       A single disk can normally read data faster if it reads one file at a
4391       time instead of reading a lot of files in parallel, as this will avoid
4392       disk seeks. However, newer disk systems with multiple drives can read
4393       faster if reading from multiple files in parallel.
4394
4395       If the jobs are of the form read-all-compute-all-write-all, so
4396       everything is read before anything is written, it may be faster to
4397       force only one disk access at the time:
4398
4399         sem --id diskio cat file | compute | sem --id diskio cat > file
4400
4401       If the jobs are of the form read-compute-write, so writing starts
4402       before all reading is done, it may be faster to force only one reader
4403       and writer at the time:
4404
4405         sem --id read cat file | compute | sem --id write cat > file
4406
4407       If the jobs are of the form read-compute-read-compute, it may be faster
4408       to run more jobs in parallel than the system has CPUs, as some of the
4409       jobs will be stuck waiting for disk access.
4410
4411   --nice limits command length
4412       The current implementation of --nice is too pessimistic in the max
4413       allowed command length. It only uses a little more than half of what it
4414       could. This affects -X and -m. If this becomes a real problem for you,
4415       file a bug-report.
4416
4417   Aliases and functions do not work
4418       If you get:
4419
4420         Can't exec "command": No such file or directory
4421
4422       or:
4423
4424         open3: exec of by command failed
4425
4426       or:
4427
4428         /bin/bash: command: command not found
4429
4430       it may be because command is not known, but it could also be because
4431       command is an alias or a function. If it is a function you need to
4432       export -f the function first or use env_parallel. An alias will only
4433       work if you use env_parallel.
4434
4435   Database with MySQL fails randomly
4436       The --sql* options may fail randomly with MySQL. This problem does not
4437       exist with PostgreSQL.
4438

REPORTING BUGS

4440       Report bugs to <bug-parallel@gnu.org> or
4441       https://savannah.gnu.org/bugs/?func=additem&group=parallel
4442
4443       See a perfect bug report on
4444       https://lists.gnu.org/archive/html/bug-parallel/2015-01/msg00000.html
4445
4446       Your bug report should always include:
4447
4448       · The error message you get (if any). If the error message is not from
4449         GNU parallel you need to show why you think GNU parallel caused
4450         these.
4451
4452       · The complete output of parallel --version. If you are not running the
4453         latest released version (see http://ftp.gnu.org/gnu/parallel/) you
4454         should specify why you believe the problem is not fixed in that
4455         version.
4456
4457       · A minimal, complete, and verifiable example (See description on
4458         http://stackoverflow.com/help/mcve).
4459
4460         It should be a complete example that others can run that shows the
4461         problem including all files needed to run the example. This should
4462         preferably be small and simple, so try to remove as many options as
4463         possible. A combination of yes, seq, cat, echo, wc, and sleep can
4464         reproduce most errors. If your example requires large files, see if
4465         you can make them with something like seq 100000000 > bigfile or yes
4466         | head -n 1000000000 > file.
4467
4468         If your example requires remote execution, see if you can use
4469         localhost - maybe using another login.
4470
4471         If you have access to a different system (maybe a VirtualBox on your
4472         own machine), test if the MCVE shows the problem on that system.
4473
4474       · The output of your example. If your problem is not easily reproduced
4475         by others, the output might help them figure out the problem.
4476
4477       · Whether you have watched the intro videos
4478         (http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1), walked
4479         through the tutorial (man parallel_tutorial), and read the EXAMPLE
4480         section in the man page (man parallel - search for EXAMPLE:).
4481
4482       If you suspect the error is dependent on your environment or
4483       distribution, please see if you can reproduce the error on one of these
4484       VirtualBox images:
4485       http://sourceforge.net/projects/virtualboximage/files/
4486       http://www.osboxes.org/virtualbox-images/
4487
4488       Specifying the name of your distribution is not enough as you may have
4489       installed software that is not in the VirtualBox images.
4490
4491       If you cannot reproduce the error on any of the VirtualBox images
4492       above, see if you can build a VirtualBox image on which you can
4493       reproduce the error. If not you should assume the debugging will be
4494       done through you. That will put more burden on you and it is extra
4495       important you give any information that help. In general the problem
4496       will be fixed faster and with less work for you if you can reproduce
4497       the error on a VirtualBox.
4498

AUTHOR

4500       When using GNU parallel for a publication please cite:
4501
4502       O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4503       The USENIX Magazine, February 2011:42-47.
4504
4505       This helps funding further development; and it won't cost you a cent.
4506       If you pay 10000 EUR you should feel free to use GNU Parallel without
4507       citing.
4508
4509       Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4510
4511       Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4512
4513       Copyright (C) 2010-2020 Ole Tange, http://ole.tange.dk and Free
4514       Software Foundation, Inc.
4515
4516       Parts of the manual concerning xargs compatibility is inspired by the
4517       manual of xargs from GNU findutils 4.4.2.
4518

LICENSE

4520       This program is free software; you can redistribute it and/or modify it
4521       under the terms of the GNU General Public License as published by the
4522       Free Software Foundation; either version 3 of the License, or at your
4523       option any later version.
4524
4525       This program is distributed in the hope that it will be useful, but
4526       WITHOUT ANY WARRANTY; without even the implied warranty of
4527       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
4528       General Public License for more details.
4529
4530       You should have received a copy of the GNU General Public License along
4531       with this program.  If not, see <http://www.gnu.org/licenses/>.
4532
4533   Documentation license I
4534       Permission is granted to copy, distribute and/or modify this
4535       documentation under the terms of the GNU Free Documentation License,
4536       Version 1.3 or any later version published by the Free Software
4537       Foundation; with no Invariant Sections, with no Front-Cover Texts, and
4538       with no Back-Cover Texts.  A copy of the license is included in the
4539       file fdl.txt.
4540
4541   Documentation license II
4542       You are free:
4543
4544       to Share to copy, distribute and transmit the work
4545
4546       to Remix to adapt the work
4547
4548       Under the following conditions:
4549
4550       Attribution
4551                You must attribute the work in the manner specified by the
4552                author or licensor (but not in any way that suggests that they
4553                endorse you or your use of the work).
4554
4555       Share Alike
4556                If you alter, transform, or build upon this work, you may
4557                distribute the resulting work only under the same, similar or
4558                a compatible license.
4559
4560       With the understanding that:
4561
4562       Waiver   Any of the above conditions can be waived if you get
4563                permission from the copyright holder.
4564
4565       Public Domain
4566                Where the work or any of its elements is in the public domain
4567                under applicable law, that status is in no way affected by the
4568                license.
4569
4570       Other Rights
4571                In no way are any of the following rights affected by the
4572                license:
4573
4574                · Your fair dealing or fair use rights, or other applicable
4575                  copyright exceptions and limitations;
4576
4577                · The author's moral rights;
4578
4579                · Rights other persons may have either in the work itself or
4580                  in how the work is used, such as publicity or privacy
4581                  rights.
4582
4583       Notice   For any reuse or distribution, you must make clear to others
4584                the license terms of this work.
4585
4586       A copy of the full license is included in the file as cc-by-sa.txt.
4587

DEPENDENCIES

4589       GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
4590       Symbol, IO::File, POSIX, and File::Temp.
4591
4592       For --csv it uses the Perl module Text::CSV.
4593
4594       For remote usage it uses rsync with ssh.
4595

SEE ALSO

4597       ssh(1), ssh-agent(1), sshpass(1), ssh-copy-id(1), rsync(1), find(1),
4598       xargs(1), dirname(1), make(1), pexec(1), ppss(1), xjobs(1), prll(1),
4599       dxargs(1), mdm(1)
4600
4601
4602
460320200522                          2020-06-06                       PARALLEL(1)
Impressum