parallel/parallel(1)

1PARALLEL(1)                        parallel                        PARALLEL(1)
2
3
4

NAME

6       parallel - build and execute shell command lines from standard input in
7       parallel
8

SYNOPSIS

10       parallel [options] [command [arguments]] < list_of_arguments
11
12       parallel [options] [command [arguments]] ( ::: arguments | :::+
13       arguments | :::: argfile(s) | ::::+ argfile(s) ) ...
14
15       parallel --semaphore [options] command
16
17       #!/usr/bin/parallel --shebang [options] [command [arguments]]
18
19       #!/usr/bin/parallel --shebang-wrap [options] [command [arguments]]
20

DESCRIPTION

22       STOP!
23
24       Read the Reader's guide below if you are new to GNU parallel.
25
26       GNU parallel is a shell tool for executing jobs in parallel using one
27       or more computers. A job can be a single command or a small script that
28       has to be run for each of the lines in the input. The typical input is
29       a list of files, a list of hosts, a list of users, a list of URLs, or a
30       list of tables. A job can also be a command that reads from a pipe. GNU
31       parallel can then split the input into blocks and pipe a block into
32       each command in parallel.
33
34       If you use xargs and tee today you will find GNU parallel very easy to
35       use as GNU parallel is written to have the same options as xargs. If
36       you write loops in shell, you will find GNU parallel may be able to
37       replace most of the loops and make them run faster by running several
38       jobs in parallel.
39
40       GNU parallel makes sure output from the commands is the same output as
41       you would get had you run the commands sequentially. This makes it
42       possible to use output from GNU parallel as input for other programs.
43
44       For each line of input GNU parallel will execute command with the line
45       as arguments. If no command is given, the line of input is executed.
46       Several lines will be run in parallel. GNU parallel can often be used
47       as a substitute for xargs or cat | bash.
48
49   Reader's guide
50       GNU parallel includes the 4 types of documentation: Tutorial, how-to,
51       reference and explanation.
52
53       Tutorial
54
55       If you prefer reading a book buy GNU Parallel 2018 at
56       http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html
57       or download it at: https://doi.org/10.5281/zenodo.1146014 Read at least
58       chapter 1+2. It should take you less than 20 minutes.
59
60       Otherwise start by watching the intro videos for a quick introduction:
61       http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
62
63       If you want to dive deeper: spend a couple of hours walking through the
64       tutorial (man parallel_tutorial). Your command line will love you for
65       it.
66
67       How-to
68
69       You can find a lot of EXAMPLEs of use after the list of OPTIONS in man
70       parallel (Use LESS=+/EXAMPLE: man parallel). That will give you an idea
71       of what GNU parallel is capable of, and you may find a solution you can
72       simply adapt to your situation.
73
74       Reference
75
76       If you need a one page printable cheat sheet you can find it on:
77       https://www.gnu.org/software/parallel/parallel_cheat.pdf
78
79       The man page is the reference for all options.
80
81       Design discussion
82
83       If you want to know the design decisions behind GNU parallel, try: man
84       parallel_design. This is also a good intro if you intend to change GNU
85       parallel.
86

OPTIONS

88       command
89           Command to execute.  If command or the following arguments contain
90           replacement strings (such as {}) every instance will be substituted
91           with the input.
92
93           If command is given, GNU parallel solve the same tasks as xargs. If
94           command is not given GNU parallel will behave similar to cat | sh.
95
96           The command must be an executable, a script, a composed command, an
97           alias, or a function.
98
99           Bash functions: export -f the function first or use env_parallel.
100
101           Bash, Csh, or Tcsh aliases: Use env_parallel.
102
103           Zsh, Fish, Ksh, and Pdksh functions and aliases: Use env_parallel.
104
105       {}  Input line. This replacement string will be replaced by a full line
106           read from the input source. The input source is normally stdin
107           (standard input), but can also be given with -a, :::, or ::::.
108
109           The replacement string {} can be changed with -I.
110
111           If the command line contains no replacement strings then {} will be
112           appended to the command line.
113
114           Replacement strings are normally quoted, so special characters are
115           not parsed by the shell. The exception is if the command starts
116           with a replacement string; then the string is not quoted.
117
118       {.} Input line without extension. This replacement string will be
119           replaced by the input with the extension removed. If the input line
120           contains . after the last /, the last . until the end of the string
121           will be removed and {.} will be replaced with the remaining. E.g.
122           foo.jpg becomes foo, subdir/foo.jpg becomes subdir/foo,
123           sub.dir/foo.jpg becomes sub.dir/foo, sub.dir/bar remains
124           sub.dir/bar. If the input line does not contain . it will remain
125           unchanged.
126
127           The replacement string {.} can be changed with --er.
128
129           To understand replacement strings see {}.
130
131       {/} Basename of input line. This replacement string will be replaced by
132           the input with the directory part removed.
133
134           The replacement string {/} can be changed with --basenamereplace.
135
136           To understand replacement strings see {}.
137
138       {//}
139           Dirname of input line. This replacement string will be replaced by
140           the dir of the input line. See dirname(1).
141
142           The replacement string {//} can be changed with --dirnamereplace.
143
144           To understand replacement strings see {}.
145
146       {/.}
147           Basename of input line without extension. This replacement string
148           will be replaced by the input with the directory and extension part
149           removed. It is a combination of {/} and {.}.
150
151           The replacement string {/.} can be changed with
152           --basenameextensionreplace.
153
154           To understand replacement strings see {}.
155
156       {#} Sequence number of the job to run. This replacement string will be
157           replaced by the sequence number of the job being run. It contains
158           the same number as $PARALLEL_SEQ.
159
160           The replacement string {#} can be changed with --seqreplace.
161
162           To understand replacement strings see {}.
163
164       {%} Job slot number. This replacement string will be replaced by the
165           job's slot number between 1 and number of jobs to run in parallel.
166           There will never be 2 jobs running at the same time with the same
167           job slot number.
168
169           The replacement string {%} can be changed with --slotreplace.
170
171           If the job needs to be retried (e.g using --retries or
172           --retry-failed) the job slot is not automatically updated. You
173           should then instead use $PARALLEL_JOBSLOT:
174
175             $ do_test() {
176                 id="$3 {%}=$1 PARALLEL_JOBSLOT=$2"
177                 echo run "$id";
178                 sleep 1
179                 # fail if {%} is odd
180                 return `echo $1%2 | bc`
181               }
182             $ export -f do_test
183             $ parallel -j3 --jl mylog do_test {%} \$PARALLEL_JOBSLOT {} ::: A B C D
184             run A {%}=1 PARALLEL_JOBSLOT=1
185             run B {%}=2 PARALLEL_JOBSLOT=2
186             run C {%}=3 PARALLEL_JOBSLOT=3
187             run D {%}=1 PARALLEL_JOBSLOT=1
188             $ parallel --retry-failed -j3 --jl mylog do_test {%} \$PARALLEL_JOBSLOT {} ::: A B C D
189             run A {%}=1 PARALLEL_JOBSLOT=1
190             run C {%}=3 PARALLEL_JOBSLOT=2
191             run D {%}=1 PARALLEL_JOBSLOT=3
192
193           Notice how {%} and $PARALLEL_JOBSLOT differ in the retry run of C
194           and D.
195
196           To understand replacement strings see {}.
197
198       {n} Argument from input source n or the n'th argument. This positional
199           replacement string will be replaced by the input from input source
200           n (when used with -a or ::::) or with the n'th argument (when used
201           with -N). If n is negative it refers to the n'th last argument.
202
203           To understand replacement strings see {}.
204
205       {n.}
206           Argument from input source n or the n'th argument without
207           extension. It is a combination of {n} and {.}.
208
209           This positional replacement string will be replaced by the input
210           from input source n (when used with -a or ::::) or with the n'th
211           argument (when used with -N). The input will have the extension
212           removed.
213
214           To understand positional replacement strings see {n}.
215
216       {n/}
217           Basename of argument from input source n or the n'th argument.  It
218           is a combination of {n} and {/}.
219
220           This positional replacement string will be replaced by the input
221           from input source n (when used with -a or ::::) or with the n'th
222           argument (when used with -N). The input will have the directory (if
223           any) removed.
224
225           To understand positional replacement strings see {n}.
226
227       {n//}
228           Dirname of argument from input source n or the n'th argument.  It
229           is a combination of {n} and {//}.
230
231           This positional replacement string will be replaced by the dir of
232           the input from input source n (when used with -a or ::::) or with
233           the n'th argument (when used with -N). See dirname(1).
234
235           To understand positional replacement strings see {n}.
236
237       {n/.}
238           Basename of argument from input source n or the n'th argument
239           without extension.  It is a combination of {n}, {/}, and {.}.
240
241           This positional replacement string will be replaced by the input
242           from input source n (when used with -a or ::::) or with the n'th
243           argument (when used with -N). The input will have the directory (if
244           any) and extension removed.
245
246           To understand positional replacement strings see {n}.
247
248       {=perl expression=}
249           Replace with calculated perl expression. $_ will contain the same
250           as {}. After evaluating perl expression $_ will be used as the
251           value. It is recommended to only change $_ but you have full access
252           to all of GNU parallel's internal functions and data structures. A
253           few convenience functions and data structures have been made:
254
255            Q(string)     shell quote a string
256
257            pQ(string)    perl quote a string
258
259            uq() (or uq)  do not quote current replacement string
260
261            total_jobs()  number of jobs in total
262
263            slot()        slot number of job
264
265            seq()         sequence number of job
266
267            @arg          the arguments
268
269           Example:
270
271             seq 10 | parallel echo {} + 1 is {= '$_++' =}
272             parallel csh -c {= '$_="mkdir ".Q($_)' =} ::: '12" dir'
273             seq 50 | parallel echo job {#} of {= '$_=total_jobs()' =}
274
275           See also: --rpl --parens
276
277       {=n perl expression=}
278           Positional equivalent to {=perl expression=}. To understand
279           positional replacement strings see {n}.
280
281           See also: {=perl expression=} {n}.
282
283       ::: arguments
284           Use arguments from the command line as input source instead of
285           stdin (standard input). Unlike other options for GNU parallel :::
286           is placed after the command and before the arguments.
287
288           The following are equivalent:
289
290             (echo file1; echo file2) | parallel gzip
291             parallel gzip ::: file1 file2
292             parallel gzip {} ::: file1 file2
293             parallel --arg-sep ,, gzip {} ,, file1 file2
294             parallel --arg-sep ,, gzip ,, file1 file2
295             parallel ::: "gzip file1" "gzip file2"
296
297           To avoid treating ::: as special use --arg-sep to set the argument
298           separator to something else. See also --arg-sep.
299
300           If multiple ::: are given, each group will be treated as an input
301           source, and all combinations of input sources will be generated.
302           E.g. ::: 1 2 ::: a b c will result in the combinations (1,a) (1,b)
303           (1,c) (2,a) (2,b) (2,c). This is useful for replacing nested for-
304           loops.
305
306           ::: and :::: can be mixed. So these are equivalent:
307
308             parallel echo {1} {2} {3} ::: 6 7 ::: 4 5 ::: 1 2 3
309             parallel echo {1} {2} {3} :::: <(seq 6 7) <(seq 4 5) \
310               :::: <(seq 1 3)
311             parallel -a <(seq 6 7) echo {1} {2} {3} :::: <(seq 4 5) \
312               :::: <(seq 1 3)
313             parallel -a <(seq 6 7) -a <(seq 4 5) echo {1} {2} {3} \
314               ::: 1 2 3
315             seq 6 7 | parallel -a - -a <(seq 4 5) echo {1} {2} {3} \
316               ::: 1 2 3
317             seq 4 5 | parallel echo {1} {2} {3} :::: <(seq 6 7) - \
318               ::: 1 2 3
319
320       :::+ arguments
321           Like ::: but linked like --link to the previous input source.
322
323           Contrary to --link, values do not wrap: The shortest input source
324           determines the length.
325
326           Example:
327
328             parallel echo ::: a b c :::+ 1 2 3 ::: X Y :::+ 11 22
329
330       :::: argfiles
331           Another way to write -a argfile1 -a argfile2 ...
332
333           ::: and :::: can be mixed.
334
335           See -a, ::: and --link.
336
337       ::::+ argfiles
338           Like :::: but linked like --link to the previous input source.
339
340           Contrary to --link, values do not wrap: The shortest input source
341           determines the length.
342
343       --null
344       -0  Use NUL as delimiter.  Normally input lines will end in \n
345           (newline). If they end in \0 (NUL), then use this option. It is
346           useful for processing arguments that may contain \n (newline).
347
348       --arg-file input-file
349       -a input-file
350           Use input-file as input source. If you use this option, stdin
351           (standard input) is given to the first process run.  Otherwise,
352           stdin (standard input) is redirected from /dev/null.
353
354           If multiple -a are given, each input-file will be treated as an
355           input source, and all combinations of input sources will be
356           generated. E.g. The file foo contains 1 2, the file bar contains a
357           b c.  -a foo -a bar will result in the combinations (1,a) (1,b)
358           (1,c) (2,a) (2,b) (2,c). This is useful for replacing nested for-
359           loops.
360
361           See also --link and {n}.
362
363       --arg-file-sep sep-str
364           Use sep-str instead of :::: as separator string between command and
365           argument files. Useful if :::: is used for something else by the
366           command.
367
368           See also: ::::.
369
370       --arg-sep sep-str
371           Use sep-str instead of ::: as separator string. Useful if ::: is
372           used for something else by the command.
373
374           Also useful if you command uses ::: but you still want to read
375           arguments from stdin (standard input): Simply change --arg-sep to a
376           string that is not in the command line.
377
378           See also: :::.
379
380       --bar
381           Show progress as a progress bar. In the bar is shown: % of jobs
382           completed, estimated seconds left, and number of jobs started.
383
384           It is compatible with zenity:
385
386             seq 1000 | parallel -j30 --bar '(echo {};sleep 0.1)' \
387               2> >(perl -pe 'BEGIN{$/="\r";$|=1};s/\r/\n/g' |
388                    zenity --progress --auto-kill) | wc
389
390       --basefile file
391       --bf file
392           file will be transferred to each sshlogin before a job is started.
393           It will be removed if --cleanup is active. The file may be a script
394           to run or some common base data needed for the job.  Multiple --bf
395           can be specified to transfer more basefiles. The file will be
396           transferred the same way as --transferfile.
397
398       --basenamereplace replace-str
399       --bnr replace-str
400           Use the replacement string replace-str instead of {/} for basename
401           of input line.
402
403       --basenameextensionreplace replace-str
404       --bner replace-str
405           Use the replacement string replace-str instead of {/.} for basename
406           of input line without extension.
407
408       --bin binexpr
409           Use binexpr as binning key and bin input to the jobs.
410
411           binexpr is [column number|column name] [perlexpression] e.g. 3,
412           Address, 3 $_%=100, Address s/\D//g.
413
414           Each input line is split using --colsep. The value of the column is
415           put into $_, the perl expression is executed, the resulting value
416           is is the job slot that will be given the line. If the value is
417           bigger than the number of jobslots the value will be modulo number
418           of jobslots.
419
420           This is similar to --shard but the hashing algorithm is a simple
421           modulo, which makes it predictible which jobslot will receive which
422           value.
423
424           The performance is in the order of 100K rows per second. Faster if
425           the bincol is small (<10), slower if it is big (>100).
426
427           --bin requires --pipe and a fixed numeric value for --jobs.
428
429           See also --shard, --group-by, --roundrobin.
430
431       --bg
432           Run command in background thus GNU parallel will not wait for
433           completion of the command before exiting. This is the default if
434           --semaphore is set.
435
436           See also: --fg, man sem.
437
438           Implies --semaphore.
439
440       --bibtex
441       --citation
442           Print the citation notice and BibTeX entry for GNU parallel,
443           silence citation notice for all future runs, and exit. It will not
444           run any commands.
445
446           If it is impossible for you to run --citation you can instead use
447           --will-cite, which will run commands, but which will only silence
448           the citation notice for this single run.
449
450           If you use --will-cite in scripts to be run by others you are
451           making it harder for others to see the citation notice.  The
452           development of GNU parallel is indirectly financed through
453           citations, so if your users do not know they should cite then you
454           are making it harder to finance development. However, if you pay
455           10000 EUR, you have done your part to finance future development
456           and should feel free to use --will-cite in scripts.
457
458           If you do not want to help financing future development by letting
459           other users see the citation notice or by paying, then please use
460           another tool instead of GNU parallel. You can find some of the
461           alternatives in man parallel_alternatives.
462
463       --block size
464       --block-size size
465           Size of block in bytes to read at a time. The size can be postfixed
466           with K, M, G, T, P, E, k, m, g, t, p, or e which would multiply the
467           size with 1024, 1048576, 1073741824, 1099511627776,
468           1125899906842624, 1152921504606846976, 1000, 1000000, 1000000000,
469           1000000000000, 1000000000000000, or 1000000000000000000
470           respectively.
471
472           GNU parallel tries to meet the block size but can be off by the
473           length of one record. For performance reasons size should be bigger
474           than a two records. GNU parallel will warn you and automatically
475           increase the size if you choose a size that is too small.
476
477           If you use -N, --block-size should be bigger than N+1 records.
478
479           size defaults to 1M.
480
481           When using --pipepart a negative block size is not interpreted as a
482           blocksize but as the number of blocks each jobslot should have. So
483           this will run 10*5 = 50 jobs in total:
484
485             parallel --pipepart -a myfile --block -10 -j5 wc
486
487           This is an efficient alternative to --roundrobin because data is
488           never read by GNU parallel, but you can still have very few
489           jobslots process a large amount of data.
490
491           See --pipe and --pipepart for use of this.
492
493       --blocktimeout duration
494       --bt duration
495           Time out for reading block when using --pipe. If it takes longer
496           than duration to read a full block, use the partial block read so
497           far.
498
499           duration must be in whole seconds, but can be expressed as floats
500           postfixed with s, m, h, or d which would multiply the float by 1,
501           60, 3600, or 86400. Thus these are equivalent: --blocktimeout
502           100000 and --blocktimeout 1d3.5h16.6m4s.
503
504       --cat
505           Create a temporary file with content. Normally --pipe/--pipepart
506           will give data to the program on stdin (standard input). With --cat
507           GNU parallel will create a temporary file with the name in {}, so
508           you can do: parallel --pipe --cat wc {}.
509
510           Implies --pipe unless --pipepart is used.
511
512           See also --fifo.
513
514       --cleanup
515           Remove transferred files. --cleanup will remove the transferred
516           files on the remote computer after processing is done.
517
518             find log -name '*gz' | parallel \
519               --sshlogin server.example.com --transferfile {} \
520               --return {.}.bz2 --cleanup "zcat {} | bzip -9 >{.}.bz2"
521
522           With --transferfile {} the file transferred to the remote computer
523           will be removed on the remote computer.  Directories created will
524           not be removed - even if they are empty.
525
526           With --return the file transferred from the remote computer will be
527           removed on the remote computer.  Directories created will not be
528           removed - even if they are empty.
529
530           --cleanup is ignored when not used with --transferfile or --return.
531
532       --colsep regexp
533       -C regexp
534           Column separator. The input will be treated as a table with regexp
535           separating the columns. The n'th column can be accessed using {n}
536           or {n.}. E.g. {3} is the 3rd column.
537
538           If there are more input sources, each input source will be
539           separated, but the columns from each input source will be linked
540           (see --link).
541
542             parallel --colsep '-' echo {4} {3} {2} {1} \
543               ::: A-B C-D ::: e-f g-h
544
545           --colsep implies --trim rl, which can be overridden with --trim n.
546
547           regexp is a Perl Regular Expression:
548           http://perldoc.perl.org/perlre.html
549
550       --compress
551           Compress temporary files. If the output is big and very
552           compressible this will take up less disk space in $TMPDIR and
553           possibly be faster due to less disk I/O.
554
555           GNU parallel will try pzstd, lbzip2, pbzip2, zstd, pigz, lz4, lzop,
556           plzip, lzip, lrz, gzip, pxz, lzma, bzip2, xz, clzip, in that order,
557           and use the first available.
558
559       --compress-program prg
560       --decompress-program prg
561           Use prg for (de)compressing temporary files. It is assumed that prg
562           -dc will decompress stdin (standard input) to stdout (standard
563           output) unless --decompress-program is given.
564
565       --csv
566           Treat input as CSV-format. --colsep sets the field delimiter. It
567           works very much like --colsep except it deals correctly with
568           quoting:
569
570              echo '"1 big, 2 small","2""x4"" plank",12.34' |
571                parallel --csv echo {1} of {2} at {3}
572
573           Even quoted newlines are parsed correctly:
574
575              (echo '"Start of field 1 with newline'
576               echo 'Line 2 in field 1";value 2') |
577                parallel --csv --colsep ';' echo Field 1: {1} Field 2: {2}
578
579           When used with --pipe only pass full CSV-records.
580
581       --delay mytime (alpha testing)
582           Delay starting next job by mytime. GNU parallel will pause mytime
583           after starting each job. mytime is normally in seconds, but can be
584           floats postfixed with s, m, h, or d which would multiply the float
585           by 1, 60, 3600, or 86400. Thus these are equivalent: --delay 100000
586           and --delay 1d3.5h16.6m4s.
587
588           If you append 'auto' to mytime (e.g. 13m3sauto) GNU parallel will
589           automatically try to find the optimal value: If a job fails, mytime
590           is doubled. If a job succeeds, mytime is decreased by 10%.
591
592       --delimiter delim
593       -d delim
594           Input items are terminated by delim.  Quotes and backslash are not
595           special; every character in the input is taken literally.  Disables
596           the end-of-file string, which is treated like any other argument.
597           The specified delimiter may be characters, C-style character
598           escapes such as \n, or octal or hexadecimal escape codes.  Octal
599           and hexadecimal escape codes are understood as for the printf
600           command.  Multibyte characters are not supported.
601
602       --dirnamereplace replace-str
603       --dnr replace-str
604           Use the replacement string replace-str instead of {//} for dirname
605           of input line.
606
607       --dry-run
608           Print the job to run on stdout (standard output), but do not run
609           the job. Use -v -v to include the wrapping that GNU parallel
610           generates (for remote jobs, --tmux, --nice, --pipe, --pipepart,
611           --fifo and --cat). Do not count on this literally, though, as the
612           job may be scheduled on another computer or the local computer if :
613           is in the list.
614
615       -E eof-str
616           Set the end of file string to eof-str.  If the end of file string
617           occurs as a line of input, the rest of the input is not read.  If
618           neither -E nor -e is used, no end of file string is used.
619
620       --eof[=eof-str]
621       -e[eof-str]
622           This option is a synonym for the -E option.  Use -E instead,
623           because it is POSIX compliant for xargs while this option is not.
624           If eof-str is omitted, there is no end of file string.  If neither
625           -E nor -e is used, no end of file string is used.
626
627       --embed
628           Embed GNU parallel in a shell script. If you need to distribute
629           your script to someone who does not want to install GNU parallel
630           you can embed GNU parallel in your own shell script:
631
632             parallel --embed > new_script
633
634           After which you add your code at the end of new_script. This is
635           tested on ash, bash, dash, ksh, sh, and zsh.
636
637       --env var
638           Copy environment variable var. This will copy var to the
639           environment that the command is run in. This is especially useful
640           for remote execution.
641
642           In Bash var can also be a Bash function - just remember to export
643           -f the function, see command.
644
645           The variable '_' is special. It will copy all exported environment
646           variables except for the ones mentioned in
647           ~/.parallel/ignored_vars.
648
649           To copy the full environment (both exported and not exported
650           variables, arrays, and functions) use env_parallel.
651
652           See also: --record-env, --session.
653
654       --eta
655           Show the estimated number of seconds before finishing. This forces
656           GNU parallel to read all jobs before starting to find the number of
657           jobs. GNU parallel normally only reads the next job to run.
658
659           The estimate is based on the runtime of finished jobs, so the first
660           estimate will only be shown when the first job has finished.
661
662           Implies --progress.
663
664           See also: --bar, --progress.
665
666       --fg
667           Run command in foreground.
668
669           With --tmux and --tmuxpane GNU parallel will start tmux in the
670           foreground.
671
672           With --semaphore GNU parallel will run the command in the
673           foreground (opposite --bg), and wait for completion of the command
674           before exiting.
675
676           See also --bg, man sem.
677
678       --fifo
679           Create a temporary fifo with content. Normally --pipe and
680           --pipepart will give data to the program on stdin (standard input).
681           With --fifo GNU parallel will create a temporary fifo with the name
682           in {}, so you can do: parallel --pipe --fifo wc {}.
683
684           Beware: If data is not read from the fifo, the job will block
685           forever.
686
687           Implies --pipe unless --pipepart is used.
688
689           See also --cat.
690
691       --filter-hosts
692           Remove down hosts. For each remote host: check that login through
693           ssh works. If not: do not use this host.
694
695           For performance reasons, this check is performed only at the start
696           and every time --sshloginfile is changed. If an host goes down
697           after the first check, it will go undetected until --sshloginfile
698           is changed; --retries can be used to mitigate this.
699
700           Currently you can not put --filter-hosts in a profile, $PARALLEL,
701           /etc/parallel/config or similar. This is because GNU parallel uses
702           GNU parallel to compute this, so you will get an infinite loop.
703           This will likely be fixed in a later release.
704
705       --gnu
706           Behave like GNU parallel. This option historically took precedence
707           over --tollef. The --tollef option is now retired, and therefore
708           may not be used. --gnu is kept for compatibility.
709
710       --group
711           Group output. Output from each job is grouped together and is only
712           printed when the command is finished. Stdout (standard output)
713           first followed by stderr (standard error).
714
715           This takes in the order of 0.5ms per job and depends on the speed
716           of your disk for larger output. It can be disabled with -u, but
717           this means output from different commands can get mixed.
718
719           --group is the default. Can be reversed with -u.
720
721           See also: --line-buffer --ungroup
722
723       --group-by val (alpha testing)
724           Group input by value. Combined with --pipe/--pipepart --group-by
725           groups lines with the same value into a record.
726
727           The value can be computed from the full line or from a single
728           column.
729
730           val can be:
731
732            column number Use the value in the column numbered.
733
734            column name   Treat the first line as a header and use the value
735                          in the column named.
736
737                          (Not supported with --pipepart).
738
739            perl expression
740                          Run the perl expression and use $_ as the value.
741
742            column number perl expression
743                          Put the value of the column put in $_, run the perl
744                          expression, and use $_ as the value.
745
746            column name perl expression
747                          Put the value of the column put in $_, run the perl
748                          expression, and use $_ as the value.
749
750                          (Not supported with --pipepart).
751
752           Example:
753
754             UserID, Consumption
755             123,    1
756             123,    2
757             12-3,   1
758             221,    3
759             221,    1
760             2/21,   5
761
762           If you want to group 123, 12-3, 221, and 2/21 into 4 records and
763           pass one record at a time to wc:
764
765             tail -n +2 table.csv | \
766               parallel --pipe --colsep , --group-by 1 -kN1 wc
767
768           Make GNU parallel treat the first line as a header:
769
770             cat table.csv | \
771               parallel --pipe --colsep , --header : --group-by 1 -kN1 wc
772
773           Address column by column name:
774
775             cat table.csv | \
776               parallel --pipe --colsep , --header : --group-by UserID -kN1 wc
777
778           If 12-3 and 123 are really the same UserID, remove non-digits in
779           UserID when grouping:
780
781             cat table.csv | parallel --pipe --colsep , --header : \
782               --group-by 'UserID s/\D//g' -kN1 wc
783
784           See also --shard, --roundrobin.
785
786       --help
787       -h  Print a summary of the options to GNU parallel and exit.
788
789       --halt-on-error val
790       --halt val
791           When should GNU parallel terminate? In some situations it makes no
792           sense to run all jobs. GNU parallel should simply give up as soon
793           as a condition is met.
794
795           val defaults to never, which runs all jobs no matter what.
796
797           val can also take on the form of when,why.
798
799           when can be 'now' which means kill all running jobs and halt
800           immediately, or it can be 'soon' which means wait for all running
801           jobs to complete, but start no new jobs.
802
803           why can be 'fail=X', 'fail=Y%', 'success=X', 'success=Y%',
804           'done=X', or 'done=Y%' where X is the number of jobs that has to
805           fail, succeed, or be done before halting, and Y is the percentage
806           of jobs that has to fail, succeed, or be done before halting.
807
808           Example:
809
810            --halt now,fail=1     exit when the first job fails. Kill running
811                                  jobs.
812
813            --halt soon,fail=3    exit when 3 jobs fail, but wait for running
814                                  jobs to complete.
815
816            --halt soon,fail=3%   exit when 3% of the jobs have failed, but
817                                  wait for running jobs to complete.
818
819            --halt now,success=1  exit when a job succeeds. Kill running jobs.
820
821            --halt soon,success=3 exit when 3 jobs succeeds, but wait for
822                                  running jobs to complete.
823
824            --halt now,success=3% exit when 3% of the jobs have succeeded.
825                                  Kill running jobs.
826
827            --halt now,done=1     exit when one of the jobs finishes. Kill
828                                  running jobs.
829
830            --halt soon,done=3    exit when 3 jobs finishes, but wait for
831                                  running jobs to complete.
832
833            --halt now,done=3%    exit when 3% of the jobs have finished. Kill
834                                  running jobs.
835
836           For backwards compatibility these also work:
837
838           0           never
839
840           1           soon,fail=1
841
842           2           now,fail=1
843
844           -1          soon,success=1
845
846           -2          now,success=1
847
848           1-99%       soon,fail=1-99%
849
850       --header regexp
851           Use regexp as header. For normal usage the matched header
852           (typically the first line: --header '.*\n') will be split using
853           --colsep (which will default to '\t') and column names can be used
854           as replacement variables: {column name}, {column name/}, {column
855           name//}, {column name/.}, {column name.}, {=column name perl
856           expression =}, ..
857
858           For --pipe the matched header will be prepended to each output.
859
860           --header : is an alias for --header '.*\n'.
861
862           If regexp is a number, it is a fixed number of lines.
863
864       --hostgroups (alpha testing)
865       --hgrp (alpha testing)
866           Enable hostgroups on arguments. If an argument contains '@' the
867           string after '@' will be removed and treated as a list of
868           hostgroups on which this job is allowed to run. If there is no
869           --sshlogin with a corresponding group, the job will run on any
870           hostgroup.
871
872           Example:
873
874             parallel --hostgroups \
875               --sshlogin @grp1/myserver1 -S @grp1+grp2/myserver2 \
876               --sshlogin @grp3/myserver3 \
877               echo ::: my_grp1_arg@grp1 arg_for_grp2@grp2 third@grp1+grp3
878
879           my_grp1_arg may be run on either myserver1 or myserver2, third may
880           be run on either myserver1 or myserver3, but arg_for_grp2 will only
881           be run on myserver2.
882
883           See also: --sshlogin, $PARALLEL_HOSTGROUPS.
884
885       -I replace-str
886           Use the replacement string replace-str instead of {}.
887
888       --replace[=replace-str]
889       -i[replace-str]
890           This option is a synonym for -Ireplace-str if replace-str is
891           specified, and for -I {} otherwise.  This option is deprecated; use
892           -I instead.
893
894       --joblog logfile
895           Logfile for executed jobs. Save a list of the executed jobs to
896           logfile in the following TAB separated format: sequence number,
897           sshlogin, start time as seconds since epoch, run time in seconds,
898           bytes in files transferred, bytes in files returned, exit status,
899           signal, and command run.
900
901           For --pipe bytes transferred and bytes returned are number of input
902           and output of bytes.
903
904           If logfile is prepended with '+' log lines will be appended to the
905           logfile.
906
907           To convert the times into ISO-8601 strict do:
908
909             cat logfile | perl -a -F"\t" -ne \
910               'chomp($F[2]=`date -d \@$F[2] +%FT%T`); print join("\t",@F)'
911
912           If the host is long, you can use column -t to pretty print it:
913
914             cat joblog | column -t
915
916           See also --resume --resume-failed.
917
918       --jobs N
919       -j N
920       --max-procs N
921       -P N
922           Number of jobslots on each machine. Run up to N jobs in parallel.
923           0 means as many as possible. Default is 100% which will run one job
924           per CPU on each machine.
925
926           If --semaphore is set, the default is 1 thus making a mutex.
927
928       --jobs +N
929       -j +N
930       --max-procs +N
931       -P +N
932           Add N to the number of CPUs.  Run this many jobs in parallel.  See
933           also --use-cores-instead-of-threads and
934           --use-sockets-instead-of-threads.
935
936       --jobs -N
937       -j -N
938       --max-procs -N
939       -P -N
940           Subtract N from the number of CPUs.  Run this many jobs in
941           parallel.  If the evaluated number is less than 1 then 1 will be
942           used.  See also --use-cores-instead-of-threads and
943           --use-sockets-instead-of-threads.
944
945       --jobs N%
946       -j N%
947       --max-procs N%
948       -P N%
949           Multiply N% with the number of CPUs.  Run this many jobs in
950           parallel. See also --use-cores-instead-of-threads and
951           --use-sockets-instead-of-threads.
952
953       --jobs procfile
954       -j procfile
955       --max-procs procfile
956       -P procfile
957           Read parameter from file. Use the content of procfile as parameter
958           for -j. E.g. procfile could contain the string 100% or +2 or 10. If
959           procfile is changed when a job completes, procfile is read again
960           and the new number of jobs is computed. If the number is lower than
961           before, running jobs will be allowed to finish but new jobs will
962           not be started until the wanted number of jobs has been reached.
963           This makes it possible to change the number of simultaneous running
964           jobs while GNU parallel is running.
965
966       --keep-order
967       -k  Keep sequence of output same as the order of input. Normally the
968           output of a job will be printed as soon as the job completes. Try
969           this to see the difference:
970
971             parallel -j4 sleep {}\; echo {} ::: 2 1 4 3
972             parallel -j4 -k sleep {}\; echo {} ::: 2 1 4 3
973
974           If used with --onall or --nonall the output will grouped by
975           sshlogin in sorted order.
976
977           If used with --pipe --roundrobin and the same input, the jobslots
978           will get the same blocks in the same order in every run.
979
980           -k only affects the order in which the output is printed - not the
981           order in which jobs are run.
982
983       -L recsize
984           When used with --pipe: Read records of recsize.
985
986           When used otherwise: Use at most recsize nonblank input lines per
987           command line.  Trailing blanks cause an input line to be logically
988           continued on the next input line.
989
990           -L 0 means read one line, but insert 0 arguments on the command
991           line.
992
993           Implies -X unless -m, --xargs, or --pipe is set.
994
995       --max-lines[=recsize]
996       -l[recsize]
997           When used with --pipe: Read records of recsize lines.
998
999           When used otherwise: Synonym for the -L option.  Unlike -L, the
1000           recsize argument is optional.  If recsize is not specified, it
1001           defaults to one.  The -l option is deprecated since the POSIX
1002           standard specifies -L instead.
1003
1004           -l 0 is an alias for -l 1.
1005
1006           Implies -X unless -m, --xargs, or --pipe is set.
1007
1008       --limit "command args"
1009           Dynamic job limit. Before starting a new job run command with args.
1010           The exit value of command determines what GNU parallel will do:
1011
1012           0   Below limit. Start another job.
1013
1014           1   Over limit. Start no jobs.
1015
1016           2   Way over limit. Kill the youngest job.
1017
1018           You can use any shell command. There are 3 predefined commands:
1019
1020           "io n"    Limit for I/O. The amount of disk I/O will be computed as
1021                     a value 0-100, where 0 is no I/O and 100 is at least one
1022                     disk is 100% saturated.
1023
1024           "load n"  Similar to --load.
1025
1026           "mem n"   Similar to --memfree.
1027
1028       --line-buffer
1029       --lb
1030           Buffer output on line basis. --group will keep the output together
1031           for a whole job. --ungroup allows output to mixup with half a line
1032           coming from one job and half a line coming from another job.
1033           --line-buffer fits between these two: GNU parallel will print a
1034           full line, but will allow for mixing lines of different jobs.
1035
1036           --line-buffer takes more CPU power than both --group and --ungroup,
1037           but can be much faster than --group if the CPU is not the limiting
1038           factor.
1039
1040           Normally --line-buffer does not buffer on disk, and can thus
1041           process an infinite amount of data, but it will buffer on disk when
1042           combined with: --keep-order, --results, --compress, and --files.
1043           This will make it as slow as --group and will limit output to the
1044           available disk space.
1045
1046           With --keep-order --line-buffer will output lines from the first
1047           job continuously while it is running, then lines from the second
1048           job while that is running. It will buffer full lines, but jobs will
1049           not mix. Compare:
1050
1051             parallel -j0 'echo {};sleep {};echo {}' ::: 1 3 2 4
1052             parallel -j0 --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1053             parallel -j0 -k --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1054
1055           See also: --group --ungroup
1056
1057       --xapply
1058       --link
1059           Link input sources. Read multiple input sources like xapply. If
1060           multiple input sources are given, one argument will be read from
1061           each of the input sources. The arguments can be accessed in the
1062           command as {1} .. {n}, so {1} will be a line from the first input
1063           source, and {6} will refer to the line with the same line number
1064           from the 6th input source.
1065
1066           Compare these two:
1067
1068             parallel echo {1} {2} ::: 1 2 3 ::: a b c
1069             parallel --link echo {1} {2} ::: 1 2 3 ::: a b c
1070
1071           Arguments will be recycled if one input source has more arguments
1072           than the others:
1073
1074             parallel --link echo {1} {2} {3} \
1075               ::: 1 2 ::: I II III ::: a b c d e f g
1076
1077           See also --header, :::+, ::::+.
1078
1079       --load max-load
1080           Do not start new jobs on a given computer unless the number of
1081           running processes on the computer is less than max-load. max-load
1082           uses the same syntax as --jobs, so 100% for one per CPU is a valid
1083           setting. Only difference is 0 which is interpreted as 0.01.
1084
1085       --controlmaster
1086       -M  Use ssh's ControlMaster to make ssh connections faster. Useful if
1087           jobs run remote and are very fast to run. This is disabled for
1088           sshlogins that specify their own ssh command.
1089
1090       -m  Multiple arguments. Insert as many arguments as the command line
1091           length permits. If multiple jobs are being run in parallel:
1092           distribute the arguments evenly among the jobs. Use -j1 or --xargs
1093           to avoid this.
1094
1095           If {} is not used the arguments will be appended to the line.  If
1096           {} is used multiple times each {} will be replaced with all the
1097           arguments.
1098
1099           Support for -m with --sshlogin is limited and may fail.
1100
1101           See also -X for context replace. If in doubt use -X as that will
1102           most likely do what is needed.
1103
1104       --memfree size
1105           Minimum memory free when starting another job. The size can be
1106           postfixed with K, M, G, T, P, k, m, g, t, or p which would multiply
1107           the size with 1024, 1048576, 1073741824, 1099511627776,
1108           1125899906842624, 1000, 1000000, 1000000000, 1000000000000, or
1109           1000000000000000, respectively.
1110
1111           If the jobs take up very different amount of RAM, GNU parallel will
1112           only start as many as there is memory for. If less than size bytes
1113           are free, no more jobs will be started. If less than 50% size bytes
1114           are free, the youngest job will be killed, and put back on the
1115           queue to be run later.
1116
1117           --retries must be set to determine how many times GNU parallel
1118           should retry a given job.
1119
1120       --minversion version
1121           Print the version GNU parallel and exit.  If the current version of
1122           GNU parallel is less than version the exit code is 255. Otherwise
1123           it is 0.
1124
1125           This is useful for scripts that depend on features only available
1126           from a certain version of GNU parallel.
1127
1128       --max-args=max-args
1129       -n max-args
1130           Use at most max-args arguments per command line.  Fewer than max-
1131           args arguments will be used if the size (see the -s option) is
1132           exceeded, unless the -x option is given, in which case GNU parallel
1133           will exit.
1134
1135           -n 0 means read one argument, but insert 0 arguments on the command
1136           line.
1137
1138           Implies -X unless -m is set.
1139
1140       --max-replace-args=max-args
1141       -N max-args
1142           Use at most max-args arguments per command line. Like -n but also
1143           makes replacement strings {1} .. {max-args} that represents
1144           argument 1 .. max-args. If too few args the {n} will be empty.
1145
1146           -N 0 means read one argument, but insert 0 arguments on the command
1147           line.
1148
1149           This will set the owner of the homedir to the user:
1150
1151             tr ':' '\n' < /etc/passwd | parallel -N7 chown {1} {6}
1152
1153           Implies -X unless -m or --pipe is set.
1154
1155           When used with --pipe -N is the number of records to read. This is
1156           somewhat slower than --block.
1157
1158       --nonall
1159           --onall with no arguments. Run the command on all computers given
1160           with --sshlogin but take no arguments. GNU parallel will log into
1161           --jobs number of computers in parallel and run the job on the
1162           computer. -j adjusts how many computers to log into in parallel.
1163
1164           This is useful for running the same command (e.g. uptime) on a list
1165           of servers.
1166
1167       --onall
1168           Run all the jobs on all computers given with --sshlogin. GNU
1169           parallel will log into --jobs number of computers in parallel and
1170           run one job at a time on the computer. The order of the jobs will
1171           not be changed, but some computers may finish before others.
1172
1173           When using --group the output will be grouped by each server, so
1174           all the output from one server will be grouped together.
1175
1176           --joblog will contain an entry for each job on each server, so
1177           there will be several job sequence 1.
1178
1179       --output-as-files
1180       --outputasfiles
1181       --files
1182           Instead of printing the output to stdout (standard output) the
1183           output of each job is saved in a file and the filename is then
1184           printed.
1185
1186           See also: --results
1187
1188       --pipe (alpha testing)
1189       --spreadstdin (alpha testing)
1190           Spread input to jobs on stdin (standard input). Read a block of
1191           data from stdin (standard input) and give one block of data as
1192           input to one job.
1193
1194           The block size is determined by --block. The strings --recstart and
1195           --recend tell GNU parallel how a record starts and/or ends. The
1196           block read will have the final partial record removed before the
1197           block is passed on to the job. The partial record will be prepended
1198           to next block.
1199
1200           If --recstart is given this will be used to split at record start.
1201
1202           If --recend is given this will be used to split at record end.
1203
1204           If both --recstart and --recend are given both will have to match
1205           to find a split position.
1206
1207           If neither --recstart nor --recend are given --recend defaults to
1208           '\n'. To have no record separator use --recend "".
1209
1210           --files is often used with --pipe.
1211
1212           --pipe maxes out at around 1 GB/s input, and 100 MB/s output. If
1213           performance is important use --pipepart.
1214
1215           See also: --recstart, --recend, --fifo, --cat, --pipepart, --files.
1216
1217       --pipepart
1218           Pipe parts of a physical file. --pipepart works similar to --pipe,
1219           but is much faster.
1220
1221           --pipepart has a few limitations:
1222
1223           •  The file must be a normal file or a block device (technically it
1224              must be seekable) and must be given using -a or ::::. The file
1225              cannot be a pipe or a fifo as they are not seekable.
1226
1227              If using a block device with lot of NUL bytes, remember to set
1228              --recend ''.
1229
1230           •  Record counting (-N) and line counting (-L/-l) do not work.
1231
1232       --plain
1233           Ignore any --profile, $PARALLEL, and ~/.parallel/config to get full
1234           control on the command line (used by GNU parallel internally when
1235           called with --sshlogin).
1236
1237       --plus
1238           Activate additional replacement strings: {+/} {+.} {+..} {+...}
1239           {..} {...} {/..} {/...} {##}. The idea being that '{+foo}' matches
1240           the opposite of '{foo}' and {} = {+/}/{/} = {.}.{+.} =
1241           {+/}/{/.}.{+.} = {..}.{+..} = {+/}/{/..}.{+..} = {...}.{+...} =
1242           {+/}/{/...}.{+...}
1243
1244           {##} is the total number of jobs to be run. It is incompatible with
1245           -X/-m/--xargs.
1246
1247           {choose_k} is inspired by n choose k: Given a list of n elements,
1248           choose k. k is the number of input sources and n is the number of
1249           arguments in an input source.  The content of the input sources
1250           must be the same and the arguments must be unique.
1251
1252           Shorthands for variables:
1253
1254             {slot}        $PARALLEL_JOBSLOT (see {%})
1255             {sshlogin}    $PARALLEL_SSHLOGIN
1256             {host}        $PARALLEL_SSHHOST
1257             {hgrp}        $PARALLEL_HOSTGROUPS
1258
1259           The following dynamic replacement strings are also activated. They
1260           are inspired by bash's parameter expansion:
1261
1262             {:-str}       str if the value is empty
1263             {:num}        remove the first num characters
1264             {:num1:num2}  characters from num1 to num2
1265             {#str}        remove prefix str
1266             {%str}        remove postfix str
1267             {/str1/str2}  replace str1 with str2
1268             {^str}        uppercase str if found at the start
1269             {^^str}       uppercase str
1270             {,str}        lowercase str if found at the start
1271             {,,str}       lowercase str
1272
1273       --progress
1274           Show progress of computations. List the computers involved in the
1275           task with number of CPUs detected and the max number of jobs to
1276           run. After that show progress for each computer: number of running
1277           jobs, number of completed jobs, and percentage of all jobs done by
1278           this computer. The percentage will only be available after all jobs
1279           have been scheduled as GNU parallel only read the next job when
1280           ready to schedule it - this is to avoid wasting time and memory by
1281           reading everything at startup.
1282
1283           By sending GNU parallel SIGUSR2 you can toggle turning on/off
1284           --progress on a running GNU parallel process.
1285
1286           See also --eta and --bar.
1287
1288       --max-line-length-allowed
1289           Print the maximal number of characters allowed on the command line
1290           and exit (used by GNU parallel itself to determine the line length
1291           on remote computers).
1292
1293       --number-of-cpus (obsolete)
1294           Print the number of physical CPU cores and exit.
1295
1296       --number-of-cores
1297           Print the number of physical CPU cores and exit (used by GNU
1298           parallel itself to determine the number of physical CPU cores on
1299           remote computers).
1300
1301       --number-of-sockets
1302           Print the number of filled CPU sockets and exit (used by GNU
1303           parallel itself to determine the number of filled CPU sockets on
1304           remote computers).
1305
1306       --number-of-threads
1307           Print the number of hyperthreaded CPU cores and exit (used by GNU
1308           parallel itself to determine the number of hyperthreaded CPU cores
1309           on remote computers).
1310
1311       --no-keep-order
1312           Overrides an earlier --keep-order (e.g. if set in
1313           ~/.parallel/config).
1314
1315       --nice niceness
1316           Run the command at this niceness.
1317
1318           By default GNU parallel will run jobs at the same nice level as GNU
1319           parallel is started - both on the local machine and remote servers,
1320           so you are unlikely to ever use this option.
1321
1322           Setting --nice will override this nice level. If the nice level is
1323           smaller than the current nice level, it will only affect remote
1324           jobs (e.g. if current level is 10 then --nice 5 will cause local
1325           jobs to be run at level 10, but remote jobs run at nice level 5).
1326
1327       --interactive
1328       -p  Prompt the user about whether to run each command line and read a
1329           line from the terminal.  Only run the command line if the response
1330           starts with 'y' or 'Y'.  Implies -t.
1331
1332       --parens parensstring
1333           Define start and end parenthesis for {= perl expression =}. The
1334           left and the right parenthesis can be multiple characters and are
1335           assumed to be the same length. The default is {==} giving {= as the
1336           start parenthesis and =} as the end parenthesis.
1337
1338           Another useful setting is ,,,, which would make both parenthesis
1339           ,,:
1340
1341             parallel --parens ,,,, echo foo is ,,s/I/O/g,, ::: FII
1342
1343           See also: --rpl {= perl expression =}
1344
1345       --profile profilename
1346       -J profilename
1347           Use profile profilename for options. This is useful if you want to
1348           have multiple profiles. You could have one profile for running jobs
1349           in parallel on the local computer and a different profile for
1350           running jobs on remote computers. See the section PROFILE FILES for
1351           examples.
1352
1353           profilename corresponds to the file ~/.parallel/profilename.
1354
1355           You can give multiple profiles by repeating --profile. If parts of
1356           the profiles conflict, the later ones will be used.
1357
1358           Default: config
1359
1360       --quote
1361       -q  Quote command. If your command contains special characters that
1362           should not be interpreted by the shell (e.g. ; \ | *), use --quote
1363           to escape these. The command must be a simple command (see man
1364           bash) without redirections and without variable assignments.
1365
1366           See the section QUOTING. Most people will not need this.  Quoting
1367           is disabled by default.
1368
1369       --no-run-if-empty
1370       -r  If the stdin (standard input) only contains whitespace, do not run
1371           the command.
1372
1373           If used with --pipe this is slow.
1374
1375       --noswap
1376           Do not start new jobs on a given computer if there is both swap-in
1377           and swap-out activity.
1378
1379           The swap activity is only sampled every 10 seconds as the sampling
1380           takes 1 second to do.
1381
1382           Swap activity is computed as (swap-in)*(swap-out) which in practice
1383           is a good value: swapping out is not a problem, swapping in is not
1384           a problem, but both swapping in and out usually indicates a
1385           problem.
1386
1387           --memfree may give better results, so try using that first.
1388
1389       --record-env
1390           Record current environment variables in ~/.parallel/ignored_vars.
1391           This is useful before using --env _.
1392
1393           See also --env, --session.
1394
1395       --recstart startstring
1396       --recend endstring
1397           If --recstart is given startstring will be used to split at record
1398           start.
1399
1400           If --recend is given endstring will be used to split at record end.
1401
1402           If both --recstart and --recend are given the combined string
1403           endstringstartstring will have to match to find a split position.
1404           This is useful if either startstring or endstring match in the
1405           middle of a record.
1406
1407           If neither --recstart nor --recend are given then --recend defaults
1408           to '\n'. To have no record separator use --recend "".
1409
1410           --recstart and --recend are used with --pipe.
1411
1412           Use --regexp to interpret --recstart and --recend as regular
1413           expressions. This is slow, however.
1414
1415       --regexp
1416           Use --regexp to interpret --recstart and --recend as regular
1417           expressions. This is slow, however.
1418
1419       --remove-rec-sep
1420       --removerecsep
1421       --rrs
1422           Remove the text matched by --recstart and --recend before piping it
1423           to the command.
1424
1425           Only used with --pipe.
1426
1427       --results name
1428       --res name
1429           Save the output into files.
1430
1431           Simple string output dir
1432
1433           If name does not contain replacement strings and does not end in
1434           .csv/.tsv, the output will be stored in a directory tree rooted at
1435           name.  Within this directory tree, each command will result in
1436           three files: name/<ARGS>/stdout and name/<ARGS>/stderr,
1437           name/<ARGS>/seq, where <ARGS> is a sequence of directories
1438           representing the header of the input source (if using --header :)
1439           or the number of the input source and corresponding values.
1440
1441           E.g:
1442
1443             parallel --header : --results foo echo {a} {b} \
1444               ::: a I II ::: b III IIII
1445
1446           will generate the files:
1447
1448             foo/a/II/b/III/seq
1449             foo/a/II/b/III/stderr
1450             foo/a/II/b/III/stdout
1451             foo/a/II/b/IIII/seq
1452             foo/a/II/b/IIII/stderr
1453             foo/a/II/b/IIII/stdout
1454             foo/a/I/b/III/seq
1455             foo/a/I/b/III/stderr
1456             foo/a/I/b/III/stdout
1457             foo/a/I/b/IIII/seq
1458             foo/a/I/b/IIII/stderr
1459             foo/a/I/b/IIII/stdout
1460
1461           and
1462
1463             parallel --results foo echo {1} {2} ::: I II ::: III IIII
1464
1465           will generate the files:
1466
1467             foo/1/II/2/III/seq
1468             foo/1/II/2/III/stderr
1469             foo/1/II/2/III/stdout
1470             foo/1/II/2/IIII/seq
1471             foo/1/II/2/IIII/stderr
1472             foo/1/II/2/IIII/stdout
1473             foo/1/I/2/III/seq
1474             foo/1/I/2/III/stderr
1475             foo/1/I/2/III/stdout
1476             foo/1/I/2/IIII/seq
1477             foo/1/I/2/IIII/stderr
1478             foo/1/I/2/IIII/stdout
1479
1480           CSV file output
1481
1482           If name ends in .csv/.tsv the output will be a CSV-file named name.
1483
1484           .csv gives a comma separated value file. .tsv gives a TAB separated
1485           value file.
1486
1487           -.csv/-.tsv are special: It will give the file on stdout (standard
1488           output).
1489
1490           JSON file output (alpha testing)
1491
1492           If name ends in .json the output will be a JSON-file named name.
1493
1494           -.json is special: It will give the file on stdout (standard
1495           output).
1496
1497           Replacement string output file (alpha testing)
1498
1499           If name contains a replacement string and the replaced result does
1500           not end in /, then the standard output will be stored in a file
1501           named by this result. Standard error will be stored in the same
1502           file name with '.err' added, and the sequence number will be stored
1503           in the same file name with '.seq' added.
1504
1505           E.g.
1506
1507             parallel --results my_{} echo ::: foo bar baz
1508
1509           will generate the files:
1510
1511             my_bar
1512             my_bar.err
1513             my_bar.seq
1514             my_baz
1515             my_baz.err
1516             my_baz.seq
1517             my_foo
1518             my_foo.err
1519             my_foo.seq
1520
1521           Replacement string output dir
1522
1523           If name contains a replacement string and the replaced result ends
1524           in /, then output files will be stored in the resulting dir.
1525
1526           E.g.
1527
1528             parallel --results my_{}/ echo ::: foo bar baz
1529
1530           will generate the files:
1531
1532             my_bar/seq
1533             my_bar/stderr
1534             my_bar/stdout
1535             my_baz/seq
1536             my_baz/stderr
1537             my_baz/stdout
1538             my_foo/seq
1539             my_foo/stderr
1540             my_foo/stdout
1541
1542           See also --files, --tag, --header, --joblog.
1543
1544       --resume
1545           Resumes from the last unfinished job. By reading --joblog or the
1546           --results dir GNU parallel will figure out the last unfinished job
1547           and continue from there. As GNU parallel only looks at the sequence
1548           numbers in --joblog then the input, the command, and --joblog all
1549           have to remain unchanged; otherwise GNU parallel may run wrong
1550           commands.
1551
1552           See also --joblog, --results, --resume-failed, --retries.
1553
1554       --resume-failed
1555           Retry all failed and resume from the last unfinished job. By
1556           reading --joblog GNU parallel will figure out the failed jobs and
1557           run those again. After that it will resume last unfinished job and
1558           continue from there. As GNU parallel only looks at the sequence
1559           numbers in --joblog then the input, the command, and --joblog all
1560           have to remain unchanged; otherwise GNU parallel may run wrong
1561           commands.
1562
1563           See also --joblog, --resume, --retry-failed, --retries.
1564
1565       --retry-failed
1566           Retry all failed jobs in joblog. By reading --joblog GNU parallel
1567           will figure out the failed jobs and run those again.
1568
1569           --retry-failed ignores the command and arguments on the command
1570           line: It only looks at the joblog.
1571
1572           Differences between --resume, --resume-failed, --retry-failed
1573
1574           In this example exit {= $_%=2 =} will cause every other job to
1575           fail.
1576
1577             timeout -k 1 4 parallel --joblog log -j10 \
1578               'sleep {}; exit {= $_%=2 =}' ::: {10..1}
1579
1580           4 jobs completed. 2 failed:
1581
1582             Seq   [...]   Exitval Signal  Command
1583             10    [...]   1       0       sleep 1; exit 1
1584             9     [...]   0       0       sleep 2; exit 0
1585             8     [...]   1       0       sleep 3; exit 1
1586             7     [...]   0       0       sleep 4; exit 0
1587
1588           --resume does not care about the Exitval, but only looks at Seq. If
1589           the Seq is run, it will not be run again. So if needed, you can
1590           change the command for the seqs not run yet:
1591
1592             parallel --resume --joblog log -j10 \
1593               'sleep .{}; exit {= $_%=2 =}' ::: {10..1}
1594
1595             Seq   [...]   Exitval Signal  Command
1596             [... as above ...]
1597             1     [...]   0       0       sleep .10; exit 0
1598             6     [...]   1       0       sleep .5; exit 1
1599             5     [...]   0       0       sleep .6; exit 0
1600             4     [...]   1       0       sleep .7; exit 1
1601             3     [...]   0       0       sleep .8; exit 0
1602             2     [...]   1       0       sleep .9; exit 1
1603
1604           --resume-failed cares about the Exitval, but also only looks at Seq
1605           to figure out which commands to run. Again this means you can
1606           change the command, but not the arguments. It will run the failed
1607           seqs and the seqs not yet run:
1608
1609             parallel --resume-failed --joblog log -j10 \
1610               'echo {};sleep .{}; exit {= $_%=3 =}' ::: {10..1}
1611
1612             Seq   [...]   Exitval Signal  Command
1613             [... as above ...]
1614             10    [...]   1       0       echo 1;sleep .1; exit 1
1615             8     [...]   0       0       echo 3;sleep .3; exit 0
1616             6     [...]   2       0       echo 5;sleep .5; exit 2
1617             4     [...]   1       0       echo 7;sleep .7; exit 1
1618             2     [...]   0       0       echo 9;sleep .9; exit 0
1619
1620           --retry-failed cares about the Exitval, but takes the command from
1621           the joblog. It ignores any arguments or commands given on the
1622           command line:
1623
1624             parallel --retry-failed --joblog log -j10 this part is ignored
1625
1626             Seq   [...]   Exitval Signal  Command
1627             [... as above ...]
1628             10    [...]   1       0       echo 1;sleep .1; exit 1
1629             6     [...]   2       0       echo 5;sleep .5; exit 2
1630             4     [...]   1       0       echo 7;sleep .7; exit 1
1631
1632           See also --joblog, --resume, --resume-failed, --retries.
1633
1634       --retries n
1635           If a job fails, retry it on another computer on which it has not
1636           failed. Do this n times. If there are fewer than n computers in
1637           --sshlogin GNU parallel will re-use all the computers. This is
1638           useful if some jobs fail for no apparent reason (such as network
1639           failure).
1640
1641       --return filename
1642           Transfer files from remote computers. --return is used with
1643           --sshlogin when the arguments are files on the remote computers.
1644           When processing is done the file filename will be transferred from
1645           the remote computer using rsync and will be put relative to the
1646           default login dir. E.g.
1647
1648             echo foo/bar.txt | parallel --return {.}.out \
1649               --sshlogin server.example.com touch {.}.out
1650
1651           This will transfer the file $HOME/foo/bar.out from the computer
1652           server.example.com to the file foo/bar.out after running touch
1653           foo/bar.out on server.example.com.
1654
1655             parallel -S server --trc out/./{}.out touch {}.out ::: in/file
1656
1657           This will transfer the file in/file.out from the computer
1658           server.example.com to the files out/in/file.out after running touch
1659           in/file.out on server.
1660
1661             echo /tmp/foo/bar.txt | parallel --return {.}.out \
1662               --sshlogin server.example.com touch {.}.out
1663
1664           This will transfer the file /tmp/foo/bar.out from the computer
1665           server.example.com to the file /tmp/foo/bar.out after running touch
1666           /tmp/foo/bar.out on server.example.com.
1667
1668           Multiple files can be transferred by repeating the option multiple
1669           times:
1670
1671             echo /tmp/foo/bar.txt | parallel \
1672               --sshlogin server.example.com \
1673               --return {.}.out --return {.}.out2 touch {.}.out {.}.out2
1674
1675           --return is often used with --transferfile and --cleanup.
1676
1677           --return is ignored when used with --sshlogin : or when not used
1678           with --sshlogin.
1679
1680       --round-robin
1681       --round
1682           Normally --pipe will give a single block to each instance of the
1683           command. With --roundrobin all blocks will at random be written to
1684           commands already running. This is useful if the command takes a
1685           long time to initialize.
1686
1687           --keep-order will not work with --roundrobin as it is impossible to
1688           track which input block corresponds to which output.
1689
1690           --roundrobin implies --pipe, except if --pipepart is given.
1691
1692           See also --group-by, --shard.
1693
1694       --rpl 'tag perl expression'
1695           Use tag as a replacement string for perl expression. This makes it
1696           possible to define your own replacement strings. GNU parallel's 7
1697           replacement strings are implemented as:
1698
1699             --rpl '{} '
1700             --rpl '{#} 1 $_=$job->seq()'
1701             --rpl '{%} 1 $_=$job->slot()'
1702             --rpl '{/} s:.*/::'
1703             --rpl '{//} $Global::use{"File::Basename"} ||=
1704               eval "use File::Basename; 1;"; $_ = dirname($_);'
1705             --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
1706             --rpl '{.} s:\.[^/.]+$::'
1707
1708           The --plus replacement strings are implemented as:
1709
1710             --rpl '{+/} s:/[^/]*$::'
1711             --rpl '{+.} s:.*\.::'
1712             --rpl '{+..} s:.*\.([^.]*\.):$1:'
1713             --rpl '{+...} s:.*\.([^.]*\.[^.]*\.):$1:'
1714             --rpl '{..} s:\.[^/.]+$::; s:\.[^/.]+$::'
1715             --rpl '{...} s:\.[^/.]+$::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1716             --rpl '{/..} s:.*/::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1717             --rpl '{/...} s:.*/::;s:\.[^/.]+$::;s:\.[^/.]+$::;s:\.[^/.]+$::'
1718             --rpl '{##} $_=total_jobs()'
1719             --rpl '{:-(.+?)} $_ ||= $$1'
1720             --rpl '{:(\d+?)} substr($_,0,$$1) = ""'
1721             --rpl '{:(\d+?):(\d+?)} $_ = substr($_,$$1,$$2);'
1722             --rpl '{#([^#].*?)} s/^$$1//;'
1723             --rpl '{%(.+?)} s/$$1$//;'
1724             --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1725             --rpl '{^(.+?)} s/^($$1)/uc($1)/e;'
1726             --rpl '{^^(.+?)} s/($$1)/uc($1)/eg;'
1727             --rpl '{,(.+?)} s/^($$1)/lc($1)/e;'
1728             --rpl '{,,(.+?)} s/($$1)/lc($1)/eg;'
1729
1730           If the user defined replacement string starts with '{' it can also
1731           be used as a positional replacement string (like {2.}).
1732
1733           It is recommended to only change $_ but you have full access to all
1734           of GNU parallel's internal functions and data structures.
1735
1736           Here are a few examples:
1737
1738             Is the job sequence even or odd?
1739             --rpl '{odd} $_ = seq() % 2 ? "odd" : "even"'
1740             Pad job sequence with leading zeros to get equal width
1741             --rpl '{0#} $f=1+int("".(log(total_jobs())/log(10)));
1742               $_=sprintf("%0${f}d",seq())'
1743             Job sequence counting from 0
1744             --rpl '{#0} $_ = seq() - 1'
1745             Job slot counting from 2
1746             --rpl '{%1} $_ = slot() + 1'
1747             Remove all extensions
1748             --rpl '{:} s:(\.[^/]+)*$::'
1749
1750           You can have dynamic replacement strings by including parenthesis
1751           in the replacement string and adding a regular expression between
1752           the parenthesis. The matching string will be inserted as $$1:
1753
1754             parallel --rpl '{%(.*?)} s/$$1//' echo {%.tar.gz} ::: my.tar.gz
1755             parallel --rpl '{:%(.+?)} s:$$1(\.[^/]+)*$::' \
1756               echo {:%_file} ::: my_file.tar.gz
1757             parallel -n3 --rpl '{/:%(.*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:' \
1758               echo job {#}: {2} {2.} {3/:%_1} ::: a/b.c c/d.e f/g_1.h.i
1759
1760           You can even use multiple matches:
1761
1762             parallel --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1763               echo {/replacethis/withthis} {/b/C} ::: a_replacethis_b
1764
1765             parallel --rpl '{(.*?)/(.*?)} $_="$$2$_$$1"' \
1766               echo {swap/these} ::: -middle-
1767
1768           See also: {= perl expression =} --parens
1769
1770       --rsync-opts options
1771           Options to pass on to rsync. Setting --rsync-opts takes precedence
1772           over setting the environment variable $PARALLEL_RSYNC_OPTS.
1773
1774       --max-chars=max-chars
1775       -s max-chars
1776           Use at most max-chars characters per command line, including the
1777           command and initial-arguments and the terminating nulls at the ends
1778           of the argument strings.  The largest allowed value is system-
1779           dependent, and is calculated as the argument length limit for exec,
1780           less the size of your environment.  The default value is the
1781           maximum.
1782
1783           Implies -X unless -m is set.
1784
1785       --show-limits
1786           Display the limits on the command-line length which are imposed by
1787           the operating system and the -s option.  Pipe the input from
1788           /dev/null (and perhaps specify --no-run-if-empty) if you don't want
1789           GNU parallel to do anything.
1790
1791       --semaphore
1792           Work as a counting semaphore. --semaphore will cause GNU parallel
1793           to start command in the background. When the number of jobs given
1794           by --jobs is reached, GNU parallel will wait for one of these to
1795           complete before starting another command.
1796
1797           --semaphore implies --bg unless --fg is specified.
1798
1799           --semaphore implies --semaphorename `tty` unless --semaphorename is
1800           specified.
1801
1802           Used with --fg, --wait, and --semaphorename.
1803
1804           The command sem is an alias for parallel --semaphore.
1805
1806           See also man sem.
1807
1808       --semaphorename name
1809       --id name
1810           Use name as the name of the semaphore. Default is the name of the
1811           controlling tty (output from tty).
1812
1813           The default normally works as expected when used interactively, but
1814           when used in a script name should be set. $$ or my_task_name are
1815           often a good value.
1816
1817           The semaphore is stored in ~/.parallel/semaphores/
1818
1819           Implies --semaphore.
1820
1821           See also man sem.
1822
1823       --semaphoretimeout secs
1824       --st secs
1825           If secs > 0: If the semaphore is not released within secs seconds,
1826           take it anyway.
1827
1828           If secs < 0: If the semaphore is not released within secs seconds,
1829           exit.
1830
1831           Implies --semaphore.
1832
1833           See also man sem.
1834
1835       --seqreplace replace-str
1836           Use the replacement string replace-str instead of {#} for job
1837           sequence number.
1838
1839       --session
1840           Record names in current environment in $PARALLEL_IGNORED_NAMES and
1841           exit. Only used with env_parallel. Aliases, functions, and
1842           variables with names in $PARALLEL_IGNORED_NAMES will not be copied.
1843
1844           Only supported in Ash, Bash, Dash, Ksh, Sh, and Zsh.
1845
1846           See also --env, --record-env.
1847
1848       --shard shardexpr
1849           Use shardexpr as shard key and shard input to the jobs.
1850
1851           shardexpr is [column number|column name] [perlexpression] e.g. 3,
1852           Address, 3 $_%=100, Address s/\d//g.
1853
1854           Each input line is split using --colsep. The value of the column is
1855           put into $_, the perl expression is executed, the resulting value
1856           is hashed so that all lines of a given value is given to the same
1857           job slot.
1858
1859           This is similar to sharding in databases.
1860
1861           The performance is in the order of 100K rows per second. Faster if
1862           the shardcol is small (<10), slower if it is big (>100).
1863
1864           --shard requires --pipe and a fixed numeric value for --jobs.
1865
1866           See also --bin, --group-by, --roundrobin.
1867
1868       --shebang
1869       --hashbang
1870           GNU parallel can be called as a shebang (#!) command as the first
1871           line of a script. The content of the file will be treated as
1872           inputsource.
1873
1874           Like this:
1875
1876             #!/usr/bin/parallel --shebang -r wget
1877
1878             https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
1879             https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
1880             https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
1881
1882           --shebang must be set as the first option.
1883
1884           On FreeBSD env is needed:
1885
1886             #!/usr/bin/env -S parallel --shebang -r wget
1887
1888             https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
1889             https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
1890             https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
1891
1892           There are many limitations of shebang (#!) depending on your
1893           operating system. See details on
1894           http://www.in-ulm.de/~mascheck/various/shebang/
1895
1896       --shebang-wrap
1897           GNU parallel can parallelize scripts by wrapping the shebang line.
1898           If the program can be run like this:
1899
1900             cat arguments | parallel the_program
1901
1902           then the script can be changed to:
1903
1904             #!/usr/bin/parallel --shebang-wrap /original/parser --options
1905
1906           E.g.
1907
1908             #!/usr/bin/parallel --shebang-wrap /usr/bin/python
1909
1910           If the program can be run like this:
1911
1912             cat data | parallel --pipe the_program
1913
1914           then the script can be changed to:
1915
1916             #!/usr/bin/parallel --shebang-wrap --pipe /orig/parser --opts
1917
1918           E.g.
1919
1920             #!/usr/bin/parallel --shebang-wrap --pipe /usr/bin/perl -w
1921
1922           --shebang-wrap must be set as the first option.
1923
1924       --shellquote
1925           Does not run the command but quotes it. Useful for making quoted
1926           composed commands for GNU parallel.
1927
1928           Multiple --shellquote with quote the string multiple times, so
1929           parallel --shellquote | parallel --shellquote can be written as
1930           parallel --shellquote --shellquote.
1931
1932       --shuf
1933           Shuffle jobs. When having multiple input sources it is hard to
1934           randomize jobs. --shuf will generate all jobs, and shuffle them
1935           before running them. This is useful to get a quick preview of the
1936           results before running the full batch.
1937
1938       --skip-first-line
1939           Do not use the first line of input (used by GNU parallel itself
1940           when called with --shebang).
1941
1942       --sql DBURL (obsolete)
1943           Use --sqlmaster instead.
1944
1945       --sqlmaster DBURL
1946           Submit jobs via SQL server. DBURL must point to a table, which will
1947           contain the same information as --joblog, the values from the input
1948           sources (stored in columns V1 .. Vn), and the output (stored in
1949           columns Stdout and Stderr).
1950
1951           If DBURL is prepended with '+' GNU parallel assumes the table is
1952           already made with the correct columns and appends the jobs to it.
1953
1954           If DBURL is not prepended with '+' the table will be dropped and
1955           created with the correct amount of V-columns unless
1956
1957           --sqlmaster does not run any jobs, but it creates the values for
1958           the jobs to be run. One or more --sqlworker must be run to actually
1959           execute the jobs.
1960
1961           If --wait is set, GNU parallel will wait for the jobs to complete.
1962
1963           The format of a DBURL is:
1964
1965             [sql:]vendor://[[user][:pwd]@][host][:port]/[db]/table
1966
1967           E.g.
1968
1969             sql:mysql://hr:hr@localhost:3306/hrdb/jobs
1970             mysql://scott:tiger@my.example.com/pardb/paralleljobs
1971             sql:oracle://scott:tiger@ora.example.com/xe/parjob
1972             postgresql://scott:tiger@pg.example.com/pgdb/parjob
1973             pg:///parjob
1974             sqlite3:///%2Ftmp%2Fpardb.sqlite/parjob
1975             csv:///%2Ftmp%2Fpardb/parjob
1976
1977           Notice how / in the path of sqlite and CVS must be encoded as %2F.
1978           Except the last / in CSV which must be a /.
1979
1980           It can also be an alias from ~/.sql/aliases:
1981
1982             :myalias mysql:///mydb/paralleljobs
1983
1984       --sqlandworker DBURL
1985           Shorthand for: --sqlmaster DBURL --sqlworker DBURL.
1986
1987       --sqlworker DBURL
1988           Execute jobs via SQL server. Read the input sources variables from
1989           the table pointed to by DBURL. The command on the command line
1990           should be the same as given by --sqlmaster.
1991
1992           If you have more than one --sqlworker jobs may be run more than
1993           once.
1994
1995           If --sqlworker runs on the local machine, the hostname in the SQL
1996           table will not be ':' but instead the hostname of the machine.
1997
1998       --ssh sshcommand
1999           GNU parallel defaults to using ssh for remote access. This can be
2000           overridden with --ssh. It can also be set on a per server basis
2001           (see --sshlogin).
2002
2003       --sshdelay mytime (alpha testing)
2004           Delay starting next ssh by mytime. GNU parallel will not start
2005           another ssh for the next mytime.
2006
2007           For details on mytime see --delay.
2008
2009       -S
2010       [@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]
2011       -S @hostgroup
2012       --sshlogin
2013       [@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]
2014       --sshlogin @hostgroup
2015           Distribute jobs to remote computers. The jobs will be run on a list
2016           of remote computers.
2017
2018           If hostgroups is given, the sshlogin will be added to that
2019           hostgroup. Multiple hostgroups are separated by '+'. The sshlogin
2020           will always be added to a hostgroup named the same as sshlogin.
2021
2022           If only the @hostgroup is given, only the sshlogins in that
2023           hostgroup will be used. Multiple @hostgroup can be given.
2024
2025           GNU parallel will determine the number of CPUs on the remote
2026           computers and run the number of jobs as specified by -j.  If the
2027           number ncpus is given GNU parallel will use this number for number
2028           of CPUs on the host. Normally ncpus will not be needed.
2029
2030           An sshlogin is of the form:
2031
2032             [sshcommand [options]] [username@]hostname
2033
2034           The sshlogin must not require a password (ssh-agent, ssh-copy-id,
2035           and sshpass may help with that).
2036
2037           The sshlogin ':' is special, it means 'no ssh' and will therefore
2038           run on the local computer.
2039
2040           The sshlogin '..' is special, it read sshlogins from
2041           ~/.parallel/sshloginfile or $XDG_CONFIG_HOME/parallel/sshloginfile
2042
2043           The sshlogin '-' is special, too, it read sshlogins from stdin
2044           (standard input).
2045
2046           To specify more sshlogins separate the sshlogins by comma, newline
2047           (in the same string), or repeat the options multiple times.
2048
2049           For examples: see --sshloginfile.
2050
2051           The remote host must have GNU parallel installed.
2052
2053           --sshlogin is known to cause problems with -m and -X.
2054
2055           --sshlogin is often used with --transferfile, --return, --cleanup,
2056           and --trc.
2057
2058       --sshloginfile filename
2059       --slf filename
2060           File with sshlogins. The file consists of sshlogins on separate
2061           lines. Empty lines and lines starting with '#' are ignored.
2062           Example:
2063
2064             server.example.com
2065             username@server2.example.com
2066             8/my-8-cpu-server.example.com
2067             2/my_other_username@my-dualcore.example.net
2068             # This server has SSH running on port 2222
2069             ssh -p 2222 server.example.net
2070             4/ssh -p 2222 quadserver.example.net
2071             # Use a different ssh program
2072             myssh -p 2222 -l myusername hexacpu.example.net
2073             # Use a different ssh program with default number of CPUs
2074             //usr/local/bin/myssh -p 2222 -l myusername hexacpu
2075             # Use a different ssh program with 6 CPUs
2076             6//usr/local/bin/myssh -p 2222 -l myusername hexacpu
2077             # Assume 16 CPUs on the local computer
2078             16/:
2079             # Put server1 in hostgroup1
2080             @hostgroup1/server1
2081             # Put myusername@server2 in hostgroup1+hostgroup2
2082             @hostgroup1+hostgroup2/myusername@server2
2083             # Force 4 CPUs and put 'ssh -p 2222 server3' in hostgroup1
2084             @hostgroup1/4/ssh -p 2222 server3
2085
2086           When using a different ssh program the last argument must be the
2087           hostname.
2088
2089           Multiple --sshloginfile are allowed.
2090
2091           GNU parallel will first look for the file in current dir; if that
2092           fails it look for the file in ~/.parallel.
2093
2094           The sshloginfile '..' is special, it read sshlogins from
2095           ~/.parallel/sshloginfile
2096
2097           The sshloginfile '.' is special, it read sshlogins from
2098           /etc/parallel/sshloginfile
2099
2100           The sshloginfile '-' is special, too, it read sshlogins from stdin
2101           (standard input).
2102
2103           If the sshloginfile is changed it will be re-read when a job
2104           finishes though at most once per second. This makes it possible to
2105           add and remove hosts while running.
2106
2107           This can be used to have a daemon that updates the sshloginfile to
2108           only contain servers that are up:
2109
2110               cp original.slf tmp2.slf
2111               while [ 1 ] ; do
2112                 nice parallel --nonall -j0 -k --slf original.slf \
2113                   --tag echo | perl 's/\t$//' > tmp.slf
2114                 if diff tmp.slf tmp2.slf; then
2115                   mv tmp.slf tmp2.slf
2116                 fi
2117                 sleep 10
2118               done &
2119               parallel --slf tmp2.slf ...
2120
2121       --slotreplace replace-str
2122           Use the replacement string replace-str instead of {%} for job slot
2123           number.
2124
2125       --silent
2126           Silent.  The job to be run will not be printed. This is the
2127           default.  Can be reversed with -v.
2128
2129       --tty
2130           Open terminal tty. If GNU parallel is used for starting a program
2131           that accesses the tty (such as an interactive program) then this
2132           option may be needed. It will default to starting only one job at a
2133           time (i.e. -j1), not buffer the output (i.e. -u), and it will open
2134           a tty for the job.
2135
2136           You can of course override -j1 and -u.
2137
2138           Using --tty unfortunately means that GNU parallel cannot kill the
2139           jobs (with --timeout, --memfree, or --halt). This is due to GNU
2140           parallel giving each child its own process group, which is then
2141           killed. Process groups are dependant on the tty.
2142
2143       --tag
2144           Tag lines with arguments. Each output line will be prepended with
2145           the arguments and TAB (\t). When combined with --onall or --nonall
2146           the lines will be prepended with the sshlogin instead.
2147
2148           --tag is ignored when using -u.
2149
2150       --tagstring str
2151           Tag lines with a string. Each output line will be prepended with
2152           str and TAB (\t). str can contain replacement strings such as {}.
2153
2154           --tagstring is ignored when using -u, --onall, and --nonall.
2155
2156       --tee
2157           Pipe all data to all jobs. Used with --pipe/--pipepart and :::.
2158
2159             seq 1000 | parallel --pipe --tee -v wc {} ::: -w -l -c
2160
2161           How many numbers in 1..1000 contain 0..9, and how many bytes do
2162           they fill:
2163
2164             seq 1000 | parallel --pipe --tee --tag \
2165               'grep {1} | wc {2}' ::: {0..9} ::: -l -c
2166
2167           How many words contain a..z and how many bytes do they fill?
2168
2169             parallel -a /usr/share/dict/words --pipepart --tee --tag \
2170               'grep {1} | wc {2}' ::: {a..z} ::: -l -c
2171
2172       --termseq sequence
2173           Termination sequence. When a job is killed due to --timeout,
2174           --memfree, --halt, or abnormal termination of GNU parallel,
2175           sequence determines how the job is killed. The default is:
2176
2177               TERM,200,TERM,100,TERM,50,KILL,25
2178
2179           which sends a TERM signal, waits 200 ms, sends another TERM signal,
2180           waits 100 ms, sends another TERM signal, waits 50 ms, sends a KILL
2181           signal, waits 25 ms, and exits. GNU parallel detects if a process
2182           dies before the waiting time is up.
2183
2184       --tmpdir dirname
2185           Directory for temporary files. GNU parallel normally buffers output
2186           into temporary files in /tmp. By setting --tmpdir you can use a
2187           different dir for the files. Setting --tmpdir is equivalent to
2188           setting $TMPDIR.
2189
2190       --tmux (Long beta testing)
2191           Use tmux for output. Start a tmux session and run each job in a
2192           window in that session. No other output will be produced.
2193
2194       --tmuxpane (Long beta testing)
2195           Use tmux for output but put output into panes in the first window.
2196           Useful if you want to monitor the progress of less than 100
2197           concurrent jobs.
2198
2199       --timeout duration
2200           Time out for command. If the command runs for longer than duration
2201           seconds it will get killed as per --termseq.
2202
2203           If duration is followed by a % then the timeout will dynamically be
2204           computed as a percentage of the median average runtime of
2205           successful jobs. Only values > 100% will make sense.
2206
2207           duration is normally in seconds, but can be floats postfixed with
2208           s, m, h, or d which would multiply the float by 1, 60, 3600, or
2209           86400. Thus these are equivalent: --timeout 100000 and --timeout
2210           1d3.5h16.6m4s.
2211
2212       --verbose
2213       -t  Print the job to be run on stderr (standard error).
2214
2215           See also -v, -p.
2216
2217       --transfer
2218           Transfer files to remote computers. Shorthand for: --transferfile
2219           {}.
2220
2221       --transferfile filename
2222       --tf filename
2223           --transferfile is used with --sshlogin to transfer files to the
2224           remote computers. The files will be transferred using rsync and
2225           will be put relative to the default work dir. If the path contains
2226           /./ the remaining path will be relative to the work dir. E.g.
2227
2228             echo foo/bar.txt | parallel --transferfile {} \
2229               --sshlogin server.example.com wc
2230
2231           This will transfer the file foo/bar.txt to the computer
2232           server.example.com to the file $HOME/foo/bar.txt before running wc
2233           foo/bar.txt on server.example.com.
2234
2235             echo /tmp/foo/bar.txt | parallel --transferfile {} \
2236               --sshlogin server.example.com wc
2237
2238           This will transfer the file /tmp/foo/bar.txt to the computer
2239           server.example.com to the file /tmp/foo/bar.txt before running wc
2240           /tmp/foo/bar.txt on server.example.com.
2241
2242             echo /tmp/./foo/bar.txt | parallel --transferfile {} \
2243               --sshlogin server.example.com wc {= s:.*/./:./: =}
2244
2245           This will transfer the file /tmp/foo/bar.txt to the computer
2246           server.example.com to the file foo/bar.txt before running wc
2247           ./foo/bar.txt on server.example.com.
2248
2249           --transferfile is often used with --return and --cleanup. A
2250           shorthand for --transferfile {} is --transfer.
2251
2252           --transferfile is ignored when used with --sshlogin : or when not
2253           used with --sshlogin.
2254
2255       --trc filename
2256           Transfer, Return, Cleanup. Shorthand for:
2257
2258           --transferfile {} --return filename --cleanup
2259
2260       --trim <n|l|r|lr|rl>
2261           Trim white space in input.
2262
2263           n   No trim. Input is not modified. This is the default.
2264
2265           l   Left trim. Remove white space from start of input. E.g. " a bc
2266               " -> "a bc ".
2267
2268           r   Right trim. Remove white space from end of input. E.g. " a bc "
2269               -> " a bc".
2270
2271           lr
2272           rl  Both trim. Remove white space from both start and end of input.
2273               E.g. " a bc " -> "a bc". This is the default if --colsep is
2274               used.
2275
2276       --ungroup
2277       -u  Ungroup output.  Output is printed as soon as possible and bypasses
2278           GNU parallel internal processing. This may cause output from
2279           different commands to be mixed thus should only be used if you do
2280           not care about the output. Compare these:
2281
2282             seq 4 | parallel -j0 \
2283               'sleep {};echo -n start{};sleep {};echo {}end'
2284             seq 4 | parallel -u -j0 \
2285               'sleep {};echo -n start{};sleep {};echo {}end'
2286
2287           It also disables --tag. GNU parallel outputs faster with -u.
2288           Compare the speeds of these:
2289
2290             parallel seq ::: 300000000 >/dev/null
2291             parallel -u seq ::: 300000000 >/dev/null
2292             parallel --line-buffer seq ::: 300000000 >/dev/null
2293
2294           Can be reversed with --group.
2295
2296           See also: --line-buffer --group
2297
2298       --extensionreplace replace-str
2299       --er replace-str
2300           Use the replacement string replace-str instead of {.} for input
2301           line without extension.
2302
2303       --use-sockets-instead-of-threads
2304       --use-cores-instead-of-threads
2305       --use-cpus-instead-of-cores (obsolete)
2306           Determine how GNU parallel counts the number of CPUs. GNU parallel
2307           uses this number when the number of jobslots is computed relative
2308           to the number of CPUs (e.g. 100% or +1).
2309
2310           CPUs can be counted in three different ways:
2311
2312           sockets The number of filled CPU sockets (i.e. the number of
2313                   physical chips).
2314
2315           cores   The number of physical cores (i.e. the number of physical
2316                   compute cores).
2317
2318           threads The number of hyperthreaded cores (i.e. the number of
2319                   virtual cores - with some of them possibly being
2320                   hyperthreaded)
2321
2322           Normally the number of CPUs is computed as the number of CPU
2323           threads. With --use-sockets-instead-of-threads or
2324           --use-cores-instead-of-threads you can force it to be computed as
2325           the number of filled sockets or number of cores instead.
2326
2327           Most users will not need these options.
2328
2329           --use-cpus-instead-of-cores is a (misleading) alias for
2330           --use-sockets-instead-of-threads and is kept for backwards
2331           compatibility.
2332
2333       -v  Verbose.  Print the job to be run on stdout (standard output). Can
2334           be reversed with --silent. See also -t.
2335
2336           Use -v -v to print the wrapping ssh command when running remotely.
2337
2338       --version
2339       -V  Print the version GNU parallel and exit.
2340
2341       --workdir mydir
2342       --wd mydir
2343           Jobs will be run in the dir mydir.
2344
2345           Files transferred using --transferfile and --return will be
2346           relative to mydir on remote computers.
2347
2348           The special mydir value ... will create working dirs under
2349           ~/.parallel/tmp/. If --cleanup is given these dirs will be removed.
2350
2351           The special mydir value . uses the current working dir.  If the
2352           current working dir is beneath your home dir, the value . is
2353           treated as the relative path to your home dir. This means that if
2354           your home dir is different on remote computers (e.g. if your login
2355           is different) the relative path will still be relative to your home
2356           dir.
2357
2358           To see the difference try:
2359
2360             parallel -S server pwd ::: ""
2361             parallel --wd . -S server pwd ::: ""
2362             parallel --wd ... -S server pwd ::: ""
2363
2364           mydir can contain GNU parallel's replacement strings.
2365
2366       --wait
2367           Wait for all commands to complete.
2368
2369           Used with --semaphore or --sqlmaster.
2370
2371           See also man sem.
2372
2373       -X  Multiple arguments with context replace. Insert as many arguments
2374           as the command line length permits. If multiple jobs are being run
2375           in parallel: distribute the arguments evenly among the jobs. Use
2376           -j1 to avoid this.
2377
2378           If {} is not used the arguments will be appended to the line.  If
2379           {} is used as part of a word (like pic{}.jpg) then the whole word
2380           will be repeated. If {} is used multiple times each {} will be
2381           replaced with the arguments.
2382
2383           Normally -X will do the right thing, whereas -m can give unexpected
2384           results if {} is used as part of a word.
2385
2386           Support for -X with --sshlogin is limited and may fail.
2387
2388           See also -m.
2389
2390       --exit
2391       -x  Exit if the size (see the -s option) is exceeded.
2392
2393       --xargs
2394           Multiple arguments. Insert as many arguments as the command line
2395           length permits.
2396
2397           If {} is not used the arguments will be appended to the line.  If
2398           {} is used multiple times each {} will be replaced with all the
2399           arguments.
2400
2401           Support for --xargs with --sshlogin is limited and may fail.
2402
2403           See also -X for context replace. If in doubt use -X as that will
2404           most likely do what is needed.
2405

EXAMPLE: Working as xargs -n1. Argument appending

2407       GNU parallel can work similar to xargs -n1.
2408
2409       To compress all html files using gzip run:
2410
2411         find . -name '*.html' | parallel gzip --best
2412
2413       If the file names may contain a newline use -0. Substitute FOO BAR with
2414       FUBAR in all files in this dir and subdirs:
2415
2416         find . -type f -print0 | \
2417           parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
2418
2419       Note -q is needed because of the space in 'FOO BAR'.
2420

EXAMPLE: Simple network scanner

2422       prips can generate IP-addresses from CIDR notation. With GNU parallel
2423       you can build a simple network scanner to see which addresses respond
2424       to ping:
2425
2426         prips 130.229.16.0/20 | \
2427           parallel --timeout 2 -j0 \
2428             'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null
2429

EXAMPLE: Reading arguments from command line

2431       GNU parallel can take the arguments from command line instead of stdin
2432       (standard input). To compress all html files in the current dir using
2433       gzip run:
2434
2435         parallel gzip --best ::: *.html
2436
2437       To convert *.wav to *.mp3 using LAME running one process per CPU run:
2438
2439         parallel lame {} -o {.}.mp3 ::: *.wav
2440

EXAMPLE: Inserting multiple arguments

2442       When moving a lot of files like this: mv *.log destdir you will
2443       sometimes get the error:
2444
2445         bash: /bin/mv: Argument list too long
2446
2447       because there are too many files. You can instead do:
2448
2449         ls | grep -E '\.log$' | parallel mv {} destdir
2450
2451       This will run mv for each file. It can be done faster if mv gets as
2452       many arguments that will fit on the line:
2453
2454         ls | grep -E '\.log$' | parallel -m mv {} destdir
2455
2456       In many shells you can also use printf:
2457
2458         printf '%s\0' *.log | parallel -0 -m mv {} destdir
2459

EXAMPLE: Context replace

2461       To remove the files pict0000.jpg .. pict9999.jpg you could do:
2462
2463         seq -w 0 9999 | parallel rm pict{}.jpg
2464
2465       You could also do:
2466
2467         seq -w 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm
2468
2469       The first will run rm 10000 times, while the last will only run rm as
2470       many times needed to keep the command line length short enough to avoid
2471       Argument list too long (it typically runs 1-2 times).
2472
2473       You could also run:
2474
2475         seq -w 0 9999 | parallel -X rm pict{}.jpg
2476
2477       This will also only run rm as many times needed to keep the command
2478       line length short enough.
2479

EXAMPLE: Compute intensive jobs and substitution

2481       If ImageMagick is installed this will generate a thumbnail of a jpg
2482       file:
2483
2484         convert -geometry 120 foo.jpg thumb_foo.jpg
2485
2486       This will run with number-of-cpus jobs in parallel for all jpg files in
2487       a directory:
2488
2489         ls *.jpg | parallel convert -geometry 120 {} thumb_{}
2490
2491       To do it recursively use find:
2492
2493         find . -name '*.jpg' | \
2494           parallel convert -geometry 120 {} {}_thumb.jpg
2495
2496       Notice how the argument has to start with {} as {} will include path
2497       (e.g. running convert -geometry 120 ./foo/bar.jpg thumb_./foo/bar.jpg
2498       would clearly be wrong). The command will generate files like
2499       ./foo/bar.jpg_thumb.jpg.
2500
2501       Use {.} to avoid the extra .jpg in the file name. This command will
2502       make files like ./foo/bar_thumb.jpg:
2503
2504         find . -name '*.jpg' | \
2505           parallel convert -geometry 120 {} {.}_thumb.jpg
2506

EXAMPLE: Substitution and redirection

2508       This will generate an uncompressed version of .gz-files next to the
2509       .gz-file:
2510
2511         parallel zcat {} ">"{.} ::: *.gz
2512
2513       Quoting of > is necessary to postpone the redirection. Another solution
2514       is to quote the whole command:
2515
2516         parallel "zcat {} >{.}" ::: *.gz
2517
2518       Other special shell characters (such as * ; $ > < | >> <<) also need to
2519       be put in quotes, as they may otherwise be interpreted by the shell and
2520       not given to GNU parallel.
2521

EXAMPLE: Composed commands

2523       A job can consist of several commands. This will print the number of
2524       files in each directory:
2525
2526         ls | parallel 'echo -n {}" "; ls {}|wc -l'
2527
2528       To put the output in a file called <name>.dir:
2529
2530         ls | parallel '(echo -n {}" "; ls {}|wc -l) >{}.dir'
2531
2532       Even small shell scripts can be run by GNU parallel:
2533
2534         find . | parallel 'a={}; name=${a##*/};' \
2535           'upper=$(echo "$name" | tr "[:lower:]" "[:upper:]");'\
2536           'echo "$name - $upper"'
2537
2538         ls | parallel 'mv {} "$(echo {} | tr "[:upper:]" "[:lower:]")"'
2539
2540       Given a list of URLs, list all URLs that fail to download. Print the
2541       line number and the URL.
2542
2543         cat urlfile | parallel "wget {} 2>/dev/null || grep -n {} urlfile"
2544
2545       Create a mirror directory with the same filenames except all files and
2546       symlinks are empty files.
2547
2548         cp -rs /the/source/dir mirror_dir
2549         find mirror_dir -type l | parallel -m rm {} '&&' touch {}
2550
2551       Find the files in a list that do not exist
2552
2553         cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi'
2554

EXAMPLE: Composed command with perl replacement string

2556       You have a bunch of file. You want them sorted into dirs. The dir of
2557       each file should be named the first letter of the file name.
2558
2559         parallel 'mkdir -p {=s/(.).*/$1/=}; mv {} {=s/(.).*/$1/=}' ::: *
2560

EXAMPLE: Composed command with multiple input sources

2562       You have a dir with files named as 24 hours in 5 minute intervals:
2563       00:00, 00:05, 00:10 .. 23:55. You want to find the files missing:
2564
2565         parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \
2566           ::: {00..23} ::: {00..55..5}
2567

EXAMPLE: Calling Bash functions

2569       If the composed command is longer than a line, it becomes hard to read.
2570       In Bash you can use functions. Just remember to export -f the function.
2571
2572         doit() {
2573           echo Doing it for $1
2574           sleep 2
2575           echo Done with $1
2576         }
2577         export -f doit
2578         parallel doit ::: 1 2 3
2579
2580         doubleit() {
2581           echo Doing it for $1 $2
2582           sleep 2
2583           echo Done with $1 $2
2584         }
2585         export -f doubleit
2586         parallel doubleit ::: 1 2 3 ::: a b
2587
2588       To do this on remote servers you need to transfer the function using
2589       --env:
2590
2591         parallel --env doit -S server doit ::: 1 2 3
2592         parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b
2593
2594       If your environment (aliases, variables, and functions) is small you
2595       can copy the full environment without having to export -f anything. See
2596       env_parallel.
2597

EXAMPLE: Function tester

2599       To test a program with different parameters:
2600
2601         tester() {
2602           if (eval "$@") >&/dev/null; then
2603             perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@"
2604           else
2605             perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@"
2606           fi
2607         }
2608         export -f tester
2609         parallel tester my_program ::: arg1 arg2
2610         parallel tester exit ::: 1 0 2 0
2611
2612       If my_program fails a red FAIL will be printed followed by the failing
2613       command; otherwise a green OK will be printed followed by the command.
2614

EXAMPLE: Continously show the latest line of output

2616       It can be useful to monitor the output of running jobs.
2617
2618       This shows the most recent output line until a job finishes. After
2619       which the output of the job is printed in full:
2620
2621         parallel '{} | tee >(cat >&3)' ::: 'command 1' 'command 2' \
2622           3> >(perl -ne '$|=1;chomp;printf"%.'$COLUMNS's\r",$_." "x100')
2623

EXAMPLE: Log rotate

2625       Log rotation renames a logfile to an extension with a higher number:
2626       log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is
2627       removed. To avoid overwriting files the process starts backwards from
2628       the high number to the low number.  This will keep 10 old versions of
2629       the log:
2630
2631         seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}'
2632         mv log log.1
2633

EXAMPLE: Removing file extension when processing files

2635       When processing files removing the file extension using {.} is often
2636       useful.
2637
2638       Create a directory for each zip-file and unzip it in that dir:
2639
2640         parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
2641
2642       Recompress all .gz files in current directory using bzip2 running 1 job
2643       per CPU in parallel:
2644
2645         parallel "zcat {} | bzip2 >{.}.bz2 && rm {}" ::: *.gz
2646
2647       Convert all WAV files to MP3 using LAME:
2648
2649         find sounddir -type f -name '*.wav' | parallel lame {} -o {.}.mp3
2650
2651       Put all converted in the same directory:
2652
2653         find sounddir -type f -name '*.wav' | \
2654           parallel lame {} -o mydir/{/.}.mp3
2655

EXAMPLE: Removing strings from the argument

2657       If you have directory with tar.gz files and want these extracted in the
2658       corresponding dir (e.g foo.tar.gz will be extracted in the dir foo) you
2659       can do:
2660
2661         parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
2662
2663       If you want to remove a different ending, you can use {%string}:
2664
2665         parallel --plus echo {%_demo} ::: mycode_demo keep_demo_here
2666
2667       You can also remove a starting string with {#string}
2668
2669         parallel --plus echo {#demo_} ::: demo_mycode keep_demo_here
2670
2671       To remove a string anywhere you can use regular expressions with
2672       {/regexp/replacement} and leave the replacement empty:
2673
2674         parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here
2675

EXAMPLE: Download 24 images for each of the past 30 days

2677       Let us assume a website stores images like:
2678
2679         http://www.example.com/path/to/YYYYMMDD_##.jpg
2680
2681       where YYYYMMDD is the date and ## is the number 01-24. This will
2682       download images for the past 30 days:
2683
2684         getit() {
2685           date=$(date -d "today -$1 days" +%Y%m%d)
2686           num=$2
2687           echo wget http://www.example.com/path/to/${date}_${num}.jpg
2688         }
2689         export -f getit
2690
2691         parallel getit ::: $(seq 30) ::: $(seq -w 24)
2692
2693       $(date -d "today -$1 days" +%Y%m%d) will give the dates in YYYYMMDD
2694       with $1 days subtracted.
2695

EXAMPLE: Download world map from NASA

2697       NASA provides tiles to download on earthdata.nasa.gov. Download tiles
2698       for Blue Marble world map and create a 10240x20480 map.
2699
2700         base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi
2701         service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0"
2702         layer="LAYER=BlueMarble_ShadedRelief_Bathymetry"
2703         set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5"
2704         tile="TILEROW={1}&TILECOL={2}"
2705         format="FORMAT=image%2Fjpeg"
2706         url="$base?$service&$layer&$set&$tile&$format"
2707
2708         parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39}
2709         parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19}
2710         convert -append line{0..19}.jpg world.jpg
2711

EXAMPLE: Download Apollo-11 images from NASA using jq

2713       Search NASA using their API to get JSON for images related to 'apollo
2714       11' and has 'moon landing' in the description.
2715
2716       The search query returns JSON containing URLs to JSON containing
2717       collections of pictures. One of the pictures in each of these
2718       collection is large.
2719
2720       wget is used to get the JSON for the search query. jq is then used to
2721       extract the URLs of the collections. parallel then calls wget to get
2722       each collection, which is passed to jq to extract the URLs of all
2723       images. grep filters out the large images, and parallel finally uses
2724       wget to fetch the images.
2725
2726         base="https://images-api.nasa.gov/search"
2727         q="q=apollo 11"
2728         description="description=moon landing"
2729         media_type="media_type=image"
2730         wget -O - "$base?$q&$description&$media_type" |
2731           jq -r .collection.items[].href |
2732           parallel wget -O - |
2733           jq -r .[] |
2734           grep large |
2735           parallel wget
2736

EXAMPLE: Download video playlist in parallel

2738       youtube-dl is an excellent tool to download videos. It can, however,
2739       not download videos in parallel. This takes a playlist and downloads 10
2740       videos in parallel.
2741
2742         url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ'
2743         export url
2744         youtube-dl --flat-playlist "https://$url" |
2745           parallel --tagstring {#} --lb -j10 \
2746             youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"'
2747

EXAMPLE: Prepend last modified date (ISO8601) to file name

2749         parallel mv {} '{= $a=pQ($_); $b=$_;' \
2750           '$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: *
2751
2752       {= and =} mark a perl expression. pQ perl-quotes the string. date
2753       +%FT%T is the date in ISO8601 with time.
2754

EXAMPLE: Save output in ISO8601 dirs

2756       Save output from ps aux every second into dirs named
2757       yyyy-mm-ddThh:mm:ss+zz:zz.
2758
2759         seq 1000 | parallel -N0 -j1 --delay 1 \
2760           --results '{= $_=`date -Isec`; chomp=}/' ps aux
2761

EXAMPLE: Digital clock with "blinking" :

2763       The : in a digital clock blinks. To make every other line have a ':'
2764       and the rest a ' ' a perl expression is used to look at the 3rd input
2765       source. If the value modulo 2 is 1: Use ":" otherwise use " ":
2766
2767         parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \
2768           ::: {0..12} ::: {0..5} ::: {0..9}
2769

EXAMPLE: Aggregating content of files

2771       This:
2772
2773         parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \
2774         ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
2775
2776       will generate the files x1y01z1 .. x5y10z5. If you want to aggregate
2777       the output grouping on x and z you can do this:
2778
2779         parallel eval 'cat {=s/y01/y*/=} > {=s/y01//=}' ::: *y01*
2780
2781       For all values of x and z it runs commands like:
2782
2783         cat x1y*z1 > x1z1
2784
2785       So you end up with x1z1 .. x5z5 each containing the content of all
2786       values of y.
2787

EXAMPLE: Breadth first parallel web crawler/mirrorer

2789       This script below will crawl and mirror a URL in parallel.  It
2790       downloads first pages that are 1 click down, then 2 clicks down, then
2791       3; instead of the normal depth first, where the first link link on each
2792       page is fetched first.
2793
2794       Run like this:
2795
2796         PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/
2797
2798       Remove the wget part if you only want a web crawler.
2799
2800       It works by fetching a page from a list of URLs and looking for links
2801       in that page that are within the same starting URL and that have not
2802       already been seen. These links are added to a new queue. When all the
2803       pages from the list is done, the new queue is moved to the list of URLs
2804       and the process is started over until no unseen links are found.
2805
2806         #!/bin/bash
2807
2808         # E.g. http://gatt.org.yeslab.org/
2809         URL=$1
2810         # Stay inside the start dir
2811         BASEURL=$(echo $URL | perl -pe 's:#.*::; s:(//.*/)[^/]*:$1:')
2812         URLLIST=$(mktemp urllist.XXXX)
2813         URLLIST2=$(mktemp urllist.XXXX)
2814         SEEN=$(mktemp seen.XXXX)
2815
2816         # Spider to get the URLs
2817         echo $URL >$URLLIST
2818         cp $URLLIST $SEEN
2819
2820         while [ -s $URLLIST ] ; do
2821           cat $URLLIST |
2822             parallel lynx -listonly -image_links -dump {} \; \
2823               wget -qm -l1 -Q1 {} \; echo Spidered: {} \>\&2 |
2824               perl -ne 's/#.*//; s/\s+\d+.\s(\S+)$/$1/ and
2825                 do { $seen{$1}++ or print }' |
2826             grep -F $BASEURL |
2827             grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
2828           mv $URLLIST2 $URLLIST
2829         done
2830
2831         rm -f $URLLIST $URLLIST2 $SEEN
2832

EXAMPLE: Process files from a tar file while unpacking

2834       If the files to be processed are in a tar file then unpacking one file
2835       and processing it immediately may be faster than first unpacking all
2836       files.
2837
2838         tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \
2839           parallel echo
2840
2841       The Perl one-liner is needed to make sure the file is complete before
2842       handing it to GNU parallel.
2843

EXAMPLE: Rewriting a for-loop and a while-read-loop

2845       for-loops like this:
2846
2847         (for x in `cat list` ; do
2848           do_something $x
2849         done) | process_output
2850
2851       and while-read-loops like this:
2852
2853         cat list | (while read x ; do
2854           do_something $x
2855         done) | process_output
2856
2857       can be written like this:
2858
2859         cat list | parallel do_something | process_output
2860
2861       For example: Find which host name in a list has IP address 1.2.3 4:
2862
2863         cat hosts.txt | parallel -P 100 host | grep 1.2.3.4
2864
2865       If the processing requires more steps the for-loop like this:
2866
2867         (for x in `cat list` ; do
2868           no_extension=${x%.*};
2869           do_step1 $x scale $no_extension.jpg
2870           do_step2 <$x $no_extension
2871         done) | process_output
2872
2873       and while-loops like this:
2874
2875         cat list | (while read x ; do
2876           no_extension=${x%.*};
2877           do_step1 $x scale $no_extension.jpg
2878           do_step2 <$x $no_extension
2879         done) | process_output
2880
2881       can be written like this:
2882
2883         cat list | parallel "do_step1 {} scale {.}.jpg ; do_step2 <{} {.}" |\
2884           process_output
2885
2886       If the body of the loop is bigger, it improves readability to use a
2887       function:
2888
2889         (for x in `cat list` ; do
2890           do_something $x
2891           [... 100 lines that do something with $x ...]
2892         done) | process_output
2893
2894         cat list | (while read x ; do
2895           do_something $x
2896           [... 100 lines that do something with $x ...]
2897         done) | process_output
2898
2899       can both be rewritten as:
2900
2901         doit() {
2902           x=$1
2903           do_something $x
2904           [... 100 lines that do something with $x ...]
2905         }
2906         export -f doit
2907         cat list | parallel doit
2908

EXAMPLE: Rewriting nested for-loops

2910       Nested for-loops like this:
2911
2912         (for x in `cat xlist` ; do
2913           for y in `cat ylist` ; do
2914             do_something $x $y
2915           done
2916         done) | process_output
2917
2918       can be written like this:
2919
2920         parallel do_something {1} {2} :::: xlist ylist | process_output
2921
2922       Nested for-loops like this:
2923
2924         (for colour in red green blue ; do
2925           for size in S M L XL XXL ; do
2926             echo $colour $size
2927           done
2928         done) | sort
2929
2930       can be written like this:
2931
2932         parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort
2933

EXAMPLE: Finding the lowest difference between files

2935       diff is good for finding differences in text files. diff | wc -l gives
2936       an indication of the size of the difference. To find the differences
2937       between all files in the current dir do:
2938
2939         parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3
2940
2941       This way it is possible to see if some files are closer to other files.
2942

EXAMPLE: for-loops with column names

2944       When doing multiple nested for-loops it can be easier to keep track of
2945       the loop variable if is is named instead of just having a number. Use
2946       --header : to let the first argument be an named alias for the
2947       positional replacement string:
2948
2949         parallel --header : echo {colour} {size} \
2950           ::: colour red green blue ::: size S M L XL XXL
2951
2952       This also works if the input file is a file with columns:
2953
2954         cat addressbook.tsv | \
2955           parallel --colsep '\t' --header : echo {Name} {E-mail address}
2956

EXAMPLE: All combinations in a list

2958       GNU parallel makes all combinations when given two lists.
2959
2960       To make all combinations in a single list with unique values, you
2961       repeat the list and use replacement string {choose_k}:
2962
2963         parallel --plus echo {choose_k} ::: A B C D ::: A B C D
2964
2965         parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D
2966
2967       {choose_k} works for any number of input sources:
2968
2969         parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D
2970

EXAMPLE: From a to b and b to c

2972       Assume you have input like:
2973
2974         aardvark
2975         babble
2976         cab
2977         dab
2978         each
2979
2980       and want to run combinations like:
2981
2982         aardvark babble
2983         babble cab
2984         cab dab
2985         dab each
2986
2987       If the input is in the file in.txt:
2988
2989         parallel echo {1} - {2} ::::+ <(head -n -1 in.txt) <(tail -n +2 in.txt)
2990
2991       If the input is in the array $a here are two solutions:
2992
2993         seq $((${#a[@]}-1)) | \
2994           env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}'
2995         parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}"
2996

EXAMPLE: Count the differences between all files in a dir

2998       Using --results the results are saved in /tmp/diffcount*.
2999
3000         parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \
3001           tail -n +3 |grep -v '^@'|wc -l" ::: * ::: *
3002
3003       To see the difference between file A and file B look at the file
3004       '/tmp/diffcount/1/A/2/B'.
3005

EXAMPLE: Speeding up fast jobs

3007       Starting a job on the local machine takes around 10 ms. This can be a
3008       big overhead if the job takes very few ms to run. Often you can group
3009       small jobs together using -X which will make the overhead less
3010       significant. Compare the speed of these:
3011
3012         seq -w 0 9999 | parallel touch pict{}.jpg
3013         seq -w 0 9999 | parallel -X touch pict{}.jpg
3014
3015       If your program cannot take multiple arguments, then you can use GNU
3016       parallel to spawn multiple GNU parallels:
3017
3018         seq -w 0 9999999 | \
3019           parallel -j10 -q -I,, --pipe parallel -j0 touch pict{}.jpg
3020
3021       If -j0 normally spawns 252 jobs, then the above will try to spawn 2520
3022       jobs. On a normal GNU/Linux system you can spawn 32000 jobs using this
3023       technique with no problems. To raise the 32000 jobs limit raise
3024       /proc/sys/kernel/pid_max to 4194303.
3025
3026       If you do not need GNU parallel to have control over each job (so no
3027       need for --retries or --joblog or similar), then it can be even faster
3028       if you can generate the command lines and pipe those to a shell. So if
3029       you can do this:
3030
3031         mygenerator | sh
3032
3033       Then that can be parallelized like this:
3034
3035         mygenerator | parallel --pipe --block 10M sh
3036
3037       E.g.
3038
3039         mygenerator() {
3040           seq 10000000 | perl -pe 'print "echo This is fast job number "';
3041         }
3042         mygenerator | parallel --pipe --block 10M sh
3043
3044       The overhead is 100000 times smaller namely around 100 nanoseconds per
3045       job.
3046

EXAMPLE: Using shell variables

3048       When using shell variables you need to quote them correctly as they may
3049       otherwise be interpreted by the shell.
3050
3051       Notice the difference between:
3052
3053         ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3054         parallel echo ::: ${ARR[@]} # This is probably not what you want
3055
3056       and:
3057
3058         ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3059         parallel echo ::: "${ARR[@]}"
3060
3061       When using variables in the actual command that contains special
3062       characters (e.g. space) you can quote them using '"$VAR"' or using "'s
3063       and -q:
3064
3065         VAR="My brother's 12\" records are worth <\$\$\$>"
3066         parallel -q echo "$VAR" ::: '!'
3067         export VAR
3068         parallel echo '"$VAR"' ::: '!'
3069
3070       If $VAR does not contain ' then "'$VAR'" will also work (and does not
3071       need export):
3072
3073         VAR="My 12\" records are worth <\$\$\$>"
3074         parallel echo "'$VAR'" ::: '!'
3075
3076       If you use them in a function you just quote as you normally would do:
3077
3078         VAR="My brother's 12\" records are worth <\$\$\$>"
3079         export VAR
3080         myfunc() { echo "$VAR" "$1"; }
3081         export -f myfunc
3082         parallel myfunc ::: '!'
3083

EXAMPLE: Group output lines

3085       When running jobs that output data, you often do not want the output of
3086       multiple jobs to run together. GNU parallel defaults to grouping the
3087       output of each job, so the output is printed when the job finishes. If
3088       you want full lines to be printed while the job is running you can use
3089       --line-buffer. If you want output to be printed as soon as possible you
3090       can use -u.
3091
3092       Compare the output of:
3093
3094         parallel wget --limit-rate=100k \
3095           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3096           ::: {12..16}
3097         parallel --line-buffer wget --limit-rate=100k \
3098           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3099           ::: {12..16}
3100         parallel -u wget --limit-rate=100k \
3101           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3102           ::: {12..16}
3103

EXAMPLE: Tag output lines

3105       GNU parallel groups the output lines, but it can be hard to see where
3106       the different jobs begin. --tag prepends the argument to make that more
3107       visible:
3108
3109         parallel --tag wget --limit-rate=100k \
3110           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3111           ::: {12..16}
3112
3113       --tag works with --line-buffer but not with -u:
3114
3115         parallel --tag --line-buffer wget --limit-rate=100k \
3116           https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3117           ::: {12..16}
3118
3119       Check the uptime of the servers in ~/.parallel/sshloginfile:
3120
3121         parallel --tag -S .. --nonall uptime
3122

EXAMPLE: Colorize output

3124       Give each job a new color. Most terminals support ANSI colors with the
3125       escape code "\033[30;3Xm" where 0 <= X <= 7:
3126
3127           seq 10 | \
3128             parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {}
3129           parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \
3130             --tagstring {color} seq {} ::: {1..10}
3131
3132       To get rid of the initial \t (which comes from --tagstring):
3133
3134           ... | perl -pe 's/\t//'
3135

EXAMPLE: Keep order of output same as order of input

3137       Normally the output of a job will be printed as soon as it completes.
3138       Sometimes you want the order of the output to remain the same as the
3139       order of the input. This is often important, if the output is used as
3140       input for another system. -k will make sure the order of output will be
3141       in the same order as input even if later jobs end before earlier jobs.
3142
3143       Append a string to every line in a text file:
3144
3145         cat textfile | parallel -k echo {} append_string
3146
3147       If you remove -k some of the lines may come out in the wrong order.
3148
3149       Another example is traceroute:
3150
3151         parallel traceroute ::: qubes-os.org debian.org freenetproject.org
3152
3153       will give traceroute of qubes-os.org, debian.org and
3154       freenetproject.org, but it will be sorted according to which job
3155       completed first.
3156
3157       To keep the order the same as input run:
3158
3159         parallel -k traceroute ::: qubes-os.org debian.org freenetproject.org
3160
3161       This will make sure the traceroute to qubes-os.org will be printed
3162       first.
3163
3164       A bit more complex example is downloading a huge file in chunks in
3165       parallel: Some internet connections will deliver more data if you
3166       download files in parallel. For downloading files in parallel see:
3167       "EXAMPLE: Download 10 images for each of the past 30 days". But if you
3168       are downloading a big file you can download the file in chunks in
3169       parallel.
3170
3171       To download byte 10000000-19999999 you can use curl:
3172
3173         curl -r 10000000-19999999 http://example.com/the/big/file >file.part
3174
3175       To download a 1 GB file we need 100 10MB chunks downloaded and combined
3176       in the correct order.
3177
3178         seq 0 99 | parallel -k curl -r \
3179           {}0000000-{}9999999 http://example.com/the/big/file > file
3180

EXAMPLE: Parallel grep

3182       grep -r greps recursively through directories. On multicore CPUs GNU
3183       parallel can often speed this up.
3184
3185         find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
3186
3187       This will run 1.5 job per CPU, and give 1000 arguments to grep.
3188

EXAMPLE: Grepping n lines for m regular expressions.

3190       The simplest solution to grep a big file for a lot of regexps is:
3191
3192         grep -f regexps.txt bigfile
3193
3194       Or if the regexps are fixed strings:
3195
3196         grep -F -f regexps.txt bigfile
3197
3198       There are 3 limiting factors: CPU, RAM, and disk I/O.
3199
3200       RAM is easy to measure: If the grep process takes up most of your free
3201       memory (e.g. when running top), then RAM is a limiting factor.
3202
3203       CPU is also easy to measure: If the grep takes >90% CPU in top, then
3204       the CPU is a limiting factor, and parallelization will speed this up.
3205
3206       It is harder to see if disk I/O is the limiting factor, and depending
3207       on the disk system it may be faster or slower to parallelize. The only
3208       way to know for certain is to test and measure.
3209
3210   Limiting factor: RAM
3211       The normal grep -f regexps.txt bigfile works no matter the size of
3212       bigfile, but if regexps.txt is so big it cannot fit into memory, then
3213       you need to split this.
3214
3215       grep -F takes around 100 bytes of RAM and grep takes about 500 bytes of
3216       RAM per 1 byte of regexp. So if regexps.txt is 1% of your RAM, then it
3217       may be too big.
3218
3219       If you can convert your regexps into fixed strings do that. E.g. if the
3220       lines you are looking for in bigfile all looks like:
3221
3222         ID1 foo bar baz Identifier1 quux
3223         fubar ID2 foo bar baz Identifier2
3224
3225       then your regexps.txt can be converted from:
3226
3227         ID1.*Identifier1
3228         ID2.*Identifier2
3229
3230       into:
3231
3232         ID1 foo bar baz Identifier1
3233         ID2 foo bar baz Identifier2
3234
3235       This way you can use grep -F which takes around 80% less memory and is
3236       much faster.
3237
3238       If it still does not fit in memory you can do this:
3239
3240         parallel --pipepart -a regexps.txt --block 1M grep -Ff - -n bigfile | \
3241           sort -un | perl -pe 's/^\d+://'
3242
3243       The 1M should be your free memory divided by the number of CPU threads
3244       and divided by 200 for grep -F and by 1000 for normal grep. On
3245       GNU/Linux you can do:
3246
3247         free=$(awk '/^((Swap)?Cached|MemFree|Buffers):/ { sum += $2 }
3248                     END { print sum }' /proc/meminfo)
3249         percpu=$((free / 200 / $(parallel --number-of-threads)))k
3250
3251         parallel --pipepart -a regexps.txt --block $percpu --compress \
3252           grep -F -f - -n bigfile | \
3253           sort -un | perl -pe 's/^\d+://'
3254
3255       If you can live with duplicated lines and wrong order, it is faster to
3256       do:
3257
3258         parallel --pipepart -a regexps.txt --block $percpu --compress \
3259           grep -F -f - bigfile
3260
3261   Limiting factor: CPU
3262       If the CPU is the limiting factor parallelization should be done on the
3263       regexps:
3264
3265         cat regexps.txt | parallel --pipe -L1000 --roundrobin --compress \
3266           grep -f - -n bigfile | \
3267           sort -un | perl -pe 's/^\d+://'
3268
3269       The command will start one grep per CPU and read bigfile one time per
3270       CPU, but as that is done in parallel, all reads except the first will
3271       be cached in RAM. Depending on the size of regexps.txt it may be faster
3272       to use --block 10m instead of -L1000.
3273
3274       Some storage systems perform better when reading multiple chunks in
3275       parallel. This is true for some RAID systems and for some network file
3276       systems. To parallelize the reading of bigfile:
3277
3278         parallel --pipepart --block 100M -a bigfile -k --compress \
3279           grep -f regexps.txt
3280
3281       This will split bigfile into 100MB chunks and run grep on each of these
3282       chunks. To parallelize both reading of bigfile and regexps.txt combine
3283       the two using --cat:
3284
3285         parallel --pipepart --block 100M -a bigfile --cat cat regexps.txt \
3286           \| parallel --pipe -L1000 --roundrobin grep -f - {}
3287
3288       If a line matches multiple regexps, the line may be duplicated.
3289
3290   Bigger problem
3291       If the problem is too big to be solved by this, you are probably ready
3292       for Lucene.
3293

EXAMPLE: Using remote computers

3295       To run commands on a remote computer SSH needs to be set up and you
3296       must be able to login without entering a password (The commands ssh-
3297       copy-id, ssh-agent, and sshpass may help you do that).
3298
3299       If you need to login to a whole cluster, you typically do not want to
3300       accept the host key for every host. You want to accept them the first
3301       time and be warned if they are ever changed. To do that:
3302
3303         # Add the servers to the sshloginfile
3304         (echo servera; echo serverb) > .parallel/my_cluster
3305         # Make sure .ssh/config exist
3306         touch .ssh/config
3307         cp .ssh/config .ssh/config.backup
3308         # Disable StrictHostKeyChecking temporarily
3309         (echo 'Host *'; echo StrictHostKeyChecking no) >> .ssh/config
3310         parallel --slf my_cluster --nonall true
3311         # Remove the disabling of StrictHostKeyChecking
3312         mv .ssh/config.backup .ssh/config
3313
3314       The servers in .parallel/my_cluster are now added in .ssh/known_hosts.
3315
3316       To run echo on server.example.com:
3317
3318         seq 10 | parallel --sshlogin server.example.com echo
3319
3320       To run commands on more than one remote computer run:
3321
3322         seq 10 | parallel --sshlogin s1.example.com,s2.example.net echo
3323
3324       Or:
3325
3326         seq 10 | parallel --sshlogin server.example.com \
3327           --sshlogin server2.example.net echo
3328
3329       If the login username is foo on server2.example.net use:
3330
3331         seq 10 | parallel --sshlogin server.example.com \
3332           --sshlogin foo@server2.example.net echo
3333
3334       If your list of hosts is server1-88.example.net with login foo:
3335
3336         seq 10 | parallel -Sfoo@server{1..88}.example.net echo
3337
3338       To distribute the commands to a list of computers, make a file
3339       mycomputers with all the computers:
3340
3341         server.example.com
3342         foo@server2.example.com
3343         server3.example.com
3344
3345       Then run:
3346
3347         seq 10 | parallel --sshloginfile mycomputers echo
3348
3349       To include the local computer add the special sshlogin ':' to the list:
3350
3351         server.example.com
3352         foo@server2.example.com
3353         server3.example.com
3354         :
3355
3356       GNU parallel will try to determine the number of CPUs on each of the
3357       remote computers, and run one job per CPU - even if the remote
3358       computers do not have the same number of CPUs.
3359
3360       If the number of CPUs on the remote computers is not identified
3361       correctly the number of CPUs can be added in front. Here the computer
3362       has 8 CPUs.
3363
3364         seq 10 | parallel --sshlogin 8/server.example.com echo
3365

EXAMPLE: Transferring of files

3367       To recompress gzipped files with bzip2 using a remote computer run:
3368
3369         find logs/ -name '*.gz' | \
3370           parallel --sshlogin server.example.com \
3371           --transfer "zcat {} | bzip2 -9 >{.}.bz2"
3372
3373       This will list the .gz-files in the logs directory and all directories
3374       below. Then it will transfer the files to server.example.com to the
3375       corresponding directory in $HOME/logs. On server.example.com the file
3376       will be recompressed using zcat and bzip2 resulting in the
3377       corresponding file with .gz replaced with .bz2.
3378
3379       If you want the resulting bz2-file to be transferred back to the local
3380       computer add --return {.}.bz2:
3381
3382         find logs/ -name '*.gz' | \
3383           parallel --sshlogin server.example.com \
3384           --transfer --return {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3385
3386       After the recompressing is done the .bz2-file is transferred back to
3387       the local computer and put next to the original .gz-file.
3388
3389       If you want to delete the transferred files on the remote computer add
3390       --cleanup. This will remove both the file transferred to the remote
3391       computer and the files transferred from the remote computer:
3392
3393         find logs/ -name '*.gz' | \
3394           parallel --sshlogin server.example.com \
3395           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3396
3397       If you want run on several computers add the computers to --sshlogin
3398       either using ',' or multiple --sshlogin:
3399
3400         find logs/ -name '*.gz' | \
3401           parallel --sshlogin server.example.com,server2.example.com \
3402           --sshlogin server3.example.com \
3403           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3404
3405       You can add the local computer using --sshlogin :. This will disable
3406       the removing and transferring for the local computer only:
3407
3408         find logs/ -name '*.gz' | \
3409           parallel --sshlogin server.example.com,server2.example.com \
3410           --sshlogin server3.example.com \
3411           --sshlogin : \
3412           --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3413
3414       Often --transfer, --return and --cleanup are used together. They can be
3415       shortened to --trc:
3416
3417         find logs/ -name '*.gz' | \
3418           parallel --sshlogin server.example.com,server2.example.com \
3419           --sshlogin server3.example.com \
3420           --sshlogin : \
3421           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3422
3423       With the file mycomputers containing the list of computers it becomes:
3424
3425         find logs/ -name '*.gz' | parallel --sshloginfile mycomputers \
3426           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3427
3428       If the file ~/.parallel/sshloginfile contains the list of computers the
3429       special short hand -S .. can be used:
3430
3431         find logs/ -name '*.gz' | parallel -S .. \
3432           --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3433

EXAMPLE: Distributing work to local and remote computers

3435       Convert *.mp3 to *.ogg running one process per CPU on local computer
3436       and server2:
3437
3438         parallel --trc {.}.ogg -S server2,: \
3439           'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3
3440

EXAMPLE: Running the same command on remote computers

3442       To run the command uptime on remote computers you can do:
3443
3444         parallel --tag --nonall -S server1,server2 uptime
3445
3446       --nonall reads no arguments. If you have a list of jobs you want to run
3447       on each computer you can do:
3448
3449         parallel --tag --onall -S server1,server2 echo ::: 1 2 3
3450
3451       Remove --tag if you do not want the sshlogin added before the output.
3452
3453       If you have a lot of hosts use '-j0' to access more hosts in parallel.
3454

EXAMPLE: Running 'sudo' on remote computers

3456       Put the password into passwordfile then run:
3457
3458         parallel --ssh 'cat passwordfile | ssh' --nonall \
3459           -S user@server1,user@server2 sudo -S ls -l /root
3460

EXAMPLE: Using remote computers behind NAT wall

3462       If the workers are behind a NAT wall, you need some trickery to get to
3463       them.
3464
3465       If you can ssh to a jumphost, and reach the workers from there, then
3466       the obvious solution would be this, but it does not work:
3467
3468         parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
3469
3470       It does not work because the command is dequoted by ssh twice where as
3471       GNU parallel only expects it to be dequoted once.
3472
3473       You can use a bash function and have GNU parallel quote the command:
3474
3475         jumpssh() { ssh -A jumphost ssh $(parallel --shellquote ::: "$@"); }
3476         export -f jumpssh
3477         parallel --ssh jumpssh -S host1 echo ::: this works
3478
3479       Or you can instead put this in ~/.ssh/config:
3480
3481         Host host1 host2 host3
3482           ProxyCommand ssh jumphost.domain nc -w 1 %h 22
3483
3484       It requires nc(netcat) to be installed on jumphost. With this you can
3485       simply:
3486
3487         parallel -S host1,host2,host3 echo ::: This does work
3488
3489   No jumphost, but port forwards
3490       If there is no jumphost but each server has port 22 forwarded from the
3491       firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 =
3492       host2, 22003 = host3) then you can use ~/.ssh/config:
3493
3494         Host host1.v
3495           Port 22001
3496         Host host2.v
3497           Port 22002
3498         Host host3.v
3499           Port 22003
3500         Host *.v
3501           Hostname firewall
3502
3503       And then use host{1..3}.v as normal hosts:
3504
3505         parallel -S host1.v,host2.v,host3.v echo ::: a b c
3506
3507   No jumphost, no port forwards
3508       If ports cannot be forwarded, you need some sort of VPN to traverse the
3509       NAT-wall. TOR is one options for that, as it is very easy to get
3510       working.
3511
3512       You need to install TOR and setup a hidden service. In torrc put:
3513
3514         HiddenServiceDir /var/lib/tor/hidden_service/
3515         HiddenServicePort 22 127.0.0.1:22
3516
3517       Then start TOR: /etc/init.d/tor restart
3518
3519       The TOR hostname is now in /var/lib/tor/hidden_service/hostname and is
3520       something similar to izjafdceobowklhz.onion. Now you simply prepend
3521       torsocks to ssh:
3522
3523         parallel --ssh 'torsocks ssh' -S izjafdceobowklhz.onion \
3524           -S zfcdaeiojoklbwhz.onion,auclucjzobowklhi.onion echo ::: a b c
3525
3526       If not all hosts are accessible through TOR:
3527
3528         parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \
3529           echo ::: a b c
3530
3531       See more ssh tricks on
3532       https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts
3533

EXAMPLE: Parallelizing rsync

3535       rsync is a great tool, but sometimes it will not fill up the available
3536       bandwidth. Running multiple rsync in parallel can fix this.
3537
3538         cd src-dir
3539         find . -type f |
3540           parallel -j10 -X rsync -zR -Ha ./{} fooserver:/dest-dir/
3541
3542       Adjust -j10 until you find the optimal number.
3543
3544       rsync -R will create the needed subdirectories, so all files are not
3545       put into a single dir. The ./ is needed so the resulting command looks
3546       similar to:
3547
3548         rsync -zR ././sub/dir/file fooserver:/dest-dir/
3549
3550       The /./ is what rsync -R works on.
3551
3552       If you are unable to push data, but need to pull them and the files are
3553       called digits.png (e.g. 000000.png) you might be able to do:
3554
3555         seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/
3556

EXAMPLE: Use multiple inputs in one command

3558       Copy files like foo.es.ext to foo.ext:
3559
3560         ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2}
3561
3562       The perl command spits out 2 lines for each input. GNU parallel takes 2
3563       inputs (using -N2) and replaces {1} and {2} with the inputs.
3564
3565       Count in binary:
3566
3567         parallel -k echo ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1
3568
3569       Print the number on the opposing sides of a six sided die:
3570
3571         parallel --link -a <(seq 6) -a <(seq 6 -1 1) echo
3572         parallel --link echo :::: <(seq 6) <(seq 6 -1 1)
3573
3574       Convert files from all subdirs to PNG-files with consecutive numbers
3575       (useful for making input PNG's for ffmpeg):
3576
3577         parallel --link -a <(find . -type f | sort) \
3578           -a <(seq $(find . -type f|wc -l)) convert {1} {2}.png
3579
3580       Alternative version:
3581
3582         find . -type f | sort | parallel convert {} {#}.png
3583

EXAMPLE: Use a table as input

3585       Content of table_file.tsv:
3586
3587         foo<TAB>bar
3588         baz <TAB> quux
3589
3590       To run:
3591
3592         cmd -o bar -i foo
3593         cmd -o quux -i baz
3594
3595       you can run:
3596
3597         parallel -a table_file.tsv --colsep '\t' cmd -o {2} -i {1}
3598
3599       Note: The default for GNU parallel is to remove the spaces around the
3600       columns. To keep the spaces:
3601
3602         parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1}
3603

EXAMPLE: Output to database

3605       GNU parallel can output to a database table and a CSV-file:
3606
3607         dburl=csv:///%2Ftmp%2Fmydir
3608         dbtableurl=$dburl/mytable.csv
3609         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3610
3611       It is rather slow and takes up a lot of CPU time because GNU parallel
3612       parses the whole CSV file for each update.
3613
3614       A better approach is to use an SQLite-base and then convert that to
3615       CSV:
3616
3617         dburl=sqlite3:///%2Ftmp%2Fmy.sqlite
3618         dbtableurl=$dburl/mytable
3619         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3620         sql $dburl '.headers on' '.mode csv' 'SELECT * FROM mytable;'
3621
3622       This takes around a second per job.
3623
3624       If you have access to a real database system, such as PostgreSQL, it is
3625       even faster:
3626
3627         dburl=pg://user:pass@host/mydb
3628         dbtableurl=$dburl/mytable
3629         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3630         sql $dburl \
3631           "COPY (SELECT * FROM mytable) TO stdout DELIMITER ',' CSV HEADER;"
3632
3633       Or MySQL:
3634
3635         dburl=mysql://user:pass@host/mydb
3636         dbtableurl=$dburl/mytable
3637         parallel --sqlandworker $dbtableurl seq ::: {1..10}
3638         sql -p -B $dburl "SELECT * FROM mytable;" > mytable.tsv
3639         perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/;
3640           %s=("\\" => "\\", "t" => "\t", "n" => "\n");
3641           s/\\([\\tn])/$s{$1}/g;' mytable.tsv
3642

EXAMPLE: Output to CSV-file for R

3644       If you have no need for the advanced job distribution control that a
3645       database provides, but you simply want output into a CSV file that you
3646       can read into R or LibreCalc, then you can use --results:
3647
3648         parallel --results my.csv seq ::: 10 20 30
3649         R
3650         > mydf <- read.csv("my.csv");
3651         > print(mydf[2,])
3652         > write(as.character(mydf[2,c("Stdout")]),'')
3653

EXAMPLE: Use XML as input

3655       The show Aflyttet on Radio 24syv publishes an RSS feed with their audio
3656       podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232
3657
3658       Using xpath you can extract the URLs for 2019 and download them using
3659       GNU parallel:
3660
3661         wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \
3662           xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \
3663           parallel -u wget '{= s/ url="//; s/"//; =}'
3664

EXAMPLE: Run the same command 10 times

3666       If you want to run the same command with the same arguments 10 times in
3667       parallel you can do:
3668
3669         seq 10 | parallel -n0 my_command my_args
3670

EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation

3672       GNU parallel can work similar to cat | sh.
3673
3674       A resource inexpensive job is a job that takes very little CPU, disk
3675       I/O and network I/O. Ping is an example of a resource inexpensive job.
3676       wget is too - if the webpages are small.
3677
3678       The content of the file jobs_to_run:
3679
3680         ping -c 1 10.0.0.1
3681         wget http://example.com/status.cgi?ip=10.0.0.1
3682         ping -c 1 10.0.0.2
3683         wget http://example.com/status.cgi?ip=10.0.0.2
3684         ...
3685         ping -c 1 10.0.0.255
3686         wget http://example.com/status.cgi?ip=10.0.0.255
3687
3688       To run 100 processes simultaneously do:
3689
3690         parallel -j 100 < jobs_to_run
3691
3692       As there is not a command the jobs will be evaluated by the shell.
3693

EXAMPLE: Call program with FASTA sequence

3695       FASTA files have the format:
3696
3697         >Sequence name1
3698         sequence
3699         sequence continued
3700         >Sequence name2
3701         sequence
3702         sequence continued
3703         more sequence
3704
3705       To call myprog with the sequence as argument run:
3706
3707         cat file.fasta |
3708           parallel --pipe -N1 --recstart '>' --rrs \
3709             'read a; echo Name: "$a"; myprog $(tr -d "\n")'
3710

EXAMPLE: Processing a big file using more CPUs

3712       To process a big file or some output you can use --pipe to split up the
3713       data into blocks and pipe the blocks into the processing program.
3714
3715       If the program is gzip -9 you can do:
3716
3717         cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz
3718
3719       This will split bigfile into blocks of 1 MB and pass that to gzip -9 in
3720       parallel. One gzip will be run per CPU. The output of gzip -9 will be
3721       kept in order and saved to bigfile.gz
3722
3723       gzip works fine if the output is appended, but some processing does not
3724       work like that - for example sorting. For this GNU parallel can put the
3725       output of each command into a file. This will sort a big file in
3726       parallel:
3727
3728         cat bigfile | parallel --pipe --files sort |\
3729           parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
3730
3731       Here bigfile is split into blocks of around 1MB, each block ending in
3732       '\n' (which is the default for --recend). Each block is passed to sort
3733       and the output from sort is saved into files. These files are passed to
3734       the second parallel that runs sort -m on the files before it removes
3735       the files. The output is saved to bigfile.sort.
3736
3737       GNU parallel's --pipe maxes out at around 100 MB/s because every byte
3738       has to be copied through GNU parallel. But if bigfile is a real
3739       (seekable) file GNU parallel can by-pass the copying and send the parts
3740       directly to the program:
3741
3742         parallel --pipepart --block 100m -a bigfile --files sort |\
3743           parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
3744

EXAMPLE: Grouping input lines

3746       When processing with --pipe you may have lines grouped by a value. Here
3747       is my.csv:
3748
3749          Transaction Customer Item
3750               1       a       53
3751               2       b       65
3752               3       b       82
3753               4       c       96
3754               5       c       67
3755               6       c       13
3756               7       d       90
3757               8       d       43
3758               9       d       91
3759               10      d       84
3760               11      e       72
3761               12      e       102
3762               13      e       63
3763               14      e       56
3764               15      e       74
3765
3766       Let us assume you want GNU parallel to process each customer. In other
3767       words: You want all the transactions for a single customer to be
3768       treated as a single record.
3769
3770       To do this we preprocess the data with a program that inserts a record
3771       separator before each customer (column 2 = $F[1]). Here we first make a
3772       50 character random string, which we then use as the separator:
3773
3774         sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'`
3775         cat my.csv | \
3776            perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \
3777            parallel --recend $sep --rrs --pipe -N1 wc
3778
3779       If your program can process multiple customers replace -N1 with a
3780       reasonable --blocksize.
3781

EXAMPLE: Running more than 250 jobs workaround

3783       If you need to run a massive amount of jobs in parallel, then you will
3784       likely hit the filehandle limit which is often around 250 jobs. If you
3785       are super user you can raise the limit in /etc/security/limits.conf but
3786       you can also use this workaround. The filehandle limit is per process.
3787       That means that if you just spawn more GNU parallels then each of them
3788       can run 250 jobs. This will spawn up to 2500 jobs:
3789
3790         cat myinput |\
3791           parallel --pipe -N 50 --roundrobin -j50 parallel -j50 your_prg
3792
3793       This will spawn up to 62500 jobs (use with caution - you need 64 GB RAM
3794       to do this, and you may need to increase /proc/sys/kernel/pid_max):
3795
3796         cat myinput |\
3797           parallel --pipe -N 250 --roundrobin -j250 parallel -j250 your_prg
3798

EXAMPLE: Working as mutex and counting semaphore

3800       The command sem is an alias for parallel --semaphore.
3801
3802       A counting semaphore will allow a given number of jobs to be started in
3803       the background.  When the number of jobs are running in the background,
3804       GNU sem will wait for one of these to complete before starting another
3805       command. sem --wait will wait for all jobs to complete.
3806
3807       Run 10 jobs concurrently in the background:
3808
3809         for i in *.log ; do
3810           echo $i
3811           sem -j10 gzip $i ";" echo done
3812         done
3813         sem --wait
3814
3815       A mutex is a counting semaphore allowing only one job to run. This will
3816       edit the file myfile and prepends the file with lines with the numbers
3817       1 to 3.
3818
3819         seq 3 | parallel sem sed -i -e '1i{}' myfile
3820
3821       As myfile can be very big it is important only one process edits the
3822       file at the same time.
3823
3824       Name the semaphore to have multiple different semaphores active at the
3825       same time:
3826
3827         seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile
3828

EXAMPLE: Mutex for a script

3830       Assume a script is called from cron or from a web service, but only one
3831       instance can be run at a time. With sem and --shebang-wrap the script
3832       can be made to wait for other instances to finish. Here in bash:
3833
3834         #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash
3835
3836         echo This will run
3837         sleep 5
3838         echo exclusively
3839
3840       Here perl:
3841
3842         #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/perl
3843
3844         print "This will run ";
3845         sleep 5;
3846         print "exclusively\n";
3847
3848       Here python:
3849
3850         #!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python
3851
3852         import time
3853         print "This will run ";
3854         time.sleep(5)
3855         print "exclusively";
3856

EXAMPLE: Start editor with filenames from stdin (standard input)

3858       You can use GNU parallel to start interactive programs like emacs or
3859       vi:
3860
3861         cat filelist | parallel --tty -X emacs
3862         cat filelist | parallel --tty -X vi
3863
3864       If there are more files than will fit on a single command line, the
3865       editor will be started again with the remaining files.
3866

EXAMPLE: Running sudo

3868       sudo requires a password to run a command as root. It caches the
3869       access, so you only need to enter the password again if you have not
3870       used sudo for a while.
3871
3872       The command:
3873
3874         parallel sudo echo ::: This is a bad idea
3875
3876       is no good, as you would be prompted for the sudo password for each of
3877       the jobs. You can either do:
3878
3879         sudo echo This
3880         parallel sudo echo ::: is a good idea
3881
3882       or:
3883
3884         sudo parallel echo ::: This is a good idea
3885
3886       This way you only have to enter the sudo password once.
3887

EXAMPLE: GNU Parallel as queue system/batch manager

3889       GNU parallel can work as a simple job queue system or batch manager.
3890       The idea is to put the jobs into a file and have GNU parallel read from
3891       that continuously. As GNU parallel will stop at end of file we use tail
3892       to continue reading:
3893
3894         true >jobqueue; tail -n+0 -f jobqueue | parallel
3895
3896       To submit your jobs to the queue:
3897
3898         echo my_command my_arg >> jobqueue
3899
3900       You can of course use -S to distribute the jobs to remote computers:
3901
3902         true >jobqueue; tail -n+0 -f jobqueue | parallel -S ..
3903
3904       If you keep this running for a long time, jobqueue will grow. A way of
3905       removing the jobs already run is by making GNU parallel stop when it
3906       hits a special value and then restart. To use --eof to make GNU
3907       parallel exit, tail also needs to be forced to exit:
3908
3909         true >jobqueue;
3910         while true; do
3911           tail -n+0 -f jobqueue |
3912             (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
3913              perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
3914              (seq 1000 >> jobqueue &);
3915              echo Done appending dummy data forcing tail to exit)
3916           echo tail exited;
3917           mv j2 jobqueue
3918         done
3919
3920       In some cases you can run on more CPUs and computers during the night:
3921
3922         # Day time
3923         echo 50% > jobfile
3924         cp day_server_list ~/.parallel/sshloginfile
3925         # Night time
3926         echo 100% > jobfile
3927         cp night_server_list ~/.parallel/sshloginfile
3928         tail -n+0 -f jobqueue | parallel --jobs jobfile -S ..
3929
3930       GNU parallel discovers if jobfile or ~/.parallel/sshloginfile changes.
3931
3932       There is a a small issue when using GNU parallel as queue system/batch
3933       manager: You have to submit JobSlot number of jobs before they will
3934       start, and after that you can submit one at a time, and job will start
3935       immediately if free slots are available.  Output from the running or
3936       completed jobs are held back and will only be printed when JobSlots
3937       more jobs has been started (unless you use --ungroup or --line-buffer,
3938       in which case the output from the jobs are printed immediately).  E.g.
3939       if you have 10 jobslots then the output from the first completed job
3940       will only be printed when job 11 has started, and the output of second
3941       completed job will only be printed when job 12 has started.
3942

EXAMPLE: GNU Parallel as dir processor

3944       If you have a dir in which users drop files that needs to be processed
3945       you can do this on GNU/Linux (If you know what inotifywait is called on
3946       other platforms file a bug report):
3947
3948         inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
3949           parallel -u echo
3950
3951       This will run the command echo on each file put into my_dir or subdirs
3952       of my_dir.
3953
3954       You can of course use -S to distribute the jobs to remote computers:
3955
3956         inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
3957           parallel -S ..  -u echo
3958
3959       If the files to be processed are in a tar file then unpacking one file
3960       and processing it immediately may be faster than first unpacking all
3961       files. Set up the dir processor as above and unpack into the dir.
3962
3963       Using GNU parallel as dir processor has the same limitations as using
3964       GNU parallel as queue system/batch manager.
3965

EXAMPLE: Locate the missing package

3967       If you have downloaded source and tried compiling it, you may have
3968       seen:
3969
3970         $ ./configure
3971         [...]
3972         checking for something.h... no
3973         configure: error: "libsomething not found"
3974
3975       Often it is not obvious which package you should install to get that
3976       file. Debian has `apt-file` to search for a file. `tracefile` from
3977       https://gitlab.com/ole.tange/tangetools can tell which files a program
3978       tried to access. In this case we are interested in one of the last
3979       files:
3980
3981         $ tracefile -un ./configure | tail | parallel -j0 apt-file search
3982

SPREADING BLOCKS OF DATA

3984       --round-robin, --pipe-part, --shard, --bin and --group-by are all
3985       specialized versions of --pipe.
3986
3987       In the following n is the number of jobslots given by --jobs. A record
3988       starts with --recstart and ends with --recend. It is typically a full
3989       line. A chunk is a number of full records that is approximately the
3990       size of a block. A block can contain half records, a chunk cannot.
3991
3992       --pipe starts one job per chunk. It reads blocks from stdin (standard
3993       input). It finds a record end near a block border and passes a chunk to
3994       the program.
3995
3996       --pipe-part starts one job per chunk - just like normal --pipe. It
3997       first finds record endings near all block borders in the file and then
3998       starts the jobs. By using --block -1 it will set the block size to 1/n
3999       * size-of-file. Used this way it will start n jobs in total.
4000
4001       --round-robin starts n jobs in total. It reads a block and passes a
4002       chunk to whichever job is ready to read. It does not parse the content
4003       except for identifying where a record ends to make sure it only passes
4004       full records.
4005
4006       --shard starts n jobs in total. It parses each line to read the value
4007       in the given column. Based on this value the line is passed to one of
4008       the n jobs. All lines having this value will be given to the same
4009       jobslot.
4010
4011       --bin works like --shard but the value of the column is the jobslot
4012       number it will be passed to. If the value is bigger than n, then n will
4013       be subtracted from the value until the values is smaller than or equal
4014       to n.
4015
4016       --group-by starts one job per chunk. Record borders are not given by
4017       --recend/--recstart. Instead a record is defined by a number of lines
4018       having the same value in a given column. So the value of a given column
4019       changes at a chunk border. With --pipe every line is parsed, with
4020       --pipe-part only a few lines are parsed to find the chunk border.
4021
4022       --group-by can be combined with --round-robin or --pipe-part.
4023

QUOTING

4025       GNU parallel is very liberal in quoting. You only need to quote
4026       characters that have special meaning in shell:
4027
4028         ( ) $ ` ' " < > ; | \
4029
4030       and depending on context these needs to be quoted, too:
4031
4032         ~ & # ! ? space * {
4033
4034       Therefore most people will never need more quoting than putting '\' in
4035       front of the special characters.
4036
4037       Often you can simply put \' around every ':
4038
4039         perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4040
4041       can be quoted:
4042
4043         parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\' ::: file
4044
4045       However, when you want to use a shell variable you need to quote the
4046       $-sign. Here is an example using $PARALLEL_SEQ. This variable is set by
4047       GNU parallel itself, so the evaluation of the $ must be done by the sub
4048       shell started by GNU parallel:
4049
4050         seq 10 | parallel -N2 echo seq:\$PARALLEL_SEQ arg1:{1} arg2:{2}
4051
4052       If the variable is set before GNU parallel starts you can do this:
4053
4054         VAR=this_is_set_before_starting
4055         echo test | parallel echo {} $VAR
4056
4057       Prints: test this_is_set_before_starting
4058
4059       It is a little more tricky if the variable contains more than one space
4060       in a row:
4061
4062         VAR="two  spaces  between  each  word"
4063         echo test | parallel echo {} \'"$VAR"\'
4064
4065       Prints: test two  spaces  between  each  word
4066
4067       If the variable should not be evaluated by the shell starting GNU
4068       parallel but be evaluated by the sub shell started by GNU parallel,
4069       then you need to quote it:
4070
4071         echo test | parallel VAR=this_is_set_after_starting \; echo {} \$VAR
4072
4073       Prints: test this_is_set_after_starting
4074
4075       It is a little more tricky if the variable contains space:
4076
4077         echo test |\
4078           parallel VAR='"two  spaces  between  each  word"' echo {} \'"$VAR"\'
4079
4080       Prints: test two  spaces  between  each  word
4081
4082       $$ is the shell variable containing the process id of the shell. This
4083       will print the process id of the shell running GNU parallel:
4084
4085         seq 10 | parallel echo $$
4086
4087       And this will print the process ids of the sub shells started by GNU
4088       parallel.
4089
4090         seq 10 | parallel echo \$\$
4091
4092       If the special characters should not be evaluated by the sub shell then
4093       you need to protect it against evaluation from both the shell starting
4094       GNU parallel and the sub shell:
4095
4096         echo test | parallel echo {} \\\$VAR
4097
4098       Prints: test $VAR
4099
4100       GNU parallel can protect against evaluation by the sub shell by using
4101       -q:
4102
4103         echo test | parallel -q echo {} \$VAR
4104
4105       Prints: test $VAR
4106
4107       This is particularly useful if you have lots of quoting. If you want to
4108       run a perl script like this:
4109
4110         perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4111
4112       It needs to be quoted like one of these:
4113
4114         ls | parallel perl -ne '/^\\S+\\s+\\S+\$/\ and\ print\ \$ARGV,\"\\n\"'
4115         ls | parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\'
4116
4117       Notice how spaces, \'s, "'s, and $'s need to be quoted. GNU parallel
4118       can do the quoting by using option -q:
4119
4120         ls | parallel -q  perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"'
4121
4122       However, this means you cannot make the sub shell interpret special
4123       characters. For example because of -q this WILL NOT WORK:
4124
4125         ls *.gz | parallel -q "zcat {} >{.}"
4126         ls *.gz | parallel -q "zcat {} | bzip2 >{.}.bz2"
4127
4128       because > and | need to be interpreted by the sub shell.
4129
4130       If you get errors like:
4131
4132         sh: -c: line 0: syntax error near unexpected token
4133         sh: Syntax error: Unterminated quoted string
4134         sh: -c: line 0: unexpected EOF while looking for matching `''
4135         sh: -c: line 1: syntax error: unexpected end of file
4136         zsh:1: no matches found:
4137
4138       then you might try using -q.
4139
4140       If you are using bash process substitution like <(cat foo) then you may
4141       try -q and prepending command with bash -c:
4142
4143         ls | parallel -q bash -c 'wc -c <(echo {})'
4144
4145       Or for substituting output:
4146
4147         ls | parallel -q bash -c \
4148           'tar c {} | tee >(gzip >{}.tar.gz) | bzip2 >{}.tar.bz2'
4149
4150       Conclusion: To avoid dealing with the quoting problems it may be easier
4151       just to write a small script or a function (remember to export -f the
4152       function) and have GNU parallel call that.
4153

LIST RUNNING JOBS

4155       If you want a list of the jobs currently running you can run:
4156
4157         killall -USR1 parallel
4158
4159       GNU parallel will then print the currently running jobs on stderr
4160       (standard error).
4161

COMPLETE RUNNING JOBS BUT DO NOT START NEW JOBS

4163       If you regret starting a lot of jobs you can simply break GNU parallel,
4164       but if you want to make sure you do not have half-completed jobs you
4165       should send the signal SIGHUP to GNU parallel:
4166
4167         killall -HUP parallel
4168
4169       This will tell GNU parallel to not start any new jobs, but wait until
4170       the currently running jobs are finished before exiting.
4171

ENVIRONMENT VARIABLES

4173       $PARALLEL_HOME
4174                Dir where GNU parallel stores config files, semaphores, and
4175                caches information between invocations. Default:
4176                $HOME/.parallel.
4177
4178       $PARALLEL_HOSTGROUPS
4179                When using --hostgroups GNU parallel sets this to the
4180                intersection of the hostgroups of the job and the sshlogin
4181                that the job is run on.
4182
4183                Remember to quote the $, so it gets evaluated by the correct
4184                shell. Or use --plus and {hgrp}.
4185
4186       $PARALLEL_JOBSLOT
4187                Set by GNU parallel and can be used in jobs run by GNU
4188                parallel.  Remember to quote the $, so it gets evaluated by
4189                the correct shell. Or use --plus and {slot}.
4190
4191                $PARALLEL_JOBSLOT is the jobslot of the job. It is equal to
4192                {%} unless the job is being retried. See {%} for details.
4193
4194       $PARALLEL_PID
4195                Set by GNU parallel and can be used in jobs run by GNU
4196                parallel.  Remember to quote the $, so it gets evaluated by
4197                the correct shell.
4198
4199                This makes it possible for the jobs to communicate directly to
4200                GNU parallel.
4201
4202                Example: If each of the jobs tests a solution and one of jobs
4203                finds the solution the job can tell GNU parallel not to start
4204                more jobs by: kill -HUP $PARALLEL_PID. This only works on the
4205                local computer.
4206
4207       $PARALLEL_RSYNC_OPTS
4208                Options to pass on to rsync. Defaults to: -rlDzR.
4209
4210       $PARALLEL_SHELL
4211                Use this shell for the commands run by GNU parallel:
4212
4213                • $PARALLEL_SHELL. If undefined use:
4214
4215                • The shell that started GNU parallel. If that cannot be
4216                  determined:
4217
4218                • $SHELL. If undefined use:
4219
4220                • /bin/sh
4221
4222       $PARALLEL_SSH
4223                GNU parallel defaults to using the ssh command for remote
4224                access. This can be overridden with $PARALLEL_SSH, which again
4225                can be overridden with --ssh. It can also be set on a per
4226                server basis (see --sshlogin).
4227
4228       $PARALLEL_SSHHOST
4229                Set by GNU parallel and can be used in jobs run by GNU
4230                parallel.  Remember to quote the $, so it gets evaluated by
4231                the correct shell. Or use --plus and {host}.
4232
4233                $PARALLEL_SSHHOST is the host part of an sshlogin line. E.g.
4234
4235                  4//usr/bin/specialssh user@host
4236
4237                becomes:
4238
4239                  host
4240
4241       $PARALLEL_SSHLOGIN
4242                Set by GNU parallel and can be used in jobs run by GNU
4243                parallel.  Remember to quote the $, so it gets evaluated by
4244                the correct shell. Or use --plus and {sshlogin}.
4245
4246                The value is the sshlogin line with number of cores removed.
4247                E.g.
4248
4249                  4//usr/bin/specialssh user@host
4250
4251                becomes:
4252
4253                  /usr/bin/specialssh user@host
4254
4255       $PARALLEL_SEQ
4256                Set by GNU parallel and can be used in jobs run by GNU
4257                parallel.  Remember to quote the $, so it gets evaluated by
4258                the correct shell.
4259
4260                $PARALLEL_SEQ is the sequence number of the job running.
4261
4262                Example:
4263
4264                  seq 10 | parallel -N2 \
4265                    echo seq:'$'PARALLEL_SEQ arg1:{1} arg2:{2}
4266
4267                {#} is a shorthand for $PARALLEL_SEQ.
4268
4269       $PARALLEL_TMUX
4270                Path to tmux. If unset the tmux in $PATH is used.
4271
4272       $TMPDIR  Directory for temporary files. See: --tmpdir.
4273
4274       $PARALLEL
4275                The environment variable $PARALLEL will be used as default
4276                options for GNU parallel. If the variable contains special
4277                shell characters (e.g. $, *, or space) then these need to be
4278                to be escaped with \.
4279
4280                Example:
4281
4282                  cat list | parallel -j1 -k -v ls
4283                  cat list | parallel -j1 -k -v -S"myssh user@server" ls
4284
4285                can be written as:
4286
4287                  cat list | PARALLEL="-kvj1" parallel ls
4288                  cat list | PARALLEL='-kvj1 -S myssh\ user@server' \
4289                    parallel echo
4290
4291                Notice the \ after 'myssh' is needed because 'myssh' and
4292                'user@server' must be one argument.
4293

DEFAULT PROFILE (CONFIG FILE)

4295       The global configuration file /etc/parallel/config, followed by user
4296       configuration file ~/.parallel/config (formerly known as .parallelrc)
4297       will be read in turn if they exist.  Lines starting with '#' will be
4298       ignored. The format can follow that of the environment variable
4299       $PARALLEL, but it is often easier to simply put each option on its own
4300       line.
4301
4302       Options on the command line take precedence, followed by the
4303       environment variable $PARALLEL, user configuration file
4304       ~/.parallel/config, and finally the global configuration file
4305       /etc/parallel/config.
4306
4307       Note that no file that is read for options, nor the environment
4308       variable $PARALLEL, may contain retired options such as --tollef.
4309

PROFILE FILES

4311       If --profile set, GNU parallel will read the profile from that file
4312       rather than the global or user configuration files. You can have
4313       multiple --profiles.
4314
4315       Profiles are searched for in ~/.parallel. If the name starts with / it
4316       is seen as an absolute path. If the name starts with ./ it is seen as a
4317       relative path from current dir.
4318
4319       Example: Profile for running a command on every sshlogin in
4320       ~/.ssh/sshlogins and prepend the output with the sshlogin:
4321
4322         echo --tag -S .. --nonall > ~/.parallel/n
4323         parallel -Jn uptime
4324
4325       Example: Profile for running every command with -j-1 and nice
4326
4327         echo -j-1 nice > ~/.parallel/nice_profile
4328         parallel -J nice_profile bzip2 -9 ::: *
4329
4330       Example: Profile for running a perl script before every command:
4331
4332         echo "perl -e '\$a=\$\$; print \$a,\" \",'\$PARALLEL_SEQ',\" \";';" \
4333           > ~/.parallel/pre_perl
4334         parallel -J pre_perl echo ::: *
4335
4336       Note how the $ and " need to be quoted using \.
4337
4338       Example: Profile for running distributed jobs with nice on the remote
4339       computers:
4340
4341         echo -S .. nice > ~/.parallel/dist
4342         parallel -J dist --trc {.}.bz2 bzip2 -9 ::: *
4343

EXIT STATUS

4345       Exit status depends on --halt-on-error if one of these is used:
4346       success=X, success=Y%, fail=Y%.
4347
4348       0     All jobs ran without error. If success=X is used: X jobs ran
4349             without error. If success=Y% is used: Y% of the jobs ran without
4350             error.
4351
4352       1-100 Some of the jobs failed. The exit status gives the number of
4353             failed jobs. If Y% is used the exit status is the percentage of
4354             jobs that failed.
4355
4356       101   More than 100 jobs failed.
4357
4358       255   Other error.
4359
4360       -1 (In joblog and SQL table)
4361             Killed by Ctrl-C, timeout, not enough memory or similar.
4362
4363       -2 (In joblog and SQL table)
4364             skip() was called in {= =}.
4365
4366       -1000 (In SQL table)
4367             Job is ready to run (set by --sqlmaster).
4368
4369       -1220 (In SQL table)
4370             Job is taken by worker (set by --sqlworker).
4371
4372       If fail=1 is used, the exit status will be the exit status of the
4373       failing job.
4374

DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES

4376       See: man parallel_alternatives
4377

BUGS

4379   Quoting of newline
4380       Because of the way newline is quoted this will not work:
4381
4382         echo 1,2,3 | parallel -vkd, "echo 'a{}b'"
4383
4384       However, these will all work:
4385
4386         echo 1,2,3 | parallel -vkd, echo a{}b
4387         echo 1,2,3 | parallel -vkd, "echo 'a'{}'b'"
4388         echo 1,2,3 | parallel -vkd, "echo 'a'"{}"'b'"
4389
4390   Speed
4391       Startup
4392
4393       GNU parallel is slow at starting up - around 250 ms the first time and
4394       150 ms after that.
4395
4396       Job startup
4397
4398       Starting a job on the local machine takes around 10 ms. This can be a
4399       big overhead if the job takes very few ms to run. Often you can group
4400       small jobs together using -X which will make the overhead less
4401       significant. Or you can run multiple GNU parallels as described in
4402       EXAMPLE: Speeding up fast jobs.
4403
4404       SSH
4405
4406       When using multiple computers GNU parallel opens ssh connections to
4407       them to figure out how many connections can be used reliably
4408       simultaneously (Namely SSHD's MaxStartups). This test is done for each
4409       host in serial, so if your --sshloginfile contains many hosts it may be
4410       slow.
4411
4412       If your jobs are short you may see that there are fewer jobs running on
4413       the remote systems than expected. This is due to time spent logging in
4414       and out. -M may help here.
4415
4416       Disk access
4417
4418       A single disk can normally read data faster if it reads one file at a
4419       time instead of reading a lot of files in parallel, as this will avoid
4420       disk seeks. However, newer disk systems with multiple drives can read
4421       faster if reading from multiple files in parallel.
4422
4423       If the jobs are of the form read-all-compute-all-write-all, so
4424       everything is read before anything is written, it may be faster to
4425       force only one disk access at the time:
4426
4427         sem --id diskio cat file | compute | sem --id diskio cat > file
4428
4429       If the jobs are of the form read-compute-write, so writing starts
4430       before all reading is done, it may be faster to force only one reader
4431       and writer at the time:
4432
4433         sem --id read cat file | compute | sem --id write cat > file
4434
4435       If the jobs are of the form read-compute-read-compute, it may be faster
4436       to run more jobs in parallel than the system has CPUs, as some of the
4437       jobs will be stuck waiting for disk access.
4438
4439   --nice limits command length
4440       The current implementation of --nice is too pessimistic in the max
4441       allowed command length. It only uses a little more than half of what it
4442       could. This affects -X and -m. If this becomes a real problem for you,
4443       file a bug-report.
4444
4445   Aliases and functions do not work
4446       If you get:
4447
4448         Can't exec "command": No such file or directory
4449
4450       or:
4451
4452         open3: exec of by command failed
4453
4454       or:
4455
4456         /bin/bash: command: command not found
4457
4458       it may be because command is not known, but it could also be because
4459       command is an alias or a function. If it is a function you need to
4460       export -f the function first or use env_parallel. An alias will only
4461       work if you use env_parallel.
4462
4463   Database with MySQL fails randomly
4464       The --sql* options may fail randomly with MySQL. This problem does not
4465       exist with PostgreSQL.
4466

REPORTING BUGS

4468       Report bugs to <bug-parallel@gnu.org> or
4469       https://savannah.gnu.org/bugs/?func=additem&group=parallel
4470
4471       See a perfect bug report on
4472       https://lists.gnu.org/archive/html/bug-parallel/2015-01/msg00000.html
4473
4474       Your bug report should always include:
4475
4476       • The error message you get (if any). If the error message is not from
4477         GNU parallel you need to show why you think GNU parallel caused this.
4478
4479       • The complete output of parallel --version. If you are not running the
4480         latest released version (see http://ftp.gnu.org/gnu/parallel/) you
4481         should specify why you believe the problem is not fixed in that
4482         version.
4483
4484       • A minimal, complete, and verifiable example (See description on
4485         https://stackoverflow.com/help/mcve).
4486
4487         It should be a complete example that others can run which shows the
4488         problem including all files needed to run the example. This should
4489         preferably be small and simple, so try to remove as many options as
4490         possible. A combination of yes, seq, cat, echo, wc, and sleep can
4491         reproduce most errors. If your example requires large files, see if
4492         you can make them with something like seq 100000000 > bigfile or yes
4493         | head -n 1000000000 > file. If you need multiple columns: paste
4494         <(seq 1000) <(seq 1000 1999)
4495
4496         If your example requires remote execution, see if you can use
4497         localhost - maybe using another login.
4498
4499         If you have access to a different system (maybe a VirtualBox on your
4500         own machine), test if the MCVE shows the problem on that system.
4501
4502       • The output of your example. If your problem is not easily reproduced
4503         by others, the output might help them figure out the problem.
4504
4505       • Whether you have watched the intro videos
4506         (http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1), walked
4507         through the tutorial (man parallel_tutorial), and read the EXAMPLE
4508         section in the man page (man parallel - search for EXAMPLE:).
4509
4510       If you suspect the error is dependent on your environment or
4511       distribution, please see if you can reproduce the error on one of these
4512       VirtualBox images:
4513       http://sourceforge.net/projects/virtualboximage/files/
4514       http://www.osboxes.org/virtualbox-images/
4515
4516       Specifying the name of your distribution is not enough as you may have
4517       installed software that is not in the VirtualBox images.
4518
4519       If you cannot reproduce the error on any of the VirtualBox images
4520       above, see if you can build a VirtualBox image on which you can
4521       reproduce the error. If not you should assume the debugging will be
4522       done through you. That will put more burden on you and it is extra
4523       important you give any information that help. In general the problem
4524       will be fixed faster and with less work for you if you can reproduce
4525       the error on a VirtualBox.
4526

AUTHOR

4528       When using GNU parallel for a publication please cite:
4529
4530       O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4531       The USENIX Magazine, February 2011:42-47.
4532
4533       This helps funding further development; and it won't cost you a cent.
4534       If you pay 10000 EUR you should feel free to use GNU Parallel without
4535       citing.
4536
4537       Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4538
4539       Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4540
4541       Copyright (C) 2010-2020 Ole Tange, http://ole.tange.dk and Free
4542       Software Foundation, Inc.
4543
4544       Parts of the manual concerning xargs compatibility is inspired by the
4545       manual of xargs from GNU findutils 4.4.2.
4546

LICENSE

4548       This program is free software; you can redistribute it and/or modify it
4549       under the terms of the GNU General Public License as published by the
4550       Free Software Foundation; either version 3 of the License, or at your
4551       option any later version.
4552
4553       This program is distributed in the hope that it will be useful, but
4554       WITHOUT ANY WARRANTY; without even the implied warranty of
4555       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
4556       General Public License for more details.
4557
4558       You should have received a copy of the GNU General Public License along
4559       with this program.  If not, see <http://www.gnu.org/licenses/>.
4560
4561   Documentation license I
4562       Permission is granted to copy, distribute and/or modify this
4563       documentation under the terms of the GNU Free Documentation License,
4564       Version 1.3 or any later version published by the Free Software
4565       Foundation; with no Invariant Sections, with no Front-Cover Texts, and
4566       with no Back-Cover Texts.  A copy of the license is included in the
4567       file fdl.txt.
4568
4569   Documentation license II
4570       You are free:
4571
4572       to Share to copy, distribute and transmit the work
4573
4574       to Remix to adapt the work
4575
4576       Under the following conditions:
4577
4578       Attribution
4579                You must attribute the work in the manner specified by the
4580                author or licensor (but not in any way that suggests that they
4581                endorse you or your use of the work).
4582
4583       Share Alike
4584                If you alter, transform, or build upon this work, you may
4585                distribute the resulting work only under the same, similar or
4586                a compatible license.
4587
4588       With the understanding that:
4589
4590       Waiver   Any of the above conditions can be waived if you get
4591                permission from the copyright holder.
4592
4593       Public Domain
4594                Where the work or any of its elements is in the public domain
4595                under applicable law, that status is in no way affected by the
4596                license.
4597
4598       Other Rights
4599                In no way are any of the following rights affected by the
4600                license:
4601
4602                • Your fair dealing or fair use rights, or other applicable
4603                  copyright exceptions and limitations;
4604
4605                • The author's moral rights;
4606
4607                • Rights other persons may have either in the work itself or
4608                  in how the work is used, such as publicity or privacy
4609                  rights.
4610
4611       Notice   For any reuse or distribution, you must make clear to others
4612                the license terms of this work.
4613
4614       A copy of the full license is included in the file as cc-by-sa.txt.
4615

DEPENDENCIES

4617       GNU parallel uses Perl, and the Perl modules Getopt::Long, IPC::Open3,
4618       Symbol, IO::File, POSIX, and File::Temp.
4619
4620       For --csv it uses the Perl module Text::CSV.
4621
4622       For remote usage it uses rsync with ssh.
4623