1MILLER(1)                                                            MILLER(1)
2
3
4

NAME

6       miller - like awk, sed, cut, join, and sort for name-indexed data such
7       as CSV and tabular JSON.
8

SYNOPSIS

10       Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or
11       more file names}
12
13

DESCRIPTION

15       Miller operates on key-value-pair data while the familiar Unix tools
16       operate on integer-indexed fields: if the natural data structure for
17       the latter is the array, then Miller's natural data structure is the
18       insertion-ordered hash map.  This encompasses a variety of data
19       formats, including but not limited to the familiar CSV, TSV, and JSON.
20       (Miller can handle positionally-indexed data as a special case.) This
21       manpage documents Miller v5.6.2.
22

EXAMPLES

24   COMMAND-LINE SYNTAX
25       mlr --csv cut -f hostname,uptime mydata.csv
26       mlr --tsv --rs lf filter '$status != "down" && $upsec >= 10000' *.tsv
27       mlr --nidx put '$sum = $7 < 0.0 ? 3.5 : $7 + 2.1*$8' *.dat
28       grep -v '^#' /etc/group | mlr --ifs : --nidx --opprint label group,pass,gid,member then sort -f group
29       mlr join -j account_id -f accounts.dat then group-by account_name balances.dat
30       mlr --json put '$attr = sub($attr, "([0-9]+)_([0-9]+)_.*", "\1:\2")' data/*.json
31       mlr stats1 -a min,mean,max,p10,p50,p90 -f flag,u,v data/*
32       mlr stats2 -a linreg-pca -f u,v -g shape data/*
33       mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}' data/*
34       mlr --from estimates.tbl put '
35         for (k,v in $*) {
36           if (is_numeric(v) && k =~ "^[t-z].*$") {
37             $sum += v; $count += 1
38           }
39         }
40         $mean = $sum / $count # no assignment if count unset'
41       mlr --from infile.dat put -f analyze.mlr
42       mlr --from infile.dat put 'tee > "./taps/data-".$a."-".$b, $*'
43       mlr --from infile.dat put 'tee | "gzip > ./taps/data-".$a."-".$b.".gz", $*'
44       mlr --from infile.dat put -q '@v=$*; dump | "jq .[]"'
45       mlr --from infile.dat put  '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'
46
47   DATA FORMATS
48         DKVP: delimited key-value pairs (Miller default format)
49         +---------------------+
50         | apple=1,bat=2,cog=3 | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
51         | dish=7,egg=8,flint  | Record 2: "dish" => "7", "egg" => "8", "3" => "flint"
52         +---------------------+
53
54         NIDX: implicitly numerically indexed (Unix-toolkit style)
55         +---------------------+
56         | the quick brown     | Record 1: "1" => "the", "2" => "quick", "3" => "brown"
57         | fox jumped          | Record 2: "1" => "fox", "2" => "jumped"
58         +---------------------+
59
60         CSV/CSV-lite: comma-separated values with separate header line
61         +---------------------+
62         | apple,bat,cog       |
63         | 1,2,3               | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
64         | 4,5,6               | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
65         +---------------------+
66
67         Tabular JSON: nested objects are supported, although arrays within them are not:
68         +---------------------+
69         | {                   |
70         |  "apple": 1,        | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
71         |  "bat": 2,          |
72         |  "cog": 3           |
73         | }                   |
74         | {                   |
75         |   "dish": {         | Record 2: "dish:egg" => "7", "dish:flint" => "8", "garlic" => ""
76         |     "egg": 7,       |
77         |     "flint": 8      |
78         |   },                |
79         |   "garlic": ""      |
80         | }                   |
81         +---------------------+
82
83         PPRINT: pretty-printed tabular
84         +---------------------+
85         | apple bat cog       |
86         | 1     2   3         | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
87         | 4     5   6         | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
88         +---------------------+
89
90         XTAB: pretty-printed transposed tabular
91         +---------------------+
92         | apple 1             | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
93         | bat   2             |
94         | cog   3             |
95         |                     |
96         | dish 7              | Record 2: "dish" => "7", "egg" => "8"
97         | egg  8              |
98         +---------------------+
99
100         Markdown tabular (supported for output only):
101         +-----------------------+
102         | | apple | bat | cog | |
103         | | ---   | --- | --- | |
104         | | 1     | 2   | 3   | | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
105         | | 4     | 5   | 6   | | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
106         +-----------------------+
107

OPTIONS

109       In the following option flags, the version with "i" designates the
110       input stream, "o" the output stream, and the version without prefix
111       sets the option for both input and output stream. For example: --irs
112       sets the input record separator, --ors the output record separator, and
113       --rs sets both the input and output separator to the given value.
114
115   HELP OPTIONS
116         -h or --help                 Show this message.
117         --version                    Show the software version.
118         {verb name} --help           Show verb-specific help.
119         --help-all-verbs             Show help on all verbs.
120         -l or --list-all-verbs       List only verb names.
121         -L                           List only verb names, one per line.
122         -f or --help-all-functions   Show help on all built-in functions.
123         -F                           Show a bare listing of built-in functions by name.
124         -k or --help-all-keywords    Show help on all keywords.
125         -K                           Show a bare listing of keywords by name.
126
127   VERB LIST
128        altkv bar bootstrap cat check clean-whitespace count-distinct count-similar
129        cut decimate fill-down filter format-values fraction grep group-by group-like
130        having-fields head histogram join label least-frequent merge-fields
131        most-frequent nest nothing put regularize remove-empty-columns rename reorder
132        repeat reshape sample sec2gmt sec2gmtdate seqgen shuffle skip-trivial-records
133        sort stats1 stats2 step tac tail tee top uniq unsparsify
134
135   FUNCTION LIST
136        + + - - * / // .+ .+ .- .- .* ./ .// % ** | ^ & ~ << >> bitcount == != =~ !=~
137        > >= < <= && || ^^ ! ? : . gsub regextract regextract_or_else strlen sub ssub
138        substr tolower toupper capitalize lstrip rstrip strip collapse_whitespace
139        clean_whitespace system abs acos acosh asin asinh atan atan2 atanh cbrt ceil
140        cos cosh erf erfc exp expm1 floor invqnorm log log10 log1p logifit madd max
141        mexp min mmul msub pow qnorm round roundm sgn sin sinh sqrt tan tanh urand
142        urandrange urand32 urandint dhms2fsec dhms2sec fsec2dhms fsec2hms gmt2sec
143        localtime2sec hms2fsec hms2sec sec2dhms sec2gmt sec2gmt sec2gmtdate
144        sec2localtime sec2localtime sec2localdate sec2hms strftime strftime_local
145        strptime strptime_local systime is_absent is_bool is_boolean is_empty
146        is_empty_map is_float is_int is_map is_nonempty_map is_not_empty is_not_map
147        is_not_null is_null is_numeric is_present is_string asserting_absent
148        asserting_bool asserting_boolean asserting_empty asserting_empty_map
149        asserting_float asserting_int asserting_map asserting_nonempty_map
150        asserting_not_empty asserting_not_map asserting_not_null asserting_null
151        asserting_numeric asserting_present asserting_string boolean float fmtnum
152        hexfmt int string typeof depth haskey joink joinkv joinv leafcount length
153        mapdiff mapexcept mapselect mapsum splitkv splitkvx splitnv splitnvx
154
155       Please use "mlr --help-function {function name}" for function-specific help.
156
157   I/O FORMATTING
158         --idkvp   --odkvp   --dkvp      Delimited key-value pairs, e.g "a=1,b=2"
159                                         (this is Miller's default format).
160
161         --inidx   --onidx   --nidx      Implicitly-integer-indexed fields
162                                         (Unix-toolkit style).
163         -T                              Synonymous with "--nidx --fs tab".
164
165         --icsv    --ocsv    --csv       Comma-separated value (or tab-separated
166                                         with --fs tab, etc.)
167
168         --itsv    --otsv    --tsv       Keystroke-savers for "--icsv --ifs tab",
169                                         "--ocsv --ofs tab", "--csv --fs tab".
170         --iasv    --oasv    --asv       Similar but using ASCII FS 0x1f and RS 0x1e
171         --iusv    --ousv    --usv       Similar but using Unicode FS U+241F (UTF-8 0xe2909f)
172                                         and RS U+241E (UTF-8 0xe2909e)
173
174         --icsvlite --ocsvlite --csvlite Comma-separated value (or tab-separated
175                                         with --fs tab, etc.). The 'lite' CSV does not handle
176                                         RFC-CSV double-quoting rules; is slightly faster;
177                                         and handles heterogeneity in the input stream via
178                                         empty newline followed by new header line. See also
179                                         http://johnkerl.org/miller/doc/file-formats.html#CSV/TSV/etc.
180
181         --itsvlite --otsvlite --tsvlite Keystroke-savers for "--icsvlite --ifs tab",
182                                         "--ocsvlite --ofs tab", "--csvlite --fs tab".
183         -t                              Synonymous with --tsvlite.
184         --iasvlite --oasvlite --asvlite Similar to --itsvlite et al. but using ASCII FS 0x1f and RS 0x1e
185         --iusvlite --ousvlite --usvlite Similar to --itsvlite et al. but using Unicode FS U+241F (UTF-8 0xe2909f)
186                                         and RS U+241E (UTF-8 0xe2909e)
187
188         --ipprint --opprint --pprint    Pretty-printed tabular (produces no
189                                         output until all input is in).
190                             --right     Right-justifies all fields for PPRINT output.
191                             --barred    Prints a border around PPRINT output
192                                         (only available for output).
193
194                   --omd                 Markdown-tabular (only available for output).
195
196         --ixtab   --oxtab   --xtab      Pretty-printed vertical-tabular.
197                             --xvright   Right-justifies values for XTAB format.
198
199         --ijson   --ojson   --json      JSON tabular: sequence or list of one-level
200                                         maps: {...}{...} or [{...},{...}].
201           --json-map-arrays-on-input    JSON arrays are unmillerable. --json-map-arrays-on-input
202           --json-skip-arrays-on-input   is the default: arrays are converted to integer-indexed
203           --json-fatal-arrays-on-input  maps. The other two options cause them to be skipped, or
204                                         to be treated as errors.  Please use the jq tool for full
205                                         JSON (pre)processing.
206                             --jvstack   Put one key-value pair per line for JSON
207                                         output.
208                             --jlistwrap Wrap JSON output in outermost [ ].
209                           --jknquoteint Do not quote non-string map keys in JSON output.
210                            --jvquoteall Quote map values in JSON output, even if they're
211                                         numeric.
212                     --jflatsep {string} Separator for flattening multi-level JSON keys,
213                                         e.g. '{"a":{"b":3}}' becomes a:b => 3 for
214                                         non-JSON formats. Defaults to :.
215
216         -p is a keystroke-saver for --nidx --fs space --repifs
217
218         --mmap --no-mmap --mmap-below {n} Use mmap for files whenever possible, never, or
219                                         for files less than n bytes in size. Default is for
220                                         files less than 4294967296 bytes in size.
221                                         'Whenever possible' means always except for when reading
222                                         standard input which is not mmappable. If you don't know
223                                         what this means, don't worry about it -- it's a minor
224                                         performance optimization.
225
226         Examples: --csv for CSV-formatted input and output; --idkvp --opprint for
227         DKVP-formatted input and pretty-printed output.
228
229         Please use --iformat1 --oformat2 rather than --format1 --oformat2.
230         The latter sets up input and output flags for format1, not all of which
231         are overridden in all cases by setting output format to format2.
232
233   COMMENTS IN DATA
234         --skip-comments                 Ignore commented lines (prefixed by "#")
235                                         within the input.
236         --skip-comments-with {string}   Ignore commented lines within input, with
237                                         specified prefix.
238         --pass-comments                 Immediately print commented lines (prefixed by "#")
239                                         within the input.
240         --pass-comments-with {string}   Immediately print commented lines within input, with
241                                         specified prefix.
242       Notes:
243       * Comments are only honored at the start of a line.
244       * In the absence of any of the above four options, comments are data like
245         any other text.
246       * When pass-comments is used, comment lines are written to standard output
247         immediately upon being read; they are not part of the record stream.
248         Results may be counterintuitive. A suggestion is to place comments at the
249         start of data files.
250
251   FORMAT-CONVERSION KEYSTROKE-SAVERS
252       As keystroke-savers for format-conversion you may use the following:
253               --c2t --c2d --c2n --c2j --c2x --c2p --c2m
254         --t2c       --t2d --t2n --t2j --t2x --t2p --t2m
255         --d2c --d2t       --d2n --d2j --d2x --d2p --d2m
256         --n2c --n2t --n2d       --n2j --n2x --n2p --n2m
257         --j2c --j2t --j2d --j2n       --j2x --j2p --j2m
258         --x2c --x2t --x2d --x2n --x2j       --x2p --x2m
259         --p2c --p2t --p2d --p2n --p2j --p2x       --p2m
260       The letters c t d n j x p m refer to formats CSV, TSV, DKVP, NIDX, JSON, XTAB,
261       PPRINT, and markdown, respectively. Note that markdown format is available for
262       output only.
263
264   COMPRESSED I/O
265         --prepipe {command} This allows Miller to handle compressed inputs. You can do
266         without this for single input files, e.g. "gunzip < myfile.csv.gz | mlr ...".
267         However, when multiple input files are present, between-file separations are
268         lost; also, the FILENAME variable doesn't iterate. Using --prepipe you can
269         specify an action to be taken on each input file. This pre-pipe command must
270         be able to read from standard input; it will be invoked with
271           {command} < {filename}.
272         Examples:
273           mlr --prepipe 'gunzip'
274           mlr --prepipe 'zcat -cf'
275           mlr --prepipe 'xz -cd'
276           mlr --prepipe cat
277         Note that this feature is quite general and is not limited to decompression
278         utilities. You can use it to apply per-file filters of your choice.
279         For output compression (or other) utilities, simply pipe the output:
280           mlr ... | {your compression command}
281
282   SEPARATORS
283         --rs     --irs     --ors              Record separators, e.g. 'lf' or '\r\n'
284         --fs     --ifs     --ofs  --repifs    Field separators, e.g. comma
285         --ps     --ips     --ops              Pair separators, e.g. equals sign
286
287         Notes about line endings:
288         * Default line endings (--irs and --ors) are "auto" which means autodetect from
289           the input file format, as long as the input file(s) have lines ending in either
290           LF (also known as linefeed, '\n', 0x0a, Unix-style) or CRLF (also known as
291           carriage-return/linefeed pairs, '\r\n', 0x0d 0x0a, Windows style).
292         * If both irs and ors are auto (which is the default) then LF input will lead to LF
293           output and CRLF input will lead to CRLF output, regardless of the platform you're
294           running on.
295         * The line-ending autodetector triggers on the first line ending detected in the input
296           stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
297           LF-terminated file then autodetected line endings will be CRLF.
298         * If you use --ors {something else} with (default or explicitly specified) --irs auto
299           then line endings are autodetected on input and set to what you specify on output.
300         * If you use --irs {something else} with (default or explicitly specified) --ors auto
301           then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
302
303         Notes about all other separators:
304         * IPS/OPS are only used for DKVP and XTAB formats, since only in these formats
305           do key-value pairs appear juxtaposed.
306         * IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
307           XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
308           a blank line. Everything above about --irs/--ors/--rs auto becomes --ifs/--ofs/--fs
309           auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
310         * OFS must be single-character for PPRINT format. This is because it is used
311           with repetition for alignment; multi-character separators would make
312           alignment impossible.
313         * OPS may be multi-character for XTAB format, in which case alignment is
314           disabled.
315         * TSV is simply CSV using tab as field separator ("--fs tab").
316         * FS/PS are ignored for markdown format; RS is used.
317         * All FS and PS options are ignored for JSON format, since they are not relevant
318           to the JSON format.
319         * You can specify separators in any of the following ways, shown by example:
320           - Type them out, quoting as necessary for shell escapes, e.g.
321             "--fs '|' --ips :"
322           - C-style escape sequences, e.g. "--rs '\r\n' --fs '\t'".
323           - To avoid backslashing, you can use any of the following names:
324             cr crcr newline lf lflf crlf crlfcrlf tab space comma pipe slash colon semicolon equals
325         * Default separators by format:
326             File format  RS       FS       PS
327             gen          N/A      (N/A)    (N/A)
328             dkvp         auto     ,        =
329             json         auto     (N/A)    (N/A)
330             nidx         auto     space    (N/A)
331             csv          auto     ,        (N/A)
332             csvlite      auto     ,        (N/A)
333             markdown     auto     (N/A)    (N/A)
334             pprint       auto     space    (N/A)
335             xtab         (N/A)    auto     space
336
337   CSV-SPECIFIC OPTIONS
338         --implicit-csv-header Use 1,2,3,... as field labels, rather than from line 1
339                            of input files. Tip: combine with "label" to recreate
340                            missing headers.
341         --allow-ragged-csv-input|--ragged If a data line has fewer fields than the header line,
342                            fill remaining keys with empty string. If a data line has more
343                            fields than the header line, use integer field labels as in
344                            the implicit-header case.
345         --headerless-csv-output   Print only CSV data lines.
346         -N                 Keystroke-saver for --implicit-csv-header --headerless-csv-output.
347
348   DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT
349         --quote-all        Wrap all fields in double quotes
350         --quote-none       Do not wrap any fields in double quotes, even if they have
351                            OFS or ORS in them
352         --quote-minimal    Wrap fields in double quotes only if they have OFS or ORS
353                            in them (default)
354         --quote-numeric    Wrap fields in double quotes only if they have numbers
355                            in them
356         --quote-original   Wrap fields in double quotes if and only if they were
357                            quoted on input. This isn't sticky for computed fields:
358                            e.g. if fields a and b were quoted on input and you do
359                            "put '$c = $a . $b'" then field c won't inherit a or b's
360                            was-quoted-on-input flag.
361
362   NUMERICAL FORMATTING
363         --ofmt {format}    E.g. %.18lf, %.0lf. Please use sprintf-style codes for
364                            double-precision. Applies to verbs which compute new
365                            values, e.g. put, stats1, stats2. See also the fmtnum
366                            function within mlr put (mlr --help-all-functions).
367                            Defaults to %lf.
368
369   OTHER OPTIONS
370         --seed {n} with n of the form 12345678 or 0xcafefeed. For put/filter
371                            urand()/urandint()/urand32().
372         --nr-progress-mod {m}, with m a positive integer: print filename and record
373                            count to stderr every m input records.
374         --from {filename}  Use this to specify an input file before the verb(s),
375                            rather than after. May be used more than once. Example:
376                            "mlr --from a.dat --from b.dat cat" is the same as
377                            "mlr cat a.dat b.dat".
378         -n                 Process no input files, nor standard input either. Useful
379                            for mlr put with begin/end statements only. (Same as --from
380                            /dev/null.) Also useful in "mlr -n put -v '...'" for
381                            analyzing abstract syntax trees (if that's your thing).
382         -I                 Process files in-place. For each file name on the command
383                            line, output is written to a temp file in the same
384                            directory, which is then renamed over the original. Each
385                            file is processed in isolation: if the output format is
386                            CSV, CSV headers will be present in each output file;
387                            statistics are only over each file's own records; and so on.
388
389   THEN-CHAINING
390       Output of one verb may be chained as input to another using "then", e.g.
391         mlr stats1 -a min,mean,max -f flag,u,v -g color then sort -f color
392
393   AUXILIARY COMMANDS
394       Miller has a few otherwise-standalone executables packaged within it.
395       They do not participate in any other parts of Miller.
396       Available subcommands:
397         aux-list
398         lecat
399         termcvt
400         hex
401         unhex
402         netbsd-strptime
403       For more information, please invoke mlr {subcommand} --help
404

VERBS

406   altkv
407       Usage: mlr altkv [no options]
408       Given fields with values of the form a,b,c,d,e,f emits a=b,c=d,e=f pairs.
409
410   bar
411       Usage: mlr bar [options]
412       Replaces a numeric field with a number of asterisks, allowing for cheesy
413       bar plots. These align best with --opprint or --oxtab output format.
414       Options:
415       -f   {a,b,c}      Field names to convert to bars.
416       -c   {character}  Fill character: default '*'.
417       -x   {character}  Out-of-bounds character: default '#'.
418       -b   {character}  Blank character: default '.'.
419       --lo {lo}         Lower-limit value for min-width bar: default '0.000000'.
420       --hi {hi}         Upper-limit value for max-width bar: default '100.000000'.
421       -w   {n}          Bar-field width: default '40'.
422       --auto            Automatically computes limits, ignoring --lo and --hi.
423                         Holds all records in memory before producing any output.
424
425   bootstrap
426       Usage: mlr bootstrap [options]
427       Emits an n-sample, with replacement, of the input records.
428       Options:
429       -n {number} Number of samples to output. Defaults to number of input records.
430                   Must be non-negative.
431       See also mlr sample and mlr shuffle.
432
433   cat
434       Usage: mlr cat [options]
435       Passes input records directly to output. Most useful for format conversion.
436       Options:
437       -n        Prepend field "n" to each record with record-counter starting at 1
438       -g {comma-separated field name(s)} When used with -n/-N, writes record-counters
439                 keyed by specified field name(s).
440       -v        Write a low-level record-structure dump to stderr.
441       -N {name} Prepend field {name} to each record with record-counter starting at 1
442
443   check
444       Usage: mlr check
445       Consumes records without printing any output.
446       Useful for doing a well-formatted check on input data.
447
448   clean-whitespace
449       Usage: mlr clean-whitespace [options] {old1,new1,old2,new2,...}
450       For each record, for each field in the record, whitespace-cleans the keys and
451       values. Whitespace-cleaning entails stripping leading and trailing whitespace,
452       and replacing multiple whitespace with singles. For finer-grained control,
453       please see the DSL functions lstrip, rstrip, strip, collapse_whitespace,
454       and clean_whitespace.
455
456       Options:
457       -k|--keys-only    Do not touch values.
458       -v|--values-only  Do not touch keys.
459       It is an error to specify -k as well as -v.
460
461   count-distinct
462       Usage: mlr count-distinct [options]
463       Prints number of records having distinct values for specified field names.
464       Same as uniq -c.
465
466       Options:
467       -f {a,b,c}    Field names for distinct count.
468       -n            Show only the number of distinct values. Not compatible with -u.
469       -o {name}     Field name for output count. Default "count".
470                     Ignored with -u.
471       -u            Do unlashed counts for multiple field names. With -f a,b and
472                     without -u, computes counts for distinct combinations of a
473                     and b field values. With -f a,b and with -u, computes counts
474                     for distinct a field values and counts for distinct b field
475                     values separately.
476
477   count-similar
478       Usage: mlr count-similar [options]
479       Ingests all records, then emits each record augmented by a count of
480       the number of other records having the same group-by field values.
481       Options:
482       -g {d,e,f} Group-by-field names for counts.
483       -o {name}  Field name for output count. Default "count".
484
485   cut
486       Usage: mlr cut [options]
487       Passes through input records with specified fields included/excluded.
488       -f {a,b,c}       Field names to include for cut.
489       -o               Retain fields in the order specified here in the argument list.
490                        Default is to retain them in the order found in the input data.
491       -x|--complement  Exclude, rather than include, field names specified by -f.
492       -r               Treat field names as regular expressions. "ab", "a.*b" will
493                        match any field name containing the substring "ab" or matching
494                        "a.*b", respectively; anchors of the form "^ab$", "^a.*b$" may
495                        be used. The -o flag is ignored when -r is present.
496       Examples:
497         mlr cut -f hostname,status
498         mlr cut -x -f hostname,status
499         mlr cut -r -f '^status$,sda[0-9]'
500         mlr cut -r -f '^status$,"sda[0-9]"'
501         mlr cut -r -f '^status$,"sda[0-9]"i' (this is case-insensitive)
502
503   decimate
504       Usage: mlr decimate [options]
505       -n {count}    Decimation factor; default 10
506       -b            Decimate by printing first of every n.
507       -e            Decimate by printing last of every n (default).
508       -g {a,b,c}    Optional group-by-field names for decimate counts
509       Passes through one of every n records, optionally by category.
510
511   fill-down
512       Usage: mlr fill-down [options]
513       -f {a,b,c}          Field names for fill-down
514       -a|--only-if-absent Field names for fill-down
515       If a given record has a missing value for a given field, fill that from
516       the corresponding value from a previous record, if any.
517       By default, a 'missing' field either is absent, or has the empty-string value.
518       With -a, a field is 'missing' only if it is absent.
519
520   filter
521       Usage: mlr filter [options] {expression}
522       Prints records for which {expression} evaluates to true.
523       If there are multiple semicolon-delimited expressions, all of them are
524       evaluated and the last one is used as the filter criterion.
525
526       Conversion options:
527       -S: Keeps field values as strings with no type inference to int or float.
528       -F: Keeps field values as strings or floats with no inference to int.
529       All field values are type-inferred to int/float/string unless this behavior is
530       suppressed with -S or -F.
531
532       Output/formatting options:
533       --oflatsep {string}: Separator to use when flattening multi-level @-variables
534           to output records for emit. Default ":".
535       --jknquoteint: For dump output (JSON-formatted), do not quote map keys if non-string.
536       --jvquoteall: For dump output (JSON-formatted), quote map values even if non-string.
537       Any of the output-format command-line flags (see mlr -h). Example: using
538         mlr --icsv --opprint ... then put --ojson 'tee > "mytap-".$a.".dat", $*' then ...
539       the input is CSV, the output is pretty-print tabular, but the tee-file output
540       is written in JSON format.
541       --no-fflush: for emit, tee, print, and dump, don't call fflush() after every
542           record.
543
544       Expression-specification options:
545       -f {filename}: the DSL expression is taken from the specified file rather
546           than from the command line. Outer single quotes wrapping the expression
547           should not be placed in the file. If -f is specified more than once,
548           all input files specified using -f are concatenated to produce the expression.
549           (For example, you can define functions in one file and call them from another.)
550       -e {expression}: You can use this after -f to add an expression. Example use
551           case: define functions/subroutines in a file you specify with -f, then call
552           them with an expression you specify with -e.
553       (If you mix -e and -f then the expressions are evaluated in the order encountered.
554       Since the expression pieces are simply concatenated, please be sure to use intervening
555       semicolons to separate expressions.)
556
557       Tracing options:
558       -v: Prints the expressions's AST (abstract syntax tree), which gives
559           full transparency on the precedence and associativity rules of
560           Miller's grammar, to stdout.
561       -a: Prints a low-level stack-allocation trace to stdout.
562       -t: Prints a low-level parser trace to stderr.
563       -T: Prints a every statement to stderr as it is executed.
564
565       Other options:
566       -x: Prints records for which {expression} evaluates to false.
567
568       Please use a dollar sign for field names and double-quotes for string
569       literals. If field names have special characters such as "." then you might
570       use braces, e.g. '${field.name}'. Miller built-in variables are
571       NF NR FNR FILENUM FILENAME M_PI M_E, and ENV["namegoeshere"] to access environment
572       variables. The environment-variable name may be an expression, e.g. a field
573       value.
574
575       Use # to comment to end of line.
576
577       Examples:
578         mlr filter 'log10($count) > 4.0'
579         mlr filter 'FNR == 2          (second record in each file)'
580         mlr filter 'urand() < 0.001'  (subsampling)
581         mlr filter '$color != "blue" && $value > 4.2'
582         mlr filter '($x<.5 && $y<.5) || ($x>.5 && $y>.5)'
583         mlr filter '($name =~ "^sys.*east$") || ($name =~ "^dev.[0-9]+"i)'
584         mlr filter '$ab = $a+$b; $cd = $c+$d; $ab != $cd'
585         mlr filter '
586           NR == 1 ||
587          #NR == 2 ||
588           NR == 3
589         '
590
591       Please see http://johnkerl.org/miller/doc/reference.html for more information
592       including function list. Or "mlr -f". Please also also "mlr grep" which is
593       useful when you don't yet know which field name(s) you're looking for.
594
595   format-values
596       Usage: mlr format-values [options]
597       Applies format strings to all field values, depending on autodetected type.
598       * If a field value is detected to be integer, applies integer format.
599       * Else, if a field value is detected to be float, applies float format.
600       * Else, applies string format.
601
602       Note: this is a low-keystroke way to apply formatting to many fields. To get
603       finer control, please see the fmtnum function within the mlr put DSL.
604
605       Note: this verb lets you apply arbitrary format strings, which can produce
606       undefined behavior and/or program crashes.  See your system's "man printf".
607
608       Options:
609       -i {integer format} Defaults to "%lld".
610                           Examples: "%06lld", "%08llx".
611                           Note that Miller integers are long long so you must use
612                           formats which apply to long long, e.g. with ll in them.
613                           Undefined behavior results otherwise.
614       -f {float format}   Defaults to "%lf".
615                           Examples: "%8.3lf", "%.6le".
616                           Note that Miller floats are double-precision so you must
617                           use formats which apply to double, e.g. with l[efg] in them.
618                           Undefined behavior results otherwise.
619       -s {string format}  Defaults to "%s".
620                           Examples: "_%s", "%08s".
621                           Note that you must use formats which apply to string, e.g.
622                           with s in them. Undefined behavior results otherwise.
623       -n                  Coerce field values autodetected as int to float, and then
624                           apply the float format.
625
626   fraction
627       Usage: mlr fraction [options]
628       For each record's value in specified fields, computes the ratio of that
629       value to the sum of values in that field over all input records.
630       E.g. with input records  x=1  x=2  x=3  and  x=4, emits output records
631       x=1,x_fraction=0.1  x=2,x_fraction=0.2  x=3,x_fraction=0.3  and  x=4,x_fraction=0.4
632
633       Note: this is internally a two-pass algorithm: on the first pass it retains
634       input records and accumulates sums; on the second pass it computes quotients
635       and emits output records. This means it produces no output until all input is read.
636
637       Options:
638       -f {a,b,c}    Field name(s) for fraction calculation
639       -g {d,e,f}    Optional group-by-field name(s) for fraction counts
640       -p            Produce percents [0..100], not fractions [0..1]. Output field names
641                     end with "_percent" rather than "_fraction"
642       -c            Produce cumulative distributions, i.e. running sums: each output
643                     value folds in the sum of the previous for the specified group
644                     E.g. with input records  x=1  x=2  x=3  and  x=4, emits output records
645                     x=1,x_cumulative_fraction=0.1  x=2,x_cumulative_fraction=0.3
646                     x=3,x_cumulative_fraction=0.6  and  x=4,x_cumulative_fraction=1.0
647
648   grep
649       Usage: mlr grep [options] {regular expression}
650       Passes through records which match {regex}.
651       Options:
652       -i    Use case-insensitive search.
653       -v    Invert: pass through records which do not match the regex.
654       Note that "mlr filter" is more powerful, but requires you to know field names.
655       By contrast, "mlr grep" allows you to regex-match the entire record. It does
656       this by formatting each record in memory as DKVP, using command-line-specified
657       ORS/OFS/OPS, and matching the resulting line against the regex specified
658       here. In particular, the regex is not applied to the input stream: if you
659       have CSV with header line "x,y,z" and data line "1,2,3" then the regex will
660       be matched, not against either of these lines, but against the DKVP line
661       "x=1,y=2,z=3".  Furthermore, not all the options to system grep are supported,
662       and this command is intended to be merely a keystroke-saver. To get all the
663       features of system grep, you can do
664         "mlr --odkvp ... | grep ... | mlr --idkvp ..."
665
666   group-by
667       Usage: mlr group-by {comma-separated field names}
668       Outputs records in batches having identical values at specified field names.
669
670   group-like
671       Usage: mlr group-like
672       Outputs records in batches having identical field names.
673
674   having-fields
675       Usage: mlr having-fields [options]
676       Conditionally passes through records depending on each record's field names.
677       Options:
678         --at-least      {comma-separated names}
679         --which-are     {comma-separated names}
680         --at-most       {comma-separated names}
681         --all-matching  {regular expression}
682         --any-matching  {regular expression}
683         --none-matching {regular expression}
684       Examples:
685         mlr having-fields --which-are amount,status,owner
686         mlr having-fields --any-matching 'sda[0-9]'
687         mlr having-fields --any-matching '"sda[0-9]"'
688         mlr having-fields --any-matching '"sda[0-9]"i' (this is case-insensitive)
689
690   head
691       Usage: mlr head [options]
692       -n {count}    Head count to print; default 10
693       -g {a,b,c}    Optional group-by-field names for head counts
694       Passes through the first n records, optionally by category.
695       Without -g, ceases consuming more input (i.e. is fast) when n
696       records have been read.
697
698   histogram
699       Usage: mlr histogram [options]
700       -f {a,b,c}    Value-field names for histogram counts
701       --lo {lo}     Histogram low value
702       --hi {hi}     Histogram high value
703       --nbins {n}   Number of histogram bins
704       --auto        Automatically computes limits, ignoring --lo and --hi.
705                     Holds all values in memory before producing any output.
706       -o {prefix}   Prefix for output field name. Default: no prefix.
707       Just a histogram. Input values < lo or > hi are not counted.
708
709   join
710       Usage: mlr join [options]
711       Joins records from specified left file name with records from all file names
712       at the end of the Miller argument list.
713       Functionality is essentially the same as the system "join" command, but for
714       record streams.
715       Options:
716         -f {left file name}
717         -j {a,b,c}   Comma-separated join-field names for output
718         -l {a,b,c}   Comma-separated join-field names for left input file;
719                      defaults to -j values if omitted.
720         -r {a,b,c}   Comma-separated join-field names for right input file(s);
721                      defaults to -j values if omitted.
722         --lp {text}  Additional prefix for non-join output field names from
723                      the left file
724         --rp {text}  Additional prefix for non-join output field names from
725                      the right file(s)
726         --np         Do not emit paired records
727         --ul         Emit unpaired records from the left file
728         --ur         Emit unpaired records from the right file(s)
729         -s|--sorted-input  Require sorted input: records must be sorted
730                      lexically by their join-field names, else not all records will
731                      be paired. The only likely use case for this is with a left
732                      file which is too big to fit into system memory otherwise.
733         -u           Enable unsorted input. (This is the default even without -u.)
734                      In this case, the entire left file will be loaded into memory.
735         --prepipe {command} As in main input options; see mlr --help for details.
736                      If you wish to use a prepipe command for the main input as well
737                      as here, it must be specified there as well as here.
738       File-format options default to those for the right file names on the Miller
739       argument list, but may be overridden for the left file as follows. Please see
740       the main "mlr --help" for more information on syntax for these arguments.
741         -i {one of csv,dkvp,nidx,pprint,xtab}
742         --irs {record-separator character}
743         --ifs {field-separator character}
744         --ips {pair-separator character}
745         --repifs
746         --repips
747         --mmap
748         --no-mmap
749       Please use "mlr --usage-separator-options" for information on specifying separators.
750       Please see http://johnkerl.org/miller/doc/reference.html for more information
751       including examples.
752
753   label
754       Usage: mlr label {new1,new2,new3,...}
755       Given n comma-separated names, renames the first n fields of each record to
756       have the respective name. (Fields past the nth are left with their original
757       names.) Particularly useful with --inidx or --implicit-csv-header, to give
758       useful names to otherwise integer-indexed fields.
759       Examples:
760         "echo 'a b c d' | mlr --inidx --odkvp cat"       gives "1=a,2=b,3=c,4=d"
761         "echo 'a b c d' | mlr --inidx --odkvp label s,t" gives "s=a,t=b,3=c,4=d"
762
763   least-frequent
764       Usage: mlr least-frequent [options]
765       Shows the least frequently occurring distinct values for specified field names.
766       The first entry is the statistical anti-mode; the remaining are runners-up.
767       Options:
768       -f {one or more comma-separated field names}. Required flag.
769       -n {count}. Optional flag defaulting to 10.
770       -b          Suppress counts; show only field values.
771       -o {name}   Field name for output count. Default "count".
772       See also "mlr most-frequent".
773
774   merge-fields
775       Usage: mlr merge-fields [options]
776       Computes univariate statistics for each input record, accumulated across
777       specified fields.
778       Options:
779       -a {sum,count,...}  Names of accumulators. One or more of:
780         count     Count instances of fields
781         mode      Find most-frequently-occurring values for fields; first-found wins tie
782         antimode  Find least-frequently-occurring values for fields; first-found wins tie
783         sum       Compute sums of specified fields
784         mean      Compute averages (sample means) of specified fields
785         stddev    Compute sample standard deviation of specified fields
786         var       Compute sample variance of specified fields
787         meaneb    Estimate error bars for averages (assuming no sample autocorrelation)
788         skewness  Compute sample skewness of specified fields
789         kurtosis  Compute sample kurtosis of specified fields
790         min       Compute minimum values of specified fields
791         max       Compute maximum values of specified fields
792       -f {a,b,c}  Value-field names on which to compute statistics. Requires -o.
793       -r {a,b,c}  Regular expressions for value-field names on which to compute
794                   statistics. Requires -o.
795       -c {a,b,c}  Substrings for collapse mode. All fields which have the same names
796                   after removing substrings will be accumulated together. Please see
797                   examples below.
798       -i          Use interpolated percentiles, like R's type=7; default like type=1.
799                   Not sensical for string-valued fields.
800       -o {name}   Output field basename for -f/-r.
801       -k          Keep the input fields which contributed to the output statistics;
802                   the default is to omit them.
803       -F          Computes integerable things (e.g. count) in floating point.
804
805       String-valued data make sense unless arithmetic on them is required,
806       e.g. for sum, mean, interpolated percentiles, etc. In case of mixed data,
807       numbers are less than strings.
808
809       Example input data: "a_in_x=1,a_out_x=2,b_in_y=4,b_out_x=8".
810       Example: mlr merge-fields -a sum,count -f a_in_x,a_out_x -o foo
811         produces "b_in_y=4,b_out_x=8,foo_sum=3,foo_count=2" since "a_in_x,a_out_x" are
812         summed over.
813       Example: mlr merge-fields -a sum,count -r in_,out_ -o bar
814         produces "bar_sum=15,bar_count=4" since all four fields are summed over.
815       Example: mlr merge-fields -a sum,count -c in_,out_
816         produces "a_x_sum=3,a_x_count=2,b_y_sum=4,b_y_count=1,b_x_sum=8,b_x_count=1"
817         since "a_in_x" and "a_out_x" both collapse to "a_x", "b_in_y" collapses to
818         "b_y", and "b_out_x" collapses to "b_x".
819
820   most-frequent
821       Usage: mlr most-frequent [options]
822       Shows the most frequently occurring distinct values for specified field names.
823       The first entry is the statistical mode; the remaining are runners-up.
824       Options:
825       -f {one or more comma-separated field names}. Required flag.
826       -n {count}. Optional flag defaulting to 10.
827       -b          Suppress counts; show only field values.
828       -o {name}   Field name for output count. Default "count".
829       See also "mlr least-frequent".
830
831   nest
832       Usage: mlr nest [options]
833       Explodes specified field values into separate fields/records, or reverses this.
834       Options:
835         --explode,--implode   One is required.
836         --values,--pairs      One is required.
837         --across-records,--across-fields One is required.
838         -f {field name}       Required.
839         --nested-fs {string}  Defaults to ";". Field separator for nested values.
840         --nested-ps {string}  Defaults to ":". Pair separator for nested key-value pairs.
841         --evar {string}       Shorthand for --explode --values ---across-records --nested-fs {string}
842         --ivar {string}       Shorthand for --implode --values ---across-records --nested-fs {string}
843       Please use "mlr --usage-separator-options" for information on specifying separators.
844
845       Examples:
846
847         mlr nest --explode --values --across-records -f x
848         with input record "x=a;b;c,y=d" produces output records
849           "x=a,y=d"
850           "x=b,y=d"
851           "x=c,y=d"
852         Use --implode to do the reverse.
853
854         mlr nest --explode --values --across-fields -f x
855         with input record "x=a;b;c,y=d" produces output records
856           "x_1=a,x_2=b,x_3=c,y=d"
857         Use --implode to do the reverse.
858
859         mlr nest --explode --pairs --across-records -f x
860         with input record "x=a:1;b:2;c:3,y=d" produces output records
861           "a=1,y=d"
862           "b=2,y=d"
863           "c=3,y=d"
864
865         mlr nest --explode --pairs --across-fields -f x
866         with input record "x=a:1;b:2;c:3,y=d" produces output records
867           "a=1,b=2,c=3,y=d"
868
869       Notes:
870       * With --pairs, --implode doesn't make sense since the original field name has
871         been lost.
872       * The combination "--implode --values --across-records" is non-streaming:
873         no output records are produced until all input records have been read. In
874         particular, this means it won't work in tail -f contexts. But all other flag
875         combinations result in streaming (tail -f friendly) data processing.
876       * It's up to you to ensure that the nested-fs is distinct from your data's IFS:
877         e.g. by default the former is semicolon and the latter is comma.
878       See also mlr reshape.
879
880   nothing
881       Usage: mlr nothing [options]
882       Drops all input records. Useful for testing, or after tee/print/etc. have
883       produced other output.
884
885   put
886       Usage: mlr put [options] {expression}
887       Adds/updates specified field(s). Expressions are semicolon-separated and must
888       either be assignments, or evaluate to boolean.  Booleans with following
889       statements in curly braces control whether those statements are executed;
890       booleans without following curly braces do nothing except side effects (e.g.
891       regex-captures into \1, \2, etc.).
892
893       Conversion options:
894       -S: Keeps field values as strings with no type inference to int or float.
895       -F: Keeps field values as strings or floats with no inference to int.
896       All field values are type-inferred to int/float/string unless this behavior is
897       suppressed with -S or -F.
898
899       Output/formatting options:
900       --oflatsep {string}: Separator to use when flattening multi-level @-variables
901           to output records for emit. Default ":".
902       --jknquoteint: For dump output (JSON-formatted), do not quote map keys if non-string.
903       --jvquoteall: For dump output (JSON-formatted), quote map values even if non-string.
904       Any of the output-format command-line flags (see mlr -h). Example: using
905         mlr --icsv --opprint ... then put --ojson 'tee > "mytap-".$a.".dat", $*' then ...
906       the input is CSV, the output is pretty-print tabular, but the tee-file output
907       is written in JSON format.
908       --no-fflush: for emit, tee, print, and dump, don't call fflush() after every
909           record.
910
911       Expression-specification options:
912       -f {filename}: the DSL expression is taken from the specified file rather
913           than from the command line. Outer single quotes wrapping the expression
914           should not be placed in the file. If -f is specified more than once,
915           all input files specified using -f are concatenated to produce the expression.
916           (For example, you can define functions in one file and call them from another.)
917       -e {expression}: You can use this after -f to add an expression. Example use
918           case: define functions/subroutines in a file you specify with -f, then call
919           them with an expression you specify with -e.
920       (If you mix -e and -f then the expressions are evaluated in the order encountered.
921       Since the expression pieces are simply concatenated, please be sure to use intervening
922       semicolons to separate expressions.)
923
924       Tracing options:
925       -v: Prints the expressions's AST (abstract syntax tree), which gives
926           full transparency on the precedence and associativity rules of
927           Miller's grammar, to stdout.
928       -a: Prints a low-level stack-allocation trace to stdout.
929       -t: Prints a low-level parser trace to stderr.
930       -T: Prints a every statement to stderr as it is executed.
931
932       Other options:
933       -q: Does not include the modified record in the output stream. Useful for when
934           all desired output is in begin and/or end blocks.
935
936       Please use a dollar sign for field names and double-quotes for string
937       literals. If field names have special characters such as "." then you might
938       use braces, e.g. '${field.name}'. Miller built-in variables are
939       NF NR FNR FILENUM FILENAME M_PI M_E, and ENV["namegoeshere"] to access environment
940       variables. The environment-variable name may be an expression, e.g. a field
941       value.
942
943       Use # to comment to end of line.
944
945       Examples:
946         mlr put '$y = log10($x); $z = sqrt($y)'
947         mlr put '$x>0.0 { $y=log10($x); $z=sqrt($y) }' # does {...} only if $x > 0.0
948         mlr put '$x>0.0;  $y=log10($x); $z=sqrt($y)'   # does all three statements
949         mlr put '$a =~ "([a-z]+)_([0-9]+);  $b = "left_\1"; $c = "right_\2"'
950         mlr put '$a =~ "([a-z]+)_([0-9]+) { $b = "left_\1"; $c = "right_\2" }'
951         mlr put '$filename = FILENAME'
952         mlr put '$colored_shape = $color . "_" . $shape'
953         mlr put '$y = cos($theta); $z = atan2($y, $x)'
954         mlr put '$name = sub($name, "http.*com"i, "")'
955         mlr put -q '@sum += $x; end {emit @sum}'
956         mlr put -q '@sum[$a] += $x; end {emit @sum, "a"}'
957         mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}'
958         mlr put -q '@min=min(@min,$x);@max=max(@max,$x); end{emitf @min, @max}'
959         mlr put -q 'is_null(@xmax) || $x > @xmax {@xmax=$x; @recmax=$*}; end {emit @recmax}'
960         mlr put '
961           $x = 1;
962          #$y = 2;
963           $z = 3
964         '
965
966       Please see also 'mlr -k' for examples using redirected output.
967
968       Please see http://johnkerl.org/miller/doc/reference.html for more information
969       including function list. Or "mlr -f".
970       Please see in particular:
971         http://www.johnkerl.org/miller/doc/reference.html#put
972
973   regularize
974       Usage: mlr regularize
975       For records seen earlier in the data stream with same field names in
976       a different order, outputs them with field names in the previously
977       encountered order.
978       Example: input records a=1,c=2,b=3, then e=4,d=5, then c=7,a=6,b=8
979       output as              a=1,c=2,b=3, then e=4,d=5, then a=6,c=7,b=8
980
981   remove-empty-columns
982       Usage: mlr remove-empty-columns
983       Omits fields which are empty on every input row. Non-streaming.
984
985   rename
986       Usage: mlr rename [options] {old1,new1,old2,new2,...}
987       Renames specified fields.
988       Options:
989       -r         Treat old field  names as regular expressions. "ab", "a.*b"
990                  will match any field name containing the substring "ab" or
991                  matching "a.*b", respectively; anchors of the form "^ab$",
992                  "^a.*b$" may be used. New field names may be plain strings,
993                  or may contain capture groups of the form "\1" through
994                  "\9". Wrapping the regex in double quotes is optional, but
995                  is required if you wish to follow it with 'i' to indicate
996                  case-insensitivity.
997       -g         Do global replacement within each field name rather than
998                  first-match replacement.
999       Examples:
1000       mlr rename old_name,new_name'
1001       mlr rename old_name_1,new_name_1,old_name_2,new_name_2'
1002       mlr rename -r 'Date_[0-9]+,Date,'  Rename all such fields to be "Date"
1003       mlr rename -r '"Date_[0-9]+",Date' Same
1004       mlr rename -r 'Date_([0-9]+).*,\1' Rename all such fields to be of the form 20151015
1005       mlr rename -r '"name"i,Name'       Rename "name", "Name", "NAME", etc. to "Name"
1006
1007   reorder
1008       Usage: mlr reorder [options]
1009       -f {a,b,c}   Field names to reorder.
1010       -e           Put specified field names at record end: default is to put
1011                    them at record start.
1012       Examples:
1013       mlr reorder    -f a,b sends input record "d=4,b=2,a=1,c=3" to "a=1,b=2,d=4,c=3".
1014       mlr reorder -e -f a,b sends input record "d=4,b=2,a=1,c=3" to "d=4,c=3,a=1,b=2".
1015
1016   repeat
1017       Usage: mlr repeat [options]
1018       Copies input records to output records multiple times.
1019       Options must be exactly one of the following:
1020         -n {repeat count}  Repeat each input record this many times.
1021         -f {field name}    Same, but take the repeat count from the specified
1022                            field name of each input record.
1023       Example:
1024         echo x=0 | mlr repeat -n 4 then put '$x=urand()'
1025       produces:
1026        x=0.488189
1027        x=0.484973
1028        x=0.704983
1029        x=0.147311
1030       Example:
1031         echo a=1,b=2,c=3 | mlr repeat -f b
1032       produces:
1033         a=1,b=2,c=3
1034         a=1,b=2,c=3
1035       Example:
1036         echo a=1,b=2,c=3 | mlr repeat -f c
1037       produces:
1038         a=1,b=2,c=3
1039         a=1,b=2,c=3
1040         a=1,b=2,c=3
1041
1042   reshape
1043       Usage: mlr reshape [options]
1044       Wide-to-long options:
1045         -i {input field names}   -o {key-field name,value-field name}
1046         -r {input field regexes} -o {key-field name,value-field name}
1047         These pivot/reshape the input data such that the input fields are removed
1048         and separate records are emitted for each key/value pair.
1049         Note: this works with tail -f and produces output records for each input
1050         record seen.
1051       Long-to-wide options:
1052         -s {key-field name,value-field name}
1053         These pivot/reshape the input data to undo the wide-to-long operation.
1054         Note: this does not work with tail -f; it produces output records only after
1055         all input records have been read.
1056
1057       Examples:
1058
1059         Input file "wide.txt":
1060           time       X           Y
1061           2009-01-01 0.65473572  2.4520609
1062           2009-01-02 -0.89248112 0.2154713
1063           2009-01-03 0.98012375  1.3179287
1064
1065         mlr --pprint reshape -i X,Y -o item,value wide.txt
1066           time       item value
1067           2009-01-01 X    0.65473572
1068           2009-01-01 Y    2.4520609
1069           2009-01-02 X    -0.89248112
1070           2009-01-02 Y    0.2154713
1071           2009-01-03 X    0.98012375
1072           2009-01-03 Y    1.3179287
1073
1074         mlr --pprint reshape -r '[A-Z]' -o item,value wide.txt
1075           time       item value
1076           2009-01-01 X    0.65473572
1077           2009-01-01 Y    2.4520609
1078           2009-01-02 X    -0.89248112
1079           2009-01-02 Y    0.2154713
1080           2009-01-03 X    0.98012375
1081           2009-01-03 Y    1.3179287
1082
1083         Input file "long.txt":
1084           time       item value
1085           2009-01-01 X    0.65473572
1086           2009-01-01 Y    2.4520609
1087           2009-01-02 X    -0.89248112
1088           2009-01-02 Y    0.2154713
1089           2009-01-03 X    0.98012375
1090           2009-01-03 Y    1.3179287
1091
1092         mlr --pprint reshape -s item,value long.txt
1093           time       X           Y
1094           2009-01-01 0.65473572  2.4520609
1095           2009-01-02 -0.89248112 0.2154713
1096           2009-01-03 0.98012375  1.3179287
1097       See also mlr nest.
1098
1099   sample
1100       Usage: mlr sample [options]
1101       Reservoir sampling (subsampling without replacement), optionally by category.
1102       -k {count}    Required: number of records to output, total, or by group if using -g.
1103       -g {a,b,c}    Optional: group-by-field names for samples.
1104       See also mlr bootstrap and mlr shuffle.
1105
1106   sec2gmt
1107       Usage: mlr sec2gmt [options] {comma-separated list of field names}
1108       Replaces a numeric field representing seconds since the epoch with the
1109       corresponding GMT timestamp; leaves non-numbers as-is. This is nothing
1110       more than a keystroke-saver for the sec2gmt function:
1111         mlr sec2gmt time1,time2
1112       is the same as
1113         mlr put '$time1=sec2gmt($time1);$time2=sec2gmt($time2)'
1114       Options:
1115       -1 through -9: format the seconds using 1..9 decimal places, respectively.
1116
1117   sec2gmtdate
1118       Usage: mlr sec2gmtdate {comma-separated list of field names}
1119       Replaces a numeric field representing seconds since the epoch with the
1120       corresponding GMT year-month-day timestamp; leaves non-numbers as-is.
1121       This is nothing more than a keystroke-saver for the sec2gmtdate function:
1122         mlr sec2gmtdate time1,time2
1123       is the same as
1124         mlr put '$time1=sec2gmtdate($time1);$time2=sec2gmtdate($time2)'
1125
1126   seqgen
1127       Usage: mlr seqgen [options]
1128       Produces a sequence of counters.  Discards the input record stream. Produces
1129       output as specified by the following options:
1130       -f {name} Field name for counters; default "i".
1131       --start {number} Inclusive start value; default "1".
1132       --stop  {number} Inclusive stop value; default "100".
1133       --step  {number} Step value; default "1".
1134       Start, stop, and/or step may be floating-point. Output is integer if start,
1135       stop, and step are all integers. Step may be negative. It may not be zero
1136       unless start == stop.
1137
1138   shuffle
1139       Usage: mlr shuffle {no options}
1140       Outputs records randomly permuted. No output records are produced until
1141       all input records are read.
1142       See also mlr bootstrap and mlr sample.
1143
1144   skip-trivial-records
1145       Usage: mlr skip-trivial-records [options]
1146       Passes through all records except:
1147       * those with zero fields;
1148       * those for which all fields have empty value.
1149
1150   sort
1151       Usage: mlr sort {flags}
1152       Flags:
1153         -f  {comma-separated field names}  Lexical ascending
1154         -n  {comma-separated field names}  Numerical ascending; nulls sort last
1155         -nf {comma-separated field names}  Same as -n
1156         -r  {comma-separated field names}  Lexical descending
1157         -nr {comma-separated field names}  Numerical descending; nulls sort first
1158       Sorts records primarily by the first specified field, secondarily by the second
1159       field, and so on.  (Any records not having all specified sort keys will appear
1160       at the end of the output, in the order they were encountered, regardless of the
1161       specified sort order.) The sort is stable: records that compare equal will sort
1162       in the order they were encountered in the input record stream.
1163
1164       Example:
1165         mlr sort -f a,b -nr x,y,z
1166       which is the same as:
1167         mlr sort -f a -f b -nr x -nr y -nr z
1168
1169   stats1
1170       Usage: mlr stats1 [options]
1171       Computes univariate statistics for one or more given fields, accumulated across
1172       the input record stream.
1173       Options:
1174       -a {sum,count,...}  Names of accumulators: p10 p25.2 p50 p98 p100 etc. and/or
1175                           one or more of:
1176          count     Count instances of fields
1177          mode      Find most-frequently-occurring values for fields; first-found wins tie
1178          antimode  Find least-frequently-occurring values for fields; first-found wins tie
1179          sum       Compute sums of specified fields
1180          mean      Compute averages (sample means) of specified fields
1181          stddev    Compute sample standard deviation of specified fields
1182          var       Compute sample variance of specified fields
1183          meaneb    Estimate error bars for averages (assuming no sample autocorrelation)
1184          skewness  Compute sample skewness of specified fields
1185          kurtosis  Compute sample kurtosis of specified fields
1186          min       Compute minimum values of specified fields
1187          max       Compute maximum values of specified fields
1188       -f {a,b,c}   Value-field names on which to compute statistics
1189       --fr {regex} Regex for value-field names on which to compute statistics
1190                    (compute statsitics on values in all field names matching regex)
1191       --fx {regex} Inverted regex for value-field names on which to compute statistics
1192                    (compute statsitics on values in all field names not matching regex)
1193       -g {d,e,f}   Optional group-by-field names
1194       --gr {regex} Regex for optional group-by-field names
1195                    (group by values in field names matching regex)
1196       --gx {regex} Inverted regex for optional group-by-field names
1197                    (group by values in field names not matching regex)
1198       --grfx {regex} Shorthand for --gr {regex} --fx {that same regex}
1199       -i           Use interpolated percentiles, like R's type=7; default like type=1.
1200                    Not sensical for string-valued fields.
1201       -s           Print iterative stats. Useful in tail -f contexts (in which
1202                    case please avoid pprint-format output since end of input
1203                    stream will never be seen).
1204       -F           Computes integerable things (e.g. count) in floating point.
1205       Example: mlr stats1 -a min,p10,p50,p90,max -f value -g size,shape
1206       Example: mlr stats1 -a count,mode -f size
1207       Example: mlr stats1 -a count,mode -f size -g shape
1208       Example: mlr stats1 -a count,mode --fr '^[a-h].*$' -gr '^k.*$'
1209                This computes count and mode statistics on all field names beginning
1210                with a through h, grouped by all field names starting with k.
1211       Notes:
1212       * p50 and median are synonymous.
1213       * min and max output the same results as p0 and p100, respectively, but use
1214         less memory.
1215       * String-valued data make sense unless arithmetic on them is required,
1216         e.g. for sum, mean, interpolated percentiles, etc. In case of mixed data,
1217         numbers are less than strings.
1218       * count and mode allow text input; the rest require numeric input.
1219         In particular, 1 and 1.0 are distinct text for count and mode.
1220       * When there are mode ties, the first-encountered datum wins.
1221
1222   stats2
1223       Usage: mlr stats2 [options]
1224       Computes bivariate statistics for one or more given field-name pairs,
1225       accumulated across the input record stream.
1226       -a {linreg-ols,corr,...}  Names of accumulators: one or more of:
1227         linreg-pca   Linear regression using principal component analysis
1228         linreg-ols   Linear regression using ordinary least squares
1229         r2           Quality metric for linreg-ols (linreg-pca emits its own)
1230         logireg      Logistic regression
1231         corr         Sample correlation
1232         cov          Sample covariance
1233         covx         Sample-covariance matrix
1234       -f {a,b,c,d}   Value-field name-pairs on which to compute statistics.
1235                      There must be an even number of names.
1236       -g {e,f,g}     Optional group-by-field names.
1237       -v             Print additional output for linreg-pca.
1238       -s             Print iterative stats. Useful in tail -f contexts (in which
1239                      case please avoid pprint-format output since end of input
1240                      stream will never be seen).
1241       --fit          Rather than printing regression parameters, applies them to
1242                      the input data to compute new fit fields. All input records are
1243                      held in memory until end of input stream. Has effect only for
1244                      linreg-ols, linreg-pca, and logireg.
1245       Only one of -s or --fit may be used.
1246       Example: mlr stats2 -a linreg-pca -f x,y
1247       Example: mlr stats2 -a linreg-ols,r2 -f x,y -g size,shape
1248       Example: mlr stats2 -a corr -f x,y
1249
1250   step
1251       Usage: mlr step [options]
1252       Computes values dependent on the previous record, optionally grouped
1253       by category.
1254
1255       Options:
1256       -a {delta,rsum,...}   Names of steppers: comma-separated, one or more of:
1257         delta    Compute differences in field(s) between successive records
1258         shift    Include value(s) in field(s) from previous record, if any
1259         from-first Compute differences in field(s) from first record
1260         ratio    Compute ratios in field(s) between successive records
1261         rsum     Compute running sums of field(s) between successive records
1262         counter  Count instances of field(s) between successive records
1263         ewma     Exponentially weighted moving average over successive records
1264       -f {a,b,c} Value-field names on which to compute statistics
1265       -g {d,e,f} Optional group-by-field names
1266       -F         Computes integerable things (e.g. counter) in floating point.
1267       -d {x,y,z} Weights for ewma. 1 means current sample gets all weight (no
1268                  smoothing), near under under 1 is light smoothing, near over 0 is
1269                  heavy smoothing. Multiple weights may be specified, e.g.
1270                  "mlr step -a ewma -f sys_load -d 0.01,0.1,0.9". Default if omitted
1271                  is "-d 0.5".
1272       -o {a,b,c} Custom suffixes for EWMA output fields. If omitted, these default to
1273                  the -d values. If supplied, the number of -o values must be the same
1274                  as the number of -d values.
1275
1276       Examples:
1277         mlr step -a rsum -f request_size
1278         mlr step -a delta -f request_size -g hostname
1279         mlr step -a ewma -d 0.1,0.9 -f x,y
1280         mlr step -a ewma -d 0.1,0.9 -o smooth,rough -f x,y
1281         mlr step -a ewma -d 0.1,0.9 -o smooth,rough -f x,y -g group_name
1282
1283       Please see http://johnkerl.org/miller/doc/reference.html#filter or
1284       https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
1285       for more information on EWMA.
1286
1287   tac
1288       Usage: mlr tac
1289       Prints records in reverse order from the order in which they were encountered.
1290
1291   tail
1292       Usage: mlr tail [options]
1293       -n {count}    Tail count to print; default 10
1294       -g {a,b,c}    Optional group-by-field names for tail counts
1295       Passes through the last n records, optionally by category.
1296
1297   tee
1298       Usage: mlr tee [options] {filename}
1299       Passes through input records (like mlr cat) but also writes to specified output
1300       file, using output-format flags from the command line (e.g. --ocsv). See also
1301       the "tee" keyword within mlr put, which allows data-dependent filenames.
1302       Options:
1303       -a:          append to existing file, if any, rather than overwriting.
1304       --no-fflush: don't call fflush() after every record.
1305       Any of the output-format command-line flags (see mlr -h). Example: using
1306         mlr --icsv --opprint put '...' then tee --ojson ./mytap.dat then stats1 ...
1307       the input is CSV, the output is pretty-print tabular, but the tee-file output
1308       is written in JSON format.
1309
1310   top
1311       Usage: mlr top [options]
1312       -f {a,b,c}    Value-field names for top counts.
1313       -g {d,e,f}    Optional group-by-field names for top counts.
1314       -n {count}    How many records to print per category; default 1.
1315       -a            Print all fields for top-value records; default is
1316                     to print only value and group-by fields. Requires a single
1317                     value-field name only.
1318       --min         Print top smallest values; default is top largest values.
1319       -F            Keep top values as floats even if they look like integers.
1320       -o {name}     Field name for output indices. Default "top_idx".
1321       Prints the n records with smallest/largest values at specified fields,
1322       optionally by category.
1323
1324   uniq
1325       Usage: mlr uniq [options]
1326       Prints distinct values for specified field names. With -c, same as
1327       count-distinct. For uniq, -f is a synonym for -g.
1328
1329       Options:
1330       -g {d,e,f}    Group-by-field names for uniq counts.
1331       -c            Show repeat counts in addition to unique values.
1332       -n            Show only the number of distinct values.
1333       -o {name}     Field name for output count. Default "count".
1334       -a            Output each unique record only once. Incompatible with -g.
1335                     With -c, produces unique records, with repeat counts for each.
1336                     With -n, produces only one record which is the unique-record count.
1337                     With neither -c nor -n, produces unique records.
1338
1339   unsparsify
1340       Usage: mlr unsparsify [options]
1341       Prints records with the union of field names over all input records.
1342       For field names absent in a given record but present in others, fills in
1343       a value. This verb retains all input before producing any output.
1344
1345       Options:
1346       --fill-with {filler string}  What to fill absent fields with. Defaults to
1347                                    the empty string.
1348
1349       Example: if the input is two records, one being 'a=1,b=2' and the other
1350       being 'b=3,c=4', then the output is the two records 'a=1,b=2,c=' and
1351       ’a=,b=3,c=4'.
1352

FUNCTIONS FOR FILTER/PUT

1354   +
1355       (class=arithmetic #args=2): Addition.
1356
1357       + (class=arithmetic #args=1): Unary plus.
1358
1359   -
1360       (class=arithmetic #args=2): Subtraction.
1361
1362       - (class=arithmetic #args=1): Unary minus.
1363
1364   *
1365       (class=arithmetic #args=2): Multiplication.
1366
1367   /
1368       (class=arithmetic #args=2): Division.
1369
1370   //
1371       (class=arithmetic #args=2): Integer division: rounds to negative (pythonic).
1372
1373   .+
1374       (class=arithmetic #args=2): Addition, with integer-to-integer overflow
1375
1376       .+ (class=arithmetic #args=1): Unary plus, with integer-to-integer overflow.
1377
1378   .-
1379       (class=arithmetic #args=2): Subtraction, with integer-to-integer overflow.
1380
1381       .- (class=arithmetic #args=1): Unary minus, with integer-to-integer overflow.
1382
1383   .*
1384       (class=arithmetic #args=2): Multiplication, with integer-to-integer overflow.
1385
1386   ./
1387       (class=arithmetic #args=2): Division, with integer-to-integer overflow.
1388
1389   .//
1390       (class=arithmetic #args=2): Integer division: rounds to negative (pythonic), with integer-to-integer overflow.
1391
1392   %
1393       (class=arithmetic #args=2): Remainder; never negative-valued (pythonic).
1394
1395   **
1396       (class=arithmetic #args=2): Exponentiation; same as pow, but as an infix
1397       operator.
1398
1399   |
1400       (class=arithmetic #args=2): Bitwise OR.
1401
1402   ^
1403       (class=arithmetic #args=2): Bitwise XOR.
1404
1405   &
1406       (class=arithmetic #args=2): Bitwise AND.
1407
1408   ~
1409       (class=arithmetic #args=1): Bitwise NOT. Beware '$y=~$x' since =~ is the
1410       regex-match operator: try '$y = ~$x'.
1411
1412   <<
1413       (class=arithmetic #args=2): Bitwise left-shift.
1414
1415   >>
1416       (class=arithmetic #args=2): Bitwise right-shift.
1417
1418   bitcount
1419       (class=arithmetic #args=1): Count of 1-bits
1420
1421   ==
1422       (class=boolean #args=2): String/numeric equality. Mixing number and string
1423       results in string compare.
1424
1425   !=
1426       (class=boolean #args=2): String/numeric inequality. Mixing number and string
1427       results in string compare.
1428
1429   =~
1430       (class=boolean #args=2): String (left-hand side) matches regex (right-hand
1431       side), e.g. '$name =~ "^a.*b$"'.
1432
1433   !=~
1434       (class=boolean #args=2): String (left-hand side) does not match regex
1435       (right-hand side), e.g. '$name !=~ "^a.*b$"'.
1436
1437   >
1438       (class=boolean #args=2): String/numeric greater-than. Mixing number and string
1439       results in string compare.
1440
1441   >=
1442       (class=boolean #args=2): String/numeric greater-than-or-equals. Mixing number
1443       and string results in string compare.
1444
1445   <
1446       (class=boolean #args=2): String/numeric less-than. Mixing number and string
1447       results in string compare.
1448
1449   <=
1450       (class=boolean #args=2): String/numeric less-than-or-equals. Mixing number
1451       and string results in string compare.
1452
1453   &&
1454       (class=boolean #args=2): Logical AND.
1455
1456   ||
1457       (class=boolean #args=2): Logical OR.
1458
1459   ^^
1460       (class=boolean #args=2): Logical XOR.
1461
1462   !
1463       (class=boolean #args=1): Logical negation.
1464
1465   ? :
1466       (class=boolean #args=3): Ternary operator.
1467
1468   .
1469       (class=string #args=2): String concatenation.
1470
1471   gsub
1472       (class=string #args=3): Example: '$name=gsub($name, "old", "new")'
1473       (replace all).
1474
1475   regextract
1476       (class=string #args=2): Example: '$name=regextract($name, "[A-Z]{3}[0-9]{2}")'
1477       .
1478
1479   regextract_or_else
1480       (class=string #args=3): Example: '$name=regextract_or_else($name, "[A-Z]{3}[0-9]{2}", "default")'
1481       .
1482
1483   strlen
1484       (class=string #args=1): String length.
1485
1486   sub
1487       (class=string #args=3): Example: '$name=sub($name, "old", "new")'
1488       (replace once).
1489
1490   ssub
1491       (class=string #args=3): Like sub but does no regexing. No characters are special.
1492
1493   substr
1494       (class=string #args=3): substr(s,m,n) gives substring of s from 0-up position m to n
1495       inclusive. Negative indices -len .. -1 alias to 0 .. len-1.
1496
1497   tolower
1498       (class=string #args=1): Convert string to lowercase.
1499
1500   toupper
1501       (class=string #args=1): Convert string to uppercase.
1502
1503   capitalize
1504       (class=string #args=1): Convert string's first character to uppercase.
1505
1506   lstrip
1507       (class=string #args=1): Strip leading whitespace from string.
1508
1509   rstrip
1510       (class=string #args=1): Strip trailing whitespace from string.
1511
1512   strip
1513       (class=string #args=1): Strip leading and trailing whitespace from string.
1514
1515   collapse_whitespace
1516       (class=string #args=1): Strip repeated whitespace from string.
1517
1518   clean_whitespace
1519       (class=string #args=1): Same as collapse_whitespace and strip.
1520
1521   system
1522       (class=string #args=1): Run command string, yielding its stdout minus final carriage return.
1523
1524   abs
1525       (class=math #args=1): Absolute value.
1526
1527   acos
1528       (class=math #args=1): Inverse trigonometric cosine.
1529
1530   acosh
1531       (class=math #args=1): Inverse hyperbolic cosine.
1532
1533   asin
1534       (class=math #args=1): Inverse trigonometric sine.
1535
1536   asinh
1537       (class=math #args=1): Inverse hyperbolic sine.
1538
1539   atan
1540       (class=math #args=1): One-argument arctangent.
1541
1542   atan2
1543       (class=math #args=2): Two-argument arctangent.
1544
1545   atanh
1546       (class=math #args=1): Inverse hyperbolic tangent.
1547
1548   cbrt
1549       (class=math #args=1): Cube root.
1550
1551   ceil
1552       (class=math #args=1): Ceiling: nearest integer at or above.
1553
1554   cos
1555       (class=math #args=1): Trigonometric cosine.
1556
1557   cosh
1558       (class=math #args=1): Hyperbolic cosine.
1559
1560   erf
1561       (class=math #args=1): Error function.
1562
1563   erfc
1564       (class=math #args=1): Complementary error function.
1565
1566   exp
1567       (class=math #args=1): Exponential function e**x.
1568
1569   expm1
1570       (class=math #args=1): e**x - 1.
1571
1572   floor
1573       (class=math #args=1): Floor: nearest integer at or below.
1574
1575   invqnorm
1576       (class=math #args=1): Inverse of normal cumulative distribution
1577       function. Note that invqorm(urand()) is normally distributed.
1578
1579   log
1580       (class=math #args=1): Natural (base-e) logarithm.
1581
1582   log10
1583       (class=math #args=1): Base-10 logarithm.
1584
1585   log1p
1586       (class=math #args=1): log(1-x).
1587
1588   logifit
1589       (class=math #args=3): Given m and b from logistic regression, compute
1590       fit: $yhat=logifit($x,$m,$b).
1591
1592   madd
1593       (class=math #args=3): a + b mod m (integers)
1594
1595   max
1596       (class=math variadic): max of n numbers; null loses
1597
1598   mexp
1599       (class=math #args=3): a ** b mod m (integers)
1600
1601   min
1602       (class=math variadic): Min of n numbers; null loses
1603
1604   mmul
1605       (class=math #args=3): a * b mod m (integers)
1606
1607   msub
1608       (class=math #args=3): a - b mod m (integers)
1609
1610   pow
1611       (class=math #args=2): Exponentiation; same as **.
1612
1613   qnorm
1614       (class=math #args=1): Normal cumulative distribution function.
1615
1616   round
1617       (class=math #args=1): Round to nearest integer.
1618
1619   roundm
1620       (class=math #args=2): Round to nearest multiple of m: roundm($x,$m) is
1621       the same as round($x/$m)*$m
1622
1623   sgn
1624       (class=math #args=1): +1 for positive input, 0 for zero input, -1 for
1625       negative input.
1626
1627   sin
1628       (class=math #args=1): Trigonometric sine.
1629
1630   sinh
1631       (class=math #args=1): Hyperbolic sine.
1632
1633   sqrt
1634       (class=math #args=1): Square root.
1635
1636   tan
1637       (class=math #args=1): Trigonometric tangent.
1638
1639   tanh
1640       (class=math #args=1): Hyperbolic tangent.
1641
1642   urand
1643       (class=math #args=0): Floating-point numbers uniformly distributed on the unit interval.
1644       Int-valued example: '$n=floor(20+urand()*11)'.
1645
1646   urandrange
1647       (class=math #args=2): Floating-point numbers uniformly distributed on the interval [a, b).
1648
1649   urand32
1650       (class=math #args=0): Integer uniformly distributed 0 and 2**32-1
1651       inclusive.
1652
1653   urandint
1654       (class=math #args=2): Integer uniformly distributed between inclusive
1655       integer endpoints.
1656
1657   dhms2fsec
1658       (class=time #args=1): Recovers floating-point seconds as in
1659       dhms2fsec("5d18h53m20.250000s") = 500000.250000
1660
1661   dhms2sec
1662       (class=time #args=1): Recovers integer seconds as in
1663       dhms2sec("5d18h53m20s") = 500000
1664
1665   fsec2dhms
1666       (class=time #args=1): Formats floating-point seconds as in
1667       fsec2dhms(500000.25) = "5d18h53m20.250000s"
1668
1669   fsec2hms
1670       (class=time #args=1): Formats floating-point seconds as in
1671       fsec2hms(5000.25) = "01:23:20.250000"
1672
1673   gmt2sec
1674       (class=time #args=1): Parses GMT timestamp as integer seconds since
1675       the epoch.
1676
1677   localtime2sec
1678       (class=time #args=1): Parses local timestamp as integer seconds since
1679       the epoch. Consults $TZ environment variable.
1680
1681   hms2fsec
1682       (class=time #args=1): Recovers floating-point seconds as in
1683       hms2fsec("01:23:20.250000") = 5000.250000
1684
1685   hms2sec
1686       (class=time #args=1): Recovers integer seconds as in
1687       hms2sec("01:23:20") = 5000
1688
1689   sec2dhms
1690       (class=time #args=1): Formats integer seconds as in sec2dhms(500000)
1691       = "5d18h53m20s"
1692
1693   sec2gmt
1694       (class=time #args=1): Formats seconds since epoch (integer part)
1695       as GMT timestamp, e.g. sec2gmt(1440768801.7) = "2015-08-28T13:33:21Z".
1696       Leaves non-numbers as-is.
1697
1698       sec2gmt (class=time #args=2): Formats seconds since epoch as GMT timestamp with n
1699       decimal places for seconds, e.g. sec2gmt(1440768801.7,1) = "2015-08-28T13:33:21.7Z".
1700       Leaves non-numbers as-is.
1701
1702   sec2gmtdate
1703       (class=time #args=1): Formats seconds since epoch (integer part)
1704       as GMT timestamp with year-month-date, e.g. sec2gmtdate(1440768801.7) = "2015-08-28".
1705       Leaves non-numbers as-is.
1706
1707   sec2localtime
1708       (class=time #args=1): Formats seconds since epoch (integer part)
1709       as local timestamp, e.g. sec2localtime(1440768801.7) = "2015-08-28T13:33:21Z".
1710       Consults $TZ environment variable. Leaves non-numbers as-is.
1711
1712       sec2localtime (class=time #args=2): Formats seconds since epoch as local timestamp with n
1713       decimal places for seconds, e.g. sec2localtime(1440768801.7,1) = "2015-08-28T13:33:21.7Z".
1714       Consults $TZ environment variable. Leaves non-numbers as-is.
1715
1716   sec2localdate
1717       (class=time #args=1): Formats seconds since epoch (integer part)
1718       as local timestamp with year-month-date, e.g. sec2localdate(1440768801.7) = "2015-08-28".
1719       Consults $TZ environment variable. Leaves non-numbers as-is.
1720
1721   sec2hms
1722       (class=time #args=1): Formats integer seconds as in
1723       sec2hms(5000) = "01:23:20"
1724
1725   strftime
1726       (class=time #args=2): Formats seconds since the epoch as timestamp, e.g.
1727       strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z", and
1728       strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z".
1729       Format strings are as in the C library (please see "man strftime" on your system),
1730       with the Miller-specific addition of "%1S" through "%9S" which format the seconds
1731       with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.)
1732       See also strftime_local.
1733
1734   strftime_local
1735       (class=time #args=2): Like strftime but consults the $TZ environment variable to get local time zone.
1736
1737   strptime
1738       (class=time #args=2): Parses timestamp as floating-point seconds since the epoch,
1739       e.g. strptime("2015-08-28T13:33:21Z","%Y-%m-%dT%H:%M:%SZ") = 1440768801.000000,
1740       and  strptime("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440768801.345000.
1741       See also strptime_local.
1742
1743   strptime_local
1744       (class=time #args=2): Like strptime, but consults $TZ environment variable to find and use local timezone.
1745
1746   systime
1747       (class=time #args=0): Floating-point seconds since the epoch,
1748       e.g. 1440768801.748936.
1749
1750   is_absent
1751       (class=typing #args=1): False if field is present in input, false otherwise
1752
1753   is_bool
1754       (class=typing #args=1): True if field is present with boolean value. Synonymous with is_boolean.
1755
1756   is_boolean
1757       (class=typing #args=1): True if field is present with boolean value. Synonymous with is_bool.
1758
1759   is_empty
1760       (class=typing #args=1): True if field is present in input with empty string value, false otherwise.
1761
1762   is_empty_map
1763       (class=typing #args=1): True if argument is a map which is empty.
1764
1765   is_float
1766       (class=typing #args=1): True if field is present with value inferred to be float
1767
1768   is_int
1769       (class=typing #args=1): True if field is present with value inferred to be int
1770
1771   is_map
1772       (class=typing #args=1): True if argument is a map.
1773
1774   is_nonempty_map
1775       (class=typing #args=1): True if argument is a map which is non-empty.
1776
1777   is_not_empty
1778       (class=typing #args=1): False if field is present in input with empty value, true otherwise
1779
1780   is_not_map
1781       (class=typing #args=1): True if argument is not a map.
1782
1783   is_not_null
1784       (class=typing #args=1): False if argument is null (empty or absent), true otherwise.
1785
1786   is_null
1787       (class=typing #args=1): True if argument is null (empty or absent), false otherwise.
1788
1789   is_numeric
1790       (class=typing #args=1): True if field is present with value inferred to be int or float
1791
1792   is_present
1793       (class=typing #args=1): True if field is present in input, false otherwise.
1794
1795   is_string
1796       (class=typing #args=1): True if field is present with string (including empty-string) value
1797
1798   asserting_absent
1799       (class=typing #args=1): Returns argument if it is absent in the input data, else
1800       throws an error.
1801
1802   asserting_bool
1803       (class=typing #args=1): Returns argument if it is present with boolean value, else
1804       throws an error.
1805
1806   asserting_boolean
1807       (class=typing #args=1): Returns argument if it is present with boolean value, else
1808       throws an error.
1809
1810   asserting_empty
1811       (class=typing #args=1): Returns argument if it is present in input with empty value,
1812       else throws an error.
1813
1814   asserting_empty_map
1815       (class=typing #args=1): Returns argument if it is a map with empty value, else
1816       throws an error.
1817
1818   asserting_float
1819       (class=typing #args=1): Returns argument if it is present with float value, else
1820       throws an error.
1821
1822   asserting_int
1823       (class=typing #args=1): Returns argument if it is present with int value, else
1824       throws an error.
1825
1826   asserting_map
1827       (class=typing #args=1): Returns argument if it is a map, else throws an error.
1828
1829   asserting_nonempty_map
1830       (class=typing #args=1): Returns argument if it is a non-empty map, else throws
1831       an error.
1832
1833   asserting_not_empty
1834       (class=typing #args=1): Returns argument if it is present in input with non-empty
1835       value, else throws an error.
1836
1837   asserting_not_map
1838       (class=typing #args=1): Returns argument if it is not a map, else throws an error.
1839
1840   asserting_not_null
1841       (class=typing #args=1): Returns argument if it is non-null (non-empty and non-absent),
1842       else throws an error.
1843
1844   asserting_null
1845       (class=typing #args=1): Returns argument if it is null (empty or absent), else throws
1846       an error.
1847
1848   asserting_numeric
1849       (class=typing #args=1): Returns argument if it is present with int or float value,
1850       else throws an error.
1851
1852   asserting_present
1853       (class=typing #args=1): Returns argument if it is present in input, else throws
1854       an error.
1855
1856   asserting_string
1857       (class=typing #args=1): Returns argument if it is present with string (including
1858       empty-string) value, else throws an error.
1859
1860   boolean
1861       (class=conversion #args=1): Convert int/float/bool/string to boolean.
1862
1863   float
1864       (class=conversion #args=1): Convert int/float/bool/string to float.
1865
1866   fmtnum
1867       (class=conversion #args=2): Convert int/float/bool to string using
1868       printf-style format string, e.g. '$s = fmtnum($n, "%06lld")'. WARNING: Miller numbers
1869       are all long long or double. If you use formats like %d or %f, behavior is undefined.
1870
1871   hexfmt
1872       (class=conversion #args=1): Convert int to string, e.g. 255 to "0xff".
1873
1874   int
1875       (class=conversion #args=1): Convert int/float/bool/string to int.
1876
1877   string
1878       (class=conversion #args=1): Convert int/float/bool/string to string.
1879
1880   typeof
1881       (class=conversion #args=1): Convert argument to type of argument (e.g.
1882       MT_STRING). For debug.
1883
1884   depth
1885       (class=maps #args=1): Prints maximum depth of hashmap: ''. Scalars have depth 0.
1886
1887   haskey
1888       (class=maps #args=2): True/false if map has/hasn't key, e.g. 'haskey($*, "a")' or
1889       ’haskey(mymap, mykey)'. Error if 1st argument is not a map.
1890
1891   joink
1892       (class=maps #args=2): Makes string from map keys. E.g. 'joink($*, ",")'.
1893
1894   joinkv
1895       (class=maps #args=3): Makes string from map key-value pairs. E.g. 'joinkv(@v[2], "=", ",")'
1896
1897   joinv
1898       (class=maps #args=2): Makes string from map keys. E.g. 'joinv(mymap, ",")'.
1899
1900   leafcount
1901       (class=maps #args=1): Counts total number of terminal values in hashmap. For single-level maps,
1902       same as length.
1903
1904   length
1905       (class=maps #args=1): Counts number of top-level entries in hashmap. Scalars have length 1.
1906
1907   mapdiff
1908       (class=maps variadic): With 0 args, returns empty map. With 1 arg, returns copy of arg.
1909       With 2 or more, returns copy of arg 1 with all keys from any of remaining argument maps removed.
1910
1911   mapexcept
1912       (class=maps variadic): Returns a map with keys from remaining arguments, if any, unset.
1913       E.g. 'mapexcept({1:2,3:4,5:6}, 1, 5, 7)' is '{3:4}'.
1914
1915   mapselect
1916       (class=maps variadic): Returns a map with only keys from remaining arguments set.
1917       E.g. 'mapselect({1:2,3:4,5:6}, 1, 5, 7)' is '{1:2,5:6}'.
1918
1919   mapsum
1920       (class=maps variadic): With 0 args, returns empty map. With >= 1 arg, returns a map with
1921       key-value pairs from all arguments. Rightmost collisions win, e.g. 'mapsum({1:2,3:4},{1:5})' is '{1:5,3:4}'.
1922
1923   splitkv
1924       (class=maps #args=3): Splits string by separators into map with type inference.
1925       E.g. 'splitkv("a=1,b=2,c=3", "=", ",")' gives '{"a" : 1, "b" : 2, "c" : 3}'.
1926
1927   splitkvx
1928       (class=maps #args=3): Splits string by separators into map without type inference (keys and
1929       values are strings). E.g. 'splitkv("a=1,b=2,c=3", "=", ",")' gives
1930       ’{"a" : "1", "b" : "2", "c" : "3"}'.
1931
1932   splitnv
1933       (class=maps #args=2): Splits string by separator into integer-indexed map with type inference.
1934       E.g. 'splitnv("a,b,c" , ",")' gives '{1 : "a", 2 : "b", 3 : "c"}'.
1935
1936   splitnvx
1937       (class=maps #args=2): Splits string by separator into integer-indexed map without type
1938       inference (values are strings). E.g. 'splitnv("4,5,6" , ",")' gives '{1 : "4", 2 : "5", 3 : "6"}'.
1939

KEYWORDS FOR PUT AND FILTER

1941   all
1942       all: used in "emit", "emitp", and "unset" as a synonym for @*
1943
1944   begin
1945       begin: defines a block of statements to be executed before input records
1946       are ingested. The body statements must be wrapped in curly braces.
1947       Example: 'begin { @count = 0 }'
1948
1949   bool
1950       bool: declares a boolean local variable in the current curly-braced scope.
1951       Type-checking happens at assignment: 'bool b = 1' is an error.
1952
1953   break
1954       break: causes execution to continue after the body of the current
1955       for/while/do-while loop.
1956
1957   call
1958       call: used for invoking a user-defined subroutine.
1959       Example: 'subr s(k,v) { print k . " is " . v} call s("a", $a)'
1960
1961   continue
1962       continue: causes execution to skip the remaining statements in the body of
1963       the current for/while/do-while loop. For-loop increments are still applied.
1964
1965   do
1966       do: with "while", introduces a do-while loop. The body statements must be wrapped
1967       in curly braces.
1968
1969   dump
1970       dump: prints all currently defined out-of-stream variables immediately
1971         to stdout as JSON.
1972
1973         With >, >>, or |, the data do not become part of the output record stream but
1974         are instead redirected.
1975
1976         The > and >> are for write and append, as in the shell, but (as with awk) the
1977         file-overwrite for > is on first write, not per record. The | is for piping to
1978         a process which will process the data. There will be one open file for each
1979         distinct file name (for > and >>) or one subordinate process for each distinct
1980         value of the piped-to command (for |). Output-formatting flags are taken from
1981         the main command line.
1982
1983         Example: mlr --from f.dat put -q '@v[NR]=$*; end { dump }'
1984         Example: mlr --from f.dat put -q '@v[NR]=$*; end { dump >  "mytap.dat"}'
1985         Example: mlr --from f.dat put -q '@v[NR]=$*; end { dump >> "mytap.dat"}'
1986         Example: mlr --from f.dat put -q '@v[NR]=$*; end { dump | "jq .[]"}'
1987
1988   edump
1989       edump: prints all currently defined out-of-stream variables immediately
1990         to stderr as JSON.
1991
1992         Example: mlr --from f.dat put -q '@v[NR]=$*; end { edump }'
1993
1994   elif
1995       elif: the way Miller spells "else if". The body statements must be wrapped
1996       in curly braces.
1997
1998   else
1999       else: terminates an if/elif/elif chain. The body statements must be wrapped
2000       in curly braces.
2001
2002   emit
2003       emit: inserts an out-of-stream variable into the output record stream. Hashmap
2004         indices present in the data but not slotted by emit arguments are not output.
2005
2006         With >, >>, or |, the data do not become part of the output record stream but
2007         are instead redirected.
2008
2009         The > and >> are for write and append, as in the shell, but (as with awk) the
2010         file-overwrite for > is on first write, not per record. The | is for piping to
2011         a process which will process the data. There will be one open file for each
2012         distinct file name (for > and >>) or one subordinate process for each distinct
2013         value of the piped-to command (for |). Output-formatting flags are taken from
2014         the main command line.
2015
2016         You can use any of the output-format command-line flags, e.g. --ocsv, --ofs,
2017         etc., to control the format of the output if the output is redirected. See also mlr -h.
2018
2019         Example: mlr --from f.dat put 'emit >  "/tmp/data-".$a, $*'
2020         Example: mlr --from f.dat put 'emit >  "/tmp/data-".$a, mapexcept($*, "a")'
2021         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit @sums'
2022         Example: mlr --from f.dat put --ojson '@sums[$a][$b]+=$x; emit > "tap-".$a.$b.".dat", @sums'
2023         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit @sums, "index1", "index2"'
2024         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit @*, "index1", "index2"'
2025         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit >  "mytap.dat", @*, "index1", "index2"'
2026         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit >> "mytap.dat", @*, "index1", "index2"'
2027         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit | "gzip > mytap.dat.gz", @*, "index1", "index2"'
2028         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit > stderr, @*, "index1", "index2"'
2029         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emit | "grep somepattern", @*, "index1", "index2"'
2030
2031         Please see http://johnkerl.org/miller/doc for more information.
2032
2033   emitf
2034       emitf: inserts non-indexed out-of-stream variable(s) side-by-side into the
2035         output record stream.
2036
2037         With >, >>, or |, the data do not become part of the output record stream but
2038         are instead redirected.
2039
2040         The > and >> are for write and append, as in the shell, but (as with awk) the
2041         file-overwrite for > is on first write, not per record. The | is for piping to
2042         a process which will process the data. There will be one open file for each
2043         distinct file name (for > and >>) or one subordinate process for each distinct
2044         value of the piped-to command (for |). Output-formatting flags are taken from
2045         the main command line.
2046
2047         You can use any of the output-format command-line flags, e.g. --ocsv, --ofs,
2048         etc., to control the format of the output if the output is redirected. See also mlr -h.
2049
2050         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf @a'
2051         Example: mlr --from f.dat put --oxtab '@a=$i;@b+=$x;@c+=$y; emitf > "tap-".$i.".dat", @a'
2052         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf @a, @b, @c'
2053         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf > "mytap.dat", @a, @b, @c'
2054         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf >> "mytap.dat", @a, @b, @c'
2055         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf > stderr, @a, @b, @c'
2056         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf | "grep somepattern", @a, @b, @c'
2057         Example: mlr --from f.dat put '@a=$i;@b+=$x;@c+=$y; emitf | "grep somepattern > mytap.dat", @a, @b, @c'
2058
2059         Please see http://johnkerl.org/miller/doc for more information.
2060
2061   emitp
2062       emitp: inserts an out-of-stream variable into the output record stream.
2063         Hashmap indices present in the data but not slotted by emitp arguments are
2064         output concatenated with ":".
2065
2066         With >, >>, or |, the data do not become part of the output record stream but
2067         are instead redirected.
2068
2069         The > and >> are for write and append, as in the shell, but (as with awk) the
2070         file-overwrite for > is on first write, not per record. The | is for piping to
2071         a process which will process the data. There will be one open file for each
2072         distinct file name (for > and >>) or one subordinate process for each distinct
2073         value of the piped-to command (for |). Output-formatting flags are taken from
2074         the main command line.
2075
2076         You can use any of the output-format command-line flags, e.g. --ocsv, --ofs,
2077         etc., to control the format of the output if the output is redirected. See also mlr -h.
2078
2079         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp @sums'
2080         Example: mlr --from f.dat put --opprint '@sums[$a][$b]+=$x; emitp > "tap-".$a.$b.".dat", @sums'
2081         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp @sums, "index1", "index2"'
2082         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp @*, "index1", "index2"'
2083         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp >  "mytap.dat", @*, "index1", "index2"'
2084         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp >> "mytap.dat", @*, "index1", "index2"'
2085         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp | "gzip > mytap.dat.gz", @*, "index1", "index2"'
2086         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp > stderr, @*, "index1", "index2"'
2087         Example: mlr --from f.dat put '@sums[$a][$b]+=$x; emitp | "grep somepattern", @*, "index1", "index2"'
2088
2089         Please see http://johnkerl.org/miller/doc for more information.
2090
2091   end
2092       end: defines a block of statements to be executed after input records
2093       are ingested. The body statements must be wrapped in curly braces.
2094       Example: 'end { emit @count }'
2095       Example: 'end { eprint "Final count is " . @count }'
2096
2097   eprint
2098       eprint: prints expression immediately to stderr.
2099         Example: mlr --from f.dat put -q 'eprint "The sum of x and y is ".($x+$y)'
2100         Example: mlr --from f.dat put -q 'for (k, v in $*) { eprint k . " => " . v }'
2101         Example: mlr --from f.dat put  '(NR % 1000 == 0) { eprint "Checkpoint ".NR}'
2102
2103   eprintn
2104       eprintn: prints expression immediately to stderr, without trailing newline.
2105         Example: mlr --from f.dat put -q 'eprintn "The sum of x and y is ".($x+$y); eprint ""'
2106
2107   false
2108       false: the boolean literal value.
2109
2110   filter
2111       filter: includes/excludes the record in the output record stream.
2112
2113         Example: mlr --from f.dat put 'filter (NR == 2 || $x > 5.4)'
2114
2115         Instead of put with 'filter false' you can simply use put -q.  The following
2116         uses the input record to accumulate data but only prints the running sum
2117         without printing the input record:
2118
2119         Example: mlr --from f.dat put -q '@running_sum += $x * $y; emit @running_sum'
2120
2121   float
2122       float: declares a floating-point local variable in the current curly-braced scope.
2123       Type-checking happens at assignment: 'float x = 0' is an error.
2124
2125   for
2126       for: defines a for-loop using one of three styles. The body statements must
2127       be wrapped in curly braces.
2128       For-loop over stream record:
2129         Example:  'for (k, v in $*) { ... }'
2130       For-loop over out-of-stream variables:
2131         Example: 'for (k, v in @counts) { ... }'
2132         Example: 'for ((k1, k2), v in @counts) { ... }'
2133         Example: 'for ((k1, k2, k3), v in @*) { ... }'
2134       C-style for-loop:
2135         Example:  'for (var i = 0, var b = 1; i < 10; i += 1, b *= 2) { ... }'
2136
2137   func
2138       func: used for defining a user-defined function.
2139       Example: 'func f(a,b) { return sqrt(a**2+b**2)} $d = f($x, $y)'
2140
2141   if
2142       if: starts an if/elif/elif chain. The body statements must be wrapped
2143       in curly braces.
2144
2145   in
2146       in: used in for-loops over stream records or out-of-stream variables.
2147
2148   int
2149       int: declares an integer local variable in the current curly-braced scope.
2150       Type-checking happens at assignment: 'int x = 0.0' is an error.
2151
2152   map
2153       map: declares an map-valued local variable in the current curly-braced scope.
2154       Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
2155       always OK. map b = a is OK or not depending on whether a is a map.
2156
2157   num
2158       num: declares an int/float local variable in the current curly-braced scope.
2159       Type-checking happens at assignment: 'num b = true' is an error.
2160
2161   print
2162       print: prints expression immediately to stdout.
2163         Example: mlr --from f.dat put -q 'print "The sum of x and y is ".($x+$y)'
2164         Example: mlr --from f.dat put -q 'for (k, v in $*) { print k . " => " . v }'
2165         Example: mlr --from f.dat put  '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'
2166
2167   printn
2168       printn: prints expression immediately to stdout, without trailing newline.
2169         Example: mlr --from f.dat put -q 'printn "."; end { print "" }'
2170
2171   return
2172       return: specifies the return value from a user-defined function.
2173       Omitted return statements (including via if-branches) result in an absent-null
2174       return value, which in turns results in a skipped assignment to an LHS.
2175
2176   stderr
2177       stderr: Used for tee, emit, emitf, emitp, print, and dump in place of filename
2178         to print to standard error.
2179
2180   stdout
2181       stdout: Used for tee, emit, emitf, emitp, print, and dump in place of filename
2182         to print to standard output.
2183
2184   str
2185       str: declares a string local variable in the current curly-braced scope.
2186       Type-checking happens at assignment.
2187
2188   subr
2189       subr: used for defining a subroutine.
2190       Example: 'subr s(k,v) { print k . " is " . v} call s("a", $a)'
2191
2192   tee
2193       tee: prints the current record to specified file.
2194         This is an immediate print to the specified file (except for pprint format
2195         which of course waits until the end of the input stream to format all output).
2196
2197         The > and >> are for write and append, as in the shell, but (as with awk) the
2198         file-overwrite for > is on first write, not per record. The | is for piping to
2199         a process which will process the data. There will be one open file for each
2200         distinct file name (for > and >>) or one subordinate process for each distinct
2201         value of the piped-to command (for |). Output-formatting flags are taken from
2202         the main command line.
2203
2204         You can use any of the output-format command-line flags, e.g. --ocsv, --ofs,
2205         etc., to control the format of the output. See also mlr -h.
2206
2207         emit with redirect and tee with redirect are identical, except tee can only
2208         output $*.
2209
2210         Example: mlr --from f.dat put 'tee >  "/tmp/data-".$a, $*'
2211         Example: mlr --from f.dat put 'tee >> "/tmp/data-".$a.$b, $*'
2212         Example: mlr --from f.dat put 'tee >  stderr, $*'
2213         Example: mlr --from f.dat put -q 'tee | "tr [a-z\] [A-Z\]", $*'
2214         Example: mlr --from f.dat put -q 'tee | "tr [a-z\] [A-Z\] > /tmp/data-".$a, $*'
2215         Example: mlr --from f.dat put -q 'tee | "gzip > /tmp/data-".$a.".gz", $*'
2216         Example: mlr --from f.dat put -q --ojson 'tee | "gzip > /tmp/data-".$a.".gz", $*'
2217
2218   true
2219       true: the boolean literal value.
2220
2221   unset
2222       unset: clears field(s) from the current record, or an out-of-stream or local variable.
2223
2224         Example: mlr --from f.dat put 'unset $x'
2225         Example: mlr --from f.dat put 'unset $*'
2226         Example: mlr --from f.dat put 'for (k, v in $*) { if (k =~ "a.*") { unset $[k] } }'
2227         Example: mlr --from f.dat put '...; unset @sums'
2228         Example: mlr --from f.dat put '...; unset @sums["green"]'
2229         Example: mlr --from f.dat put '...; unset @*'
2230
2231   var
2232       var: declares an untyped local variable in the current curly-braced scope.
2233       Examples: 'var a=1', 'var xyz=""'
2234
2235   while
2236       while: introduces a while loop, or with "do", introduces a do-while loop.
2237       The body statements must be wrapped in curly braces.
2238
2239   ENV
2240       ENV: access to environment variables by name, e.g. '$home = ENV["HOME"]'
2241
2242   FILENAME
2243       FILENAME: evaluates to the name of the current file being processed.
2244
2245   FILENUM
2246       FILENUM: evaluates to the number of the current file being processed,
2247       starting with 1.
2248
2249   FNR
2250       FNR: evaluates to the number of the current record within the current file
2251       being processed, starting with 1. Resets at the start of each file.
2252
2253   IFS
2254       IFS: evaluates to the input field separator from the command line.
2255
2256   IPS
2257       IPS: evaluates to the input pair separator from the command line.
2258
2259   IRS
2260       IRS: evaluates to the input record separator from the command line,
2261       or to LF or CRLF from the input data if in autodetect mode (which is
2262       the default).
2263
2264   M_E
2265       M_E: the mathematical constant e.
2266
2267   M_PI
2268       M_PI: the mathematical constant pi.
2269
2270   NF
2271       NF: evaluates to the number of fields in the current record.
2272
2273   NR
2274       NR: evaluates to the number of the current record over all files
2275       being processed, starting with 1. Does not reset at the start of each file.
2276
2277   OFS
2278       OFS: evaluates to the output field separator from the command line.
2279
2280   OPS
2281       OPS: evaluates to the output pair separator from the command line.
2282
2283   ORS
2284       ORS: evaluates to the output record separator from the command line,
2285       or to LF or CRLF from the input data if in autodetect mode (which is
2286       the default).
2287

AUTHOR

2289       Miller is written by John Kerl <kerl.john.r@gmail.com>.
2290
2291       This manual page has been composed from Miller's help output by Eric
2292       MSP Veith <eveith@veith-m.de>.
2293

SEE ALSO

2295       awk(1), sed(1), cut(1), join(1), sort(1), RFC 4180: Common Format and
2296       MIME Type for Comma-Separated Values (CSV) Files, the miller website
2297       http://johnkerl.org/miller/doc
2298
2299
2300
2301                                  2019-09-22                         MILLER(1)
Impressum