1NAWK(1) General Commands Manual NAWK(1)
2
3
4
6 nawk - pattern-directed scanning and processing language
7
9 nawk [ -F fs ] [ -v var=value ] [ 'prog' | -f progfile ] [ file ... ]
10
12 Nawk scans each input file for lines that match any of a set of pat‐
13 terns specified literally in prog or in one or more files specified as
14 -f progfile. With each pattern there can be an associated action that
15 will be performed when a line of a file matches the pattern. Each line
16 is matched against the pattern portion of every pattern-action state‐
17 ment; the associated action is performed for each matched pattern. The
18 file name - means the standard input. Any file of the form var=value
19 is treated as an assignment, not a filename, and is executed at the
20 time it would have been opened if it were a filename. The option -v
21 followed by var=value is an assignment to be done before prog is exe‐
22 cuted; any number of -v options may be present. The -F fs option
23 defines the input field separator to be the regular expression fs.
24
25 An input line is normally made up of fields separated by white space,
26 or by the regular expression FS. The fields are denoted $1, $2, ...,
27 while $0 refers to the entire line. If FS is null, the input line is
28 split into one field per character.
29
30 A pattern-action statement has the form:
31
32 pattern { action }
33
34 A missing { action } means print the line; a missing pattern always
35 matches. Pattern-action statements are separated by newlines or semi‐
36 colons.
37
38 An action is a sequence of statements. A statement can be one of the
39 following:
40
41 if( expression ) statement [ else statement ]
42 while( expression ) statement
43 for( expression ; expression ; expression ) statement
44 for( var in array ) statement
45 do statement while( expression )
46 break
47 continue
48 { [ statement ... ] }
49 expression # commonly var = expression
50 print [ expression-list ] [ > expression ]
51 printf format [ , expression-list ] [ > expression ]
52 return [ expression ]
53 next # skip remaining patterns on this input line
54 nextfile # skip rest of this file, open next, start at top
55 delete array[ expression ]# delete an array element
56 delete array # delete all elements of array
57 exit [ expression ] # exit immediately; status is expression
58
59 Statements are terminated by semicolons, newlines or right braces. An
60 empty expression-list stands for $0. String constants are quoted " ",
61 with the usual C escapes recognized within. Expressions take on string
62 or numeric values as appropriate, and are built using the operators + -
63 * / % ^ (exponentiation), and concatenation (indicated by white space).
64 The operators ! ++ -- += -= *= /= %= ^= > >= < <= == != ?: are also
65 available in expressions. Variables may be scalars, array elements
66 (denoted x[i]) or fields. Variables are initialized to the null
67 string. Array subscripts may be any string, not necessarily numeric;
68 this allows for a form of associative memory. Multiple subscripts such
69 as [i,j,k] are permitted; the constituents are concatenated, separated
70 by the value of SUBSEP.
71
72 The print statement prints its arguments on the standard output (or on
73 a file if > file or >> file is present or on a pipe if | cmd is
74 present), separated by the current output field separator, and termi‐
75 nated by the output record separator. file and cmd may be literal
76 names or parenthesized expressions; identical string values in differ‐
77 ent statements denote the same open file. The printf statement formats
78 its expression list according to the format (see printf(3)). The
79 built-in function close(expr) closes the file or pipe expr. The built-
80 in function fflush(expr) flushes any buffered output for the file or
81 pipe expr.
82
83 The mathematical functions atan2, cos, exp, log, sin, and sqrt are
84 built in. Other built-in functions:
85
86 length the length of its argument taken as a string, number of elements
87 in an array for an array argument, or length of $0 if no argu‐
88 ment.
89
90 rand random number on (0,1)
91
92 srand sets seed for rand and returns the previous seed.
93
94 int truncates to an integer value
95
96 substr(s, m [, n])
97 the n-character substring of s that begins at position m counted
98 from 1. If no m, use the rest of the string
99
100 index(s, t)
101 the position in s where the string t occurs, or 0 if it does
102 not.
103
104 match(s, r)
105 the position in s where the regular expression r occurs, or 0 if
106 it does not. The variables RSTART and RLENGTH are set to the
107 position and length of the matched string.
108
109 split(s, a [, fs])
110 splits the string s into array elements a[1], a[2], ..., a[n],
111 and returns n. The separation is done with the regular expres‐
112 sion fs or with the field separator FS if fs is not given. An
113 empty string as field separator splits the string into one array
114 element per character.
115
116 sub(r, t [, s])
117 substitutes t for the first occurrence of the regular expression
118 r in the string s. If s is not given, $0 is used.
119
120 gsub(r, t [, s])
121 same as sub except that all occurrences of the regular expres‐
122 sion are replaced; sub and gsub return the number of replace‐
123 ments.
124
125 sprintf(fmt, expr, ...)
126 the string resulting from formatting expr ... according to the
127 printf(3) format fmt.
128
129 system(cmd)
130 executes cmd and returns its exit status. This will be -1 upon
131 error, cmd's exit status upon a normal exit, 256 + sig upon
132 death-by-signal, where sig is the number of the murdering sig‐
133 nal, or 512 + sig if there was a core dump.
134
135 tolower(str)
136 returns a copy of str with all upper-case characters translated
137 to their corresponding lower-case equivalents.
138
139 toupper(str)
140 returns a copy of str with all lower-case characters translated
141 to their corresponding upper-case equivalents.
142
143 The ``function'' getline sets $0 to the next input record from the cur‐
144 rent input file; getline < file sets $0 to the next record from file.
145 getline x sets variable x instead. Finally, cmd | getline pipes the
146 output of cmd into getline; each call of getline returns the next line
147 of output from cmd. In all cases, getline returns 1 for a successful
148 input, 0 for end of file, and -1 for an error.
149
150 Patterns are arbitrary Boolean combinations (with ! || &&) of regular
151 expressions and relational expressions. Regular expressions are as in
152 egrep; see grep(1). Isolated regular expressions in a pattern apply to
153 the entire line. Regular expressions may also occur in relational
154 expressions, using the operators ~ and !~. /re/ is a constant regular
155 expression; any string (constant or variable) may be used as a regular
156 expression, except in the position of an isolated regular expression in
157 a pattern.
158
159 A pattern may consist of two patterns separated by a comma; in this
160 case, the action is performed for all lines from an occurrence of the
161 first pattern though an occurrence of the second.
162
163 A relational expression is one of the following:
164
165 expression matchop regular-expression
166 expression relop expression
167 expression in array-name
168 (expr,expr,...) in array-name
169
170 where a relop is any of the six relational operators in C, and a
171 matchop is either ~ (matches) or !~ (does not match). A conditional is
172 an arithmetic expression, a relational expression, or a Boolean combi‐
173 nation of these.
174
175 The special patterns BEGIN and END may be used to capture control
176 before the first input line is read and after the last. BEGIN and END
177 do not combine with other patterns. They may appear multiple times in
178 a program and execute in the order they are read by awk.
179
180 Variable names with special meanings:
181
182 ARGC argument count, assignable.
183
184 ARGV argument array, assignable; non-null members are taken as file‐
185 names.
186
187 CONVFMT
188 conversion format used when converting numbers (default %.6g).
189
190 ENVIRON
191 array of environment variables; subscripts are names.
192
193 FILENAME
194 the name of the current input file.
195
196 FNR ordinal number of the current record in the current file.
197
198 FS regular expression used to separate fields; also settable by
199 option -Ffs.
200
201 NF number of fields in the current record.
202
203 NR ordinal number of the current record.
204
205 OFMT output format for numbers (default %.6g).
206
207 OFS output field separator (default space).
208
209 ORS output record separator (default newline).
210
211 RLENGTH
212 the length of a string matched by match.
213
214 RS input record separator (default newline).
215
216 RSTART the start position of a string matched by match.
217
218 SUBSEP separates multiple subscripts (default 034).
219
220 Functions may be defined (at the position of a pattern-action state‐
221 ment) thus:
222
223 function foo(a, b, c) { ...; return x }
224
225 Parameters are passed by value if scalar and by reference if array
226 name; functions may be called recursively. Parameters are local to the
227 function; all other variables are global. Thus local variables may be
228 created by providing excess parameters in the function definition.
229
231 length($0) > 72
232 Print lines longer than 72 characters.
233
234 { print $2, $1 }
235 Print first two fields in opposite order.
236
237 BEGIN { FS = ",[ \t]*|[ \t]+" }
238 { print $2, $1 }
239 Same, with input fields separated by comma and/or spaces and
240 tabs.
241
242 { s += $1 }
243 END { print "sum is", s, " average is", s/NR }
244 Add up first column, print sum and average.
245
246 /start/, /stop/
247 Print all lines between start/stop pairs.
248
249 BEGIN { # Simulate echo(1)
250 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
251 printf "\n"
252 exit }
253
255 grep(1), lex(1), sed(1)
256 A. V. Aho, B. W. Kernighan, P. J. Weinberger, The AWK Programming Lan‐
257 guage, Addison-Wesley, 1988. ISBN 0-201-07981-X.
258
260 There are no explicit conversions between numbers and strings. To
261 force an expression to be treated as a number add 0 to it; to force it
262 to be treated as a string concatenate "" to it.
263 The scope rules for variables in functions are a botch; the syntax is
264 worse.
265 POSIX-standard interval expressions in regular expressions are not sup‐
266 ported.
267 Only eight-bit characters sets are handled correctly.
268
269
270
271 NAWK(1)