1NAWK(1) General Commands Manual NAWK(1)
2
3
4
6 nawk - pattern-directed scanning and processing language
7
9 nawk [ -F fs ] [ -v var=value ] [ 'prog' | -f progfile ] [ file ... ]
10
12 Nawk scans each input file for lines that match any of a set of pat‐
13 terns specified literally in prog or in one or more files specified as
14 -f progfile. With each pattern there can be an associated action that
15 will be performed when a line of a file matches the pattern. Each line
16 is matched against the pattern portion of every pattern-action state‐
17 ment; the associated action is performed for each matched pattern. The
18 file name - means the standard input. Any file of the form var=value
19 is treated as an assignment, not a filename, and is executed at the
20 time it would have been opened if it were a filename. The option -v
21 followed by var=value is an assignment to be done before prog is exe‐
22 cuted; any number of -v options may be present. The -F fs option
23 defines the input field separator to be the regular expression fs.
24
25 An input line is normally made up of fields separated by white space,
26 or by regular expression FS. The fields are denoted $1, $2, ..., while
27 $0 refers to the entire line. If FS is null, the input line is split
28 into one field per character.
29
30 A pattern-action statement has the form
31
32 pattern { action }
33
34 A missing { action } means print the line; a missing pattern always
35 matches. Pattern-action statements are separated by newlines or semi‐
36 colons.
37
38 An action is a sequence of statements. A statement can be one of the
39 following:
40
41 if( expression ) statement [ else statement ]
42 while( expression ) statement
43 for( expression ; expression ; expression ) statement
44 for( var in array ) statement
45 do statement while( expression )
46 break
47 continue
48 { [ statement ... ] }
49 expression # commonly var = expression
50 print [ expression-list ] [ > expression ]
51 printf format [ , expression-list ] [ > expression ]
52 return [ expression ]
53 next # skip remaining patterns on this input line
54 nextfile # skip rest of this file, open next, start at top
55 delete array[ expression ]# delete an array element
56 delete array # delete all elements of array
57 exit [ expression ] # exit immediately; status is expression
58
59 Statements are terminated by semicolons, newlines or right braces. An
60 empty expression-list stands for $0. String constants are quoted " ",
61 with the usual C escapes recognized within. Expressions take on string
62 or numeric values as appropriate, and are built using the operators + -
63 * / % ^ (exponentiation), and concatenation (indicated by white space).
64 The operators ! ++ -- += -= *= /= %= ^= > >= < <= == != ?: are also
65 available in expressions. Variables may be scalars, array elements
66 (denoted x[i]) or fields. Variables are initialized to the null
67 string. Array subscripts may be any string, not necessarily numeric;
68 this allows for a form of associative memory. Multiple subscripts such
69 as [i,j,k] are permitted; the constituents are concatenated, separated
70 by the value of SUBSEP.
71
72 The print statement prints its arguments on the standard output (or on
73 a file if >file or >>file is present or on a pipe if |cmd is present),
74 separated by the current output field separator, and terminated by the
75 output record separator. file and cmd may be literal names or paren‐
76 thesized expressions; identical string values in different statements
77 denote the same open file. The printf statement formats its expression
78 list according to the format (see printf(3)). The built-in function
79 close(expr) closes the file or pipe expr. The built-in function
80 fflush(expr) flushes any buffered output for the file or pipe expr.
81
82 The mathematical functions exp, log, sqrt, sin, cos, and atan2 are
83 built in. Other built-in functions:
84
85 length the length of its argument taken as a string, or of $0 if no
86 argument.
87
88 rand random number on (0,1)
89
90 srand sets seed for rand and returns the previous seed.
91
92 int truncates to an integer value
93
94 substr(s, m, n)
95 the n-character substring of s that begins at position m counted
96 from 1.
97
98 index(s, t)
99 the position in s where the string t occurs, or 0 if it does
100 not.
101
102 match(s, r)
103 the position in s where the regular expression r occurs, or 0 if
104 it does not. The variables RSTART and RLENGTH are set to the
105 position and length of the matched string.
106
107 split(s, a, fs)
108 splits the string s into array elements a[1], a[2], ..., a[n],
109 and returns n. The separation is done with the regular expres‐
110 sion fs or with the field separator FS if fs is not given. An
111 empty string as field separator splits the string into one array
112 element per character.
113
114 sub(r, t, s)
115 substitutes t for the first occurrence of the regular expression
116 r in the string s. If s is not given, $0 is used.
117
118 gsub same as sub except that all occurrences of the regular expres‐
119 sion are replaced; sub and gsub return the number of replace‐
120 ments.
121
122 sprintf(fmt, expr, ... )
123 the string resulting from formatting expr ... according to the
124 printf(3) format fmt
125
126 system(cmd)
127 executes cmd and returns its exit status
128
129 tolower(str)
130 returns a copy of str with all upper-case characters translated
131 to their corresponding lower-case equivalents.
132
133 toupper(str)
134 returns a copy of str with all lower-case characters translated
135 to their corresponding upper-case equivalents.
136
137 The ``function'' getline sets $0 to the next input record from the cur‐
138 rent input file; getline <file sets $0 to the next record from file.
139 getline x sets variable x instead. Finally, cmd | getline pipes the
140 output of cmd into getline; each call of getline returns the next line
141 of output from cmd. In all cases, getline returns 1 for a successful
142 input, 0 for end of file, and -1 for an error.
143
144 Patterns are arbitrary Boolean combinations (with ! || &&) of regular
145 expressions and relational expressions. Regular expressions are as in
146 egrep; see grep(1). Isolated regular expressions in a pattern apply to
147 the entire line. Regular expressions may also occur in relational
148 expressions, using the operators ~ and !~. /re/ is a constant regular
149 expression; any string (constant or variable) may be used as a regular
150 expression, except in the position of an isolated regular expression in
151 a pattern.
152
153 A pattern may consist of two patterns separated by a comma; in this
154 case, the action is performed for all lines from an occurrence of the
155 first pattern though an occurrence of the second.
156
157 A relational expression is one of the following:
158
159 expression matchop regular-expression
160 expression relop expression
161 expression in array-name
162 (expr,expr,...) in array-name
163
164 where a relop is any of the six relational operators in C, and a
165 matchop is either ~ (matches) or !~ (does not match). A conditional is
166 an arithmetic expression, a relational expression, or a Boolean combi‐
167 nation of these.
168
169 The special patterns BEGIN and END may be used to capture control
170 before the first input line is read and after the last. BEGIN and END
171 do not combine with other patterns.
172
173 Variable names with special meanings:
174
175 CONVFMT
176 conversion format used when converting numbers (default %.6g)
177
178 FS regular expression used to separate fields; also settable by
179 option -Ffs.
180
181 NF number of fields in the current record
182
183 NR ordinal number of the current record
184
185 FNR ordinal number of the current record in the current file
186
187 FILENAME
188 the name of the current input file
189
190 RS input record separator (default newline)
191
192 OFS output field separator (default blank)
193
194 ORS output record separator (default newline)
195
196 OFMT output format for numbers (default %.6g)
197
198 SUBSEP separates multiple subscripts (default 034)
199
200 ARGC argument count, assignable
201
202 ARGV argument array, assignable; non-null members are taken as file‐
203 names
204
205 ENVIRON
206 array of environment variables; subscripts are names.
207
208 Functions may be defined (at the position of a pattern-action state‐
209 ment) thus:
210
211 function foo(a, b, c) { ...; return x }
212
213 Parameters are passed by value if scalar and by reference if array
214 name; functions may be called recursively. Parameters are local to the
215 function; all other variables are global. Thus local variables may be
216 created by providing excess parameters in the function definition.
217
219 length($0) > 72
220 Print lines longer than 72 characters.
221
222 { print $2, $1 }
223 Print first two fields in opposite order.
224
225 BEGIN { FS = ",[ \t]*|[ \t]+" }
226 { print $2, $1 }
227 Same, with input fields separated by comma and/or blanks and
228 tabs.
229
230 { s += $1 }
231 END { print "sum is", s, " average is", s/NR }
232 Add up first column, print sum and average.
233
234 /start/, /stop/
235 Print all lines between start/stop pairs.
236
237 BEGIN { # Simulate echo(1)
238 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
239 printf "\n"
240 exit }
241
243 lex(1), sed(1)
244 A. V. Aho, B. W. Kernighan, P. J. Weinberger, The AWK Programming Lan‐
245 guage, Addison-Wesley, 1988. ISBN 0-201-07981-X
246
248 There are no explicit conversions between numbers and strings. To
249 force an expression to be treated as a number add 0 to it; to force it
250 to be treated as a string concatenate "" to it.
251 The scope rules for variables in functions are a botch; the syntax is
252 worse.
253
254
255
256 NAWK(1)