1awk(1) User Commands awk(1)
2
3
4
6 awk - pattern scanning and processing language
7
9 /usr/bin/awk [-f progfile] [-Fc] [' prog '] [parameters]
10 [filename]...
11
12
13 /usr/xpg4/bin/awk [-FcERE] [-v assignment]... 'program' -f progfile...
14 [argument]...
15
16
18 The /usr/xpg4/bin/awk utility is described on the nawk(1) manual page.
19
20
21 The /usr/bin/awk utility scans each input filename for lines that match
22 any of a set of patterns specified in prog. The prog string must be
23 enclosed in single quotes ( a´) to protect it from the shell. For each
24 pattern in prog there can be an associated action performed when a line
25 of a filename matches the pattern. The set of pattern-action statements
26 can appear literally as prog or in a file specified with the -f prog‐
27 file option. Input files are read in order; if there are no files, the
28 standard input is read. The file name '−' means the standard input.
29
31 The following options are supported:
32
33 -f progfile awk uses the set of patterns it reads from progfile.
34
35
36 -Fc Uses the character c as the field separator (FS) char‐
37 acter. See the discussion of FS below.
38
39
41 Input Lines
42 Each input line is matched against the pattern portion of every pat‐
43 tern-action statement; the associated action is performed for each
44 matched pattern. Any filename of the form var=value is treated as an
45 assignment, not a filename, and is executed at the time it would have
46 been opened if it were a filename. Variables assigned in this manner
47 are not available inside a BEGIN rule, and are assigned after previ‐
48 ously specified files have been read.
49
50
51 An input line is normally made up of fields separated by white spaces.
52 (This default can be changed by using the FS built-in variable or the
53 -Fc option.) The default is to ignore leading blanks and to separate
54 fields by blanks and/or tab characters. However, if FS is assigned a
55 value that does not include any of the white spaces, then leading
56 blanks are not ignored. The fields are denoted $1, $2, ...; $0 refers
57 to the entire line.
58
59 Pattern-action Statements
60 A pattern-action statement has the form:
61
62 pattern { action }
63
64
65
66
67 Either pattern or action can be omitted. If there is no action, the
68 matching line is printed. If there is no pattern, the action is per‐
69 formed on every input line. Pattern-action statements are separated by
70 newlines or semicolons.
71
72
73 Patterns are arbitrary Boolean combinations ( !, ||, &&, and parenthe‐
74 ses) of relational expressions and regular expressions. A relational
75 expression is one of the following:
76
77 expression relop expression
78 expression matchop regular_expression
79
80
81
82 where a relop is any of the six relational operators in C, and a
83 matchop is either ~ (contains) or !~ (does not contain). An expression
84 is an arithmetic expression, a relational expression, the special
85 expression
86
87 var in array
88
89
90
91 or a Boolean combination of these.
92
93
94 Regular expressions are as in egrep(1). In patterns they must be sur‐
95 rounded by slashes. Isolated regular expressions in a pattern apply to
96 the entire line. Regular expressions can also occur in relational
97 expressions. A pattern can consist of two patterns separated by a
98 comma; in this case, the action is performed for all lines between the
99 occurrence of the first pattern to the occurrence of the second pat‐
100 tern.
101
102
103 The special patterns BEGIN and END can be used to capture control
104 before the first input line has been read and after the last input line
105 has been read respectively. These keywords do not combine with any
106 other patterns.
107
108 Built-in Variables
109 Built-in variables include:
110
111 FILENAME name of the current input file
112
113
114 FS input field separator regular expression (default blank
115 and tab)
116
117
118 NF number of fields in the current record
119
120
121 NR ordinal number of the current record
122
123
124 OFMT output format for numbers (default %.6g)
125
126
127 OFS output field separator (default blank)
128
129
130 ORS output record separator (default new-line)
131
132
133 RS input record separator (default new-line)
134
135
136
137 An action is a sequence of statements. A statement can be one of the
138 following:
139
140 if ( expression ) statement [ else statement ]
141 while ( expression ) statement
142 do statement while ( expression )
143 for ( expression ; expression ; expression ) statement
144 for ( var in array ) statement
145 break
146 continue
147 { [ statement ] ... }
148 expression # commonly variable = expression
149 print [ expression-list ] [ >expression ]
150 printf format [ ,expression-list ] [ >expression ]
151 next # skip remaining patterns on this input line
152 exit [expr] # skip the rest of the input; exit status is expr
153
154
155
156 Statements are terminated by semicolons, newlines, or right braces. An
157 empty expression-list stands for the whole input line. Expressions take
158 on string or numeric values as appropriate, and are built using the
159 operators +, −, *, /, %, ^ and concatenation (indicated by a blank).
160 The operators ++, −−, +=, −=, *=, /=, %=, ^=, >, >=, <, <=, ==, !=, and
161 ?: are also available in expressions. Variables can be scalars, array
162 elements (denoted x[i]), or fields. Variables are initialized to the
163 null string or zero. Array subscripts can be any string, not necessar‐
164 ily numeric; this allows for a form of associative memory. String con‐
165 stants are quoted (""), with the usual C escapes recognized within.
166
167
168 The print statement prints its arguments on the standard output, or on
169 a file if >expression is present, or on a pipe if '|cmd' is present.
170 The output resulted from the print statement is terminated by the out‐
171 put record separator with each argument separated by the current output
172 field separator. The printf statement formats its expression list
173 according to the format (see printf(3C)).
174
175 Built-in Functions
176 The arithmetic functions are as follows:
177
178 cos(x) Return cosine of x, where x is in radians. (In
179 /usr/xpg4/bin/awk only. See nawk(1).)
180
181
182 sin(x) Return sine of x, where x is in radians. (In
183 /usr/xpg4/bin/awk only. See nawk(1).)
184
185
186 exp(x) Return the exponential function of x.
187
188
189 log(x) Return the natural logarithm of x.
190
191
192 sqrt(x) Return the square root of x.
193
194
195 int(x) Truncate its argument to an integer. It is truncated toward
196 0 when x > 0.
197
198
199
200 The string functions are as follows:
201
202 index(s, t)
203
204 Return the position in string s where string t first occurs, or 0
205 if it does not occur at all.
206
207
208 int(s)
209
210 truncates s to an integer value. If s is not specified, $0 is used.
211
212
213 length(s)
214
215 Return the length of its argument taken as a string, or of the
216 whole line if there is no argument.
217
218
219 split(s, a, fs)
220
221 Split the string s into array elements a[1], a[2], ... a[n], and
222 returns n. The separation is done with the regular expression fs or
223 with the field separator FS if fs is not given.
224
225
226 sprintf(fmt, expr, expr,...)
227
228 Format the expressions according to the printf(3C) format given by
229 fmt and returns the resulting string.
230
231
232 substr(s, m, n)
233
234 returns the n-character substring of s that begins at position m.
235
236
237
238 The input/output function is as follows:
239
240 getline Set $0 to the next input record from the current input file.
241 getline returns 1 for successful input, 0 for end of file,
242 and −1 for an error.
243
244
245 Large File Behavior
246 See largefile(5) for the description of the behavior of awk when
247 encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
248
250 Example 1 Printing Lines Longer Than 72 Characters
251
252
253 The following example is an awk script that can be executed by an awk
254 -f examplescript style command. It prints lines longer than seventy two
255 characters:
256
257
258 length > 72
259
260
261
262 Example 2 Printing Fields in Opposite Order
263
264
265 The following example is an awk script that can be executed by an awk
266 -f examplescript style command. It prints the first two fields in oppo‐
267 site order:
268
269
270 { print $2, $1 }
271
272
273
274 Example 3 Printing Fields in Opposite Order with the Input Fields Sepa‐
275 rated
276
277
278 The following example is an awk script that can be executed by an awk
279 -f examplescript style command. It prints the first two input fields in
280 opposite order, separated by a comma, blanks or tabs:
281
282
283 BEGIN { FS = ",[ \t]*|[ \t]+" }
284 { print $2, $1 }
285
286
287
288 Example 4 Adding Up the First Column, Printing the Sum and Average
289
290
291 The following example is an awk script that can be executed by an awk
292 -f examplescript style command. It adds up the first column, and
293 prints the sum and average:
294
295
296 { s += $1 }
297 END { print "sum is", s, " average is", s/NR }
298
299
300
301 Example 5 Printing Fields in Reverse Order
302
303
304 The following example is an awk script that can be executed by an awk
305 -f examplescript style command. It prints fields in reverse order:
306
307
308 { for (i = NF; i > 0; −−i) print $i }
309
310
311
312 Example 6 Printing All lines Between start/stop Pairs
313
314
315 The following example is an awk script that can be executed by an awk
316 -f examplescript style command. It prints all lines between start/stop
317 pairs.
318
319
320 /start/, /stop/
321
322
323
324 Example 7 Printing All Lines Whose First Field is Different from the
325 Previous One
326
327
328 The following example is an awk script that can be executed by an awk
329 -f examplescript style command. It prints all lines whose first field
330 is different from the previous one.
331
332
333 $1 != prev { print; prev = $1 }
334
335
336
337 Example 8 Printing a File and Filling in Page numbers
338
339
340 The following example is an awk script that can be executed by an awk
341 -f examplescript style command. It prints a file and fills in page num‐
342 bers starting at 5:
343
344
345 /Page/ { $2 = n++; }
346 { print }
347
348
349
350 Example 9 Printing a File and Numbering Its Pages
351
352
353 Assuming this program is in a file named prog, the following example
354 prints the file input numbering its pages starting at 5:
355
356
357 example% awk -f prog n=5 input
358
359
360
362 See environ(5) for descriptions of the following environment variables
363 that affect the execution of awk: LANG, LC_ALL, LC_COLLATE, LC_CTYPE,
364 LC_MESSAGES, NLSPATH, and PATH.
365
366 LC_NUMERIC Determine the radix character used when interpreting
367 numeric input, performing conversions between numeric and
368 string values and formatting numeric output. Regardless
369 of locale, the period character (the decimal-point char‐
370 acter of the POSIX locale) is the decimal-point character
371 recognized in processing awk programs (including assign‐
372 ments in command-line arguments).
373
374
376 See attributes(5) for descriptions of the following attributes:
377
378 /usr/bin/awk
379 ┌─────────────────────────────┬─────────────────────────────┐
380 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
381 ├─────────────────────────────┼─────────────────────────────┤
382 │Availability │SUNWesu │
383 ├─────────────────────────────┼─────────────────────────────┤
384 │CSI │Not Enabled │
385 └─────────────────────────────┴─────────────────────────────┘
386
387 /usr/xpg4/bin/awk
388 ┌─────────────────────────────┬─────────────────────────────┐
389 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
390 ├─────────────────────────────┼─────────────────────────────┤
391 │Availability │SUNWxcu4 │
392 ├─────────────────────────────┼─────────────────────────────┤
393 │CSI │Enabled │
394 ├─────────────────────────────┼─────────────────────────────┤
395 │Interface Stability │Standard │
396 └─────────────────────────────┴─────────────────────────────┘
397
399 egrep(1), grep(1), nawk(1), sed(1), printf(3C), attributes(5), envi‐
400 ron(5), largefile(5), standards(5)
401
403 Input white space is not preserved on output if fields are involved.
404
405
406 There are no explicit conversions between numbers and strings. To force
407 an expression to be treated as a number, add 0 to it. To force an
408 expression to be treated as a string, concatenate the null string ("")
409 to it.
410
411
412
413SunOS 5.11 22 Jun 2005 awk(1)