1sort(1) User Commands sort(1)
2
3
4
6 sort - sort, merge, or sequence check text files
7
9 /usr/bin/sort [-bcdfimMnru] [-k keydef] [-o output]
10 [-S kmem] [-t char] [-T directory] [-y [kmem]]
11 [-z recsz] [+pos1 [-pos2]] [file]...
12
13
14 /usr/xpg4/bin/sort [-bcdfimMnru] [-k keydef] [-o output]
15 [-S kmem] [-t char] [-T directory] [-y [kmem]]
16 [-z recsz] [+pos1 [-pos2]] [file]...
17
18
20 The sort command sorts lines of all the named files together and writes
21 the result on the standard output.
22
23
24 Comparisons are based on one or more sort keys extracted from each line
25 of input. By default, there is one sort key, the entire input line.
26 Lines are ordered according to the collating sequence of the current
27 locale.
28
30 The following options alter the default behavior:
31
32 /usr/bin/sort
33 -c Checks that the single input file is ordered as specified by the
34 arguments and the collating sequence of the current locale. The
35 exit code is set and no output is produced unless the file is out
36 of sort.
37
38
39 /usr/xpg4/bin/sort
40 -c Same as /usr/bin/sort except no output is produced
41 under any circumstances.
42
43
44 -m Merges only. The input files are assumed to be already
45 sorted.
46
47
48 -o output Specifies the name of an output file to be used instead
49 of the standard output. This file can be the same as
50 one of the input files.
51
52
53 -S kmem Specifies the maximum amount of swap-based memory used
54 for sorting, in kilobytes (the default unit). kmem can
55 also be specified directly as a number of bytes (b),
56 kilobytes (k), megabytes (m), gigabytes (g), or ter‐
57 abytes (t); or as a percentage (%) of the installed
58 physical memory.
59
60
61 -T directory Specifies the directory in which to place temporary
62 files.
63
64
65 -u Unique: suppresses all but one in each set of lines
66 having equal keys. If used with the -c option, checks
67 that there are no lines with duplicate keys in addition
68 to checking that the input file is sorted.
69
70
71 -y kmem (obsolete). This option was used to specify the amount
72 of main memory initially used by sort. Its functional‐
73 ity is not appropriate for a virtual memory system;
74 memory usage for sort is now specified using the -S
75 option.
76
77
78 -z recsz (obsolete). This option was used to prevent abnormal
79 termination when lines longer than the system-dependent
80 default buffer size are encountered. Because sort auto‐
81 matically allocates buffers large enough to hold the
82 longest line, this option has no effect.
83
84
85 Ordering Options
86 The default sort order depends on the value of LC_COLLATE. If LC_COL‐
87 LATE is set to C, sorting is in ASCII order. If LC_COLLATE is set to
88 en_US, sorting is case insensitive except when the two strings are oth‐
89 erwise equal and one has an uppercase letter earlier than the other.
90 Other locales have other sort orders.
91
92
93 The following options override the default ordering rules. When order‐
94 ing options appear independent of any key field specifications, the
95 requested field ordering rules are applied globally to all sort keys.
96 When attached to a specific key (see Sort Key Options), the specified
97 ordering options override all global ordering options for that key. In
98 the obsolescent forms, if one or more of these options follows a +pos1
99 option, it affects only the key field specified by that preceding
100 option.
101
102 -d Dictionary order: only letters, digits, and blanks (spaces and
103 tabs) are significant in comparisons.
104
105
106 -f Folds lower-case letters into upper case.
107
108
109 -i Ignores non-printable characters.
110
111
112 -M Compares as months. The first three non-blank characters of the
113 field are folded to upper case and compared. For example, in Eng‐
114 lish the sorting order is "JAN" < "FEB" < ... < "DEC". Invalid
115 fields compare low to "JAN". The -M option implies the -b option
116 (see below).
117
118
119 -n Restricts the sort key to an initial numeric string, consisting
120 of optional blank characters, optional minus sign, and zero or
121 more digits with an optional radix character and thousands sepa‐
122 rators (as defined in the current locale), which is sorted by
123 arithmetic value. An empty digit string is treated as zero.
124 Leading zeros and signs on zeros do not affect ordering.
125
126
127 -r Reverses the sense of comparisons.
128
129
130 Field Separator Options
131 The treatment of field separators can be altered using the following
132 options:
133
134 -b Ignores leading blank characters when determining the start‐
135 ing and ending positions of a restricted sort key. If the -b
136 option is specified before the first sort key option, it is
137 applied to all sort key options. Otherwise, the -b option
138 can be attached independently to each -k field_start,
139 field_end, or +pos1 or −pos2 option-argument (see below).
140
141
142 -t char Use char as the field separator character. char is not con‐
143 sidered to be part of a field (although it can be included
144 in a sort key). Each occurrence of char is significant (for
145 example, <char><char> delimits an empty field). If -t is not
146 specified, blank characters are used as default field sepa‐
147 rators; each maximal non-empty sequence of blank characters
148 that follows a non-blank character is a field separator.
149
150
151 Sort Key Options
152 Sort keys can be specified using the options:
153
154 -k keydef The keydef argument is a restricted sort key field
155 definition. The format of this definition is:
156
157 -k field_start [type] [,field_end [type] ]
158
159
160 where:
161
162 field_start and field_end
163
164 define a key field restricted to a portion of
165 the line.
166
167
168 type
169
170 is a modifier from the list of characters
171 bdfiMnr. The b modifier behaves like the -b
172 option, but applies only to the field_start or
173 field_end to which it is attached and characters
174 within a field are counted from the first non-
175 blank character in the field. (This applies sep‐
176 arately to first_character and last_character.)
177 The other modifiers behave like the correspond‐
178 ing options, but apply only to the key field to
179 which they are attached. They have this effect
180 if specified with field_start, field_end or
181 both. If any modifier is attached to a
182 field_start or to a field_end, no option applies
183 to either.
184
185 When there are multiple key fields, later keys are
186 compared only after all earlier keys compare equal.
187 Except when the -u option is specified, lines that
188 otherwise compare equal are ordered as if none of
189 the options -d, -f, -i, -n or -k were present (but
190 with -r still in effect, if it was specified) and
191 with all bytes in the lines significant to the com‐
192 parison.
193
194 The notation:
195
196 -k field_start[type][,field_end[type]]
197
198
199 defines a key field that begins at field_start and
200 ends at field_end inclusive, unless field_start
201 falls beyond the end of the line or after field_end,
202 in which case the key field is empty. A missing
203 field_end means the last character of the line.
204
205 A field comprises a maximal sequence of non-separat‐
206 ing characters and, in the absence of option -t, any
207 preceding field separator.
208
209 The field_start portion of the keydef option-argu‐
210 ment has the form:
211
212 field_number[.first_character]
213
214
215 Fields and characters within fields are numbered
216 starting with 1. field_number and first_character,
217 interpreted as positive decimal integers, specify
218 the first character to be used as part of a sort
219 key. If .first_character is omitted, it refers to
220 the first character of the field.
221
222 The field_end portion of the keydef option-argument
223 has the form:
224
225 field_number[.last_character]
226
227
228 The field_number is as described above for
229 field_start. last_character, interpreted as a non-
230 negative decimal integer, specifies the last charac‐
231 ter to be used as part of the sort key. If
232 last_character evaluates to zero or .last_character
233 is omitted, it refers to the last character of the
234 field specified by field_number.
235
236 If the -b option or b type modifier is in effect,
237 characters within a field are counted from the first
238 non-blank character in the field. (This applies sep‐
239 arately to first_character and last_character.)
240
241
242 [+pos1 [-pos2]] (obsolete). Provide functionality equivalent to the
243 -kkeydef option.
244
245 pos1 and pos2 each have the form m.n optionally fol‐
246 lowed by one or more of the flags bdfiMnr. A start‐
247 ing position specified by +m.n is interpreted to
248 mean the n+1st character in the m+1st field. A miss‐
249 ing .n means .0, indicating the first character of
250 the m+1st field. If the b flag is in effect n is
251 counted from the first non-blank in the m+1st field;
252 +m.0b refers to the first non-blank character in the
253 m+1st field.
254
255 A last position specified by −m.n is interpreted to
256 mean the nth character (including separators) after
257 the last character of the mth field. A missing .n
258 means .0, indicating the last character of the mth
259 field. If the b flag is in effect n is counted from
260 the last leading blank in the m+1st field; −m.1b
261 refers to the first non-blank in the m+1st field.
262
263 The fully specified +pos1 −pos2 form with type modi‐
264 fiers T and U:
265
266 +w.xT -y.zU
267
268
269 is equivalent to:
270
271 undefined (z==0 & U contains b & -t is present)
272 -k w+1.x+1T,y.0U (z==0 otherwise)
273 -k w+1.x+1T,y+1.zU (z > 0)
274
275
276 Implementations support at least nine occurrences of
277 the sort keys (the -k option and obsolescent +pos1
278 and −pos2) which are significant in command line
279 order. If no sort key is specified, a default sort
280 key of the entire line is used.
281
282
284 The following operand is supported:
285
286 file A path name of a file to be sorted, merged or checked. If no
287 file operands are specified, or if a file operand is −, the
288 standard input is used.
289
290
292 See largefile(5) for the description of the behavior of sort when
293 encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
294
296 In the following examples, first the preferred and then the obsolete
297 way of specifying sort keys are given as an aid to understanding the
298 relationship between the two forms.
299
300 Example 1 Sorting with the Second Field as a sort Key
301
302
303 Either of the following commands sorts the contents of infile with the
304 second field as the sort key:
305
306
307 example% sort -k 2,2 infile
308 example% sort +1 −2 infile
309
310
311
312 Example 2 Sorting in Reverse Order
313
314
315 Either of the following commands sorts, in reverse order, the contents
316 of infile1 and infile2, placing the output in outfile and using the
317 second character of the second field as the sort key (assuming that the
318 first character of the second field is the field separator):
319
320
321 example% sort -r -o outfile -k 2.2,2.2 infile1 infile2
322 example% sort -r -o outfile +1.1 −1.2 infile1 infile2
323
324
325
326 Example 3 Sorting Using a Specified Character in One of the Files
327
328
329 Either of the following commands sorts the contents of infile1 and
330 infile2 using the second non-blank character of the second field as the
331 sort key:
332
333
334 example% sort -k 2.2b,2.2b infile1 infile2
335 example% sort +1.1b −1.2b infile1 infile2
336
337
338
339 Example 4 Sorting by Numeric User ID
340
341
342 Either of the following commands prints the passwd(4) file (user data‐
343 base) sorted by the numeric user ID (the third colon-separated field):
344
345
346 example% sort -t : -k 3,3n /etc/passwd
347 example% sort -t : +2 −3n /etc/passwd
348
349
350
351 Example 5 Printing Sorted Lines Excluding Lines that Duplicate a Field
352
353
354 Either of the following commands prints the lines of the already sorted
355 file infile, suppressing all but one occurrence of lines having the
356 same third field:
357
358
359 example% sort -um -k 3.1,3.0 infile
360 example% sort -um +2.0 −3.0 infile
361
362
363
364 Example 6 Sorting by Host IP Address
365
366
367 Either of the following commands prints the hosts(4) file (IPv4 hosts
368 database), sorted by the numeric IP address (the first four numeric
369 fields):
370
371
372 example$ sort -t . -k 1,1n -k 2,2n -k 3,3n -k 4,4n /etc/hosts
373 example$ sort -t . +0 -1n +1 -2n +2 -3n +3 -4n /etc/hosts
374
375
376
377
378 Since '.' is both the field delimiter and, in many locales, the decimal
379 separator, failure to specify both ends of the field leads to results
380 where the second field is interpreted as a fractional portion of the
381 first, and so forth.
382
383
385 See environ(5) for descriptions of the following environment variables
386 that affect the execution of sort: LANG, LC_ALL, LC_COLLATE, LC_MES‐
387 SAGES, and NLSPATH.
388
389 LC_CTYPE Determine the locale for the interpretation of sequences
390 of bytes of text data as characters (for example, single-
391 versus multi-byte characters in arguments and input
392 files) and the behavior of character classification for
393 the -b, -d, -f, -i and -n options.
394
395
396 LC_NUMERIC Determine the locale for the definition of the radix
397 character and thousands separator for the -n option.
398
399
401 The following exit values are returned:
402
403 0 All input files were output successfully, or -c was specified and
404 the input file was correctly sorted.
405
406
407 1 Under the -c option, the file was not ordered as specified, or if
408 the -c and -u options were both specified, two input lines were
409 found with equal keys.
410
411
412 >1 An error occurred.
413
414
416 /var/tmp/stm??? Temporary files
417
418
420 See attributes(5) for descriptions of the following attributes:
421
422 /usr/bin/sort
423 ┌─────────────────────────────┬─────────────────────────────┐
424 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
425 ├─────────────────────────────┼─────────────────────────────┤
426 │Availability │SUNWesu │
427 ├─────────────────────────────┼─────────────────────────────┤
428 │CSI │Enabled │
429 └─────────────────────────────┴─────────────────────────────┘
430
431 /usr/xpg4/bin/sort
432 ┌─────────────────────────────┬─────────────────────────────┐
433 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
434 ├─────────────────────────────┼─────────────────────────────┤
435 │Availability │SUNWxcu4 │
436 ├─────────────────────────────┼─────────────────────────────┤
437 │CSI │Enabled │
438 ├─────────────────────────────┼─────────────────────────────┤
439 │Interface Stability │Standard │
440 └─────────────────────────────┴─────────────────────────────┘
441
443 comm(1), join(1), uniq(1), nl_langinfo(3C), strftime(3C), hosts(4),
444 passwd(4), attributes(5), environ(5), largefile(5), standards(5)
445
447 Comments and exits with non-zero status for various trouble conditions
448 (for example, when input lines are too long), and for disorders discov‐
449 ered under the -c option.
450
452 When the last line of an input file is missing a new-line character,
453 sort appends one, prints a warning message, and continues.
454
455
456 sort does not guarantee preservation of relative line ordering on equal
457 keys.
458
459
460 One can tune sort performance for a specific scenario using the -S
461 option. However, one should note in particular that sort has greater
462 knowledge of how to use a finite amount of memory for sorting than the
463 virtual memory system. Thus, a sort invoked to request an extremely
464 large amount of memory via the -S option could perform extremely
465 poorly.
466
467
468 As noted, certain of the field modifiers (such as -M and -d) cause the
469 interpretation of input data to be done with reference to locale-spe‐
470 cific settings. The results of this interpretation can be unexpected if
471 one's expectations are not aligned with the conventions established by
472 the locale. In the case of the month keys, sort does not attempt to
473 compensate for approximate month abbreviations. The precise month
474 abbreviations from nl_langinfo(3C) or strftime(3C) are the only ones
475 recognized. For printable or dictionary order, if these concepts are
476 not well-defined by the locale, an empty sort key might be the result,
477 leading to the next key being the significant one for determining the
478 appropriate ordering.
479
480
481
482SunOS 5.11 19 Nov 2001 sort(1)