1UNIQ(1P) POSIX Programmer's Manual UNIQ(1P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 uniq - report or filter out repeated lines in a file
13
15 uniq [-c|-d|-u][-f fields][-s char][input_file [output_file]]
16
18 The uniq utility shall read an input file comparing adjacent lines, and
19 write one copy of each input line on the output. The second and suc‐
20 ceeding copies of repeated adjacent input lines shall not be written.
21
22 Repeated lines in the input shall not be detected if they are not adja‐
23 cent.
24
26 The uniq utility shall conform to the Base Definitions volume of
27 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
28
29 The following options shall be supported:
30
31 -c Precede each output line with a count of the number of times the
32 line occurred in the input.
33
34 -d Suppress the writing of lines that are not repeated in the
35 input.
36
37 -f fields
38 Ignore the first fields fields on each input line when doing
39 comparisons, where fields is a positive decimal integer. A field
40 is the maximal string matched by the basic regular expression:
41
42
43 [[:blank:]]*[^[:blank:]]*
44
45 If the fields option-argument specifies more fields than appear on an
46 input line, a null string shall be used for comparison.
47
48 -s chars
49 Ignore the first chars characters when doing comparisons, where
50 chars shall be a positive decimal integer. If specified in con‐
51 junction with the -f option, the first chars characters after
52 the first fields fields shall be ignored. If the chars option-
53 argument specifies more characters than remain on an input line,
54 a null string shall be used for comparison.
55
56 -u Suppress the writing of lines that are repeated in the input.
57
58
60 The following operands shall be supported:
61
62 input_file
63 A pathname of the input file. If the input_file operand is not
64 specified, or if the input_file is '-', the standard input shall
65 be used.
66
67 output_file
68 A pathname of the output file. If the output_file operand is not
69 specified, the standard output shall be used. The results are
70 unspecified if the file named by output_file is the file named
71 by input_file.
72
73
75 The standard input shall be used only if no input_file operand is spec‐
76 ified or if input_file is '-' . See the INPUT FILES section.
77
79 The input file shall be a text file.
80
82 The following environment variables shall affect the execution of uniq:
83
84 LANG Provide a default value for the internationalization variables
85 that are unset or null. (See the Base Definitions volume of
86 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
87 ables for the precedence of internationalization variables used
88 to determine the values of locale categories.)
89
90 LC_ALL If set to a non-empty string value, override the values of all
91 the other internationalization variables.
92
93 LC_COLLATE
94
95 Determine the locale for ordering rules.
96
97 LC_CTYPE
98 Determine the locale for the interpretation of sequences of
99 bytes of text data as characters (for example, single-byte as
100 opposed to multi-byte characters in arguments and input files)
101 and which characters constitute a <blank> in the current locale.
102
103 LC_MESSAGES
104 Determine the locale that should be used to affect the format
105 and contents of diagnostic messages written to standard error.
106
107 NLSPATH
108 Determine the location of message catalogs for the processing of
109 LC_MESSAGES .
110
111
113 Default.
114
116 The standard output shall be used only if no output_file operand is
117 specified. See the OUTPUT FILES section.
118
120 The standard error shall be used only for diagnostic messages.
121
123 If the -c option is specified, the output file shall be empty or each
124 line shall be of the form:
125
126
127 "%d %s", <number of duplicates>, <line>
128
129 otherwise, the output file shall be empty or each line shall be of the
130 form:
131
132
133 "%s", <line>
134
136 None.
137
139 The following exit values shall be returned:
140
141 0 The utility executed successfully.
142
143 >0 An error occurred.
144
145
147 Default.
148
149 The following sections are informative.
150
152 The sort utility can be used to cause repeated lines to be adjacent in
153 the input file.
154
156 The following input file data (but flushed left) was used for a test
157 series on uniq:
158
159
160 #01 foo0 bar0 foo1 bar1
161 #02 bar0 foo1 bar1 foo1
162 #03 foo0 bar0 foo1 bar1
163 #04
164 #05 foo0 bar0 foo1 bar1
165 #06 foo0 bar0 foo1 bar1
166 #07 bar0 foo1 bar1 foo0
167
168 What follows is a series of test invocations of the uniq utility that
169 use a mixture of uniq options against the input file data. These tests
170 verify the meaning of adjacent. The uniq utility views the input data
171 as a sequence of strings delimited by '\n' . Accordingly, for the
172 fieldsth member of the sequence, uniq interprets unique or repeated
173 adjacent lines strictly relative to the fields+1th member.
174
175 1. This first example tests the line counting option, comparing each
176 line of the input file data starting from the second field:
177
178
179 uniq -c -f 1 uniq_0I.t
180 1 #01 foo0 bar0 foo1 bar1
181 1 #02 bar0 foo1 bar1 foo0
182 1 #03 foo0 bar0 foo1 bar1
183 1 #04
184 2 #05 foo0 bar0 foo1 bar1
185 1 #07 bar0 foo1 bar1 foo0
186
187 The number '2', prefixing the fifth line of output, signifies that the
188 uniq utility detected a pair of repeated lines. Given the input data,
189 this can only be true when uniq is run using the -f 1 option (which
190 shall cause uniq to ignore the first field on each input line).
191
192 2. The second example tests the option to suppress unique lines, com‐
193 paring each line of the input file data starting from the second
194 field:
195
196
197 uniq -d -f 1 uniq_0I.t
198 #05 foo0 bar0 foo1 bar1
199
200 3. This test suppresses repeated lines, comparing each line of the
201 input file data starting from the second field:
202
203
204 uniq -u -f 1 uniq_0I.t
205 #01 foo0 bar0 foo1 bar1
206 #02 bar0 foo1 bar1 foo1
207 #03 foo0 bar0 foo1 bar1
208 #04
209 #07 bar0 foo1 bar1 foo0
210
211 4. This suppresses unique lines, comparing each line of the input file
212 data starting from the third character:
213
214
215 uniq -d -s 2 uniq_0I.t
216
217 In the last example, the uniq utility found no input matching the above
218 criteria.
219
221 Some historical implementations have limited lines to be 1080 bytes in
222 length, which does not meet the implied {LINE_MAX} limit.
223
225 None.
226
228 comm, sort
229
231 Portions of this text are reprinted and reproduced in electronic form
232 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
233 -- Portable Operating System Interface (POSIX), The Open Group Base
234 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
235 Electrical and Electronics Engineers, Inc and The Open Group. In the
236 event of any discrepancy between this version and the original IEEE and
237 The Open Group Standard, the original IEEE and The Open Group Standard
238 is the referee document. The original Standard can be obtained online
239 at http://www.opengroup.org/unix/online.html .
240
241
242
243IEEE/The Open Group 2003 UNIQ(1P)