1UNIQ(P) POSIX Programmer's Manual UNIQ(P)
2
3
4
6 uniq - report or filter out repeated lines in a file
7
9 uniq [-c|-d|-u][-f fields][-s char][input_file [output_file]]
10
12 The uniq utility shall read an input file comparing adjacent lines, and
13 write one copy of each input line on the output. The second and suc‐
14 ceeding copies of repeated adjacent input lines shall not be written.
15
16 Repeated lines in the input shall not be detected if they are not adja‐
17 cent.
18
20 The uniq utility shall conform to the Base Definitions volume of
21 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
22
23 The following options shall be supported:
24
25 -c Precede each output line with a count of the number of times the
26 line occurred in the input.
27
28 -d Suppress the writing of lines that are not repeated in the
29 input.
30
31 -f fields
32 Ignore the first fields fields on each input line when doing
33 comparisons, where fields is a positive decimal integer. A field
34 is the maximal string matched by the basic regular expression:
35
36
37 [[:blank:]]*[^[:blank:]]*
38
39 If the fields option-argument specifies more fields than appear on an
40 input line, a null string shall be used for comparison.
41
42 -s chars
43 Ignore the first chars characters when doing comparisons, where
44 chars shall be a positive decimal integer. If specified in con‐
45 junction with the -f option, the first chars characters after
46 the first fields fields shall be ignored. If the chars option-
47 argument specifies more characters than remain on an input line,
48 a null string shall be used for comparison.
49
50 -u Suppress the writing of lines that are repeated in the input.
51
52
54 The following operands shall be supported:
55
56 input_file
57 A pathname of the input file. If the input_file operand is not
58 specified, or if the input_file is '-' , the standard input
59 shall be used.
60
61 output_file
62 A pathname of the output file. If the output_file operand is not
63 specified, the standard output shall be used. The results are
64 unspecified if the file named by output_file is the file named
65 by input_file.
66
67
69 The standard input shall be used only if no input_file operand is spec‐
70 ified or if input_file is '-' . See the INPUT FILES section.
71
73 The input file shall be a text file.
74
76 The following environment variables shall affect the execution of uniq:
77
78 LANG Provide a default value for the internationalization variables
79 that are unset or null. (See the Base Definitions volume of
80 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
81 ables for the precedence of internationalization variables used
82 to determine the values of locale categories.)
83
84 LC_ALL If set to a non-empty string value, override the values of all
85 the other internationalization variables.
86
87 LC_COLLATE
88
89 Determine the locale for ordering rules.
90
91 LC_CTYPE
92 Determine the locale for the interpretation of sequences of
93 bytes of text data as characters (for example, single-byte as
94 opposed to multi-byte characters in arguments and input files)
95 and which characters constitute a <blank> in the current locale.
96
97 LC_MESSAGES
98 Determine the locale that should be used to affect the format
99 and contents of diagnostic messages written to standard error.
100
101 NLSPATH
102 Determine the location of message catalogs for the processing of
103 LC_MESSAGES .
104
105
107 Default.
108
110 The standard output shall be used only if no output_file operand is
111 specified. See the OUTPUT FILES section.
112
114 The standard error shall be used only for diagnostic messages.
115
117 If the -c option is specified, the output file shall be empty or each
118 line shall be of the form:
119
120
121 "%d %s", <number of duplicates>, <line>
122
123 otherwise, the output file shall be empty or each line shall be of the
124 form:
125
126
127 "%s", <line>
128
130 None.
131
133 The following exit values shall be returned:
134
135 0 The utility executed successfully.
136
137 >0 An error occurred.
138
139
141 Default.
142
143 The following sections are informative.
144
146 The sort utility can be used to cause repeated lines to be adjacent in
147 the input file.
148
150 The following input file data (but flushed left) was used for a test
151 series on uniq:
152
153
154 #01 foo0 bar0 foo1 bar1
155 #02 bar0 foo1 bar1 foo1
156 #03 foo0 bar0 foo1 bar1
157 #04
158 #05 foo0 bar0 foo1 bar1
159 #06 foo0 bar0 foo1 bar1
160 #07 bar0 foo1 bar1 foo0
161
162 What follows is a series of test invocations of the uniq utility that
163 use a mixture of uniq options against the input file data. These tests
164 verify the meaning of adjacent. The uniq utility views the input data
165 as a sequence of strings delimited by '\n' . Accordingly, for the
166 fieldsth member of the sequence, uniq interprets unique or repeated
167 adjacent lines strictly relative to the fields+1th member.
168
169 1. This first example tests the line counting option, comparing each
170 line of the input file data starting from the second field:
171
172
173 uniq -c -f 1 uniq_0I.t
174 1 #01 foo0 bar0 foo1 bar1
175 1 #02 bar0 foo1 bar1 foo0
176 1 #03 foo0 bar0 foo1 bar1
177 1 #04
178 2 #05 foo0 bar0 foo1 bar1
179 1 #07 bar0 foo1 bar1 foo0
180
181 The number '2' , prefixing the fifth line of output, signifies that the
182 uniq utility detected a pair of repeated lines. Given the input data,
183 this can only be true when uniq is run using the -f 1 option (which
184 shall cause uniq to ignore the first field on each input line).
185
186 2. The second example tests the option to suppress unique lines, com‐
187 paring each line of the input file data starting from the second
188 field:
189
190
191 uniq -d -f 1 uniq_0I.t
192 #05 foo0 bar0 foo1 bar1
193
194 3. This test suppresses repeated lines, comparing each line of the
195 input file data starting from the second field:
196
197
198 uniq -u -f 1 uniq_0I.t
199 #01 foo0 bar0 foo1 bar1
200 #02 bar0 foo1 bar1 foo1
201 #03 foo0 bar0 foo1 bar1
202 #04
203 #07 bar0 foo1 bar1 foo0
204
205 4. This suppresses unique lines, comparing each line of the input file
206 data starting from the third character:
207
208
209 uniq -d -s 2 uniq_0I.t
210
211 In the last example, the uniq utility found no input matching the above
212 criteria.
213
215 Some historical implementations have limited lines to be 1080 bytes in
216 length, which does not meet the implied {LINE_MAX} limit.
217
219 None.
220
222 comm , sort
223
225 Portions of this text are reprinted and reproduced in electronic form
226 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
227 -- Portable Operating System Interface (POSIX), The Open Group Base
228 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
229 Electrical and Electronics Engineers, Inc and The Open Group. In the
230 event of any discrepancy between this version and the original IEEE and
231 The Open Group Standard, the original IEEE and The Open Group Standard
232 is the referee document. The original Standard can be obtained online
233 at http://www.opengroup.org/unix/online.html .
234
235
236
237IEEE/The Open Group 2003 UNIQ(P)