1UNIQ(1P)                   POSIX Programmer's Manual                  UNIQ(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10
11

NAME

13       uniq — report or filter out repeated lines in a file
14

SYNOPSIS

16       uniq [−c|−d|−u] [−f fields] [−s char] [input_file [output_file]]
17

DESCRIPTION

19       The uniq utility shall read an input file comparing adjacent lines, and
20       write  one  copy  of each input line on the output. The second and suc‐
21       ceeding copies of repeated adjacent input lines shall not  be  written.
22       The  trailing <newline> of each line in the input shall be ignored when
23       doing comparisons.
24
25       Repeated lines in the input shall not be detected if they are not adja‐
26       cent.
27

OPTIONS

29       The  uniq  utility  shall  conform  to  the  Base Definitions volume of
30       POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines, except that  '+'
31       may be recognized as an option delimiter as well as '−'.
32
33       The following options shall be supported:
34
35       −c        Precede  each output line with a count of the number of times
36                 the line occurred in the input.
37
38       −d        Suppress the writing of lines that are not  repeated  in  the
39                 input.
40
41       −f fields Ignore  the first fields fields on each input line when doing
42                 comparisons, where fields is a positive  decimal  integer.  A
43                 field  is  the  maximal  string  matched by the basic regular
44                 expression:
45
46                     [[:blank:]]*[^[:blank:]]*
47
48                 If the fields  option-argument  specifies  more  fields  than
49                 appear on an input line, a null string shall be used for com‐
50                 parison.
51
52       −s chars  Ignore the first chars  characters  when  doing  comparisons,
53                 where chars shall be a positive decimal integer. If specified
54                 in conjunction with the −f option, the first chars characters
55                 after  the first fields fields shall be ignored. If the chars
56                 option-argument specifies more characters than remain  on  an
57                 input line, a null string shall be used for comparison.
58
59       −u        Suppress the writing of lines that are repeated in the input.
60

OPERANDS

62       The following operands shall be supported:
63
64       input_file
65                 A  pathname  of  the input file. If the input_file operand is
66                 not specified, or if the  input_file  is  '−',  the  standard
67                 input shall be used.
68
69       output_file
70                 A  pathname of the output file. If the output_file operand is
71                 not specified, the standard output shall be used. The results
72                 are  unspecified if the file named by output_file is the file
73                 named by input_file.
74

STDIN

76       The standard input shall be used only if no input_file operand is spec‐
77       ified or if input_file is '−'.  See the INPUT FILES section.
78

INPUT FILES

80       The input file shall be a text file.
81

ENVIRONMENT VARIABLES

83       The following environment variables shall affect the execution of uniq:
84
85       LANG      Provide  a  default  value for the internationalization vari‐
86                 ables that are unset or null. (See the Base Definitions  vol‐
87                 ume  of POSIX.1‐2008, Section 8.2, Internationalization Vari‐
88                 ables for the precedence  of  internationalization  variables
89                 used to determine the values of locale categories.)
90
91       LC_ALL    If  set  to  a non-empty string value, override the values of
92                 all the other internationalization variables.
93
94       LC_COLLATE
95                 Determine the locale for ordering rules.
96
97       LC_CTYPE  Determine the locale for the interpretation of  sequences  of
98                 bytes of text data as characters (for example, single-byte as
99                 opposed to  multi-byte  characters  in  arguments  and  input
100                 files)  and which characters constitute a <blank> in the cur‐
101                 rent locale.
102
103       LC_MESSAGES
104                 Determine the locale that should be used to affect the format
105                 and  contents  of  diagnostic  messages  written  to standard
106                 error.
107
108       NLSPATH   Determine the location of message catalogs for the processing
109                 of LC_MESSAGES.
110

ASYNCHRONOUS EVENTS

112       Default.
113

STDOUT

115       The  standard  output shall be used if no output_file operand is speci‐
116       fied, and shall be used if the  output_file  operand  is  '−'  and  the
117       implementation  treats  the  '−' as meaning standard output. Otherwise,
118       the standard output shall not be used.  See the OUTPUT FILES section.
119

STDERR

121       The standard error shall be used only for diagnostic messages.
122

OUTPUT FILES

124       If the −c option is specified, the output file shall be empty  or  each
125       line shall be of the form:
126
127           "%d %s", <number of duplicates>, <line>
128
129       otherwise,  the output file shall be empty or each line shall be of the
130       form:
131
132           "%s", <line>
133

EXTENDED DESCRIPTION

135       None.
136

EXIT STATUS

138       The following exit values shall be returned:
139
140        0    The utility executed successfully.
141
142       >0    An error occurred.
143

CONSEQUENCES OF ERRORS

145       Default.
146
147       The following sections are informative.
148

APPLICATION USAGE

150       The sort utility can be used to cause repeated lines to be adjacent  in
151       the input file.
152

EXAMPLES

154       The  following  input  file data (but flushed left) was used for a test
155       series on uniq:
156
157           #01 foo0 bar0 foo1 bar1
158           #02 bar0 foo1 bar1 foo1
159           #03 foo0 bar0 foo1 bar1
160           #04
161           #05 foo0 bar0 foo1 bar1
162           #06 foo0 bar0 foo1 bar1
163           #07 bar0 foo1 bar1 foo0
164
165       What follows is a series of test invocations of the uniq  utility  that
166       use  a mixture of uniq options against the input file data. These tests
167       verify the meaning of adjacent.  The uniq utility views the input  data
168       as  a  sequence  of  strings  delimited  by '\n'.  Accordingly, for the
169       fieldsth member of the sequence, uniq  interprets  unique  or  repeated
170       adjacent lines strictly relative to the fields+1th member.
171
172        1. This  first  example tests the line counting option, comparing each
173           line of the input file data starting from the second field:
174
175               uniq −c −f 1 uniq_0I.t
176                   1 #01 foo0 bar0 foo1 bar1
177                   1 #02 bar0 foo1 bar1 foo1
178                   1 #03 foo0 bar0 foo1 bar1
179                   1 #04
180                   2 #05 foo0 bar0 foo1 bar1
181                   1 #07 bar0 foo1 bar1 foo0
182
183           The number '2', prefixing the fifth line of output, signifies  that
184           the uniq utility detected a pair of repeated lines. Given the input
185           data, this can only be true when uniq is run using the −f 1  option
186           (which  shall  cause  uniq  to ignore the first field on each input
187           line).
188
189        2. The second example tests the option to suppress unique lines,  com‐
190           paring  each  line  of the input file data starting from the second
191           field:
192
193               uniq −d −f 1 uniq_0I.t
194               #05 foo0 bar0 foo1 bar1
195
196        3. This test suppresses repeated lines, comparing  each  line  of  the
197           input file data starting from the second field:
198
199               uniq −u −f 1 uniq_0I.t
200               #01 foo0 bar0 foo1 bar1
201               #02 bar0 foo1 bar1 foo1
202               #03 foo0 bar0 foo1 bar1
203               #04
204               #07 bar0 foo1 bar1 foo0
205
206        4. This suppresses unique lines, comparing each line of the input file
207           data starting from the third character:
208
209               uniq −d −s 2 uniq_0I.t
210
211           In the last example, the uniq utility found no input  matching  the
212           above criteria.
213

RATIONALE

215       Some  historical implementations have limited lines to be 1080 bytes in
216       length, which does not meet the implied {LINE_MAX} limit.
217
218       Earlier versions of this  standard  allowed  the  number  and  +number
219       options.  These options are no longer specified by POSIX.1‐2008 but may
220       be present in some implementations.
221

FUTURE DIRECTIONS

223       None.
224

SEE ALSO

226       comm, sort
227
228       The Base Definitions volume of  POSIX.1‐2008,  Chapter  8,  Environment
229       Variables, Section 12.2, Utility Syntax Guidelines
230
232       Portions  of  this text are reprinted and reproduced in electronic form
233       from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
234       --  Portable  Operating  System  Interface (POSIX), The Open Group Base
235       Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
236       cal  and  Electronics  Engineers,  Inc  and  The  Open Group.  (This is
237       POSIX.1-2008 with the 2013 Technical Corrigendum  1  applied.)  In  the
238       event of any discrepancy between this version and the original IEEE and
239       The Open Group Standard, the original IEEE and The Open Group  Standard
240       is  the  referee document. The original Standard can be obtained online
241       at http://www.unix.org/online.html .
242
243       Any typographical or formatting errors that appear  in  this  page  are
244       most likely to have been introduced during the conversion of the source
245       files to man page format. To report such errors,  see  https://www.ker
246       nel.org/doc/man-pages/reporting_bugs.html .
247
248
249
250IEEE/The Open Group                  2013                             UNIQ(1P)
Impressum