1COMM(1P)                   POSIX Programmer's Manual                  COMM(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       comm — select or reject lines common to two files
13

SYNOPSIS

15       comm [-123] file1 file2
16

DESCRIPTION

18       The comm utility shall read file1 and file2, which should be ordered in
19       the  current collating sequence, and produce three text columns as out‐
20       put: lines only in file1, lines only in file2, and lines in both files.
21
22       If the lines in both files are not ordered according to  the  collating
23       sequence of the current locale, the results are unspecified.
24
25       If  the  collating sequence of the current locale does not have a total
26       ordering  of  all  characters  (see  the  Base  Definitions  volume  of
27       POSIX.1‐2017,  Section  7.3.2, LC_COLLATE) and any lines from the input
28       files collate equally but are not identical, comm should treat them  as
29       different lines but may treat them as being the same. If it treats them
30       as different, comm should expect them to be ordered according to a fur‐
31       ther byte-by-byte comparison using the collating sequence for the POSIX
32       locale and if they are not ordered in this way, the output of comm  can
33       identify  such  lines as being both unique to file1 and unique to file2
34       instead of being in both files.
35

OPTIONS

37       The comm utility shall  conform  to  the  Base  Definitions  volume  of
38       POSIX.1‐2017, Section 12.2, Utility Syntax Guidelines.
39
40       The following options shall be supported:
41
42       -1        Suppress the output column of lines unique to file1.
43
44       -2        Suppress the output column of lines unique to file2.
45
46       -3        Suppress  the  output column of lines duplicated in file1 and
47                 file2.
48

OPERANDS

50       The following operands shall be supported:
51
52       file1     A pathname of the first file to be compared. If file1 is '-',
53                 the standard input shall be used.
54
55       file2     A  pathname  of  the  second file to be compared. If file2 is
56                 '-', the standard input shall be used.
57
58       If both file1 and file2 refer to standard input or  to  the  same  FIFO
59       special,  block  special,  or  character  special file, the results are
60       undefined.
61

STDIN

63       The standard input shall be used only if one of the file1 or file2  op‐
64       erands refers to standard input. See the INPUT FILES section.
65

INPUT FILES

67       The input files shall be text files.
68

ENVIRONMENT VARIABLES

70       The following environment variables shall affect the execution of comm:
71
72       LANG      Provide  a  default  value for the internationalization vari‐
73                 ables that are unset or null. (See the Base Definitions  vol‐
74                 ume  of POSIX.1‐2017, Section 8.2, Internationalization Vari‐
75                 ables for the precedence  of  internationalization  variables
76                 used to determine the values of locale categories.)
77
78       LC_ALL    If  set  to  a non-empty string value, override the values of
79                 all the other internationalization variables.
80
81       LC_COLLATE
82                 Determine the locale for the collating sequence comm  expects
83                 to have been used when the input files were sorted.
84
85       LC_CTYPE  Determine  the  locale for the interpretation of sequences of
86                 bytes of text data as characters (for example, single-byte as
87                 opposed  to  multi-byte  characters  in  arguments  and input
88                 files).
89
90       LC_MESSAGES
91                 Determine the locale that should be used to affect the format
92                 and  contents  of  diagnostic  messages  written  to standard
93                 error.
94
95       NLSPATH   Determine the location of message catalogs for the processing
96                 of LC_MESSAGES.
97

ASYNCHRONOUS EVENTS

99       Default.
100

STDOUT

102       The  comm  utility  shall  produce  output  depending  on  the  options
103       selected. If the -1, -2, and -3 options are all  selected,  comm  shall
104       write nothing to standard output.
105
106       If  the  -1 option is not selected, lines contained only in file1 shall
107       be written using the format:
108
109
110           "%s\n", <line in file1>
111
112       If the -2 option is not selected, lines contained  only  in  file2  are
113       written using the format:
114
115
116           "%s%s\n", <lead>, <line in file2>
117
118       where the string <lead> is as follows:
119
120       <tab>     The -1 option is not selected.
121
122       null string
123                 The -1 option is selected.
124
125       If  the  -3 option is not selected, lines contained in both files shall
126       be written using the format:
127
128
129           "%s%s\n", <lead>, <line in both>
130
131       where the string <lead> is as follows:
132
133       <tab><tab>
134                 Neither the -1 nor the -2 option is selected.
135
136       <tab>     Exactly one of the -1 and -2 options is selected.
137
138       null string
139                 Both the -1 and -2 options are selected.
140
141       If the input files were ordered according to the collating sequence  of
142       the  current  locale,  the  lines  written  shall  be  in the collating
143       sequence of the current locale. If the input files contained any  lines
144       that collated equally but were not identical and within each file those
145       lines were ordered according to a further byte-by-byte comparison using
146       the  collating  sequence for the POSIX locale, and comm treated them as
147       different lines, then lines written that collate equally  but  are  not
148       identical should be ordered according to a further byte-by-byte compar‐
149       ison using the collating sequence for the POSIX locale.
150

STDERR

152       The standard error shall be used only for diagnostic messages.
153

OUTPUT FILES

155       None.
156

EXTENDED DESCRIPTION

158       None.
159

EXIT STATUS

161       The following exit values shall be returned:
162
163        0    All input files were successfully output as specified.
164
165       >0    An error occurred.
166

CONSEQUENCES OF ERRORS

168       Default.
169
170       The following sections are informative.
171

APPLICATION USAGE

173       If the input files are not properly presorted, the output of comm might
174       not be useful.
175
176       When using comm to process pathnames, it is recommended that LC_ALL, or
177       at least LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environ‐
178       ment, since pathnames can contain byte sequences that do not form valid
179       characters in some locales, in which case the utility's behavior  would
180       be  undefined.  In  the  POSIX  locale each byte is a valid single-byte
181       character, and therefore this problem is avoided.
182
183       If the collating sequence of the current locale does not have  a  total
184       ordering of all characters, this can affect the behavior of comm in the
185       following ways:
186
187        *  If comm treats lines as being the same only if they are  identical,
188           some  lines  can be misleadingly identified as being both unique to
189           file1 and unique to file2.
190
191        *  If comm treats lines as being the same if they collate equally  and
192           a  line  from  file1 collates equally with a line from file2 but is
193           not identical to it, one of the lines is misleadingly identified as
194           being  in  both files and the other is not written to the output at
195           all.
196
197       Such problems can be avoided by forcing the use of  the  POSIX  locale;
198       for example, the following identifies lines in both file1 and file2:
199
200
201           LC_ALL=POSIX sort file1 > file1.posix
202           LC_ALL=POSIX sort file2 > file2.posix
203           LC_ALL=POSIX comm -12 file1.posix file2.posix | sort
204
205       The  final  sort re-sorts the output of comm according to the collating
206       sequence of the original locale. Doing this might be difficult if  more
207       than one column is output and leading <blank>s cannot be ignored.
208

EXAMPLES

210       If  a  file  named  xcu contains a sorted list of the utilities in this
211       volume of POSIX.1‐2017, a file named xpg3 contains a sorted list of the
212       utilities  specified  in  the  X/Open Portability Guide, Issue 3, and a
213       file named svid89 contains a sorted list of the utilities in the System
214       V Interface Definition Third Edition:
215
216
217           comm -23 xcu xpg3 | comm -23 - svid89
218
219       would  print  a  list  of  utilities in this volume of POSIX.1‐2017 not
220       specified by either of the other documents:
221
222
223           comm -12 xcu xpg3 | comm -12 - svid89
224
225       would print a list of utilities specified by all three documents, and:
226
227
228           comm -12 xpg3 svid89 | comm -23 - xcu
229
230       would print a list of utilities specified by both XPG3  and  the  SVID,
231       but not specified in this volume of POSIX.1‐2017.
232

RATIONALE

234       None.
235

FUTURE DIRECTIONS

237       A  future  version  of this standard may require that if any lines from
238       the input files collate equally but are not identical, then comm treats
239       them  as  different lines and expects them to be ordered according to a
240       further byte-by-byte comparison using the collating  sequence  for  the
241       POSIX locale.
242
243       A  future  version of this standard may require that if the input files
244       contained any lines that collated equally but were  not  identical  and
245       within  each file those lines were ordered according to a further byte-
246       by-byte comparison using the collating sequence for the  POSIX  locale,
247       then  lines  written  that  collate  equally  but are not identical are
248       ordered according to a further byte-by-byte comparison using  the  col‐
249       lating sequence for the POSIX locale.
250

SEE ALSO

252       cmp, diff, sort, uniq
253
254       The Base Definitions volume of POSIX.1‐2017, Section 7.3.2, LC_COLLATE,
255       Chapter 8, Environment Variables, Section 12.2, Utility  Syntax  Guide‐
256       lines
257
259       Portions  of  this text are reprinted and reproduced in electronic form
260       from IEEE Std 1003.1-2017, Standard for Information Technology --  Por‐
261       table  Operating System Interface (POSIX), The Open Group Base Specifi‐
262       cations Issue 7, 2018 Edition, Copyright (C) 2018 by the  Institute  of
263       Electrical  and  Electronics Engineers, Inc and The Open Group.  In the
264       event of any discrepancy between this version and the original IEEE and
265       The  Open Group Standard, the original IEEE and The Open Group Standard
266       is the referee document. The original Standard can be  obtained  online
267       at http://www.opengroup.org/unix/online.html .
268
269       Any  typographical  or  formatting  errors that appear in this page are
270       most likely to have been introduced during the conversion of the source
271       files  to  man page format. To report such errors, see https://www.ker
272       nel.org/doc/man-pages/reporting_bugs.html .
273
274
275
276IEEE/The Open Group                  2017                             COMM(1P)
Impressum