1DIFF(P) POSIX Programmer's Manual DIFF(P)
2
3
4
6 diff - compare two files
7
9 diff [-c| -e| -f| -C n][-br] file1 file2
10
12 The diff utility shall compare the contents of file1 and file2 and
13 write to standard output a list of changes necessary to convert file1
14 into file2. This list should be minimal. No output shall be produced if
15 the files are identical.
16
18 The diff utility shall conform to the Base Definitions volume of
19 IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
20
21 The following options shall be supported:
22
23 -b Cause any amount of white space at the end of a line to be
24 treated as a single <newline> (that is, the white-space charac‐
25 ters preceding the <newline> are ignored) and other strings of
26 white-space characters, not including <newline>s, to compare
27 equal.
28
29 -c Produce output in a form that provides three lines of context.
30
31 -C n Produce output in a form that provides n lines of context (where
32 n shall be interpreted as a positive decimal integer).
33
34 -e Produce output in a form suitable as input for the ed utility,
35 which can then be used to convert file1 into file2.
36
37 -f Produce output in an alternative form, similar in format to -e,
38 but not intended to be suitable as input for the ed utility, and
39 in the opposite order.
40
41 -r Apply diff recursively to files and directories of the same name
42 when file1 and file2 are both directories.
43
44
46 The following operands shall be supported:
47
48 file1, file2
49 A pathname of a file to be compared. If either the file1 or
50 file2 operand is '-' , the standard input shall be used in its
51 place.
52
53
54 If both file1 and file2 are directories, diff shall not compare block
55 special files, character special files, or FIFO special files to any
56 files and shall not compare regular files to directories. Further
57 details are as specified in Diff Directory Comparison Format . The
58 behavior of diff on other file types is implementation-defined when
59 found in directories.
60
61 If only one of file1 and file2 is a directory, diff shall be applied to
62 the non-directory file and the file contained in the directory file
63 with a filename that is the same as the last component of the non-
64 directory file.
65
67 The standard input shall be used only if one of the file1 or file2 op‐
68 erands references standard input. See the INPUT FILES section.
69
71 The input files may be of any type.
72
74 The following environment variables shall affect the execution of diff:
75
76 LANG Provide a default value for the internationalization variables
77 that are unset or null. (See the Base Definitions volume of
78 IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
79 ables for the precedence of internationalization variables used
80 to determine the values of locale categories.)
81
82 LC_ALL If set to a non-empty string value, override the values of all
83 the other internationalization variables.
84
85 LC_CTYPE
86 Determine the locale for the interpretation of sequences of
87 bytes of text data as characters (for example, single-byte as
88 opposed to multi-byte characters in arguments and input files).
89
90 LC_MESSAGES
91 Determine the locale that should be used to affect the format
92 and contents of diagnostic messages written to standard error
93 and informative messages written to standard output.
94
95 LC_TIME
96 Determine the locale for affecting the format of file timestamps
97 written with the -C and -c options.
98
99 NLSPATH
100 Determine the location of message catalogs for the processing of
101 LC_MESSAGES .
102
103 TZ Determine the timezone used for calculating file timestamps
104 written with the -C and -c options. If TZ is unset or null, an
105 unspecified default timezone shall be used.
106
107
109 Default.
110
112 Diff Directory Comparison Format
113 If both file1 and file2 are directories, the following output formats
114 shall be used.
115
116 In the POSIX locale, each file that is present in only one directory
117 shall be reported using the following format:
118
119
120 "Only in %s: %s\n", <directory pathname>, <filename>
121
122 In the POSIX locale, subdirectories that are common to the two directo‐
123 ries may be reported with the following format:
124
125
126 "Common subdirectories: %s and %s\n", <directory1 pathname>,
127 <directory2 pathname>
128
129 For each file common to the two directories if the two files are not to
130 be compared, the following format shall be used in the POSIX locale:
131
132
133 "File %s is a %s while file %s is a %s\n", <directory1 pathname>,
134 <file type of directory1 pathname>, <directory2 pathname>,
135 <file type of directory2 pathname>
136
137 For each file common to the two directories, if the files are compared
138 and are identical, no output shall be written. If the two files differ,
139 the following format is written:
140
141
142 "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
143
144 where <diff_options> are the options as specified on the command line.
145
146 All directory pathnames listed in this section shall be relative to the
147 original command line arguments. All other names of files listed in
148 this section shall be filenames (pathname components).
149
150 Diff Binary Output Format
151 In the POSIX locale, if one or both of the files being compared are not
152 text files, an unspecified format shall be used that contains the path‐
153 names of two files being compared and the string "differ" .
154
155 If both files being compared are text files, depending on the options
156 specified, one of the following formats shall be used to write the dif‐
157 ferences.
158
159 Diff Default Output Format
160 The default (without -e, -f, -c, or -C options) diff utility output
161 shall contain lines of these forms:
162
163
164 "%da%d\n", <num1>, <num2>
165
166
167 "%da%d,%d\n", <num1>, <num2>, <num3>
168
169
170 "%dd%d\n", <num1>, <num2>
171
172
173 "%d,%dd%d\n", <num1>, <num2>, <num3>
174
175
176 "%dc%d\n", <num1>, <num2>
177
178
179 "%d,%dc%d\n", <num1>, <num2>, <num3>
180
181
182 "%dc%d,%d\n", <num1>, <num2>, <num3>
183
184
185 "%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>
186
187 These lines resemble ed subcommands to convert file1 into file2. The
188 line numbers before the action letters shall pertain to file1; those
189 after shall pertain to file2. Thus, by exchanging a for d and reading
190 the line in reverse order, one can also determine how to convert file2
191 into file1. As in ed, identical pairs (where num1= num2) are abbrevi‐
192 ated as a single number.
193
194 Following each of these lines, diff shall write to standard output all
195 lines affected in the first file using the format:
196
197
198 "< %s", <line>
199
200 and all lines affected in the second file using the format:
201
202
203 "> %s", <line>
204
205 If there are lines affected in both file1 and file2 (as with the c sub‐
206 command), the changes are separated with a line consisting of three
207 hyphens:
208
209
210 "---\n"
211
212 Diff -e Output Format
213 With the -e option, a script shall be produced that shall, when pro‐
214 vided as input to ed, along with an appended w (write) command, convert
215 file1 into file2. Only the a (append), c (change), d (delete), i
216 (insert), and s (substitute) commands of ed shall be used in this
217 script. Text lines, except those consisting of the single character
218 period ( '.' ), shall be output as they appear in the file.
219
220 Diff -f Output Format
221 With the -f option, an alternative format of script shall be produced.
222 It is similar to that produced by -e, with the following differences:
223
224 1. It is expressed in reverse sequence; the output of -e orders
225 changes from the end of the file to the beginning; the -f from
226 beginning to end.
227
228 2. The command form <lines> <command-letter> used by -e is reversed.
229 For example, 10c with -e would be c10 with -f.
230
231 3. The form used for ranges of line numbers is <space>-separated,
232 rather than comma-separated.
233
234 Diff -c or -C Output Format
235 With the -c or -C option, the output format shall consist of affected
236 lines along with surrounding lines of context. The affected lines shall
237 show which ones need to be deleted or changed in file1, and those added
238 from file2. With the -c option, three lines of context, if available,
239 shall be written before and after the affected lines. With the -C
240 option, the user can specify how many lines of context are written. The
241 exact format follows.
242
243 The name and last modification time of each file shall be output in the
244 following format:
245
246
247 "*** %s %s\n", file1, <file1 timestamp>
248 "--- %s %s\n", file2, <file2 timestamp>
249
250 Each <file> field shall be the pathname of the corresponding file being
251 compared. The pathname written for standard input is unspecified.
252
253 In the POSIX locale, each <timestamp> field shall be equivalent to the
254 output from the following command:
255
256
257 date "+%a %b %e %T %Y"
258
259 without the trailing <newline>, executed at the time of last modifica‐
260 tion of the corresponding file (or the current time, if the file is
261 standard input).
262
263 Then, the following output formats shall be applied for every set of
264 changes.
265
266 First, a line shall be written in the following format:
267
268
269 "***************\n"
270
271 Next, the range of lines in file1 shall be written in the following
272 format if the range contains two or more lines:
273
274
275 "*** %d,%d ****\n", <beginning line number>, <ending line number>
276
277 and the following format otherwise:
278
279
280 "*** %d ****\n", <ending line number>
281
282 The ending line number of an empty range shall be the number of the
283 preceding line, or 0 if the range is at the start of the file.
284
285 Next, the affected lines along with lines of context (unaffected lines)
286 shall be written. Unaffected lines shall be written in the following
287 format:
288
289
290 " %s", <unaffected_line>
291
292 Deleted lines shall be written as:
293
294
295 "- %s", <deleted_line>
296
297 Changed lines shall be written as:
298
299
300 "! %s", <changed_line>
301
302 Next, the range of lines in file2 shall be written in the following
303 format if the range contains two or more lines:
304
305
306 "--- %d,%d ----\n", <beginning line number>, <ending line number>
307
308 and the following format otherwise:
309
310
311 "--- %d ----\n", <ending line number>
312
313 Then, lines of context and changed lines shall be written as described
314 in the previous formats. Lines added from file2 shall be written in the
315 following format:
316
317
318 "+ %s", <added_line>
319
321 The standard error shall be used only for diagnostic messages.
322
324 None.
325
327 None.
328
330 The following exit values shall be returned:
331
332 0 No differences were found.
333
334 1 Differences were found.
335
336 >1 An error occurred.
337
338
340 Default.
341
342 The following sections are informative.
343
345 If lines at the end of a file are changed and other lines are added,
346 diff output may show this as a delete and add, as a change, or as a
347 change and add; diff is not expected to know which happened and users
348 should not care about the difference in output as long as it clearly
349 shows the differences between the files.
350
352 If dir1 is a directory containing a directory named x, dir2 is a direc‐
353 tory containing a directory named x, dir1/x and dir2/x both contain
354 files named date.out, and dir2/x contains a file named y, the command:
355
356
357 diff -r dir1 dir2
358
359 could produce output similar to:
360
361
362 Common subdirectories: dir1/x and dir2/x
363 Only in dir2/x: y
364 diff -r dir1/x/date.out dir2/x/date.out
365 1c1
366 < Mon Jul 2 13:12:16 PDT 1990
367 ---
368 > Tue Jun 19 21:41:39 PDT 1990
369
371 The -h option was omitted because it was insufficiently specified and
372 does not add to applications portability.
373
374 Historical implementations employ algorithms that do not always produce
375 a minimum list of differences; the current language about making every
376 effort is the best this volume of IEEE Std 1003.1-2001 can do, as there
377 is no metric that could be employed to judge the quality of implementa‐
378 tions against any and all file contents. The statement "This list
379 should be minimal'' clearly implies that implementations are not
380 expected to provide the following output when comparing two 100-line
381 files that differ in only one character on a single line:
382
383
384 1,100c1,100
385 all 100 lines from file1 preceded with "< "
386 ---
387 all 100 lines from file2 preceded with "> "
388
389 The "Only in" messages required when the -r option is specified are not
390 used by most historical implementations if the -e option is also speci‐
391 fied. It is required here because it provides useful information that
392 must be provided to update a target directory hierarchy to match a
393 source hierarchy. The "Common subdirectories" messages are written by
394 System V and 4.3 BSD when the -r option is specified. They are allowed
395 here but are not required because they are reporting on something that
396 is the same, not reporting a difference, and are not needed to update a
397 target hierarchy.
398
399 The -c option, which writes output in a format using lines of context,
400 has been included. The format is useful for a variety of reasons, among
401 them being much improved readability and the ability to understand dif‐
402 ference changes when the target file has line numbers that differ from
403 another similar, but slightly different, copy. The patch utility is
404 most valuable when working with difference listings using the context
405 format. The BSD version of -c takes an optional argument specifying
406 the amount of context. Rather than overloading -c and breaking the
407 Utility Syntax Guidelines for diff, the standard developers decided to
408 add a separate option for specifying a context diff with a specified
409 amount of context ( -C). Also, the format for context diffs was
410 extended slightly in 4.3 BSD to allow multiple changes that are within
411 context lines from each other to be merged together. The output format
412 contains an additional four asterisks after the range of affected lines
413 in the first filename. This was to provide a flag for old programs
414 (like old versions of patch) that only understand the old context for‐
415 mat. The version of context described here does not require that multi‐
416 ple changes within context lines be merged, but it does not prohibit it
417 either. The extension is upwards-compatible, so any vendors that wish
418 to retain the old version of diff can do so by adding the extra four
419 asterisks (that is, utilities that currently use diff and understand
420 the new merged format will also understand the old unmerged format, but
421 not vice versa).
422
423 The substitute command was added as an additional format for the -e
424 option. This was added to provide implementations with a way to fix the
425 classic "dot alone on a line" bug present in many versions of diff.
426 Since many implementations have fixed this bug, the standard developers
427 decided not to standardize broken behavior, but rather to provide the
428 necessary tool for fixing the bug. One way to fix this bug is to output
429 two periods whenever a lone period is needed, then terminate the append
430 command with a period, and then use the substitute command to convert
431 the two periods into one period.
432
433 The BSD-derived -r option was added to provide a mechanism for using
434 diff to compare two file system trees. This behavior is useful, is
435 standard practice on all BSD-derived systems, and is not easily repro‐
436 ducible with the find utility.
437
438 The requirement that diff not compare files in some circumstances, even
439 though they have the same name, is based on the actual output of his‐
440 torical implementations. The message specified here is already in use
441 when a directory is being compared to a non-directory. It is extended
442 here to preclude the problems arising from running into FIFOs and other
443 files that would cause diff to hang waiting for input with no indica‐
444 tion to the user that diff was hung. In most common usage, diff -r
445 should indicate differences in the file hierarchies, not the difference
446 of contents of devices pointed to by the hierarchies.
447
448 Many early implementations of diff require seekable files. Since the
449 System Interfaces volume of IEEE Std 1003.1-2001 supports named pipes,
450 the standard developers decided that such a restriction was unreason‐
451 able. Note also that the allowed filename - almost always refers to a
452 pipe.
453
454 No directory search order is specified for diff. The historical order‐
455 ing is, in fact, not optimal, in that it prints out all of the differ‐
456 ences at the current level, including the statements about all common
457 subdirectories before recursing into those subdirectories.
458
459 The message:
460
461
462 "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
463
464 does not vary by locale because it is the representation of a command,
465 not an English sentence.
466
468 None.
469
471 cmp , comm , ed , find
472
474 Portions of this text are reprinted and reproduced in electronic form
475 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
476 -- Portable Operating System Interface (POSIX), The Open Group Base
477 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
478 Electrical and Electronics Engineers, Inc and The Open Group. In the
479 event of any discrepancy between this version and the original IEEE and
480 The Open Group Standard, the original IEEE and The Open Group Standard
481 is the referee document. The original Standard can be obtained online
482 at http://www.opengroup.org/unix/online.html .
483
484
485
486IEEE/The Open Group 2003 DIFF(P)