1DIFF(1P)                   POSIX Programmer's Manual                  DIFF(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10

NAME

12       diff — compare two files
13

SYNOPSIS

15       diff [-c|-e|-f|-u|-C n|-U n] [-br] file1 file2
16

DESCRIPTION

18       The diff utility shall compare the contents  of  file1  and  file2  and
19       write  to  standard output a list of changes necessary to convert file1
20       into file2.  This list should be minimal. No output shall  be  produced
21       if the files are identical.
22

OPTIONS

24       The  diff  utility  shall  conform  to  the  Base Definitions volume of
25       POSIX.1‐2017, Section 12.2, Utility Syntax Guidelines.
26
27       The following options shall be supported:
28
29       -b        Cause any amount of white space at the end of a  line  to  be
30                 treated as a single <newline> (that is, the white-space char‐
31                 acters preceding the <newline> are ignored) and other strings
32                 of  white-space  characters,  not including <newline> charac‐
33                 ters, to compare equal.
34
35       -c        Produce output in a form that provides three lines of  copied
36                 context.
37
38       -C n      Produce output in a form that provides n lines of copied con‐
39                 text (where n shall be  interpreted  as  a  positive  decimal
40                 integer).
41
42       -e        Produce  output  in a form suitable as input for the ed util‐
43                 ity, which can then be used to convert file1 into file2.
44
45       -f        Produce output in an alternative form, similar in  format  to
46                 -e, but not intended to be suitable as input for the ed util‐
47                 ity, and in the opposite order.
48
49       -r        Apply diff recursively to files and directories of  the  same
50                 name when file1 and file2 are both directories.
51
52                 The diff utility shall detect infinite loops; that is, enter‐
53                 ing a previously visited directory that is an ancestor of the
54                 last  file  encountered.   When  it detects an infinite loop,
55                 diff shall write a diagnostic message to standard  error  and
56                 shall  either recover its position in the hierarchy or termi‐
57                 nate.
58
59       -u        Produce output in a form that provides three lines of unified
60                 context.
61
62       -U n      Produce  output  in  a  form that provides n lines of unified
63                 context (where n shall be interpreted as a non-negative deci‐
64                 mal integer).
65

OPERANDS

67       The following operands shall be supported:
68
69       file1, file2
70                 A  pathname  of a file to be compared. If either the file1 or
71                 file2 operand is '-', the standard input shall be used in its
72                 place.
73
74       If  both  file1 and file2 are directories, diff shall not compare block
75       special files, character special files, or FIFO special  files  to  any
76       files  and  shall  not  compare  regular files to directories.  Further
77       details are as specified in  Diff  Directory  Comparison  Format.   The
78       behavior  of  diff  on  other file types is implementation-defined when
79       found in directories.
80
81       If only one of file1 and file2 is a directory, diff shall be applied to
82       the  non-directory  file  and  the file contained in the directory file
83       with a filename that is the same as the  last  component  of  the  non-
84       directory file.
85

STDIN

87       The  standard input shall be used only if one of the file1 or file2 op‐
88       erands references standard input. See the INPUT FILES section.
89

INPUT FILES

91       The input files may be of any type.
92

ENVIRONMENT VARIABLES

94       The following environment variables shall affect the execution of diff:
95
96       LANG      Provide a default value for  the  internationalization  vari‐
97                 ables  that are unset or null. (See the Base Definitions vol‐
98                 ume of POSIX.1‐2017, Section 8.2, Internationalization  Vari‐
99                 ables  for  the  precedence of internationalization variables
100                 used to determine the values of locale categories.)
101
102       LC_ALL    If set to a non-empty string value, override  the  values  of
103                 all the other internationalization variables.
104
105       LC_CTYPE  Determine  the  locale for the interpretation of sequences of
106                 bytes of text data as characters (for example, single-byte as
107                 opposed  to  multi-byte  characters  in  arguments  and input
108                 files).
109
110       LC_MESSAGES
111                 Determine the locale that should be used to affect the format
112                 and contents of diagnostic messages written to standard error
113                 and informative messages written to standard output.
114
115       LC_TIME   Determine the locale for affecting the format of  file  time‐
116                 stamps written with the -C and -c options.
117
118       NLSPATH   Determine the location of message catalogs for the processing
119                 of LC_MESSAGES.
120
121       TZ        Determine the timezone used for calculating  file  timestamps
122                 written  with  a  context  format. If TZ is unset or null, an
123                 unspecified default timezone shall be used.
124

ASYNCHRONOUS EVENTS

126       Default.
127

STDOUT

129   Diff Directory Comparison Format
130       If both file1 and file2 are directories, the following  output  formats
131       shall be used.
132
133       In  the  POSIX  locale, each file that is present in only one directory
134       shall be reported using the following format:
135
136
137           "Only in %s: %s\n", <directory pathname>, <filename>
138
139       In the POSIX locale, subdirectories that are common to the two directo‐
140       ries may be reported with the following format:
141
142
143           "Common subdirectories: %s and %s\n", <directory1 pathname>,
144               <directory2 pathname>
145
146       For  each  file common to the two directories, if the two files are not
147       to be compared: if the two files have  the  same  device  ID  and  file
148       serial  number,  or are both block special files that refer to the same
149       device, or are both character special files  that  refer  to  the  same
150       device,  in  the POSIX locale the output format is unspecified.  Other‐
151       wise, in the POSIX locale an unspecified format shall be used that con‐
152       tains the pathnames of the two files.
153
154       For  each file common to the two directories, if the files are compared
155       and are identical, no output shall be written. If the two files differ,
156       the following format is written:
157
158
159           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
160
161       where <diff_options> are the options as specified on the command line.
162
163       All directory pathnames listed in this section shall be relative to the
164       original command line arguments. All other names  of  files  listed  in
165       this section shall be filenames (pathname components).
166
167   Diff Binary Output Format
168       In the POSIX locale, if one or both of the files being compared are not
169       text files, it is implementation-defined whether diff uses  the  binary
170       file  output format or the other formats as specified below. The binary
171       file output format shall contain the pathnames of two files being  com‐
172       pared and the string "differ".
173
174       If  both  files being compared are text files, depending on the options
175       specified, one of the following formats shall be used to write the dif‐
176       ferences.
177
178   Diff Default Output Format
179       The  default  (without  -e, -f, -c, -C, -u, or -U options) diff utility
180       output shall contain lines of these forms:
181
182
183           "%da%d\n", <num1>, <num2>
184
185           "%da%d,%d\n", <num1>, <num2>, <num3>
186
187           "%dd%d\n", <num1>, <num2>
188
189           "%d,%dd%d\n", <num1>, <num2>, <num3>
190
191           "%dc%d\n", <num1>, <num2>
192
193           "%d,%dc%d\n", <num1>, <num2>, <num3>
194
195           "%dc%d,%d\n", <num1>, <num2>, <num3>
196
197           "%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>
198
199       These lines resemble ed subcommands to convert file1 into  file2.   The
200       line  numbers  before  the action letters shall pertain to file1; those
201       after shall pertain to file2.  Thus, by exchanging a for d and  reading
202       the  line in reverse order, one can also determine how to convert file2
203       into file1.  As in ed, identical pairs (where num1= num2) are  abbrevi‐
204       ated as a single number.
205
206       Following  each of these lines, diff shall write to standard output all
207       lines affected in the first file using the format:
208
209
210           "< %s", <line>
211
212       and all lines affected in the second file using the format:
213
214
215           "> %s", <line>
216
217       If there are lines affected in both file1 and file2 (as with the c sub‐
218       command),  the  changes  are  separated with a line consisting of three
219       <hyphen-minus> characters:
220
221
222           "---\n"
223
224   Diff -e Output Format
225       With the -e option, a script shall be produced that  shall,  when  pro‐
226       vided as input to ed, along with an appended w (write) command, convert
227       file1 into file2.  Only the a  (append),  c  (change),  d  (delete),  i
228       (insert),  and  s  (substitute)  commands  of  ed shall be used in this
229       script. Text lines, except those consisting  of  the  single  character
230       <period> ('.'), shall be output as they appear in the file.
231
232   Diff -f Output Format
233       With  the -f option, an alternative format of script shall be produced.
234       It is similar to that produced by -e, with the following differences:
235
236        1. It is expressed in  reverse  sequence;  the  output  of  -e  orders
237           changes  from  the  end  of  the file to the beginning; the -f from
238           beginning to end.
239
240        2. The command form <lines> <command-letter> used by -e  is  reversed.
241           For example, 10c with -e would be c10 with -f.
242
243        3. The  form  used  for  ranges  of line numbers is <space>-separated,
244           rather than <comma>-separated.
245
246   Diff -c or -C Output Format
247       With the -c or -C option, the output format shall consist  of  affected
248       lines along with surrounding lines of context. The affected lines shall
249       show which ones need to be deleted or changed in file1, and those added
250       from  file2.  With the -c option, three lines of context, if available,
251       shall be written before and after  the  affected  lines.  With  the  -C
252       option,  the  user  can  specify how many lines of context are written.
253       The exact format follows.
254
255       The name and last modification time of each file shall be output in the
256       following format:
257
258
259           "*** %s %s\n", file1, <file1 timestamp>
260           "--- %s %s\n", file2, <file2 timestamp>
261
262       Each <file> field shall be the pathname of the corresponding file being
263       compared. The pathname written for standard input is unspecified.
264
265       In the POSIX locale, each <timestamp> field shall be equivalent to  the
266       output from the following command:
267
268
269           date "+%a %b %e %T %Y"
270
271       without  the trailing <newline>, executed at the time of last modifica‐
272       tion of the corresponding file (or the current time,  if  the  file  is
273       standard input).
274
275       Then,  the  following  output formats shall be applied for every set of
276       changes.
277
278       First, a line shall be written in the following format:
279
280
281           "***************\n"
282
283       Next, the range of lines in file1 shall be  written  in  the  following
284       format if the range contains two or more lines:
285
286
287           "*** %d,%d ****\n", <beginning line number>, <ending line number>
288
289       and the following format otherwise:
290
291
292           "*** %d ****\n", <ending line number>
293
294       The  ending  line  number  of an empty range shall be the number of the
295       preceding line, or 0 if the range is at the start of the file.
296
297       Next, the affected lines along with lines of context (unaffected lines)
298       shall  be  written.  Unaffected lines shall be written in the following
299       format:
300
301
302           "  %s", <unaffected_line>
303
304       Deleted lines shall be written as:
305
306
307           "- %s", <deleted_line>
308
309       Changed lines shall be written as:
310
311
312           "! %s", <changed_line>
313
314       Next, the range of lines in file2 shall be  written  in  the  following
315       format if the range contains two or more lines:
316
317
318           "--- %d,%d ----\n", <beginning line number>, <ending line number>
319
320       and the following format otherwise:
321
322
323           "--- %d ----\n", <ending line number>
324
325       Then,  lines of context and changed lines shall be written as described
326       in the previous formats. Lines added from file2 shall be written in the
327       following format:
328
329
330           "+ %s", <added_line>
331
332   Diff -u or -U Output Format
333       The  -u or -U options behave like the -c or -C options, except that the
334       context lines are not repeated;  instead,  the  context,  deleted,  and
335       added lines are shown together, interleaved.  The exact format follows.
336
337       The name and last modification time of each file shall be output in the
338       following format:
339
340
341           "--- %s\t%s%s %s\n", file1, <file1 timestamp>, <file1 frac>, <file1 zone>
342           "+++ %s\t%s%s %s\n", file2, <file2 timestamp>, <file2 frac>, <file2 zone>
343
344       Each <file> field shall be the pathname of the corresponding file being
345       compared,  or  the single character '-' if standard input is being com‐
346       pared. However, if the pathname contains a <tab> or a <newline>, or  if
347       it  does  not  consist  entirely  of characters taken from the portable
348       character set, the behavior is implementation-defined.
349
350       Each <timestamp> field shall be equivalent to the output from the  fol‐
351       lowing command:
352
353
354           date '+%Y-%m-%d %H:%M:%S'
355
356       without  the trailing <newline>, executed at the time of last modifica‐
357       tion of the corresponding file (or the current time,  if  the  file  is
358       standard input).
359
360       Each <frac> field shall be either empty, or a decimal point followed by
361       at least one decimal digit, indicating the fractional-seconds part  (if
362       any) of the file timestamp. The number of fractional digits shall be at
363       least the number needed to represent the file's timestamp without  loss
364       of information.
365
366       Each <zone> field shall be of the form "shhmm", where "shh" is a signed
367       two-digit decimal number in the range -24 through +25, and "mm"  is  an
368       unsigned  two-digit decimal number in the range 00 through 59.  It rep‐
369       resents the timezone of the timestamp as the number of hours  (hh)  and
370       minutes  (mm)  east  (+)  or west (-) of UTC for the timestamp.  If the
371       hours and minutes are both zero, the sign shall be  '+'.   However,  if
372       the  timezone  is  not an integral number of minutes away from UTC, the
373       <zone> field is implementation-defined.
374
375       Then, the following output formats shall be applied for  every  set  of
376       changes.
377
378       First,  the range of lines in each file shall be written in the follow‐
379       ing format:
380
381
382           "@@ -%s +%s @@", <file1 range>, <file2 range>
383
384       Each <range> field shall be of the form:
385
386
387           "%1d", <beginning line number>
388
389       or:
390
391
392           "%1d,1", <beginning line number>
393
394       if the range contains exactly one line, and:
395
396
397           "%1d,%1d", <beginning line number>, <number of lines>
398
399       otherwise. If a range is empty, its beginning line number shall be  the
400       number  of  the  line  just  before  the range, or 0 if the empty range
401       starts the file.
402
403       Next, the affected lines along with lines of context shall be  written.
404       Each  non-empty  unaffected line shall be written in the following for‐
405       mat:
406
407
408           " %s", <unaffected_line>
409
410       where the contents of the unaffected line shall be  taken  from  file1.
411       It  is implementation-defined whether an empty unaffected line is writ‐
412       ten as an empty line or a line containing a single  <space>  character.
413       This  line  also represents the same line of file2, even though file2's
414       line may contain different contents due to the -b.  Deleted lines shall
415       be written as:
416
417
418           "-%s", <deleted_line>
419
420       Added lines shall be written as:
421
422
423           "+%s", <added_line>
424
425       The order of lines written shall be the same as that of the correspond‐
426       ing file. A deleted line shall never be written  immediately  after  an
427       added line.
428
429       If -U n is specified, the output shall contain no more than 2n consecu‐
430       tive unaffected lines; and if the output contains an affected line  and
431       this  line  is  adjacent to up to n consecutive unaffected lines in the
432       corresponding file, the output shall contain  these  unaffected  lines.
433       -u shall act like -U3.
434

STDERR

436       The standard error shall be used only for diagnostic messages.
437

OUTPUT FILES

439       None.
440

EXTENDED DESCRIPTION

442       None.
443

EXIT STATUS

445       The following exit values shall be returned:
446
447        0    No differences were found.
448
449        1    Differences were found.
450
451       >1    An error occurred.
452

CONSEQUENCES OF ERRORS

454       Default.
455
456       The following sections are informative.
457

APPLICATION USAGE

459       If  lines  at  the end of a file are changed and other lines are added,
460       diff output may show this as a delete and add, as a  change,  or  as  a
461       change  and  add; diff is not expected to know which happened and users
462       should not care about the difference in output as long  as  it  clearly
463       shows the differences between the files.
464

EXAMPLES

466       If dir1 is a directory containing a directory named x, dir2 is a direc‐
467       tory containing a directory named x, dir1/x  and  dir2/x  both  contain
468       files named date.out, and dir2/x contains a file named y, the command:
469
470
471           diff -r dir1 dir2
472
473       could produce output similar to:
474
475
476           Common subdirectories: dir1/x and dir2/x
477           Only in dir2/x: y
478           diff -r dir1/x/date.out dir2/x/date.out
479           1c1
480           < Mon Jul  2 13:12:16 PDT 1990
481           ---
482           > Tue Jun 19 21:41:39 PDT 1990
483

RATIONALE

485       The  -h  option was omitted because it was insufficiently specified and
486       does not add to applications portability.
487
488       Historical implementations employ algorithms that do not always produce
489       a  minimum list of differences; the current language about making every
490       effort is the best this volume of POSIX.1‐2017 can do, as there  is  no
491       metric  that  could be employed to judge the quality of implementations
492       against any and all file contents. The statement ``This list should  be
493       minimal'' clearly implies that implementations are not expected to pro‐
494       vide the following output when comparing two 100-line files that differ
495       in only one character on a single line:
496
497
498           1,100c1,100
499           all 100 lines from file1 preceded with "< "
500           ---
501           all 100 lines from file2 preceded with "> "
502
503       The  ``Only  in'' messages required when the -r option is specified are
504       not used by most historical implementations if the -e  option  is  also
505       specified.  It  is required here because it provides useful information
506       that must be provided to update a target directory hierarchy to match a
507       source hierarchy. The ``Common subdirectories'' messages are written by
508       System V and 4.3 BSD when the -r option is specified. They are  allowed
509       here  but are not required because they are reporting on something that
510       is the same, not reporting a difference, and are not needed to update a
511       target hierarchy.
512
513       The  -c option, which writes output in a format using lines of context,
514       has been included. The format is useful for a variety of reasons, among
515       them being much improved readability and the ability to understand dif‐
516       ference changes when the target file has line numbers that differ  from
517       another  similar,  but  slightly  different, copy. The patch utility is
518       most valuable when working with difference  listings  using  a  context
519       format. The BSD version of -c takes an optional argument specifying the
520       amount of context. Rather than overloading -c and breaking the  Utility
521       Syntax  Guidelines  for  diff, the standard developers decided to add a
522       separate option for specifying a context diff with a  specified  amount
523       of  context  (-C).   Also,  the  format  for context diffs was extended
524       slightly in 4.3 BSD to allow multiple changes that are  within  context
525       lines from each other to be merged together. The output format contains
526       an additional four <asterisk> characters after the  range  of  affected
527       lines  in  the  first filename. This was to provide a flag for old pro‐
528       grams (like old versions of patch) that only understand the old context
529       format.  The  version  of  context described here does not require that
530       multiple changes within context lines be merged, but it does  not  pro‐
531       hibit  it  either.  The extension is upwards-compatible, so any vendors
532       that wish to retain the old version of diff can do  so  by  adding  the
533       extra four <asterisk> characters (that is, utilities that currently use
534       diff and understand the new merged format will also understand the  old
535       unmerged format, but not vice versa).
536
537       The -u and -U options of GNU diff have been included. Their output for‐
538       mat, designed by Wayne Davison, takes up less space than -c and -C for‐
539       mat,  and  in  many cases is easier to read. The format's timestamps do
540       not vary by locale, so LC_TIME does not affect it.  The  format's  line
541       numbers are rendered with the %1d format, not %d, because the file for‐
542       mat notation rules would  allow  extra  <blank>  characters  to  appear
543       around the numbers.
544
545       The  substitute  command  was  added as an additional format for the -e
546       option. This was added to provide implementations with a way to fix the
547       classic  ``dot  alone on a line'' bug present in many versions of diff.
548       Since many implementations have fixed this bug, the standard developers
549       decided  not  to standardize broken behavior, but rather to provide the
550       necessary tool for fixing the bug. One way to fix this bug is to output
551       two periods whenever a lone period is needed, then terminate the append
552       command with a period, and then use the substitute command  to  convert
553       the two periods into one period.
554
555       The  BSD-derived  -r  option was added to provide a mechanism for using
556       diff to compare two file system trees.  This  behavior  is  useful,  is
557       standard  practice on all BSD-derived systems, and is not easily repro‐
558       ducible with the find utility.
559
560       The requirement that diff not compare files in some circumstances, even
561       though  they  have the same name, is based on the actual output of his‐
562       torical implementations.  The specified behavior precludes the problems
563       arising  from  running into FIFOs and other files that would cause diff
564       to hang waiting for input with no indication to the user that diff  was
565       hung.  An  earlier version of this standard specified the output format
566       more precisely, but in practice this requirement was widely ignored and
567       the  benefit of standardization seemed small, so it is now unspecified.
568       In most common usage, diff -r should indicate differences in  the  file
569       hierarchies,  not  the  difference of contents of devices pointed to by
570       the hierarchies.
571
572       Many early implementations of diff require seekable  files.  Since  the
573       System  Interfaces  volume  of  POSIX.1‐2017  supports named pipes, the
574       standard developers decided that such a restriction  was  unreasonable.
575       Note also that the allowed filename - almost always refers to a pipe.
576
577       No directory search order is specified for diff.  The historical order‐
578       ing is, in fact, not optimal, in that it prints out all of the  differ‐
579       ences  at  the current level, including the statements about all common
580       subdirectories before recursing into those subdirectories.
581
582       The message:
583
584
585           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
586
587       does not vary by locale because it is the representation of a  command,
588       not an English sentence.
589

FUTURE DIRECTIONS

591       None.
592

SEE ALSO

594       cmp, comm, ed, find
595
596       The  Base  Definitions  volume  of POSIX.1‐2017, Chapter 8, Environment
597       Variables, Section 12.2, Utility Syntax Guidelines
598
600       Portions of this text are reprinted and reproduced in  electronic  form
601       from  IEEE Std 1003.1-2017, Standard for Information Technology -- Por‐
602       table Operating System Interface (POSIX), The Open Group Base  Specifi‐
603       cations  Issue  7, 2018 Edition, Copyright (C) 2018 by the Institute of
604       Electrical and Electronics Engineers, Inc and The Open Group.   In  the
605       event of any discrepancy between this version and the original IEEE and
606       The Open Group Standard, the original IEEE and The Open Group  Standard
607       is  the  referee document. The original Standard can be obtained online
608       at http://www.opengroup.org/unix/online.html .
609
610       Any typographical or formatting errors that appear  in  this  page  are
611       most likely to have been introduced during the conversion of the source
612       files to man page format. To report such errors,  see  https://www.ker
613       nel.org/doc/man-pages/reporting_bugs.html .
614
615
616
617IEEE/The Open Group                  2017                             DIFF(1P)
Impressum