1DIFF(1P)                   POSIX Programmer's Manual                  DIFF(1P)
2
3
4

PROLOG

6       This  manual  page is part of the POSIX Programmer's Manual.  The Linux
7       implementation of this interface may differ (consult the  corresponding
8       Linux  manual page for details of Linux behavior), or the interface may
9       not be implemented on Linux.
10
11

NAME

13       diff — compare two files
14

SYNOPSIS

16       diff [−c|−e|−f|−u|−C n|−U n] [−br] file1 file2
17

DESCRIPTION

19       The diff utility shall compare the contents  of  file1  and  file2  and
20       write  to  standard output a list of changes necessary to convert file1
21       into file2.  This list should be minimal. No output shall  be  produced
22       if the files are identical.
23

OPTIONS

25       The  diff  utility  shall  conform  to  the  Base Definitions volume of
26       POSIX.1‐2008, Section 12.2, Utility Syntax Guidelines.
27
28       The following options shall be supported:
29
30       −b        Cause any amount of white space at the end of a  line  to  be
31                 treated as a single <newline> (that is, the white-space char‐
32                 acters preceding the <newline> are ignored) and other strings
33                 of  white-space  characters,  not including <newline> charac‐
34                 ters, to compare equal.
35
36       −c        Produce output in a form that provides three lines of  copied
37                 context.
38
39       −C n      Produce output in a form that provides n lines of copied con‐
40                 text (where n shall be  interpreted  as  a  positive  decimal
41                 integer).
42
43       −e        Produce  output  in a form suitable as input for the ed util‐
44                 ity, which can then be used to convert file1 into file2.
45
46       −f        Produce output in an alternative form, similar in  format  to
47                 −e, but not intended to be suitable as input for the ed util‐
48                 ity, and in the opposite order.
49
50       −r        Apply diff recursively to files and directories of  the  same
51                 name when file1 and file2 are both directories.
52
53                 The diff utility shall detect infinite loops; that is, enter‐
54                 ing a previously visited directory that is an ancestor of the
55                 last  file  encountered.   When  it detects an infinite loop,
56                 diff shall write a diagnostic message to standard  error  and
57                 shall  either recover its position in the hierarchy or termi‐
58                 nate.
59
60       −u        Produce output in a form that provides three lines of unified
61                 context.
62
63       −U n      Produce  output  in  a  form that provides n lines of unified
64                 context (where n shall be interpreted as a non-negative deci‐
65                 mal integer).
66

OPERANDS

68       The following operands shall be supported:
69
70       file1, file2
71                 A  pathname  of a file to be compared. If either the file1 or
72                 file2 operand is '−', the standard input shall be used in its
73                 place.
74
75       If  both  file1 and file2 are directories, diff shall not compare block
76       special files, character special files, or FIFO special  files  to  any
77       files  and  shall  not  compare  regular files to directories.  Further
78       details are as specified in  Diff  Directory  Comparison  Format.   The
79       behavior  of  diff  on  other file types is implementation-defined when
80       found in directories.
81
82       If only one of file1 and file2 is a directory, diff shall be applied to
83       the  non-directory  file  and  the file contained in the directory file
84       with a filename that is the same as the  last  component  of  the  non-
85       directory file.
86

STDIN

88       The  standard input shall be used only if one of the file1 or file2 op‐
89       erands references standard input. See the INPUT FILES section.
90

INPUT FILES

92       The input files may be of any type.
93

ENVIRONMENT VARIABLES

95       The following environment variables shall affect the execution of diff:
96
97       LANG      Provide a default value for  the  internationalization  vari‐
98                 ables  that are unset or null. (See the Base Definitions vol‐
99                 ume of POSIX.1‐2008, Section 8.2, Internationalization  Vari‐
100                 ables  for  the  precedence of internationalization variables
101                 used to determine the values of locale categories.)
102
103       LC_ALL    If set to a non-empty string value, override  the  values  of
104                 all the other internationalization variables.
105
106       LC_CTYPE  Determine  the  locale for the interpretation of sequences of
107                 bytes of text data as characters (for example, single-byte as
108                 opposed  to  multi-byte  characters  in  arguments  and input
109                 files).
110
111       LC_MESSAGES
112                 Determine the locale that should be used to affect the format
113                 and contents of diagnostic messages written to standard error
114                 and informative messages written to standard output.
115
116       LC_TIME   Determine the locale for affecting the format of  file  time‐
117                 stamps written with the −C and −c options.
118
119       NLSPATH   Determine the location of message catalogs for the processing
120                 of LC_MESSAGES.
121
122       TZ        Determine the timezone used for calculating  file  timestamps
123                 written  with  a  context  format. If TZ is unset or null, an
124                 unspecified default timezone shall be used.
125

ASYNCHRONOUS EVENTS

127       Default.
128

STDOUT

130   Diff Directory Comparison Format
131       If both file1 and file2 are directories, the following  output  formats
132       shall be used.
133
134       In  the  POSIX  locale, each file that is present in only one directory
135       shall be reported using the following format:
136
137           "Only in %s: %s\n", <directory pathname>, <filename>
138
139       In the POSIX locale, subdirectories that are common to the two directo‐
140       ries may be reported with the following format:
141
142           "Common subdirectories: %s and %s\n", <directory1 pathname>,
143               <directory2 pathname>
144
145       For  each  file common to the two directories, if the two files are not
146       to be compared: if the two files have  the  same  device  ID  and  file
147       serial  number,  or are both block special files that refer to the same
148       device, or are both character special files  that  refer  to  the  same
149       device,  in  the POSIX locale the output format is unspecified.  Other‐
150       wise, in the POSIX locale an unspecified format shall be used that con‐
151       tains the pathnames of the two files.
152
153       For  each file common to the two directories, if the files are compared
154       and are identical, no output shall be written. If the two files differ,
155       the following format is written:
156
157           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
158
159       where <diff_options> are the options as specified on the command line.
160
161       All directory pathnames listed in this section shall be relative to the
162       original command line arguments. All other names  of  files  listed  in
163       this section shall be filenames (pathname components).
164
165   Diff Binary Output Format
166       In the POSIX locale, if one or both of the files being compared are not
167       text files, it is implementation-defined whether diff uses  the  binary
168       file  output format or the other formats as specified below. The binary
169       file output format shall contain the pathnames of two files being  com‐
170       pared and the string "differ".
171
172       If  both  files being compared are text files, depending on the options
173       specified, one of the following formats shall be used to write the dif‐
174       ferences.
175
176   Diff Default Output Format
177       The  default  (without  −e, −f, −c, −C, −u, or −U options) diff utility
178       output shall contain lines of these forms:
179
180           "%da%d\n", <num1>, <num2>
181
182           "%da%d,%d\n", <num1>, <num2>, <num3>
183
184           "%dd%d\n", <num1>, <num2>
185
186           "%d,%dd%d\n", <num1>, <num2>, <num3>
187
188           "%dc%d\n", <num1>, <num2>
189
190           "%d,%dc%d\n", <num1>, <num2>, <num3>
191
192           "%dc%d,%d\n", <num1>, <num2>, <num3>
193
194           "%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>
195
196       These lines resemble ed subcommands to convert file1 into  file2.   The
197       line  numbers  before  the action letters shall pertain to file1; those
198       after shall pertain to file2.  Thus, by exchanging a for d and  reading
199       the  line in reverse order, one can also determine how to convert file2
200       into file1.  As in ed, identical pairs (where num1= num2) are  abbrevi‐
201       ated as a single number.
202
203       Following  each of these lines, diff shall write to standard output all
204       lines affected in the first file using the format:
205
206           "< %s", <line>
207
208       and all lines affected in the second file using the format:
209
210           "> %s", <line>
211
212       If there are lines affected in both file1 and file2 (as with the c sub‐
213       command),  the  changes  are  separated with a line consisting of three
214       <hyphen> characters:
215
216           "−−−\n"
217
218   Diff −e Output Format
219       With the −e option, a script shall be produced that  shall,  when  pro‐
220       vided as input to ed, along with an appended w (write) command, convert
221       file1 into file2.  Only the a  (append),  c  (change),  d  (delete),  i
222       (insert),  and  s  (substitute)  commands  of  ed shall be used in this
223       script. Text lines, except those consisting  of  the  single  character
224       <period> ('.'), shall be output as they appear in the file.
225
226   Diff −f Output Format
227       With  the −f option, an alternative format of script shall be produced.
228       It is similar to that produced by −e, with the following differences:
229
230        1. It is expressed in  reverse  sequence;  the  output  of  −e  orders
231           changes  from  the  end  of  the file to the beginning; the −f from
232           beginning to end.
233
234        2. The command form <lines> <command-letter> used by −e  is  reversed.
235           For example, 10c with −e would be c10 with −f.
236
237        3. The  form  used  for  ranges  of line numbers is <space>-separated,
238           rather than <comma>-separated.
239
240   Diff −c or −C Output Format
241       With the −c or −C option, the output format shall consist  of  affected
242       lines along with surrounding lines of context. The affected lines shall
243       show which ones need to be deleted or changed in file1, and those added
244       from  file2.  With the −c option, three lines of context, if available,
245       shall be written before and after  the  affected  lines.  With  the  −C
246       option,  the  user  can  specify how many lines of context are written.
247       The exact format follows.
248
249       The name and last modification time of each file shall be output in the
250       following format:
251
252           "*** %s %s\n", file1, <file1 timestamp>
253           "−−− %s %s\n", file2, <file2 timestamp>
254
255       Each <file> field shall be the pathname of the corresponding file being
256       compared. The pathname written for standard input is unspecified.
257
258       In the POSIX locale, each <timestamp> field shall be equivalent to  the
259       output from the following command:
260
261           date "+%a %b %e %T %Y"
262
263       without  the trailing <newline>, executed at the time of last modifica‐
264       tion of the corresponding file (or the current time,  if  the  file  is
265       standard input).
266
267       Then,  the  following  output formats shall be applied for every set of
268       changes.
269
270       First, a line shall be written in the following format:
271
272           "***************\n"
273
274       Next, the range of lines in file1 shall be  written  in  the  following
275       format if the range contains two or more lines:
276
277           "*** %d,%d ****\n", <beginning line number>, <ending line number>
278
279       and the following format otherwise:
280
281           "*** %d ****\n", <ending line number>
282
283       The  ending  line  number  of an empty range shall be the number of the
284       preceding line, or 0 if the range is at the start of the file.
285
286       Next, the affected lines along with lines of context (unaffected lines)
287       shall  be  written.  Unaffected lines shall be written in the following
288       format:
289
290           "  %s", <unaffected_line>
291
292       Deleted lines shall be written as:
293
294           "− %s", <deleted_line>
295
296       Changed lines shall be written as:
297
298           "! %s", <changed_line>
299
300       Next, the range of lines in file2 shall be  written  in  the  following
301       format if the range contains two or more lines:
302
303           "−−− %d,%d −−−−\n", <beginning line number>, <ending line number>
304
305       and the following format otherwise:
306
307           "−−− %d −−−−\n", <ending line number>
308
309       Then,  lines of context and changed lines shall be written as described
310       in the previous formats. Lines added from file2 shall be written in the
311       following format:
312
313           "+ %s", <added_line>
314
315   Diff −u or −U Output Format
316       The  −u or −U options behave like the −c or −C options, except that the
317       context lines are not repeated;  instead,  the  context,  deleted,  and
318       added lines are shown together, interleaved.  The exact format follows.
319
320       The name and last modification time of each file shall be output in the
321       following format:
322
323           "--- %s%s%s %s0, file1, <file1 timestamp>, <file1 frac>, <file1 zone>
324           "+++ %s%s%s %s0, file2, <file2 timestamp>, <file2 frac>, <file2 zone>
325
326       Each <file> field shall be the pathname of the corresponding file being
327       compared,  or  the single character '−' if standard input is being com‐
328       pared. However, if the pathname contains a <tab> or a <newline>, or  if
329       it  does  not  consist  entirely  of characters taken from the portable
330       character set, the behavior is implementation-defined.
331
332       Each <timestamp> field shall be equivalent to the output from the  fol‐
333       lowing command:
334
335           date '+%Y-%m-%d %H:%M:%S'
336
337       without  the trailing <newline>, executed at the time of last modifica‐
338       tion of the corresponding file (or the current time,  if  the  file  is
339       standard input).
340
341       Each <frac> field shall be either empty, or a decimal point followed by
342       at least one decimal digit, indicating the fractional-seconds part  (if
343       any) of the file timestamp. The number of fractional digits shall be at
344       least the number needed to represent the file's timestamp without  loss
345       of information.
346
347       Each <zone> field shall be of the form "shhmm", where "shh" is a signed
348       two-digit decimal number in the range −24 through +25, and "mm"  is  an
349       unsigned  two-digit decimal number in the range 00 through 59.  It rep‐
350       resents the timezone of the timestamp as the number of hours  (hh)  and
351       minutes  (mm)  east  (+)  or west (−) of UTC for the timestamp.  If the
352       hours and minutes are both zero, the sign shall be  '+'.   However,  if
353       the  timezone  is  not an integral number of minutes away from UTC, the
354       <zone> field is implementation-defined.
355
356       Then, the following output formats shall be applied for  every  set  of
357       changes.
358
359       First,  the range of lines in each file shall be written in the follow‐
360       ing format:
361
362           "@@ -%s +%s @@", <file1 range>, <file2 range>
363
364       Each <range> field shall be of the form:
365
366           "%1d", <beginning line number>
367
368       if the range contains exactly one line, and:
369
370           "%1d,%1d", <beginning line number>, <number of lines>
371
372       otherwise. If a range is empty, its beginning line number shall be  the
373       number  of  the  line  just  before  the range, or 0 if the empty range
374       starts the file.
375
376       Next, the affected lines along with lines of context shall be  written.
377       Each  non-empty  unaffected line shall be written in the following for‐
378       mat:
379
380           " %s", <unaffected_line>
381
382       where the contents of the unaffected line shall be  taken  from  file1.
383       It  is implementation-defined whether an empty unaffected line is writ‐
384       ten as an empty line or a line containing a single  <space>  character.
385       This  line  also represents the same line of file2, even though file2's
386       line may contain different contents due to the −b.  Deleted lines shall
387       be written as:
388
389           "-%s", <deleted_line>
390
391       Added lines shall be written as:
392
393           "+%s", <added_line>
394
395       The order of lines written shall be the same as that of the correspond‐
396       ing file. A deleted line shall never be written  immediately  after  an
397       added line.
398
399       If  −U n is specified, the output shall contain no more than n consecu‐
400       tive unaffected lines; and if the output contains an affected line  and
401       this  line  is  adjacent to up to n consecutive unaffected lines in the
402       corresponding file, the output shall contain  these  unaffected  lines.
403       −u shall act like −U3.
404

STDERR

406       The standard error shall be used only for diagnostic messages.
407

OUTPUT FILES

409       None.
410

EXTENDED DESCRIPTION

412       None.
413

EXIT STATUS

415       The following exit values shall be returned:
416
417        0    No differences were found.
418
419        1    Differences were found.
420
421       >1    An error occurred.
422

CONSEQUENCES OF ERRORS

424       Default.
425
426       The following sections are informative.
427

APPLICATION USAGE

429       If  lines  at  the end of a file are changed and other lines are added,
430       diff output may show this as a delete and add, as a  change,  or  as  a
431       change  and  add; diff is not expected to know which happened and users
432       should not care about the difference in output as long  as  it  clearly
433       shows the differences between the files.
434

EXAMPLES

436       If dir1 is a directory containing a directory named x, dir2 is a direc‐
437       tory containing a directory named x, dir1/x  and  dir2/x  both  contain
438       files named date.out, and dir2/x contains a file named y, the command:
439
440           diff −r dir1 dir2
441
442       could produce output similar to:
443
444           Common subdirectories: dir1/x and dir2/x
445           Only in dir2/x: y
446           diff −r dir1/x/date.out dir2/x/date.out
447           1c1
448           < Mon Jul  2 13:12:16 PDT 1990
449           −−−
450           > Tue Jun 19 21:41:39 PDT 1990
451

RATIONALE

453       The  −h  option was omitted because it was insufficiently specified and
454       does not add to applications portability.
455
456       Historical implementations employ algorithms that do not always produce
457       a  minimum list of differences; the current language about making every
458       effort is the best this volume of POSIX.1‐2008 can do, as there  is  no
459       metric  that  could be employed to judge the quality of implementations
460       against any and all file contents. The statement ``This list should  be
461       minimal'' clearly implies that implementations are not expected to pro‐
462       vide the following output when comparing two 100-line files that differ
463       in only one character on a single line:
464
465           1,100c1,100
466           all 100 lines from file1 preceded with "< "
467           −−−
468           all 100 lines from file2 preceded with "> "
469
470       The  ``Only  in'' messages required when the −r option is specified are
471       not used by most historical implementations if the −e  option  is  also
472       specified.  It  is required here because it provides useful information
473       that must be provided to update a target directory hierarchy to match a
474       source hierarchy. The ``Common subdirectories'' messages are written by
475       System V and 4.3 BSD when the −r option is specified. They are  allowed
476       here  but are not required because they are reporting on something that
477       is the same, not reporting a difference, and are not needed to update a
478       target hierarchy.
479
480       The  −c option, which writes output in a format using lines of context,
481       has been included. The format is useful for a variety of reasons, among
482       them being much improved readability and the ability to understand dif‐
483       ference changes when the target file has line numbers that differ  from
484       another  similar,  but  slightly  different, copy. The patch utility is
485       most valuable when working with difference  listings  using  a  context
486       format. The BSD version of −c takes an optional argument specifying the
487       amount of context. Rather than overloading −c and breaking the  Utility
488       Syntax  Guidelines  for  diff, the standard developers decided to add a
489       separate option for specifying a context diff with a  specified  amount
490       of  context  (−C).   Also,  the  format  for context diffs was extended
491       slightly in 4.3 BSD to allow multiple changes that are  within  context
492       lines from each other to be merged together. The output format contains
493       an additional four <asterisk> characters after the  range  of  affected
494       lines  in  the  first filename. This was to provide a flag for old pro‐
495       grams (like old versions of patch) that only understand the old context
496       format.  The  version  of  context described here does not require that
497       multiple changes within context lines be merged, but it does  not  pro‐
498       hibit  it  either.  The extension is upwards-compatible, so any vendors
499       that wish to retain the old version of diff can do  so  by  adding  the
500       extra four <asterisk> characters (that is, utilities that currently use
501       diff and understand the new merged format will also understand the  old
502       unmerged format, but not vice versa).
503
504       The −u and −U options of GNU diff have been included. Their output for‐
505       mat, designed by Wayne Davison, takes up less space than −c and −C for‐
506       mat,  and  in  many cases is easier to read. The format's timestamps do
507       not vary by locale, so LC_TIME does not affect it.  The  format's  line
508       numbers are rendered with the %1d format, not %d, because the file for‐
509       mat notation rules would  allow  extra  <blank>  characters  to  appear
510       around the numbers.
511
512       The  substitute  command  was  added as an additional format for the −e
513       option. This was added to provide implementations with a way to fix the
514       classic  ``dot  alone on a line'' bug present in many versions of diff.
515       Since many implementations have fixed this bug, the standard developers
516       decided  not  to standardize broken behavior, but rather to provide the
517       necessary tool for fixing the bug. One way to fix this bug is to output
518       two periods whenever a lone period is needed, then terminate the append
519       command with a period, and then use the substitute command  to  convert
520       the two periods into one period.
521
522       The  BSD-derived  −r  option was added to provide a mechanism for using
523       diff to compare two file system trees.  This  behavior  is  useful,  is
524       standard  practice on all BSD-derived systems, and is not easily repro‐
525       ducible with the find utility.
526
527       The requirement that diff not compare files in some circumstances, even
528       though  they  have the same name, is based on the actual output of his‐
529       torical implementations.  The specified behavior precludes the problems
530       arising  from  running into FIFOs and other files that would cause diff
531       to hang waiting for input with no indication to the user that diff  was
532       hung.  An  earlier version of this standard specified the output format
533       more precisely, but in practice this requirement was widely ignored and
534       the  benefit of standardization seemed small, so it is now unspecified.
535       In most common usage, diff −r should indicate differences in  the  file
536       hierarchies,  not  the  difference of contents of devices pointed to by
537       the hierarchies.
538
539       Many early implementations of diff require seekable  files.  Since  the
540       System  Interfaces  volume  of  POSIX.1‐2008  supports named pipes, the
541       standard developers decided that such a restriction  was  unreasonable.
542       Note also that the allowed filename almost always refers to a pipe.
543
544       No directory search order is specified for diff.  The historical order‐
545       ing is, in fact, not optimal, in that it prints out all of the  differ‐
546       ences  at  the current level, including the statements about all common
547       subdirectories before recursing into those subdirectories.
548
549       The message:
550
551           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
552
553       does not vary by locale because it is the representation of a  command,
554       not an English sentence.
555

FUTURE DIRECTIONS

557       None.
558

SEE ALSO

560       cmp, comm, ed, find
561
562       The  Base  Definitions  volume  of POSIX.1‐2008, Chapter 8, Environment
563       Variables, Section 12.2, Utility Syntax Guidelines
564
566       Portions of this text are reprinted and reproduced in  electronic  form
567       from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
568       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
569       Specifications Issue 7, Copyright (C) 2013 by the Institute of Electri‐
570       cal and Electronics Engineers,  Inc  and  The  Open  Group.   (This  is
571       POSIX.1-2008  with  the  2013  Technical Corrigendum 1 applied.) In the
572       event of any discrepancy between this version and the original IEEE and
573       The  Open Group Standard, the original IEEE and The Open Group Standard
574       is the referee document. The original Standard can be  obtained  online
575       at http://www.unix.org/online.html .
576
577       Any  typographical  or  formatting  errors that appear in this page are
578       most likely to have been introduced during the conversion of the source
579       files  to  man page format. To report such errors, see https://www.ker
580       nel.org/doc/man-pages/reporting_bugs.html .
581
582
583
584IEEE/The Open Group                  2013                             DIFF(1P)
Impressum