1Text::Diff(3)         User Contributed Perl Documentation        Text::Diff(3)
2
3
4

NAME

6       Text::Diff - Perform diffs on files and record sets
7

SYNOPSIS

9           use Text::Diff;
10
11           ## Mix and match filenames, strings, file handles, producer subs,
12           ## or arrays of records; returns diff in a string.
13           ## WARNING: can return B<large> diffs for large files.
14           my $diff = diff "file1.txt", "file2.txt", { STYLE => "Context" };
15           my $diff = diff \$string1,   \$string2,   \%options;
16           my $diff = diff \*FH1,       \*FH2;
17           my $diff = diff \&reader1,   \&reader2;
18           my $diff = diff \@records1,  \@records2;
19
20           ## May also mix input types:
21           my $diff = diff \@records1,  "file_B.txt";
22

DESCRIPTION

24       "diff()" provides a basic set of services akin to the GNU "diff" util‐
25       ity.  It is not anywhere near as feature complete as GNU "diff", but it
26       is better integrated with Perl and available on all platforms.  It is
27       often faster than shelling out to a system's "diff" executable for
28       small files, and generally slower on larger files.
29
30       Relies on Algorithm::Diff for, well, the algorithm.  This may not pro‐
31       duce the same exact diff as a system's local "diff" executable, but it
32       will be a valid diff and comprehensible by "patch".  We haven't seen
33       any differences between Algorithm::Diff's logic and GNU diff's, but we
34       have not examined them to make sure they are indeed identical.
35
36       Note: If you don't want to import the "diff" function, do one of the
37       following:
38
39          use Text::Diff ();
40
41          require Text::Diff;
42
43       That's a pretty rare occurence, so "diff()" is exported by default.
44

OPTIONS

46       diff() takes two parameters from which to draw input and a set of
47       options to control it's output.  The options are:
48
49       FILENAME_A, MTIME_A, FILENAME_B, MTIME_B
50           The name of the file and the modification time "files"
51
52           These are filled in automatically for each file when diff() is
53           passed a filename, unless a defined value is passed in.
54
55           If a filename is not passed in and FILENAME_A and FILENAME_B are
56           not provided or "undef", the header will not be printed.
57
58           Unused on "OldStyle" diffs.
59
60       OFFSET_A, OFFSET_B
61           The index of the first line / element.  These default to 1 for all
62           parameter types except ARRAY references, for which the default is
63           0.  This is because ARRAY references are presumed to be data struc‐
64           tures, while the others are line oriented text.
65
66       STYLE
67           "Unified", "Context", "OldStyle", or an object or class reference
68           for a class providing "file_header()", "hunk_header()", "hunk()",
69           "hunk_footer()" and "file_footer()" methods.  The two footer()
70           methods are provided for overloading only; none of the formats pro‐
71           vide them.
72
73           Defaults to "Unified" (unlike standard "diff", but Unified is
74           what's most often used in submitting patches and is the most human
75           readable of the three.
76
77           If the package indicated by the STYLE has no hunk() method,
78           c<diff()> will load it automatically (lazy loading).  Since all
79           such packages should inherit from Text::Diff::Base, this should be
80           marvy.
81
82           Styles may be specified as class names ("STYLE =" "Foo"), in which
83           case they will be "new()"ed with no parameters, or as objects
84           ("STYLE =" Foo->new>).
85
86       CONTEXT
87           How many lines before and after each diff to display.  Ignored on
88           old-style diffs.  Defaults to 3.
89
90       OUTPUT
91           Examples and their equivalent subroutines:
92
93               OUTPUT   => \*FOOHANDLE,   # like: sub { print FOOHANDLE shift() }
94               OUTPUT   => \$output,      # like: sub { $output .= shift }
95               OUTPUT   => \@output,      # like: sub { push @output, shift }
96               OUTPUT   => sub { $output .= shift },
97
98           If no "OUTPUT" is supplied, returns the diffs in a string.  If
99           "OUTPUT" is a "CODE" ref, it will be called once with the
100           (optional) file header, and once for each hunk body with the text
101           to emit.  If "OUTPUT" is an IO::Handle, output will be emitted to
102           that handle.
103
104       FILENAME_PREFIX_A, FILENAME_PREFIX_B
105           The string to print before the filename in the header. Unused on
106           "OldStyle" diffs.  Defaults are "---", "+++" for Unified and "***",
107           "+++" for Context.
108
109       KEYGEN, KEYGEN_ARGS
110           These are passed to "traverse_sequences" in Algorithm::Diff.
111
112       Note: if neither "FILENAME_" option is defined, the header will not be
113       printed.  If at one is present, the other and both MTIME_ options must
114       be present or "Use of undefined variable" warnings will be generated
115       (except on "OldStyle" diffs, which ignores these options).
116

Formatting Classes

118       These functions implement the output formats.  They are grouped in to
119       classes so diff() can use class names to call the correct set of output
120       routines and so that you may inherit from them easily.  There are no
121       constructors or instance methods for these classes, though subclasses
122       may provide them if need be.
123
124       Each class has file_header(), hunk_header(), hunk(), and footer() meth‐
125       ods identical to those documented in the Text::Diff::Unified section.
126       header() is called before the hunk() is first called, footer() after‐
127       wards.  The default footer function is an empty method provided for
128       overloading:
129
130           sub footer { return "End of patch\n" }
131
132       Some output formats are provided by external modules (which are loaded
133       automatically), such as Text::Diff::Table.  These are are documented
134       here to keep the documentation simple.
135
136       Text::Diff::Base
137
138           Returns "" for all methods (other than "new()").
139
140           Text::Diff::Unified
141
142               --- A   Mon Nov 12 23:49:30 2001
143               +++ B   Mon Nov 12 23:49:30 2001
144               @@ -2,13 +2,13 @@
145                2
146                3
147                4
148               -5d
149               +5a
150                6
151                7
152                8
153                9
154               +9a
155                10
156                11
157               -11d
158                12
159                13
160
161           file_header
162                   $s = Text::Diff::Unified->file_header( $options );
163
164               Returns a string containing a unified header.  The sole parame‐
165               ter is the options hash passed in to diff(), containing at
166               least:
167
168                   FILENAME_A  => $fn1,
169                   MTIME_A     => $mtime1,
170                   FILENAME_B  => $fn2,
171                   MTIME_B     => $mtime2
172
173               May also contain
174
175                   FILENAME_PREFIX_A    => "---",
176                   FILENAME_PREFIX_B    => "+++",
177
178               to override the default prefixes (default values shown).
179
180           hunk_header
181                   Text::Diff::Unified->hunk_header( \@ops, $options );
182
183               Returns a string containing the output of one hunk of unified
184               diff.
185
186           Text::Diff::Unified::hunk
187                   Text::Diff::Unified->hunk( \@seq_a, \@seq_b, \@ops, $options );
188
189               Returns a string containing the output of one hunk of unified
190               diff.
191
192           Text::Diff::Table
193
194            +--+----------------------------------+--+------------------------------+
195            ⎪  ⎪../Test-Differences-0.2/MANIFEST  ⎪  ⎪../Test-Differences/MANIFEST  ⎪
196            ⎪  ⎪Thu Dec 13 15:38:49 2001          ⎪  ⎪Sat Dec 15 02:09:44 2001      ⎪
197            +--+----------------------------------+--+------------------------------+
198            ⎪  ⎪                                  * 1⎪Changes                       *
199            ⎪ 1⎪Differences.pm                    ⎪ 2⎪Differences.pm                ⎪
200            ⎪ 2⎪MANIFEST                          ⎪ 3⎪MANIFEST                      ⎪
201            ⎪  ⎪                                  * 4⎪MANIFEST.SKIP                 *
202            ⎪ 3⎪Makefile.PL                       ⎪ 5⎪Makefile.PL                   ⎪
203            ⎪  ⎪                                  * 6⎪t/00escape.t                  *
204            ⎪ 4⎪t/00flatten.t                     ⎪ 7⎪t/00flatten.t                 ⎪
205            ⎪ 5⎪t/01text_vs_data.t                ⎪ 8⎪t/01text_vs_data.t            ⎪
206            ⎪ 6⎪t/10test.t                        ⎪ 9⎪t/10test.t                    ⎪
207            +--+----------------------------------+--+------------------------------+
208
209           This format also goes to some pains to highlight "invisible" char‐
210           acters on differing elements by selectively escaping whitespace:
211
212            +--+--------------------------+--------------------------+
213            ⎪  ⎪demo_ws_A.txt             ⎪demo_ws_B.txt             ⎪
214            ⎪  ⎪Fri Dec 21 08:36:32 2001  ⎪Fri Dec 21 08:36:50 2001  ⎪
215            +--+--------------------------+--------------------------+
216            ⎪ 1⎪identical                 ⎪identical                 ⎪
217            * 2⎪        spaced in         ⎪        also spaced in    *
218            * 3⎪embedded space            ⎪embedded        tab       *
219            ⎪ 4⎪identical                 ⎪identical                 ⎪
220            * 5⎪        spaced in         ⎪\ttabbed in               *
221            * 6⎪trailing spaces\s\s\n     ⎪trailing tabs\t\t\n       *
222            ⎪ 7⎪identical                 ⎪identical                 ⎪
223            * 8⎪lf line\n                 ⎪crlf line\r\n             *
224            * 9⎪embedded ws               ⎪embedded\tws              *
225            +--+--------------------------+--------------------------+
226
227           See "Text::Diff::Table" for more details, including how the white‐
228           space escaping works.
229
230           Text::Diff::Context
231
232               *** A   Mon Nov 12 23:49:30 2001
233               --- B   Mon Nov 12 23:49:30 2001
234               ***************
235               *** 2,14 ****
236                 2
237                 3
238                 4
239               ! 5d
240                 6
241                 7
242                 8
243                 9
244                 10
245                 11
246               - 11d
247                 12
248                 13
249               --- 2,14 ----
250                 2
251                 3
252                 4
253               ! 5a
254                 6
255                 7
256                 8
257                 9
258               + 9a
259                 10
260                 11
261                 12
262                 13
263
264           Note: hunk_header() returns only "***************\n".
265

LIMITATIONS

267       Must suck both input files entirely in to memory and store them with a
268       normal amount of Perlish overhead (one array location) per record.
269       This is implied by the implementation of Algorithm::Diff, which takes
270       two arrays.  If Algorithm::Diff ever offers an incremental mode, this
271       can be changed (contact the maintainers of Algorithm::Diff and
272       Text::Diff if you need this; it shouldn't be too terribly hard to tie
273       arrays in this fashion).
274
275       Does not provide most of the more refined GNU diff options: recursive
276       directory tree scanning, ignoring blank lines / whitespace, etc., etc.
277       These can all be added as time permits and need arises, many are rather
278       easy; patches quite welcome.
279
280       Uses closures internally, this may lead to leaks on "perl" versions
281       5.6.1 and prior if used many times over a process' life time.
282

AUTHOR

284       Barrie Slaymaker <barries@slaysys.com>.
285
287       Copyright 2001, Barrie Slaymaker.  All Rights Reserved.
288
289       You may use this under the terms of either the Artistic License or GNU
290       Public License v 2.0 or greater.
291
292
293
294perl v5.8.8                       2002-08-27                     Text::Diff(3)
Impressum