1Text::Diff(3) User Contributed Perl Documentation Text::Diff(3)
2
3
4
6 Text::Diff - Perform diffs on files and record sets
7
9 use Text::Diff;
10
11 ## Mix and match filenames, strings, file handles, producer subs,
12 ## or arrays of records; returns diff in a string.
13 ## WARNING: can return B<large> diffs for large files.
14 my $diff = diff "file1.txt", "file2.txt", { STYLE => "Context" };
15 my $diff = diff \$string1, \$string2, \%options;
16 my $diff = diff \*FH1, \*FH2;
17 my $diff = diff \&reader1, \&reader2;
18 my $diff = diff \@records1, \@records2;
19
20 ## May also mix input types:
21 my $diff = diff \@records1, "file_B.txt";
22
24 "diff()" provides a basic set of services akin to the GNU "diff" util‐
25 ity. It is not anywhere near as feature complete as GNU "diff", but it
26 is better integrated with Perl and available on all platforms. It is
27 often faster than shelling out to a system's "diff" executable for
28 small files, and generally slower on larger files.
29
30 Relies on Algorithm::Diff for, well, the algorithm. This may not pro‐
31 duce the same exact diff as a system's local "diff" executable, but it
32 will be a valid diff and comprehensible by "patch". We haven't seen
33 any differences between Algorithm::Diff's logic and GNU diff's, but we
34 have not examined them to make sure they are indeed identical.
35
36 Note: If you don't want to import the "diff" function, do one of the
37 following:
38
39 use Text::Diff ();
40
41 require Text::Diff;
42
43 That's a pretty rare occurence, so "diff()" is exported by default.
44
46 diff() takes two parameters from which to draw input and a set of
47 options to control it's output. The options are:
48
49 FILENAME_A, MTIME_A, FILENAME_B, MTIME_B
50 The name of the file and the modification time "files"
51
52 These are filled in automatically for each file when diff() is
53 passed a filename, unless a defined value is passed in.
54
55 If a filename is not passed in and FILENAME_A and FILENAME_B are
56 not provided or "undef", the header will not be printed.
57
58 Unused on "OldStyle" diffs.
59
60 OFFSET_A, OFFSET_B
61 The index of the first line / element. These default to 1 for all
62 parameter types except ARRAY references, for which the default is
63 0. This is because ARRAY references are presumed to be data struc‐
64 tures, while the others are line oriented text.
65
66 STYLE
67 "Unified", "Context", "OldStyle", or an object or class reference
68 for a class providing "file_header()", "hunk_header()", "hunk()",
69 "hunk_footer()" and "file_footer()" methods. The two footer()
70 methods are provided for overloading only; none of the formats pro‐
71 vide them.
72
73 Defaults to "Unified" (unlike standard "diff", but Unified is
74 what's most often used in submitting patches and is the most human
75 readable of the three.
76
77 If the package indicated by the STYLE has no hunk() method,
78 c<diff()> will load it automatically (lazy loading). Since all
79 such packages should inherit from Text::Diff::Base, this should be
80 marvy.
81
82 Styles may be specified as class names ("STYLE =" "Foo"), in which
83 case they will be "new()"ed with no parameters, or as objects
84 ("STYLE =" Foo->new>).
85
86 CONTEXT
87 How many lines before and after each diff to display. Ignored on
88 old-style diffs. Defaults to 3.
89
90 OUTPUT
91 Examples and their equivalent subroutines:
92
93 OUTPUT => \*FOOHANDLE, # like: sub { print FOOHANDLE shift() }
94 OUTPUT => \$output, # like: sub { $output .= shift }
95 OUTPUT => \@output, # like: sub { push @output, shift }
96 OUTPUT => sub { $output .= shift },
97
98 If no "OUTPUT" is supplied, returns the diffs in a string. If
99 "OUTPUT" is a "CODE" ref, it will be called once with the
100 (optional) file header, and once for each hunk body with the text
101 to emit. If "OUTPUT" is an IO::Handle, output will be emitted to
102 that handle.
103
104 FILENAME_PREFIX_A, FILENAME_PREFIX_B
105 The string to print before the filename in the header. Unused on
106 "OldStyle" diffs. Defaults are "---", "+++" for Unified and "***",
107 "+++" for Context.
108
109 KEYGEN, KEYGEN_ARGS
110 These are passed to "traverse_sequences" in Algorithm::Diff.
111
112 Note: if neither "FILENAME_" option is defined, the header will not be
113 printed. If at one is present, the other and both MTIME_ options must
114 be present or "Use of undefined variable" warnings will be generated
115 (except on "OldStyle" diffs, which ignores these options).
116
118 These functions implement the output formats. They are grouped in to
119 classes so diff() can use class names to call the correct set of output
120 routines and so that you may inherit from them easily. There are no
121 constructors or instance methods for these classes, though subclasses
122 may provide them if need be.
123
124 Each class has file_header(), hunk_header(), hunk(), and footer() meth‐
125 ods identical to those documented in the Text::Diff::Unified section.
126 header() is called before the hunk() is first called, footer() after‐
127 wards. The default footer function is an empty method provided for
128 overloading:
129
130 sub footer { return "End of patch\n" }
131
132 Some output formats are provided by external modules (which are loaded
133 automatically), such as Text::Diff::Table. These are are documented
134 here to keep the documentation simple.
135
136 Text::Diff::Base
137
138 Returns "" for all methods (other than "new()").
139
140 Text::Diff::Unified
141
142 --- A Mon Nov 12 23:49:30 2001
143 +++ B Mon Nov 12 23:49:30 2001
144 @@ -2,13 +2,13 @@
145 2
146 3
147 4
148 -5d
149 +5a
150 6
151 7
152 8
153 9
154 +9a
155 10
156 11
157 -11d
158 12
159 13
160
161 file_header
162 $s = Text::Diff::Unified->file_header( $options );
163
164 Returns a string containing a unified header. The sole parame‐
165 ter is the options hash passed in to diff(), containing at
166 least:
167
168 FILENAME_A => $fn1,
169 MTIME_A => $mtime1,
170 FILENAME_B => $fn2,
171 MTIME_B => $mtime2
172
173 May also contain
174
175 FILENAME_PREFIX_A => "---",
176 FILENAME_PREFIX_B => "+++",
177
178 to override the default prefixes (default values shown).
179
180 hunk_header
181 Text::Diff::Unified->hunk_header( \@ops, $options );
182
183 Returns a string containing the output of one hunk of unified
184 diff.
185
186 Text::Diff::Unified::hunk
187 Text::Diff::Unified->hunk( \@seq_a, \@seq_b, \@ops, $options );
188
189 Returns a string containing the output of one hunk of unified
190 diff.
191
192 Text::Diff::Table
193
194 +--+----------------------------------+--+------------------------------+
195 ⎪ ⎪../Test-Differences-0.2/MANIFEST ⎪ ⎪../Test-Differences/MANIFEST ⎪
196 ⎪ ⎪Thu Dec 13 15:38:49 2001 ⎪ ⎪Sat Dec 15 02:09:44 2001 ⎪
197 +--+----------------------------------+--+------------------------------+
198 ⎪ ⎪ * 1⎪Changes *
199 ⎪ 1⎪Differences.pm ⎪ 2⎪Differences.pm ⎪
200 ⎪ 2⎪MANIFEST ⎪ 3⎪MANIFEST ⎪
201 ⎪ ⎪ * 4⎪MANIFEST.SKIP *
202 ⎪ 3⎪Makefile.PL ⎪ 5⎪Makefile.PL ⎪
203 ⎪ ⎪ * 6⎪t/00escape.t *
204 ⎪ 4⎪t/00flatten.t ⎪ 7⎪t/00flatten.t ⎪
205 ⎪ 5⎪t/01text_vs_data.t ⎪ 8⎪t/01text_vs_data.t ⎪
206 ⎪ 6⎪t/10test.t ⎪ 9⎪t/10test.t ⎪
207 +--+----------------------------------+--+------------------------------+
208
209 This format also goes to some pains to highlight "invisible" char‐
210 acters on differing elements by selectively escaping whitespace:
211
212 +--+--------------------------+--------------------------+
213 ⎪ ⎪demo_ws_A.txt ⎪demo_ws_B.txt ⎪
214 ⎪ ⎪Fri Dec 21 08:36:32 2001 ⎪Fri Dec 21 08:36:50 2001 ⎪
215 +--+--------------------------+--------------------------+
216 ⎪ 1⎪identical ⎪identical ⎪
217 * 2⎪ spaced in ⎪ also spaced in *
218 * 3⎪embedded space ⎪embedded tab *
219 ⎪ 4⎪identical ⎪identical ⎪
220 * 5⎪ spaced in ⎪\ttabbed in *
221 * 6⎪trailing spaces\s\s\n ⎪trailing tabs\t\t\n *
222 ⎪ 7⎪identical ⎪identical ⎪
223 * 8⎪lf line\n ⎪crlf line\r\n *
224 * 9⎪embedded ws ⎪embedded\tws *
225 +--+--------------------------+--------------------------+
226
227 See "Text::Diff::Table" for more details, including how the white‐
228 space escaping works.
229
230 Text::Diff::Context
231
232 *** A Mon Nov 12 23:49:30 2001
233 --- B Mon Nov 12 23:49:30 2001
234 ***************
235 *** 2,14 ****
236 2
237 3
238 4
239 ! 5d
240 6
241 7
242 8
243 9
244 10
245 11
246 - 11d
247 12
248 13
249 --- 2,14 ----
250 2
251 3
252 4
253 ! 5a
254 6
255 7
256 8
257 9
258 + 9a
259 10
260 11
261 12
262 13
263
264 Note: hunk_header() returns only "***************\n".
265
267 Must suck both input files entirely in to memory and store them with a
268 normal amount of Perlish overhead (one array location) per record.
269 This is implied by the implementation of Algorithm::Diff, which takes
270 two arrays. If Algorithm::Diff ever offers an incremental mode, this
271 can be changed (contact the maintainers of Algorithm::Diff and
272 Text::Diff if you need this; it shouldn't be too terribly hard to tie
273 arrays in this fashion).
274
275 Does not provide most of the more refined GNU diff options: recursive
276 directory tree scanning, ignoring blank lines / whitespace, etc., etc.
277 These can all be added as time permits and need arises, many are rather
278 easy; patches quite welcome.
279
280 Uses closures internally, this may lead to leaks on "perl" versions
281 5.6.1 and prior if used many times over a process' life time.
282
284 Barrie Slaymaker <barries@slaysys.com>.
285
287 Copyright 2001, Barrie Slaymaker. All Rights Reserved.
288
289 You may use this under the terms of either the Artistic License or GNU
290 Public License v 2.0 or greater.
291
292
293
294perl v5.8.8 2002-08-27 Text::Diff(3)