1Text::Diff(3) User Contributed Perl Documentation Text::Diff(3)
2
3
4
6 Text::Diff - Perform diffs on files and record sets
7
9 use Text::Diff;
10
11 ## Mix and match filenames, strings, file handles, producer subs,
12 ## or arrays of records; returns diff in a string.
13 ## WARNING: can return B<large> diffs for large files.
14 my $diff = diff "file1.txt", "file2.txt", { STYLE => "Context" };
15 my $diff = diff \$string1, \$string2, \%options;
16 my $diff = diff \*FH1, \*FH2;
17 my $diff = diff \&reader1, \&reader2;
18 my $diff = diff \@records1, \@records2;
19
20 ## May also mix input types:
21 my $diff = diff \@records1, "file_B.txt";
22
24 "diff()" provides a basic set of services akin to the GNU "diff"
25 utility. It is not anywhere near as feature complete as GNU "diff",
26 but it is better integrated with Perl and available on all platforms.
27 It is often faster than shelling out to a system's "diff" executable
28 for small files, and generally slower on larger files.
29
30 Relies on Algorithm::Diff for, well, the algorithm. This may not
31 produce the same exact diff as a system's local "diff" executable, but
32 it will be a valid diff and comprehensible by "patch". We haven't seen
33 any differences between Algorithm::Diff's logic and GNU diff's, but we
34 have not examined them to make sure they are indeed identical.
35
36 Note: If you don't want to import the "diff" function, do one of the
37 following:
38
39 use Text::Diff ();
40
41 require Text::Diff;
42
43 That's a pretty rare occurence, so "diff()" is exported by default.
44 =head1 OPTIONS
45
46 diff() takes two parameters from which to draw input and a set of
47 options to control it's output. The options are:
48
49 FILENAME_A, MTIME_A, FILENAME_B, MTIME_B
50 The name of the file and the modification time "files"
51
52 These are filled in automatically for each file when diff() is
53 passed a filename, unless a defined value is passed in.
54
55 If a filename is not passed in and FILENAME_A and FILENAME_B are
56 not provided or "undef", the header will not be printed.
57
58 Unused on "OldStyle" diffs.
59
60 OFFSET_A, OFFSET_B
61 The index of the first line / element. These default to 1 for all
62 parameter types except ARRAY references, for which the default is
63 0. This is because ARRAY references are presumed to be data
64 structures, while the others are line oriented text.
65
66 STYLE
67 "Unified", "Context", "OldStyle", or an object or class reference
68 for a class providing "file_header()", "hunk_header()", "hunk()",
69 "hunk_footer()" and "file_footer()" methods. The two footer()
70 methods are provided for overloading only; none of the formats
71 provide them.
72
73 Defaults to "Unified" (unlike standard "diff", but Unified is
74 what's most often used in submitting patches and is the most human
75 readable of the three.
76
77 If the package indicated by the STYLE has no hunk() method,
78 c<diff()> will load it automatically (lazy loading). Since all
79 such packages should inherit from Text::Diff::Base, this should be
80 marvy.
81
82 Styles may be specified as class names ("STYLE =" "Foo"), in which
83 case they will be "new()"ed with no parameters, or as objects
84 ("STYLE =" Foo->new>).
85
86 CONTEXT
87 How many lines before and after each diff to display. Ignored on
88 old-style diffs. Defaults to 3.
89
90 OUTPUT
91 Examples and their equivalent subroutines:
92
93 OUTPUT => \*FOOHANDLE, # like: sub { print FOOHANDLE shift() }
94 OUTPUT => \$output, # like: sub { $output .= shift }
95 OUTPUT => \@output, # like: sub { push @output, shift }
96 OUTPUT => sub { $output .= shift },
97
98 If no "OUTPUT" is supplied, returns the diffs in a string. If
99 "OUTPUT" is a "CODE" ref, it will be called once with the
100 (optional) file header, and once for each hunk body with the text
101 to emit. If "OUTPUT" is an IO::Handle, output will be emitted to
102 that handle.
103
104 FILENAME_PREFIX_A, FILENAME_PREFIX_B
105 The string to print before the filename in the header. Unused on
106 "OldStyle" diffs. Defaults are "---", "+++" for Unified and "***",
107 "+++" for Context.
108
109 KEYGEN, KEYGEN_ARGS
110 These are passed to "traverse_sequences" in Algorithm::Diff.
111
112 Note: if neither "FILENAME_" option is defined, the header will not be
113 printed. If at one is present, the other and both MTIME_ options must
114 be present or "Use of undefined variable" warnings will be generated
115 (except on "OldStyle" diffs, which ignores these options).
116
118 These functions implement the output formats. They are grouped in to
119 classes so diff() can use class names to call the correct set of output
120 routines and so that you may inherit from them easily. There are no
121 constructors or instance methods for these classes, though subclasses
122 may provide them if need be.
123
124 Each class has file_header(), hunk_header(), hunk(), and footer()
125 methods identical to those documented in the Text::Diff::Unified
126 section. header() is called before the hunk() is first called,
127 footer() afterwards. The default footer function is an empty method
128 provided for overloading:
129
130 sub footer { return "End of patch\n" }
131
132 Some output formats are provided by external modules (which are loaded
133 automatically), such as Text::Diff::Table. These are are documented
134 here to keep the documentation simple.
135
136 Text::Diff::Base
137 Returns "" for all methods (other than "new()").
138
139 Text::Diff::Unified
140 --- A Mon Nov 12 23:49:30 2001
141 +++ B Mon Nov 12 23:49:30 2001
142 @@ -2,13 +2,13 @@
143 2
144 3
145 4
146 -5d
147 +5a
148 6
149 7
150 8
151 9
152 +9a
153 10
154 11
155 -11d
156 12
157 13
158
159 file_header
160 $s = Text::Diff::Unified->file_header( $options );
161
162 Returns a string containing a unified header. The sole parameter
163 is the options hash passed in to diff(), containing at least:
164
165 FILENAME_A => $fn1,
166 MTIME_A => $mtime1,
167 FILENAME_B => $fn2,
168 MTIME_B => $mtime2
169
170 May also contain
171
172 FILENAME_PREFIX_A => "---",
173 FILENAME_PREFIX_B => "+++",
174
175 to override the default prefixes (default values shown).
176
177 hunk_header
178 Text::Diff::Unified->hunk_header( \@ops, $options );
179
180 Returns a string containing the output of one hunk of unified diff.
181
182 Text::Diff::Unified::hunk
183 Text::Diff::Unified->hunk( \@seq_a, \@seq_b, \@ops, $options );
184
185 Returns a string containing the output of one hunk of unified diff.
186
187 Text::Diff::Table
188 +--+----------------------------------+--+------------------------------+
189 | |../Test-Differences-0.2/MANIFEST | |../Test-Differences/MANIFEST |
190 | |Thu Dec 13 15:38:49 2001 | |Sat Dec 15 02:09:44 2001 |
191 +--+----------------------------------+--+------------------------------+
192 | | * 1|Changes *
193 | 1|Differences.pm | 2|Differences.pm |
194 | 2|MANIFEST | 3|MANIFEST |
195 | | * 4|MANIFEST.SKIP *
196 | 3|Makefile.PL | 5|Makefile.PL |
197 | | * 6|t/00escape.t *
198 | 4|t/00flatten.t | 7|t/00flatten.t |
199 | 5|t/01text_vs_data.t | 8|t/01text_vs_data.t |
200 | 6|t/10test.t | 9|t/10test.t |
201 +--+----------------------------------+--+------------------------------+
202
203 This format also goes to some pains to highlight "invisible" characters
204 on differing elements by selectively escaping whitespace:
205
206 +--+--------------------------+--------------------------+
207 | |demo_ws_A.txt |demo_ws_B.txt |
208 | |Fri Dec 21 08:36:32 2001 |Fri Dec 21 08:36:50 2001 |
209 +--+--------------------------+--------------------------+
210 | 1|identical |identical |
211 * 2| spaced in | also spaced in *
212 * 3|embedded space |embedded tab *
213 | 4|identical |identical |
214 * 5| spaced in |\ttabbed in *
215 * 6|trailing spaces\s\s\n |trailing tabs\t\t\n *
216 | 7|identical |identical |
217 * 8|lf line\n |crlf line\r\n *
218 * 9|embedded ws |embedded\tws *
219 +--+--------------------------+--------------------------+
220
221 See "Text::Diff::Table" for more details, including how the whitespace
222 escaping works.
223
224 Text::Diff::Context
225 *** A Mon Nov 12 23:49:30 2001
226 --- B Mon Nov 12 23:49:30 2001
227 ***************
228 *** 2,14 ****
229 2
230 3
231 4
232 ! 5d
233 6
234 7
235 8
236 9
237 10
238 11
239 - 11d
240 12
241 13
242 --- 2,14 ----
243 2
244 3
245 4
246 ! 5a
247 6
248 7
249 8
250 9
251 + 9a
252 10
253 11
254 12
255 13
256
257 Note: hunk_header() returns only "***************\n".
258
259 Text::Diff::OldStyle
260 5c5
261 < 5d
262 ---
263 > 5a
264 9a10
265 > 9a
266 12d12
267 < 11d
268
269 Note: no file_header().
270
272 Must suck both input files entirely in to memory and store them with a
273 normal amount of Perlish overhead (one array location) per record.
274 This is implied by the implementation of Algorithm::Diff, which takes
275 two arrays. If Algorithm::Diff ever offers an incremental mode, this
276 can be changed (contact the maintainers of Algorithm::Diff and
277 Text::Diff if you need this; it shouldn't be too terribly hard to tie
278 arrays in this fashion).
279
280 Does not provide most of the more refined GNU diff options: recursive
281 directory tree scanning, ignoring blank lines / whitespace, etc., etc.
282 These can all be added as time permits and need arises, many are rather
283 easy; patches quite welcome.
284
285 Uses closures internally, this may lead to leaks on "perl" versions
286 5.6.1 and prior if used many times over a process' life time.
287
289 Adam Kennedy <adamk@cpan.org>
290
291 Barrie Slaymaker <barries@slaysys.com>
292
294 Some parts copyright 2009 Adam Kennedy.
295
296 Copyright 2001 Barrie Slaymaker. All Rights Reserved.
297
298 You may use this under the terms of either the Artistic License or GNU
299 Public License v 2.0 or greater.
300
301
302
303perl v5.12.0 2009-07-16 Text::Diff(3)