1XML::DifferenceMarkup(3U)ser Contributed Perl DocumentatiXoMnL::DifferenceMarkup(3)
2
3
4

NAME

6       XML::DifferenceMarkup - XML diff and merge
7

SYNOPSIS

9        use XML::DifferenceMarkup qw(make_diff);
10        use XML::LibXML;
11
12        $parser = XML::LibXML->new(keep_blanks => 0, load_ext_dtd => 0);
13        $d1 = $parser->parse_file($fname1);
14        $d2 = $parser->parse_file($fname2);
15
16        $dom = make_diff($d1, $d2);
17        print $dom->toString(1);
18

DESCRIPTION

20       This module implements an XML diff producing XML output. Both input and
21       output are DOM documents, as implemented by XML::LibXML.
22
23       The diff format used by XML::DifferenceMarkup is meant to be human-
24       readable (i.e. simple, as opposed to short) - basically the diff is a
25       subset of the input trees, annotated with instruction element nodes
26       specifying how to convert the source tree to the target by inserting
27       and deleting nodes. To prevent name colisions with input trees, all
28       added elements are in a namespace "http://www.locus.cz/diffmark" (the
29       diff will fail on input trees which already use that namespace).
30
31       The top-level node of the diff is always <diff/> (or rather <dm:diff
32       xmlns:dm="http://www.locus.cz/diffmark"> ... </dm:diff> - this
33       description omits the namespace specification from now on); under it
34       are fragments of the input trees and instruction nodes: <insert/>,
35       <delete/> and <copy/>. <copy/> is used in places where the input
36       subtrees are the same - in the limit, the diff of 2 identical documents
37       is
38
39        <?xml version="1.0"?>
40        <dm:diff xmlns:dm="http://www.locus.cz/diffmark">
41          <dm:copy count="1"/>
42        </dm:diff>
43
44       (copy always has the count attribute and no other content). <insert/>
45       and <delete/> have the obvious meaning - in the limit a diff of 2
46       documents which have nothing in common is something like
47
48        <?xml version="1.0"?>
49        <dm:diff xmlns:dm="http://www.locus.cz/diffmark">
50          <dm:delete>
51            <old/>
52          </dm:delete>
53          <dm:insert>
54            <new>
55              <tree>with the whole subtree, of course</tree>
56            </new>
57          </dm:insert>
58        </dm:diff>
59
60       A combination of <insert/>, <delete/> and <copy/> can capture any
61       difference, but it's sub-optimal for the case where (for example) the
62       top-level elements in the two input documents differ while their
63       subtrees are exactly the same. This case is handled by putting the
64       element from the second document into the diff, adding to it a special
65       attribute dm:update (whose value is the element name from the first
66       document) marking the element change:
67
68        <?xml version="1.0"?>
69        <dm:diff xmlns:dm="http://www.locus.cz/XML/diffmark">
70          <top-of-second dm:update="top-of-first">
71            <dm:copy count="42"/>
72          </top-of-second>
73        </dm:diff>
74
75       <delete/> contains just one level of nested nodes - their subtrees are
76       not included in the diff (but the element nodes which are included
77       always come with all their attributes). <insert/> and <delete/> don't
78       have any attributes and always contain some subtree.
79
80       Instruction nodes are never nested; all nodes above an instruction node
81       (except the top-level <diff/>) come from the input trees. A node from
82       the second input tree might be included in the output diff to provide
83       context for instruction nodes when it's an element node whose subtree
84       is not the same in the two input documents. When such an element has
85       the same name, attributes (names and values) and namespace declarations
86       in both input documents, it's always included in the diff (its
87       different output trees guarantee that it will have some chindren
88       there). If the corresponding elements are different, the one from the
89       second document might still be included, with an added dm:update
90       attribute, provided that both corresponding elements have non-empty
91       subtrees, and these subtrees are so similar that deleting the first
92       corresponding element and inserting the second would lead to a larger
93       diff. And if this paragraph seems too complicated, don't despair - just
94       ignore it and look at some examples.
95

FUNCTIONS

97       Note that XML::DifferenceMarkup functions must be explicitly imported
98       (i.e. with "use XML::DifferenceMarkup qw(make_diff merge_diff);")
99       before they can be called.
100
101   make_diff
102       "make_diff" takes 2 parameters (the input documents) and produces their
103       diff. Note that the diff is asymmetric - "make_diff($a, $b)" is
104       different from "make_diff($b, $a)".
105
106   merge_diff
107       "merge_diff" takes the first document passed to "make_diff" and its
108       return value and produces the second document. (More-or-less - the
109       document isn't canonicalized, so opinions on its "equality" may
110       differ.)
111
112   Error Handling
113       Both "make_diff" and "merge_diff" throw exceptions on invalid input -
114       their own exceptions as well as exceptions thrown by XML::LibXML. These
115       exceptions can usually (probably not always, though - it used to be
116       possible to construct an input which would crash the calling process)
117       be catched by calling the functions from an eval block.
118

BUGS

120       •   information outside the document element is not processed
121

AUTHOR

123       Vaclav Barta <vbar@comp.cz>
124

SEE ALSO

126       XML::LibXML
127
128
129
130perl v5.32.1                      2021-01-27          XML::DifferenceMarkup(3)
Impressum