1XML::LibXML::PrettyPrinUts(e3r)Contributed Perl DocumentXaMtLi:o:nLibXML::PrettyPrint(3)
2
3
4

NAME

6       XML::LibXML::PrettyPrint - add pleasant whitespace to a DOM tree
7

SYNOPSIS

9        my $document = XML::LibXML->new->parse_file('in.xml');
10        my $pp = XML::LibXML::PrettyPrint->new(indent_string => "  ");
11        $pp->pretty_print($document); # modified in-place
12        print $document->toString;
13

DESCRIPTION

15       Long XML files can be daunting for humans to read. Of course, XML is
16       really designed for computers to read - not people - but there are
17       times when mere mortals do need to read and edit XML by hand. For
18       example, if your application stores its configuration in XML, or you
19       need to dump some XML to STDOUT for debugging purposes.
20
21       Syntax highlighting helps, but to really make sense of some XML, proper
22       indentation can be vital. Hence "XML::LibXML::PrettyPrint" - it can be
23       applied to an XML::LibXML DOM tree to reformat it into a more readable
24       result.
25
26       Pretty-printing XML is not as CPU-efficient as dumping it out sloppily,
27       so unless you're pretty sure that a human is going to need to make
28       sense of your XML, you should probably not use this module.
29
30   Constructors
31       new(%options)
32           Constructs a pretty-printer object.
33
34           Options:
35
36indent_string - The string to use to indent each line. Defaults
37               to a single tab character. Setting it to a non-whitespace
38               character is allowed, but will carp a warning.
39
40new_line - The string to use to begin a new line. Defaults to
41               "\n".
42
43element - A hashref of element categorisations. Each
44               categorisation is a reference to an array of element names or
45               callback functions. Element names may use Clark notation.
46
47                 my $callback = sub {
48                   my $node = shift;
49                   return 1 if $node->hasAttribute('is_block');
50                   return undef;
51                 };
52                 my $pp = XML::LibXML::PrettyPrint->new(
53                     element => {
54                         inline   => [qw/span strong em b i a/],
55                         block    => [qw/p div body html head/, $callback],
56                         compact  => [qw/title caption li dd dt th td/],
57                         preserves_whitespace => [qw/pre script style/],
58                         }
59                     );
60
61               Callbacks should return 1 (true), 0 (false) or undef (dunno).
62
63       new_for_html(%options)
64           Constructs a pretty printer object pre-configured to be suitable
65           for HTML and XHTML. The indent_string and new_line options are
66           supported.
67
68   Methods
69       If you just need to use a default configuration (no options passed to
70       the constructor, then you can call these as class methods, unless
71       otherwise stated.
72
73       strip_whitespace($node)
74           Strips superfluous whitespace from an "XML::LibXML::Document" or
75           "XML::LibXML::Element".
76
77           Whitespace just before, just after or leading/trailing within an
78           inline element is not considered superfluous. Runs of multiple
79           whitespace characters are replaced with a single space. Whitespace
80           is not changed within an element that preserves whitespace.
81
82           The node is modified in place.
83
84       "indent($node, $level)"
85           Indents the node to a certain indentation level, and its direct
86           children to "$level + 1", grandchildren to "$level + 2", etc.
87           Typically you'd just want to indent the root node to level 0.
88
89           The node is modified in place.
90
91           Elements that preserve whitespace are not changed.
92
93       "pretty_print($node, $level)"
94           Strip whitespace and indent. The node is modified in place and
95           returned.
96
97           Example use as a class method:
98
99            print XML::LibXML::PrettyPrint
100              ->pretty_print(XML::LibXML->new->parse_string($XML))
101              ->toString;
102
103       indent_string($level)
104           Returns the string that would be used to indent something to a
105           particular level. Descendent classes could override this method to
106           do funky indentation, such as having varying levels of indentation.
107
108       "new_line"
109           Returns the string that would be used to begin a new line.
110
111       element_category($node)
112           Returns EL_INLINE, EL_BLOCK, EL_COMPACT or undef.
113
114       element_preserves_whitespace($node)
115           Boolean indicating whether the contents of the element have
116           significant whitespace that needs preserving.
117
118           Returns undef if $node is not an "XML::LibXML::Element".
119
120   Functions
121       "print_xml $xml"
122           Given an XML string or an XML::LibXML::Node object, prints it
123           nicely.
124
125           This function is not exported by default, but can be requested:
126
127            use XML::LibXML::PrettyPrint 0.001 qw(print_xml);
128
129           Use like this:
130
131            print_xml '<foo> <bar> </bar> </foo>';
132
133       "IO::Handle::print_xml($handle, $xml)"
134           Partly experimental, partly mental. You can enable this feature
135           like this:
136
137            use XML::LibXML::PrettyPrint 0.001 qw(-io);
138
139           And that will allow stuff like this to work:
140
141            open LOG, '>mylog.xml';
142            print_xml LOG '<foo> <bar> </bar> </foo>';
143            close LOG;
144
145            open my $log, '>otherlog.xml';
146            print_xml $log '<foo> <bar> </bar> </foo>';
147            close $log;
148
149            print_xml STDERR '<foo> <bar> </bar> </foo>';
150
151   Constants
152       These can be exported:
153
154        use XML::LibXML::PrettyPrint 0.001 qw(:constants);
155
156       "EL_BLOCK"
157       "EL_COMPACT"
158       "EL_INLINE"
159

ELEMENT CATEGORIES

161       There are three categories of element: inline, block and compact.
162
163       For inline elements the presence of whitespace (though not the amount
164       of whitespace) is considered significant just before the element, just
165       after the element, or just within the element.
166
167       In XHTML, consider the difference between the block element "<div>":
168
169        <div>Will</div><div>Carlton</div> <div>Ashley</div>
170
171       and the inline element "<span>":
172
173        <span>Spider</span>-<span>Man</span> <span>lives</span>
174
175       The space or lackthereof between "<div>" elements does not matter one
176       whit. The lack of spaces between the first two "<span>" elements allows
177       them to be read as a single (in this case, hyphenated) word, whereas
178       the space before the third "<span>" separates out the word "lives".
179
180       In terms of indentation, inline elements do not start a new indented
181       line, unless they are the first element within their block, or are
182       preceded by a block or compact element.
183
184       Block elements always start a new line, and cause their child nodes to
185       be indented to the next level.
186
187       Compact elements are somewhere in-between. When it comes to whitespace
188       stripping, they're treated as block elements. In terms of indentation,
189       they always start a new line, but they only cause their child nodes to
190       be indented to the next level if they have block descendents. If we
191       imagine that in HTML, "<ul>" is a block element, "<i>" is an inline
192       element, and "<li>" is a compact element:
193
194        <ul>
195          <li>Will Smith - Will Smith</li>
196          <li>Carlton Banks - Alfonso Ribeiro</li>
197          <li>
198            Vivian Banks:
199            <ul>
200              <li>Janet Hubert-Whitten <i>(seasons 1-3)</i></li>
201              <li>Daphne Maxwell Reid <i>(seasons 3-6)</i></li>
202            </ul>
203          </li>
204        </ul>
205
206       The third "<li>" element is indented like a block element because it
207       contains a block "<ul>" element. The other "<li>" elements do not have
208       their contents indented, because they contain only inline content.
209
210       Elements default to being block, but you can specify particular
211       elements as inline or compact by passing node names or callbacks to the
212       constructor. Elements default to not preserving whitespace unless they
213       have an "xml:space="preserve"" attribute, but again you can use the
214       constructor to change this.
215
216       Comments and processing instructions default to being compact, but you
217       can make particular comments or PIs inline or block by passing
218       appropriate callbacks to the constructor. Whitespace within comments
219       and PIs is always preserved. (There is rarely any reason to make
220       comments and processing instructions block, but making them inline can
221       occasionally be useful, as it will mean that the presence of whitespace
222       just before or just after the comment is treated as significant.)
223
224       Text nodes are always inline.
225

BUGS

227       Please report any bugs to
228       <http://rt.cpan.org/Dist/Display.html?Queue=XML-LibXML-PrettyPrint>.
229

SEE ALSO

231       Related: XML::LibXML, HTML::HTML5::Writer.
232
233       XML::Tidy - similar, but based on XML::XPath. Doesn't differentiate
234       between inline and block elements.
235
236       XML::Filter::Reindent - similar again, based on XML::Parser. Doesn't
237       differentiate between inline and block elements.
238
239       Sermon: <http://www.derkarl.org/why_to_tabs.html>. Read it.
240

AUTHOR

242       Toby Inkster <tobyink@cpan.org>.
243
245       This software is copyright (c) 2011-2014 by Toby Inkster.
246
247       This is free software; you can redistribute it and/or modify it under
248       the same terms as the Perl 5 programming language system itself.
249

DISCLAIMER OF WARRANTIES

251       THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
252       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
253       MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
254
255
256
257perl v5.36.0                      2023-01-20       XML::LibXML::PrettyPrint(3)
Impressum