1HTML::HTML5::Writer(3)User Contributed Perl DocumentationHTML::HTML5::Writer(3)
2
3
4
6 HTML::HTML5::Writer - output a DOM as HTML5
7
9 use HTML::HTML5::Writer;
10
11 my $writer = HTML::HTML5::Writer->new;
12 print $writer->document($dom);
13
15 This module outputs XML::LibXML::Node objects as HTML5 strings. It
16 works well on DOM trees that represent valid HTML/XHTML documents; less
17 well on other DOM trees.
18
19 Constructor
20 "$writer = HTML::HTML5::Writer->new(%opts)"
21 Create a new writer object. Options include:
22
23 · markup
24
25 Choose which serialisation of HTML5 to use: 'html' or 'xhtml'.
26
27 · polyglot
28
29 Set to true in order to attempt to produce output which works
30 as both XML and HTML. Set to false to produce content that
31 might not.
32
33 If you don't explicitly set it, then it defaults to false for
34 HTML, and true for XHTML.
35
36 · doctype
37
38 Set this to a string to choose which <!DOCTYPE> tag to output.
39 Note, this purely sets the <!DOCTYPE> tag and does not change
40 how the rest of the document is output. This really is just a
41 plain string literal...
42
43 # Yes, this works...
44 my $w = HTML::HTML5::Writer->new(doctype => '<!doctype html>');
45
46 The following constants are provided for convenience:
47 DOCTYPE_HTML2, DOCTYPE_HTML32, DOCTYPE_HTML4 (latest stable
48 strict HTML 4.x), DOCTYPE_HTML4_RDFA (latest stable HTML
49 4.x+RDFa), DOCTYPE_HTML40 (strict), DOCTYPE_HTML40_FRAMESET,
50 DOCTYPE_HTML40_LOOSE, DOCTYPE_HTML40_STRICT, DOCTYPE_HTML401
51 (strict), DOCTYPE_HTML401_FRAMESET, DOCTYPE_HTML401_LOOSE,
52 DOCTYPE_HTML401_RDFA10, DOCTYPE_HTML401_RDFA11,
53 DOCTYPE_HTML401_STRICT, DOCTYPE_HTML5, DOCTYPE_LEGACY
54 (about:legacy-compat), DOCTYPE_NIL (empty string),
55 DOCTYPE_XHTML1 (strict), DOCTYPE_XHTML1_FRAMESET,
56 DOCTYPE_XHTML1_LOOSE, DOCTYPE_XHTML1_STRICT, DOCTYPE_XHTML11,
57 DOCTYPE_XHTML_BASIC, DOCTYPE_XHTML_BASIC_10,
58 DOCTYPE_XHTML_BASIC_11, DOCTYPE_XHTML_MATHML_SVG,
59 DOCTYPE_XHTML_RDFA (latest stable strict XHTML+RDFa),
60 DOCTYPE_XHTML_RDFA10, DOCTYPE_XHTML_RDFA11.
61
62 Defaults to DOCTYPE_HTML5 for HTML and DOCTYPE_LEGACY for
63 XHTML.
64
65 · charset
66
67 This module always returns strings in Perl's internal utf8
68 encoding, but you can set the 'charset' option to 'ascii' to
69 create output that would be suitable for re-encoding to ASCII
70 (e.g. it will entity-encode characters which do not exist in
71 ASCII).
72
73 · quote_attributes
74
75 Set this to a true to force attributes to be quoted. If not
76 explicitly set, the writer will automatically detect when
77 attributes need quoting.
78
79 · voids
80
81 Set this to true to force void elements to always be terminated
82 with '/>'. If not explicitly set, they'll only be terminated
83 that way in polyglot or XHTML documents.
84
85 · start_tags and end_tags
86
87 Except in polyglot and XHTML documents, some elements allow
88 their start and/or end tags to be omitted in certain
89 circumstances. By setting these to true, you can prevent them
90 from being omitted.
91
92 · refs
93
94 Special characters that can't be encoded as named entities need
95 to be encoded as numeric character references instead. These
96 can be expressed in decimal or hexadecimal. Setting this option
97 to 'dec' or 'hex' allows you to choose. The default is 'hex'.
98
99 Public Methods
100 "$writer->document($node)"
101 Outputs (i.e. returns a string that is) an XML::LibXML::Document as
102 HTML.
103
104 "$writer->element($node)"
105 Outputs an XML::LibXML::Element as HTML.
106
107 "$writer->attribute($node)"
108 Outputs an XML::LibXML::Attr as HTML.
109
110 "$writer->text($node)"
111 Outputs an XML::LibXML::Text as HTML.
112
113 "$writer->cdata($node)"
114 Outputs an XML::LibXML::CDATASection as HTML.
115
116 "$writer->comment($node)"
117 Outputs an XML::LibXML::Comment as HTML.
118
119 "$writer->pi($node)"
120 Outputs an XML::LibXML::PI as HTML.
121
122 "$writer->doctype"
123 Outputs the writer's DOCTYPE.
124
125 "$writer->encode_entities($string, characters=>$more)"
126 Takes a string and returns the same string with some special
127 characters replaced. These special characters do not include any of
128 '&', '<', '>' or '"', but you can provide a string of additional
129 characters to treat as special:
130
131 $encoded = $writer->encode_entities($raw, characters=>'&<>"');
132
133 "$writer->encode_entity($char)"
134 Returns $char entity-encoded. Encoding is done regardless of
135 whether $char is "special" or not.
136
137 "$writer->is_xhtml"
138 Boolean indicating if $writer is configured to output XHTML.
139
140 "$writer->is_polyglot"
141 Boolean indicating if $writer is configured to output polyglot
142 HTML.
143
144 "$writer->should_force_start_tags"
145 "$writer->should_force_end_tags"
146 Booleans indicating whether optional start and end tags should be
147 forced.
148
149 "$writer->should_quote_attributes"
150 Boolean indicating whether attributes need to be quoted.
151
152 "$writer->should_slash_voids"
153 Boolean indicating whether void elements should be closed in the
154 XHTML style.
155
157 Certain DOM constructs cannot be output in non-XML HTML. e.g.
158
159 my $xhtml = <<XHTML;
160 <html xmlns="http://www.w3.org/1999/xhtml">
161 <head><title>Test</title></head>
162 <body><hr>This text is within the HR element</hr></body>
163 </html>
164 XHTML
165 my $dom = XML::LibXML->new->parse_string($xhtml);
166 my $writer = HTML::HTML5::Writer->new(markup=>'html');
167 print $writer->document($dom);
168
169 In HTML, there's no way to serialise that properly in HTML. Right now
170 this module just outputs that HR element with text contained within it,
171 a la XHTML. In future versions, it may emit a warning or throw an
172 error.
173
174 In these cases, the HTML::HTML5::{Parser,Writer} combination is not
175 round-trippable.
176
177 Outputting elements and attributes in foreign (non-XHTML) namespaces is
178 implemented pretty naively and not thoroughly tested. I'd be interested
179 in any feedback people have, especially on round-trippability of SVG,
180 MathML and RDFa content in HTML.
181
182 Please report any bugs to <http://rt.cpan.org/>.
183
185 HTML::HTML5::Parser, HTML::HTML5::Builder, HTML::HTML5::ToText,
186 XML::LibXML.
187
189 Toby Inkster <tobyink@cpan.org>.
190
192 Copyright (C) 2010-2012 by Toby Inkster.
193
194 This library is free software; you can redistribute it and/or modify it
195 under the same terms as Perl itself.
196
197
198
199perl v5.32.0 2020-07-28 HTML::HTML5::Writer(3)