1HTML::HTML5::Writer(3)User Contributed Perl DocumentationHTML::HTML5::Writer(3)
2
3
4

NAME

6       HTML::HTML5::Writer - output a DOM as HTML5
7

SYNOPSIS

9        use HTML::HTML5::Writer;
10
11        my $writer = HTML::HTML5::Writer->new;
12        print $writer->document($dom);
13

DESCRIPTION

15       This module outputs XML::LibXML::Node objects as HTML5 strings.  It
16       works well on DOM trees that represent valid HTML/XHTML documents; less
17       well on other DOM trees.
18
19   Constructor
20       "$writer = HTML::HTML5::Writer->new(%opts)"
21           Create a new writer object. Options include:
22
23           ·   markup
24
25               Choose which serialisation of HTML5 to use: 'html' or 'xhtml'.
26
27           ·   polyglot
28
29               Set to true in order to attempt to produce output which works
30               as both XML and HTML. Set to false to produce content that
31               might not.
32
33               If you don't explicitly set it, then it defaults to false for
34               HTML, and true for XHTML.
35
36           ·   doctype
37
38               Set this to a string to choose which <!DOCTYPE> tag to output.
39               Note, this purely sets the <!DOCTYPE> tag and does not change
40               how the rest of the document is output. This really is just a
41               plain string literal...
42
43                # Yes, this works...
44                my $w = HTML::HTML5::Writer->new(doctype => '<!doctype html>');
45
46               The following constants are provided for convenience:
47               DOCTYPE_HTML2, DOCTYPE_HTML32, DOCTYPE_HTML4 (latest stable
48               strict HTML 4.x), DOCTYPE_HTML4_RDFA (latest stable HTML
49               4.x+RDFa), DOCTYPE_HTML40 (strict), DOCTYPE_HTML40_FRAMESET,
50               DOCTYPE_HTML40_LOOSE, DOCTYPE_HTML40_STRICT, DOCTYPE_HTML401
51               (strict), DOCTYPE_HTML401_FRAMESET, DOCTYPE_HTML401_LOOSE,
52               DOCTYPE_HTML401_RDFA10, DOCTYPE_HTML401_RDFA11,
53               DOCTYPE_HTML401_STRICT, DOCTYPE_HTML5, DOCTYPE_LEGACY
54               (about:legacy-compat), DOCTYPE_NIL (empty string),
55               DOCTYPE_XHTML1 (strict), DOCTYPE_XHTML1_FRAMESET,
56               DOCTYPE_XHTML1_LOOSE, DOCTYPE_XHTML1_STRICT, DOCTYPE_XHTML11,
57               DOCTYPE_XHTML_BASIC, DOCTYPE_XHTML_BASIC_10,
58               DOCTYPE_XHTML_BASIC_11, DOCTYPE_XHTML_MATHML_SVG,
59               DOCTYPE_XHTML_RDFA (latest stable strict XHTML+RDFa),
60               DOCTYPE_XHTML_RDFA10, DOCTYPE_XHTML_RDFA11.
61
62               Defaults to DOCTYPE_HTML5 for HTML and DOCTYPE_LEGACY for
63               XHTML.
64
65           ·   charset
66
67               This module always returns strings in Perl's internal utf8
68               encoding, but you can set the 'charset' option to 'ascii' to
69               create output that would be suitable for re-encoding to ASCII
70               (e.g. it will entity-encode characters which do not exist in
71               ASCII).
72
73           ·   quote_attributes
74
75               Set this to a true to force attributes to be quoted. If not
76               explicitly set, the writer will automatically detect when
77               attributes need quoting.
78
79           ·   voids
80
81               Set this to true to force void elements to always be terminated
82               with '/>'.  If not explicitly set, they'll only be terminated
83               that way in polyglot or XHTML documents.
84
85           ·   start_tags and end_tags
86
87               Except in polyglot and XHTML documents, some elements allow
88               their start and/or end tags to be omitted in certain
89               circumstances. By setting these to true, you can prevent them
90               from being omitted.
91
92           ·   refs
93
94               Special characters that can't be encoded as named entities need
95               to be encoded as numeric character references instead. These
96               can be expressed in decimal or hexadecimal. Setting this option
97               to 'dec' or 'hex' allows you to choose. The default is 'hex'.
98
99   Public Methods
100       "$writer->document($node)"
101           Outputs (i.e. returns a string that is) an XML::LibXML::Document as
102           HTML.
103
104       "$writer->element($node)"
105           Outputs an XML::LibXML::Element as HTML.
106
107       "$writer->attribute($node)"
108           Outputs an XML::LibXML::Attr as HTML.
109
110       "$writer->text($node)"
111           Outputs an XML::LibXML::Text as HTML.
112
113       "$writer->cdata($node)"
114           Outputs an XML::LibXML::CDATASection as HTML.
115
116       "$writer->comment($node)"
117           Outputs an XML::LibXML::Comment as HTML.
118
119       "$writer->pi($node)"
120           Outputs an XML::LibXML::PI as HTML.
121
122       "$writer->doctype"
123           Outputs the writer's DOCTYPE.
124
125       "$writer->encode_entities($string, characters=>$more)"
126           Takes a string and returns the same string with some special
127           characters replaced. These special characters do not include any of
128           '&', '<', '>' or '"', but you can provide a string of additional
129           characters to treat as special:
130
131            $encoded = $writer->encode_entities($raw, characters=>'&<>"');
132
133       "$writer->encode_entity($char)"
134           Returns $char entity-encoded. Encoding is done regardless of
135           whether $char is "special" or not.
136
137       "$writer->is_xhtml"
138           Boolean indicating if $writer is configured to output XHTML.
139
140       "$writer->is_polyglot"
141           Boolean indicating if $writer is configured to output polyglot
142           HTML.
143
144       "$writer->should_force_start_tags"
145       "$writer->should_force_end_tags"
146           Booleans indicating whether optional start and end tags should be
147           forced.
148
149       "$writer->should_quote_attributes"
150           Boolean indicating whether attributes need to be quoted.
151
152       "$writer->should_slash_voids"
153           Boolean indicating whether void elements should be closed in the
154           XHTML style.
155

BUGS AND LIMITATIONS

157       Certain DOM constructs cannot be output in non-XML HTML. e.g.
158
159        my $xhtml = <<XHTML;
160        <html xmlns="http://www.w3.org/1999/xhtml">
161         <head><title>Test</title></head>
162         <body><hr>This text is within the HR element</hr></body>
163        </html>
164        XHTML
165        my $dom    = XML::LibXML->new->parse_string($xhtml);
166        my $writer = HTML::HTML5::Writer->new(markup=>'html');
167        print $writer->document($dom);
168
169       In HTML, there's no way to serialise that properly in HTML. Right now
170       this module just outputs that HR element with text contained within it,
171       a la XHTML. In future versions, it may emit a warning or throw an
172       error.
173
174       In these cases, the HTML::HTML5::{Parser,Writer} combination is not
175       round-trippable.
176
177       Outputting elements and attributes in foreign (non-XHTML) namespaces is
178       implemented pretty naively and not thoroughly tested. I'd be interested
179       in any feedback people have, especially on round-trippability of SVG,
180       MathML and RDFa content in HTML.
181
182       Please report any bugs to <http://rt.cpan.org/>.
183

SEE ALSO

185       HTML::HTML5::Parser, HTML::HTML5::Builder, HTML::HTML5::ToText,
186       XML::LibXML.
187

AUTHOR

189       Toby Inkster <tobyink@cpan.org>.
190
192       Copyright (C) 2010-2012 by Toby Inkster.
193
194       This library is free software; you can redistribute it and/or modify it
195       under the same terms as Perl itself.
196
197
198
199perl v5.32.0                      2020-07-28            HTML::HTML5::Writer(3)
Impressum