1HTML::HTML5::Sanity(3)User Contributed Perl DocumentationHTML::HTML5::Sanity(3)
2
3
4

NAME

6       HTML::HTML5::Sanity - make HTML5 DOM trees less insane
7

SYNOPSIS

9         use HTML::HTML5::Parser;
10         use HTML::HTML5::Sanity;
11
12         my $parser    = HTML::HTML5::Parser->new;
13         my $html5_dom = $parser->parse_file('http://example.com/');
14         my $sane_dom  = fix_document($html5_dom);
15

DESCRIPTION

17       The Document Object Model (DOM) generated by HTML::HTML5::Parser meets
18       the requirements of the HTML5 spec, but will probably catch a lot of
19       people by surprise.
20
21       The main oddity is that elements and attributes which appear to be
22       namespaced are not really. For example, the following element:
23
24         <div xml:lang="fr">...</div>
25
26       Looks like it should be parsed so that it has an attribute "lang" in
27       the XML namespace. Not so. It will really be parsed as having the
28       attribute "xml:lang" in the null namespace.
29
30       "fix_document($document)"
31             $sane_dom = fix_document($html5_dom);
32
33           Returns a modified copy of the DOM and leaving the original DOM
34           unmodified.
35
36       "fix_element($element_node, $new_document_node, \%namespaces)"
37           Don't use this. Not exported.
38
39       "fix_attribute($attribute_node, $new_element_node, \%namespaces)"
40           Don't use this. Not exported.
41
42       $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES
43             $HTML::HTML5::Sanity::FIX_LANG_ATTRIBUTES = 2;
44             $sane_dom = fix_document($html5_dom);
45
46           If set to 1 (the default), the package will detect invalid values
47           in @lang and @xml:lang, and remove the attribute if it is invalid.
48           If set to 2, it will also attempt to canonicalise the value (e.g.
49           'EN_GB' will be converted to to 'en-GB'). If set to 0, then the
50           value of language attributes is not checked.
51

BUGS

53       Please report any bugs to <http://rt.cpan.org/>.
54

SEE ALSO

56       HTML::HTML5::Parser, XML::LibXML, Task::HTML5.
57

AUTHOR

59       Toby Inkster <tobyink@cpan.org>.
60
62       Copyright (C) 2009-2014 by Toby Inkster
63
64       This library is free software; you can redistribute it and/or modify it
65       under the same terms as Perl itself.
66

DISCLAIMER OF WARRANTIES

68       THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
69       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
70       MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
71
72
73
74perl v5.36.0                      2022-07-22            HTML::HTML5::Sanity(3)
Impressum