1HTML::Parse(3) User Contributed Perl Documentation HTML::Parse(3)
2
3
4
6 HTML::Parse - Deprecated, a wrapper around HTML::TreeBuilder
7
9 This document describes version 5.07 of HTML::Parse, released August
10 31, 2017 as part of HTML-Tree.
11
13 See the documentation for HTML::TreeBuilder
14
16 Disclaimer: This module is provided only for backwards compatibility
17 with earlier versions of this library. New code should not use this
18 module, and should really use the HTML::Parser and HTML::TreeBuilder
19 modules directly, instead.
20
21 The "HTML::Parse" module provides functions to parse HTML documents.
22 There are two functions exported by this module:
23
24 parse_html($html) or parse_html($html, $obj)
25 This function is really just a synonym for $obj->parse($html) and
26 $obj is assumed to be a subclass of "HTML::Parser". Refer to
27 HTML::Parser for more documentation.
28
29 If $obj is not specified, the $obj will default to an internally
30 created new "HTML::TreeBuilder" object configured with
31 strict_comment() turned on. That class implements a parser that
32 builds (and is) a HTML syntax tree with HTML::Element objects as
33 nodes.
34
35 The return value from parse_html() is $obj.
36
37 parse_htmlfile($file, [$obj])
38 Same as parse_html(), but pulls the HTML to parse, from the named
39 file.
40
41 Returns "undef" if the file could not be opened, or $obj otherwise.
42
43 When a "HTML::TreeBuilder" object is created, the following variables
44 control how parsing takes place:
45
46 $HTML::Parse::IMPLICIT_TAGS
47 Setting this variable to true will instruct the parser to try to
48 deduce implicit elements and implicit end tags. If this variable
49 is false you get a parse tree that just reflects the text as it
50 stands. Might be useful for quick & dirty parsing. Default is
51 true.
52
53 Implicit elements have the implicit() attribute set.
54
55 $HTML::Parse::IGNORE_UNKNOWN
56 This variable contols whether unknow tags should be represented as
57 elements in the parse tree. Default is true.
58
59 $HTML::Parse::IGNORE_TEXT
60 Do not represent the text content of elements. This saves space if
61 all you want is to examine the structure of the document. Default
62 is false.
63
64 $HTML::Parse::WARN
65 Call warn() with an appropriate message for syntax errors. Default
66 is false.
67
69 HTML::TreeBuilder objects should be explicitly destroyed when you're
70 finished with them. See HTML::TreeBuilder.
71
73 HTML::Parser, HTML::TreeBuilder, HTML::Element
74
76 Current maintainers:
77
78 · Christopher J. Madsen "<perl AT cjmweb.net>"
79
80 · Jeff Fearn "<jfearn AT cpan.org>"
81
82 Original HTML-Tree author:
83
84 · Gisle Aas
85
86 Former maintainers:
87
88 · Sean M. Burke
89
90 · Andy Lester
91
92 · Pete Krawczyk "<petek AT cpan.org>"
93
94 You can follow or contribute to HTML-Tree's development at
95 <https://github.com/kentfredric/HTML-Tree>.
96
98 Copyright 1995-1998 Gisle Aas, 1999-2004 Sean M. Burke, 2005 Andy
99 Lester, 2006 Pete Krawczyk, 2010 Jeff Fearn, 2012 Christopher J.
100 Madsen.
101
102 This library is free software; you can redistribute it and/or modify it
103 under the same terms as Perl itself.
104
105 The programs in this library are distributed in the hope that they will
106 be useful, but without any warranty; without even the implied warranty
107 of merchantability or fitness for a particular purpose.
108
109
110
111perl v5.28.0 2018-07-14 HTML::Parse(3)