1HTML::TreeBuilder::XPatUhs(e3r)Contributed Perl DocumentHaTtMiLo:n:TreeBuilder::XPath(3)
2
3
4

NAME

6       HTML::TreeBuilder::XPath - add XPath support to HTML::TreeBuilder
7

SYNOPSIS

9         use HTML::TreeBuilder::XPath;
10         my $tree= HTML::TreeBuilder::XPath->new;
11         $tree->parse_file( "mypage.html");
12         my $nb=$tree->findvalue( '/html/body//p[@class="section_title"]/span[@class="nb"]');
13         my $id=$tree->findvalue( '/html/body//p[@class="section_title"]/@id');
14
15         my $p= $html->findnodes( '//p[@id="toto"]')->[0];
16         my $link_texts= $p->findvalue( './a'); # the texts of all a elements in $p
17         $tree->delete; # to avoid memory leaks, if you parse many HTML documents
18

DESCRIPTION

20       This module adds typical XPath methods to HTML::TreeBuilder, to make it
21       easy to query a document.
22

METHODS

24       Extra methods added both to the tree object and to each element:
25
26   findnodes ($path)
27       Returns a list of nodes found by $path.  In scalar context returns an
28       "Tree::XPathEngine::NodeSet" object.
29
30   findnodes_as_string ($path)
31       Returns the text values of the nodes, as one string.
32
33   findnodes_as_strings ($path)
34       Returns a list of the values of the result nodes.
35
36   findvalue ($path)
37       Returns either a "Tree::XPathEngine::Literal", a
38       "Tree::XPathEngine::Boolean" or a "Tree::XPathEngine::Number" object.
39       If the path returns a NodeSet, $nodeset->xpath_to_literal is called
40       automatically for you (and thus a "Tree::XPathEngine::Literal" is
41       returned). Note that for each of the objects stringification is
42       overloaded, so you can just print the value found, or manipulate it in
43       the ways you would a normal perl value (e.g. using regular
44       expressions).
45
46   findvalues ($path)
47       Returns the values of the matching nodes as a list. This is mostly the
48       same as findnodes_as_strings, except that the elements of the list are
49       objects (with overloaded stringification) instead of plain strings.
50
51   exists ($path)
52       Returns true if the given path exists.
53
54   matches($path)
55       Returns true if the element matches the path.
56
57   find ($path)
58       The find function takes an XPath expression (a string) and returns
59       either a Tree::XPathEngine::NodeSet object containing the nodes it
60       found (or empty if no nodes matched the path), or one of
61       XML::XPathEngine::Literal (a string), XML::XPathEngine::Number, or
62       XML::XPathEngine::Boolean. It should always return something - and you
63       can use ->isa() to find out what it returned. If you need to check how
64       many nodes it found you should check $nodeset->size.  See
65       XML::XPathEngine::NodeSet.
66
67   as_XML_compact
68       HTML::TreeBuilder's "as_XML" output is not really nice to look at, so I
69       added a new method, that can be used as a simple replacement for it.
70       It escapes only the '<', '>' and '&' (plus '"' in attribute values),
71       and wraps CDATA elements in CDATA sections.
72
73       Note that the XML is actually not garanteed to be valid at this point.
74       Nothing is done about the encoding of the string. Patches or just ideas
75       of how it could work are welcome.
76
77   as_XML_indented
78       Same as as_XML, except that the output is indented.
79

SEE ALSO

81       HTML::TreeBuilder
82
83       XML::XPathEngine
84

REPOSITORY

86       <https://github.com/mirod/HTML--TreeBuilder--XPath>
87

AUTHOR

89       Michel Rodriguez, <mirod@cpan.org>
90
92       Copyright (C) 2006-2011 by Michel Rodriguez
93
94       This library is free software; you can redistribute it and/or modify it
95       under the same terms as Perl itself, either Perl version 5.8.4 or, at
96       your option, any later version of Perl 5 you may have available.
97
98
99
100perl v5.28.0                      2011-09-20       HTML::TreeBuilder::XPath(3)
Impressum