1XML::Grove(3) User Contributed Perl Documentation XML::Grove(3)
2
3
4
6 XML::Grove - Perl-style XML objects
7
9 use XML::Grove;
10
11 # Basic parsing and grove building
12 use XML::Grove::Builder;
13 use XML::Parser::PerlSAX;
14 $grove_builder = XML::Grove::Builder->new;
15 $parser = XML::Parser::PerlSAX->new ( Handler => $grove_builder );
16 $document = $parser->parse ( Source => { SystemId => 'filename' } );
17
18 # Creating new objects
19 $document = XML::Grove::Document->new ( Contents => [ ] );
20 $element = XML::Grove::Element->new ( Name => 'tag',
21 Attributes => { },
22 Contents => [ ] );
23
24 # Accessing XML objects
25 $tag_name = $element->{Name};
26 $contents = $element->{Contents};
27 $parent = $element->{Parent};
28 $characters->{Data} = 'XML is fun!';
29
31 XML::Grove is a tree-based object model for accessing the information
32 set of parsed or stored XML, HTML, or SGML instances. XML::Grove
33 objects are Perl hashes and arrays where you access the properties of
34 the objects using normal Perl syntax:
35
36 $text = $characters->{Data};
37
38 How To Create a Grove
39
40 There are several ways for groves to come into being, they can be read
41 from a file or string using a parser and a grove builder, they can be
42 created by your Perl code using the `"new()"' methods of
43 XML::Grove::Objects, or databases or other sources can act as groves.
44
45 The most common way to build groves is using a parser and a grove
46 builder. The parser is the package that reads the characters of an XML
47 file, recognizes the XML syntax, and produces ``events'' reporting when
48 elements (tags), text (characters), processing instructions, and other
49 sequences occur. A grove builder receives (``consumes'' or ``han‐
50 dles'') these events and builds XML::Grove objects. The last thing the
51 parser does is return the XML::Grove::Document object that the grove
52 builder created, with all of it's elements and character data.
53
54 The most common parser and grove builder are XML::Parser::PerlSAX (in
55 libxml-perl) and XML::Grove::Builder. To build a grove, create the
56 grove builder first:
57
58 $grove_builder = XML::Grove::Builder->new;
59
60 Then create the parser, passing it the grove builder as it's handler:
61
62 $parser = XML::Parser::PerlSAX->new ( Handler => $grove_builder );
63
64 This associates the grove builder with the parser so that every time
65 you parse a document with this parser it will return an
66 XML::Grove::Document object. To parse a file, use the `"Source"'
67 parameter to the `"parse()"' method containing a `"SystemId"' parameter
68 (URL or path) of the file you want to parse:
69
70 $document = $parser->parse ( Source => { SystemId => 'kjv.xml' } );
71
72 To parse a string held in a Perl variable, use the `"Source"' parameter
73 containing a `"String"' parameter:
74
75 $document = $parser->parse ( Source => { String => $xml_text } );
76
77 The following are all parsers that work with XML::Grove::Builder:
78
79 XML::Parser::PerlSAX (in libxml-perl, uses XML::Parser)
80 XML::ESISParser (in libxml-perl, uses James Clark's `nsgmls')
81 XML::SAX2Perl (in libxml-perl, translates SAX 1.0 to PerlSAX)
82
83 Most parsers supply more properties than the standard information set
84 below and XML::Grove will make available all the properties given by
85 the parser, refer to the parser documentation to find out what addi‐
86 tional properties it may provide.
87
88 Although there are not any available yet (August 1999), PerlSAX filters
89 can be used to process the output of a parser before it is passed to
90 XML::Grove::Builder. XML::Grove::PerlSAX can be used to provide input
91 to PerlSAX filters or other PerlSAX handlers.
92
93 Using Groves
94
95 The properties provided by parsers are available directly using Perl's
96 normal syntax for accessing hashes and arrays. For example, to get the
97 name of an element:
98
99 $element_name = $element->{Name};
100
101 By convention, all properties provided by parsers are in mixed case.
102 `"Parent"' properties are available using the `"Data::Grove::Parent"'
103 module.
104
105 The following is the minimal set of objects and their properties that
106 you are likely to get from all parsers:
107
108 XML::Grove::Document
109
110 The Document object is parent of the root element of the parsed XML
111 document.
112
113 Contents An array containing the root element.
114
115 A document's `Contents' may also contain processing instructions, com‐
116 ments, and whitespace.
117
118 Some parsers provide information about the document type, the XML dec‐
119 laration, or notations and entities. Check the parser documentation
120 for property names.
121
122 XML::Grove::Element
123
124 The Element object represents elements from the XML source.
125
126 Parent The parent object of this element.
127
128 Name A string, the element type name of this element
129
130 Attributes A hash of strings or arrays
131
132 Contents An array of elements, characters, processing instructions,
133 etc.
134
135 In a purely minimal grove, the attributes of an element will be plain
136 text (Perl scalars). Some parsers provide access to notations and
137 entities in attributes, in which case the attribute may contain an
138 array.
139
140 XML::Grove::Characters
141
142 The Characters object represents text from the XML source.
143
144 Parent The parent object of this characters object
145
146 Data A string, the characters
147
148 XML::Grove::PI
149
150 The PI object represents processing instructions from the XML source.
151
152 Parent The parent object of this PI object.
153
154 Target A string, the processing instruction target.
155
156 Data A string, the processing instruction data, or undef if none
157 was supplied.
158
159 In addition to the minimal set of objects above, XML::Grove knows about
160 and parsers may provide the following objects. Refer to the parser
161 documentation for descriptions of the properties of these objects.
162
163 XML::Grove::
164 ::Entity::External External entity reference
165 ::Entity::SubDoc External SubDoc reference (SGML)
166 ::Entity::SGML External SGML reference (SGML)
167 ::Entity Entity reference
168 ::Notation Notation declaration
169 ::Comment <!-- A Comment -->
170 ::SubDoc A parsed subdocument (SGML)
171 ::CData A CDATA marked section
172 ::ElementDecl An element declaration from the DTD
173 ::AttListDecl An element's attribute declaration, from the DTD
174
176 XML::Grove by itself only provides one method, new(), for creating new
177 XML::Grove objects. There are Data::Grove and XML::Grove extension
178 modules that give additional methods for working with XML::Grove
179 objects and new extensions can be created as needed.
180
181 $obj = XML::Grove::OBJECT->new( [PROPERTIES] )
182 `"new"' creates a new XML::Grove object with the type OBJECT, and
183 with the initial PROPERTIES. PROPERTIES may be given as either a
184 list of key-value pairs, a hash, or an XML::Grove object to copy.
185 OBJECT may be any of the objects listed above.
186
187 This is a list of available extensions and the methods they provide (as
188 of Feb 1999). Refer to their module documentation for more information
189 on how to use them.
190
191 XML::Grove::AsString
192 as_string return portions of groves as a string
193 attr_as_string return an element's attribute as a string
194
195 XML::Grove::AsCanonXML
196 as_canon_xml return XML text in canonical XML format
197
198 XML::Grove::PerlSAX
199 parse emulate a PerlSAX parser using the grove objects
200
201 Data::Grove::Parent
202 root return the root element of a grove
203 rootpath return an array of all objects between the root
204 element and this object, inclusive
205
206 Data::Grove::Parent also adds `C<Parent>' and `C<Raw>' properties
207 to grove objects.
208
209 Data::Grove::Visitor
210 accept call back a subroutine using an object type name
211 accept_name call back using an element or tag name
212 children_accept for each child in Contents, call back a sub
213 children_accept_name same, but using tag names
214 attr_accept call back for the objects in attributes
215
216 XML::Grove::IDs
217 get_ids return a list of all ID attributes in grove
218
219 XML::Grove::Path
220 at_path $el->at_path('/html/body/ul/li[4]')
221
222 XML::Grove::Sub
223 filter run a sub against all the objects in the grove
224
226 The class `"XML::Grove"' is the superclass of all classes in the
227 XML::Grove module. `"XML::Grove"' is a subclass of `"Data::Grove"'.
228
229 If you create an extension and you want to add a method to all
230 XML::Grove objects, then create that method in the XML::Grove package.
231 Many extensions only need to add methods to XML::Grove::Document and/or
232 XML::Grove::Element.
233
234 When you create an extension you should definitly provide a way to
235 invoke your module using objects from your package too. For example,
236 XML::Grove::AsString's `"as_string()"' method can also be called using
237 an XML::Grove::AsString object:
238
239 $writer= new XML::Grove::AsString;
240 $string = $writer->as_string ( $xml_object );
241
243 Ken MacLeod, ken@bitsko.slc.ut.us
244
246 perl(1), XML::Grove(3)
247
248 Extensible Markup Language (XML) <http://www.w3c.org/XML>
249
250
251
252perl v5.8.8 1999-08-25 XML::Grove(3)