1XML::Grove(3) User Contributed Perl Documentation XML::Grove(3)
2
3
4
6 XML::Grove - Perl-style XML objects
7
9 use XML::Grove;
10
11 # Basic parsing and grove building
12 use XML::Grove::Builder;
13 use XML::Parser::PerlSAX;
14 $grove_builder = XML::Grove::Builder->new;
15 $parser = XML::Parser::PerlSAX->new ( Handler => $grove_builder );
16 $document = $parser->parse ( Source => { SystemId => 'filename' } );
17
18 # Creating new objects
19 $document = XML::Grove::Document->new ( Contents => [ ] );
20 $element = XML::Grove::Element->new ( Name => 'tag',
21 Attributes => { },
22 Contents => [ ] );
23
24 # Accessing XML objects
25 $tag_name = $element->{Name};
26 $contents = $element->{Contents};
27 $parent = $element->{Parent};
28 $characters->{Data} = 'XML is fun!';
29
31 XML::Grove is a tree-based object model for accessing the information
32 set of parsed or stored XML, HTML, or SGML instances. XML::Grove
33 objects are Perl hashes and arrays where you access the properties of
34 the objects using normal Perl syntax:
35
36 $text = $characters->{Data};
37
38 How To Create a Grove
39 There are several ways for groves to come into being, they can be read
40 from a file or string using a parser and a grove builder, they can be
41 created by your Perl code using the `"new()"' methods of
42 XML::Grove::Objects, or databases or other sources can act as groves.
43
44 The most common way to build groves is using a parser and a grove
45 builder. The parser is the package that reads the characters of an XML
46 file, recognizes the XML syntax, and produces ``events'' reporting when
47 elements (tags), text (characters), processing instructions, and other
48 sequences occur. A grove builder receives (``consumes'' or
49 ``handles'') these events and builds XML::Grove objects. The last
50 thing the parser does is return the XML::Grove::Document object that
51 the grove builder created, with all of it's elements and character
52 data.
53
54 The most common parser and grove builder are XML::Parser::PerlSAX (in
55 libxml-perl) and XML::Grove::Builder. To build a grove, create the
56 grove builder first:
57
58 $grove_builder = XML::Grove::Builder->new;
59
60 Then create the parser, passing it the grove builder as it's handler:
61
62 $parser = XML::Parser::PerlSAX->new ( Handler => $grove_builder );
63
64 This associates the grove builder with the parser so that every time
65 you parse a document with this parser it will return an
66 XML::Grove::Document object. To parse a file, use the `"Source"'
67 parameter to the `"parse()"' method containing a `"SystemId"' parameter
68 (URL or path) of the file you want to parse:
69
70 $document = $parser->parse ( Source => { SystemId => 'kjv.xml' } );
71
72 To parse a string held in a Perl variable, use the `"Source"' parameter
73 containing a `"String"' parameter:
74
75 $document = $parser->parse ( Source => { String => $xml_text } );
76
77 The following are all parsers that work with XML::Grove::Builder:
78
79 XML::Parser::PerlSAX (in libxml-perl, uses XML::Parser)
80 XML::ESISParser (in libxml-perl, uses James Clark's `nsgmls')
81 XML::SAX2Perl (in libxml-perl, translates SAX 1.0 to PerlSAX)
82
83 Most parsers supply more properties than the standard information set
84 below and XML::Grove will make available all the properties given by
85 the parser, refer to the parser documentation to find out what
86 additional properties it may provide.
87
88 Although there are not any available yet (August 1999), PerlSAX filters
89 can be used to process the output of a parser before it is passed to
90 XML::Grove::Builder. XML::Grove::PerlSAX can be used to provide input
91 to PerlSAX filters or other PerlSAX handlers.
92
93 Using Groves
94 The properties provided by parsers are available directly using Perl's
95 normal syntax for accessing hashes and arrays. For example, to get the
96 name of an element:
97
98 $element_name = $element->{Name};
99
100 By convention, all properties provided by parsers are in mixed case.
101 `"Parent"' properties are available using the `"Data::Grove::Parent"'
102 module.
103
104 The following is the minimal set of objects and their properties that
105 you are likely to get from all parsers:
106
107 XML::Grove::Document
108 The Document object is parent of the root element of the parsed XML
109 document.
110
111 Contents An array containing the root element.
112
113 A document's `Contents' may also contain processing instructions,
114 comments, and whitespace.
115
116 Some parsers provide information about the document type, the XML
117 declaration, or notations and entities. Check the parser documentation
118 for property names.
119
120 XML::Grove::Element
121 The Element object represents elements from the XML source.
122
123 Parent The parent object of this element.
124
125 Name A string, the element type name of this element
126
127 Attributes A hash of strings or arrays
128
129 Contents An array of elements, characters, processing instructions,
130 etc.
131
132 In a purely minimal grove, the attributes of an element will be plain
133 text (Perl scalars). Some parsers provide access to notations and
134 entities in attributes, in which case the attribute may contain an
135 array.
136
137 XML::Grove::Characters
138 The Characters object represents text from the XML source.
139
140 Parent The parent object of this characters object
141
142 Data A string, the characters
143
144 XML::Grove::PI
145 The PI object represents processing instructions from the XML source.
146
147 Parent The parent object of this PI object.
148
149 Target A string, the processing instruction target.
150
151 Data A string, the processing instruction data, or undef if none
152 was supplied.
153
154 In addition to the minimal set of objects above, XML::Grove knows about
155 and parsers may provide the following objects. Refer to the parser
156 documentation for descriptions of the properties of these objects.
157
158 XML::Grove::
159 ::Entity::External External entity reference
160 ::Entity::SubDoc External SubDoc reference (SGML)
161 ::Entity::SGML External SGML reference (SGML)
162 ::Entity Entity reference
163 ::Notation Notation declaration
164 ::Comment <!-- A Comment -->
165 ::SubDoc A parsed subdocument (SGML)
166 ::CData A CDATA marked section
167 ::ElementDecl An element declaration from the DTD
168 ::AttListDecl An element's attribute declaration, from the DTD
169
171 XML::Grove by itself only provides one method, new(), for creating new
172 XML::Grove objects. There are Data::Grove and XML::Grove extension
173 modules that give additional methods for working with XML::Grove
174 objects and new extensions can be created as needed.
175
176 $obj = XML::Grove::OBJECT->new( [PROPERTIES] )
177 `"new"' creates a new XML::Grove object with the type OBJECT, and
178 with the initial PROPERTIES. PROPERTIES may be given as either a
179 list of key-value pairs, a hash, or an XML::Grove object to copy.
180 OBJECT may be any of the objects listed above.
181
182 This is a list of available extensions and the methods they provide (as
183 of Feb 1999). Refer to their module documentation for more information
184 on how to use them.
185
186 XML::Grove::AsString
187 as_string return portions of groves as a string
188 attr_as_string return an element's attribute as a string
189
190 XML::Grove::AsCanonXML
191 as_canon_xml return XML text in canonical XML format
192
193 XML::Grove::PerlSAX
194 parse emulate a PerlSAX parser using the grove objects
195
196 Data::Grove::Parent
197 root return the root element of a grove
198 rootpath return an array of all objects between the root
199 element and this object, inclusive
200
201 Data::Grove::Parent also adds `C<Parent>' and `C<Raw>' properties
202 to grove objects.
203
204 Data::Grove::Visitor
205 accept call back a subroutine using an object type name
206 accept_name call back using an element or tag name
207 children_accept for each child in Contents, call back a sub
208 children_accept_name same, but using tag names
209 attr_accept call back for the objects in attributes
210
211 XML::Grove::IDs
212 get_ids return a list of all ID attributes in grove
213
214 XML::Grove::Path
215 at_path $el->at_path('/html/body/ul/li[4]')
216
217 XML::Grove::Sub
218 filter run a sub against all the objects in the grove
219
221 The class `"XML::Grove"' is the superclass of all classes in the
222 XML::Grove module. `"XML::Grove"' is a subclass of `"Data::Grove"'.
223
224 If you create an extension and you want to add a method to all
225 XML::Grove objects, then create that method in the XML::Grove package.
226 Many extensions only need to add methods to XML::Grove::Document and/or
227 XML::Grove::Element.
228
229 When you create an extension you should definitly provide a way to
230 invoke your module using objects from your package too. For example,
231 XML::Grove::AsString's `"as_string()"' method can also be called using
232 an XML::Grove::AsString object:
233
234 $writer= new XML::Grove::AsString;
235 $string = $writer->as_string ( $xml_object );
236
238 Ken MacLeod, ken@bitsko.slc.ut.us
239
241 perl(1), XML::Grove(3)
242
243 Extensible Markup Language (XML) <http://www.w3c.org/XML>
244
245
246
247perl v5.16.3 1999-09-09 XML::Grove(3)