1XML::Parser::PerlSAX(3)User Contributed Perl DocumentatioXnML::Parser::PerlSAX(3)
2
3
4
6 XML::Parser::PerlSAX - Perl SAX parser using XML::Parser
7
9 use XML::Parser::PerlSAX;
10
11 $parser = XML::Parser::PerlSAX->new( [OPTIONS] );
12 $result = $parser->parse( [OPTIONS] );
13
14 $result = $parser->parse($string);
15
17 "XML::Parser::PerlSAX" is a PerlSAX parser using the XML::Parser mod‐
18 ule. This man page summarizes the specific options, handlers, and
19 properties supported by "XML::Parser::PerlSAX"; please refer to the
20 PerlSAX standard in `"PerlSAX.pod"' for general usage information.
21
23 new Creates a new parser object. Default options for parsing,
24 described below, are passed as key-value pairs or as a single hash.
25 Options may be changed directly in the parser object unless stated
26 otherwise. Options passed to `"parse()"' override the default
27 options in the parser object for the duration of the parse.
28
29 parse
30 Parses a document. Options, described below, are passed as key-
31 value pairs or as a single hash. Options passed to `"parse()"'
32 override default options in the parser object.
33
34 location
35 Returns the location as a hash:
36
37 ColumnNumber The column number of the parse.
38 LineNumber The line number of the parse.
39 BytePosition The current byte position of the parse.
40 PublicId A string containing the public identifier, or undef
41 if none is available.
42 SystemId A string containing the system identifier, or undef
43 if none is available.
44 Base The current value of the base for resolving relative
45 URIs.
46
47 ALPHA WARNING: The `"SystemId"' and `"PublicId"' properties
48 returned are the system and public identifiers of the document
49 passed to `"parse()"', not the identifiers of the currently parsing
50 external entity. The column, line, and byte positions are of the
51 current entity being parsed.
52
54 The following options are supported by "XML::Parser::PerlSAX":
55
56 Handler default handler to receive events
57 DocumentHandler handler to receive document events
58 DTDHandler handler to receive DTD events
59 ErrorHandler handler to receive error events
60 EntityResolver handler to resolve entities
61 Locale locale to provide localisation for errors
62 Source hash containing the input source for parsing
63 UseAttributeOrder set to true to provide AttributeOrder and Defaulted
64 properties in `start_element()'
65
66 If no handlers are provided then all events will be silently ignored,
67 except for `"fatal_error()"' which will cause a `"die()"' to be called
68 after calling `"end_document()"'.
69
70 If a single string argument is passed to the `"parse()"' method, it is
71 treated as if a `"Source"' option was given with a `"String"' parame‐
72 ter.
73
74 The `"Source"' hash may contain the following parameters:
75
76 ByteStream The raw byte stream (file handle) containing the
77 document.
78 String A string containing the document.
79 SystemId The system identifier (URI) of the document.
80 PublicId The public identifier.
81 Encoding A string describing the character encoding.
82
83 If more than one of `"ByteStream"', `"String"', or `"SystemId"', then
84 preference is given first to `"ByteStream"', then `"String"', then
85 `"SystemId"'.
86
88 The following handlers and properties are supported by
89 "XML::Parser::PerlSAX":
90
91 DocumentHandler methods
92
93 start_document
94 Receive notification of the beginning of a document.
95
96 No properties defined.
97
98 end_document
99 Receive notification of the end of a document.
100
101 No properties defined.
102
103 start_element
104 Receive notification of the beginning of an element.
105
106 Name The element type name.
107 Attributes A hash containing the attributes attached to the
108 element, if any.
109
110 The `"Attributes"' hash contains only string values.
111
112 If the `"UseAttributeOrder"' parser option is true, the follow‐
113 ing properties are also passed to `"start_element"':
114
115 AttributeOrder An array of attribute names in the order they were
116 specified, followed by the defaulted attribute
117 names.
118 Defaulted The index number of the first defaulted attribute in
119 `AttributeOrder. If this index is equal to the
120 length of `AttributeOrder', there were no defaulted
121 values.
122
123 Note to "XML::Parser" users: `"Defaulted"' will be half the
124 value of "XML::Parser::Expat"'s `"specified_attr()"' function
125 because only attribute names are provided, not their values.
126
127 end_element
128 Receive notification of the end of an element.
129
130 Name The element type name.
131
132 characters
133 Receive notification of character data.
134
135 Data The characters from the XML document.
136
137 processing_instruction
138 Receive notification of a processing instruction.
139
140 Target The processing instruction target.
141 Data The processing instruction data, if any.
142
143 comment
144 Receive notification of a comment.
145
146 Data The comment data, if any.
147
148 start_cdata
149 Receive notification of the start of a CDATA section.
150
151 No properties defined.
152
153 end_cdata
154 Receive notification of the end of a CDATA section.
155
156 No properties defined.
157
158 entity_reference
159 Receive notification of an internal entity reference. If this
160 handler is defined, internal entities will not be expanded and
161 not passed to the `"characters()"' handler. If this handler is
162 not defined, internal entities will be expanded if possible and
163 passed to the `"characters()"' handler.
164
165 Name The entity reference name
166 Value The entity reference value
167
168 DTDHandler methods
169
170 notation_decl
171 Receive notification of a notation declaration event.
172
173 Name The notation name.
174 PublicId The notation's public identifier, if any.
175 SystemId The notation's system identifier, if any.
176 Base The base for resolving a relative URI, if any.
177
178 unparsed_entity_decl
179 Receive notification of an unparsed entity declaration event.
180
181 Name The unparsed entity's name.
182 SystemId The entity's system identifier.
183 PublicId The entity's public identifier, if any.
184 Base The base for resolving a relative URI, if any.
185
186 entity_decl
187 Receive notification of an entity declaration event.
188
189 Name The entity name.
190 Value The entity value, if any.
191 PublicId The notation's public identifier, if any.
192 SystemId The notation's system identifier, if any.
193 Notation The notation declared for this entity, if any.
194
195 For internal entities, the `"Value"' parameter will contain the
196 value and the `"PublicId"', `"SystemId"', and `"Notation"' will
197 be undefined. For external entities, the `"Value"' parameter
198 will be undefined, the `"SystemId"' parameter will have the
199 system id, the `"PublicId"' parameter will have the public id
200 if it was provided (it will be undefined otherwise), the
201 `"Notation"' parameter will contain the notation name for
202 unparsed entities. If this is a parameter entity declaration,
203 then a '%' will be prefixed to the entity name.
204
205 Note that `"entity_decl()"' and `"unparsed_entity_decl()"'
206 overlap. If both methods are implemented by a handler, then
207 this handler will not be called for unparsed entities.
208
209 element_decl
210 Receive notification of an element declaration event.
211
212 Name The element type name.
213 Model The content model as a string.
214
215 attlist_decl
216 Receive notification of an attribute list declaration event.
217
218 This handler is called for each attribute in an ATTLIST decla‐
219 ration found in the internal subset. So an ATTLIST declaration
220 that has multiple attributes will generate multiple calls to
221 this handler.
222
223 ElementName The element type name.
224 AttributeName The attribute name.
225 Type The attribute type.
226 Fixed True if this is a fixed attribute.
227
228 The default for `"Type"' is the default value, which will
229 either be "#REQUIRED", "#IMPLIED" or a quoted string (i.e. the
230 returned string will begin and end with a quote character).
231
232 doctype_decl
233 Receive notification of a DOCTYPE declaration event.
234
235 Name The document type name.
236 SystemId The document's system identifier.
237 PublicId The document's public identifier, if any.
238 Internal The internal subset as a string, if any.
239
240 Internal will contain all whitespace, comments, processing
241 instructions, and declarations seen in the internal subset. The
242 declarations will be there whether or not they have been pro‐
243 cessed by another handler (except for unparsed entities pro‐
244 cessed by the Unparsed handler). However, comments and pro‐
245 cessing instructions will not appear if they've been processed
246 by their respective handlers.
247
248 xml_decl
249 Receive notification of an XML declaration event.
250
251 Version The version.
252 Encoding The encoding string, if any.
253 Standalone True, false, or undefined if not declared.
254
255 EntityResolver
256
257 resolve_entity
258 Allow the handler to resolve external entities.
259
260 Name The notation name.
261 SystemId The notation's system identifier.
262 PublicId The notation's public identifier, if any.
263 Base The base for resolving a relative URI, if any.
264
265 `"resolve_entity()"' should return undef to request that the
266 parser open a regular URI connection to the system identifier
267 or a hash describing the new input source. This hash has the
268 same properties as the `"Source"' parameter to `"parse()"':
269
270 PublicId The public identifier of the external entity being
271 referenced, or undef if none was supplied.
272 SystemId The system identifier of the external entity being
273 referenced.
274 String String containing XML text
275 ByteStream An open file handle.
276 CharacterStream
277 An open file handle.
278 Encoding The character encoding, if known.
279
281 Ken MacLeod, ken@bitsko.slc.ut.us
282
284 perl(1), PerlSAX.pod(3)
285
286 Extensible Markup Language (XML) <http://www.w3c.org/XML/>
287 SAX 1.0: The Simple API for XML <http://www.megginson.com/SAX/>
288
289
290
291perl v5.8.8 2003-10-21 XML::Parser::PerlSAX(3)