1XML::Parser::PerlSAX(3)User Contributed Perl DocumentatioXnML::Parser::PerlSAX(3)
2
3
4

NAME

6       XML::Parser::PerlSAX - Perl SAX parser using XML::Parser
7

SYNOPSIS

9        use XML::Parser::PerlSAX;
10
11        $parser = XML::Parser::PerlSAX->new( [OPTIONS] );
12        $result = $parser->parse( [OPTIONS] );
13
14        $result = $parser->parse($string);
15

DESCRIPTION

17       "XML::Parser::PerlSAX" is a PerlSAX parser using the XML::Parser mod‐
18       ule.  This man page summarizes the specific options, handlers, and
19       properties supported by "XML::Parser::PerlSAX"; please refer to the
20       PerlSAX standard in `"PerlSAX.pod"' for general usage information.
21

METHODS

23       new Creates a new parser object.  Default options for parsing,
24           described below, are passed as key-value pairs or as a single hash.
25           Options may be changed directly in the parser object unless stated
26           otherwise.  Options passed to `"parse()"' override the default
27           options in the parser object for the duration of the parse.
28
29       parse
30           Parses a document.  Options, described below, are passed as key-
31           value pairs or as a single hash.  Options passed to `"parse()"'
32           override default options in the parser object.
33
34       location
35           Returns the location as a hash:
36
37             ColumnNumber    The column number of the parse.
38             LineNumber      The line number of the parse.
39             BytePosition    The current byte position of the parse.
40             PublicId        A string containing the public identifier, or undef
41                             if none is available.
42             SystemId        A string containing the system identifier, or undef
43                             if none is available.
44             Base            The current value of the base for resolving relative
45                             URIs.
46
47           ALPHA WARNING: The `"SystemId"' and `"PublicId"' properties
48           returned are the system and public identifiers of the document
49           passed to `"parse()"', not the identifiers of the currently parsing
50           external entity.  The column, line, and byte positions are of the
51           current entity being parsed.
52

OPTIONS

54       The following options are supported by "XML::Parser::PerlSAX":
55
56        Handler          default handler to receive events
57        DocumentHandler  handler to receive document events
58        DTDHandler       handler to receive DTD events
59        ErrorHandler     handler to receive error events
60        EntityResolver   handler to resolve entities
61        Locale           locale to provide localisation for errors
62        Source           hash containing the input source for parsing
63        UseAttributeOrder set to true to provide AttributeOrder and Defaulted
64                          properties in `start_element()'
65
66       If no handlers are provided then all events will be silently ignored,
67       except for `"fatal_error()"' which will cause a `"die()"' to be called
68       after calling `"end_document()"'.
69
70       If a single string argument is passed to the `"parse()"' method, it is
71       treated as if a `"Source"' option was given with a `"String"' parame‐
72       ter.
73
74       The `"Source"' hash may contain the following parameters:
75
76        ByteStream       The raw byte stream (file handle) containing the
77                         document.
78        String           A string containing the document.
79        SystemId         The system identifier (URI) of the document.
80        PublicId         The public identifier.
81        Encoding         A string describing the character encoding.
82
83       If more than one of `"ByteStream"', `"String"', or `"SystemId"', then
84       preference is given first to `"ByteStream"', then `"String"', then
85       `"SystemId"'.
86

HANDLERS

88       The following handlers and properties are supported by
89       "XML::Parser::PerlSAX":
90
91       DocumentHandler methods
92
93           start_document
94               Receive notification of the beginning of a document.
95
96               No properties defined.
97
98           end_document
99               Receive notification of the end of a document.
100
101               No properties defined.
102
103           start_element
104               Receive notification of the beginning of an element.
105
106                Name             The element type name.
107                Attributes       A hash containing the attributes attached to the
108                                 element, if any.
109
110               The `"Attributes"' hash contains only string values.
111
112               If the `"UseAttributeOrder"' parser option is true, the follow‐
113               ing properties are also passed to `"start_element"':
114
115                AttributeOrder   An array of attribute names in the order they were
116                                 specified, followed by the defaulted attribute
117                                 names.
118                Defaulted        The index number of the first defaulted attribute in
119                                 `AttributeOrder.  If this index is equal to the
120                                 length of `AttributeOrder', there were no defaulted
121                                 values.
122
123               Note to "XML::Parser" users:  `"Defaulted"' will be half the
124               value of "XML::Parser::Expat"'s `"specified_attr()"' function
125               because only attribute names are provided, not their values.
126
127           end_element
128               Receive notification of the end of an element.
129
130                Name             The element type name.
131
132           characters
133               Receive notification of character data.
134
135                Data             The characters from the XML document.
136
137           processing_instruction
138               Receive notification of a processing instruction.
139
140                Target           The processing instruction target.
141                Data             The processing instruction data, if any.
142
143           comment
144               Receive notification of a comment.
145
146                Data             The comment data, if any.
147
148           start_cdata
149               Receive notification of the start of a CDATA section.
150
151               No properties defined.
152
153           end_cdata
154               Receive notification of the end of a CDATA section.
155
156               No properties defined.
157
158           entity_reference
159               Receive notification of an internal entity reference.  If this
160               handler is defined, internal entities will not be expanded and
161               not passed to the `"characters()"' handler.  If this handler is
162               not defined, internal entities will be expanded if possible and
163               passed to the `"characters()"' handler.
164
165                Name             The entity reference name
166                Value            The entity reference value
167
168           DTDHandler methods
169
170           notation_decl
171               Receive notification of a notation declaration event.
172
173                Name             The notation name.
174                PublicId         The notation's public identifier, if any.
175                SystemId         The notation's system identifier, if any.
176                Base             The base for resolving a relative URI, if any.
177
178           unparsed_entity_decl
179               Receive notification of an unparsed entity declaration event.
180
181                Name             The unparsed entity's name.
182                SystemId         The entity's system identifier.
183                PublicId         The entity's public identifier, if any.
184                Base             The base for resolving a relative URI, if any.
185
186           entity_decl
187               Receive notification of an entity declaration event.
188
189                Name             The entity name.
190                Value            The entity value, if any.
191                PublicId         The notation's public identifier, if any.
192                SystemId         The notation's system identifier, if any.
193                Notation         The notation declared for this entity, if any.
194
195               For internal entities, the `"Value"' parameter will contain the
196               value and the `"PublicId"', `"SystemId"', and `"Notation"' will
197               be undefined.  For external entities, the `"Value"' parameter
198               will be undefined, the `"SystemId"' parameter will have the
199               system id, the `"PublicId"' parameter will have the public id
200               if it was provided (it will be undefined otherwise), the
201               `"Notation"' parameter will contain the notation name for
202               unparsed entities.  If this is a parameter entity declaration,
203               then a '%' will be prefixed to the entity name.
204
205               Note that `"entity_decl()"' and `"unparsed_entity_decl()"'
206               overlap.  If both methods are implemented by a handler, then
207               this handler will not be called for unparsed entities.
208
209           element_decl
210               Receive notification of an element declaration event.
211
212                Name             The element type name.
213                Model            The content model as a string.
214
215           attlist_decl
216               Receive notification of an attribute list declaration event.
217
218               This handler is called for each attribute in an ATTLIST decla‐
219               ration found in the internal subset. So an ATTLIST declaration
220               that has multiple attributes will generate multiple calls to
221               this handler.
222
223                ElementName      The element type name.
224                AttributeName    The attribute name.
225                Type             The attribute type.
226                Fixed            True if this is a fixed attribute.
227
228               The default for `"Type"' is the default value, which will
229               either be "#REQUIRED", "#IMPLIED" or a quoted string (i.e. the
230               returned string will begin and end with a quote character).
231
232           doctype_decl
233               Receive notification of a DOCTYPE declaration event.
234
235                Name             The document type name.
236                SystemId         The document's system identifier.
237                PublicId         The document's public identifier, if any.
238                Internal         The internal subset as a string, if any.
239
240               Internal will contain all whitespace, comments, processing
241               instructions, and declarations seen in the internal subset. The
242               declarations will be there whether or not they have been pro‐
243               cessed by another handler (except for unparsed entities pro‐
244               cessed by the Unparsed handler).  However, comments and pro‐
245               cessing instructions will not appear if they've been processed
246               by their respective handlers.
247
248           xml_decl
249               Receive notification of an XML declaration event.
250
251                Version          The version.
252                Encoding         The encoding string, if any.
253                Standalone       True, false, or undefined if not declared.
254
255           EntityResolver
256
257           resolve_entity
258               Allow the handler to resolve external entities.
259
260                Name             The notation name.
261                SystemId         The notation's system identifier.
262                PublicId         The notation's public identifier, if any.
263                Base             The base for resolving a relative URI, if any.
264
265               `"resolve_entity()"' should return undef to request that the
266               parser open a regular URI connection to the system identifier
267               or a hash describing the new input source.  This hash has the
268               same properties as the `"Source"' parameter to `"parse()"':
269
270                 PublicId    The public identifier of the external entity being
271                             referenced, or undef if none was supplied.
272                 SystemId    The system identifier of the external entity being
273                             referenced.
274                 String      String containing XML text
275                 ByteStream  An open file handle.
276                 CharacterStream
277                             An open file handle.
278                 Encoding    The character encoding, if known.
279

AUTHOR

281       Ken MacLeod, ken@bitsko.slc.ut.us
282

SEE ALSO

284       perl(1), PerlSAX.pod(3)
285
286        Extensible Markup Language (XML) <http://www.w3c.org/XML/>
287        SAX 1.0: The Simple API for XML <http://www.megginson.com/SAX/>
288
289
290
291perl v5.8.8                       2003-10-21           XML::Parser::PerlSAX(3)
Impressum