1XML::SAX::Base(3) User Contributed Perl Documentation XML::SAX::Base(3)
2
3
4
6 XML::SAX::Base - Base class SAX Drivers and Filters
7
9 package MyFilter;
10 use XML::SAX::Base;
11 @ISA = ('XML::SAX::Base');
12
14 This module has a very simple task - to be a base class for PerlSAX
15 drivers and filters. It's default behaviour is to pass the input
16 directly to the output unchanged. It can be useful to use this module
17 as a base class so you don't have to, for example, implement the
18 characters() callback.
19
20 The main advantages that it provides are easy dispatching of events the
21 right way (ie it takes care for you of checking that the handler has
22 implemented that method, or has defined an AUTOLOAD), and the guarantee
23 that filters will pass along events that they aren't implementing to
24 handlers downstream that might nevertheless be interested in them.
25
27 Writing SAX Filters is tremendously easy: all you need to do is inherit
28 from this module, and define the events you want to handle. A more
29 detailed explanation can be found at
30 http://www.xml.com/pub/a/2001/10/10/sax-filters.html.
31
32 Writing Drivers is equally simple. The one thing you need to pay
33 attention to is NOT to call events yourself (this applies to Filters as
34 well). For instance:
35
36 package MyFilter;
37 use base qw(XML::SAX::Base);
38
39 sub start_element {
40 my $self = shift;
41 my $data = shift;
42 # do something
43 $self->{Handler}->start_element($data); # BAD
44 }
45
46 The above example works well as precisely that: an example. But it has
47 several faults: 1) it doesn't test to see whether the handler defines
48 start_element. Perhaps it doesn't want to see that event, in which case
49 you shouldn't throw it (otherwise it'll die). 2) it doesn't check
50 ContentHandler and then Handler (ie it doesn't look to see that the
51 user hasn't requested events on a specific handler, and if not on the
52 default one), 3) if it did check all that, not only would the code be
53 cumbersome (see this module's source to get an idea) but it would also
54 probably have to check for a DocumentHandler (in case this were SAX1)
55 and for AUTOLOADs potentially defined in all these packages. As you can
56 tell, that would be fairly painful. Instead of going through that,
57 simply remember to use code similar to the following instead:
58
59 package MyFilter;
60 use base qw(XML::SAX::Base);
61
62 sub start_element {
63 my $self = shift;
64 my $data = shift;
65 # do something to filter
66 $self->SUPER::start_element($data); # GOOD (and easy) !
67 }
68
69 This way, once you've done your job you hand the ball back to
70 XML::SAX::Base and it takes care of all those problems for you!
71
72 Note that the above example doesn't apply to filters only, drivers will
73 benefit from the exact same feature.
74
76 A number of methods are defined within this class for the purpose of
77 inheritance. Some probably don't need to be overridden (eg parse_file)
78 but some clearly should be (eg parse). Options for these methods are
79 described in the PerlSAX2 specification available from
80 http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/perl-xml/libxml-perl/doc/sax-2.0.html?rev=HEAD&content-type=text/html.
81
82 · parse
83
84 The parse method is the main entry point to parsing documents.
85 Internally the parse method will detect what type of "thing" you
86 are parsing, and call the appropriate method in your implementation
87 class. Here is the mapping table of what is in the Source options
88 (see the Perl SAX 2.0 specification for the meaning of these
89 values):
90
91 Source Contains parse() calls
92 =============== =============
93 CharacterStream (*) _parse_characterstream($stream, $options)
94 ByteStream _parse_bytestream($stream, $options)
95 String _parse_string($string, $options)
96 SystemId _parse_systemid($string, $options)
97
98 However note that these methods may not be sensible if your driver
99 class is not for parsing XML. An example might be a DBI driver that
100 generates XML/SAX from a database table. If that is the case, you
101 likely want to write your own parse() method.
102
103 Also note that the Source may contain both a PublicId entry, and an
104 Encoding entry. To get at these, examine $options->{Source} as
105 passed to your method.
106
107 (*) A CharacterStream is a filehandle that does not need any
108 encoding translation done on it. This is implemented as a regular
109 filehandle and only works under Perl 5.7.2 or higher using PerlIO.
110 To get a single character, or number of characters from it, use the
111 perl core read() function. To get a single byte from it (or number
112 of bytes), you can use sysread(). The encoding of the stream should
113 be in the Encoding entry for the Source.
114
115 · parse_file, parse_uri, parse_string
116
117 These are all convenience variations on parse(), and in fact simply
118 set up the options before calling it. You probably don't need to
119 override these.
120
121 · get_options
122
123 This is a convenience method to get options in SAX2 style, or more
124 generically either as hashes or as hashrefs (it returns a hashref).
125 You will probably want to use this method in your own
126 implementations of parse() and of new().
127
128 · get_feature, set_feature
129
130 These simply get and set features, and throw the appropriate
131 exceptions defined in the specification if need be.
132
133 If your subclass defines features not defined in this one, then you
134 should override these methods in such a way that they check for
135 your features first, and then call the base class's methods for
136 features not defined by your class. An example would be:
137
138 sub get_feature {
139 my $self = shift;
140 my $feat = shift;
141 if (exists $MY_FEATURES{$feat}) {
142 # handle the feature in various ways
143 }
144 else {
145 return $self->SUPER::get_feature($feat);
146 }
147 }
148
149 Currently this part is unimplemented.
150
151 · set_handler
152
153 This method takes a handler type (Handler, ContentHandler, etc.)
154 and a handler object as arguments, and changes the current handler
155 for that handler type, while taking care of resetting the internal
156 state that needs to be reset. This allows one to change a handler
157 during parse without running into problems (changing it on the
158 parser object directly will most likely cause trouble).
159
160 · set_document_handler, set_content_handler, set_dtd_handler,
161 set_lexical_handler, set_decl_handler, set_error_handler,
162 set_entity_resolver
163
164 These are just simple wrappers around the former method, and take a
165 handler object as their argument. Internally they simply call
166 set_handler with the correct arguments.
167
168 · get_handler
169
170 The inverse of set_handler, this method takes a an optional string
171 containing a handler type (DTDHandler, ContentHandler, etc.
172 'Handler' is used if no type is passed). It returns a reference to
173 the object that implements that that class, or undef if that
174 handler type is not set for the current driver/filter.
175
176 · get_document_handler, get_content_handler, get_dtd_handler,
177 get_lexical_handler, get_decl_handler, get_error_handler,
178 get_entity_resolver
179
180 These are just simple wrappers around the get_handler() method, and
181 take no arguments. Internally they simply call get_handler with the
182 correct handler type name.
183
184 It would be rather useless to describe all the methods that this module
185 implements here. They are all the methods supported in SAX1 and SAX2.
186 In case your memory is a little short, here is a list. The apparent
187 duplicates are there so that both versions of SAX can be supported.
188
189 · start_document
190
191 · end_document
192
193 · start_element
194
195 · start_document
196
197 · end_document
198
199 · start_element
200
201 · end_element
202
203 · characters
204
205 · processing_instruction
206
207 · ignorable_whitespace
208
209 · set_document_locator
210
211 · start_prefix_mapping
212
213 · end_prefix_mapping
214
215 · skipped_entity
216
217 · start_cdata
218
219 · end_cdata
220
221 · comment
222
223 · entity_reference
224
225 · notation_decl
226
227 · unparsed_entity_decl
228
229 · element_decl
230
231 · attlist_decl
232
233 · doctype_decl
234
235 · xml_decl
236
237 · entity_decl
238
239 · attribute_decl
240
241 · internal_entity_decl
242
243 · external_entity_decl
244
245 · resolve_entity
246
247 · start_dtd
248
249 · end_dtd
250
251 · start_entity
252
253 · end_entity
254
255 · warning
256
257 · error
258
259 · fatal_error
260
262 - more tests
263 - conform to the "SAX Filters" and "Java and DOM compatibility"
264 sections of the SAX2 document.
265
267 Kip Hampton (khampton@totalcinema.com) did most of the work, after
268 porting it from XML::Filter::Base.
269
270 Robin Berjon (robin@knowscape.com) pitched in with patches to make it
271 usable as a base for drivers as well as filters, along with other
272 patches.
273
274 Matt Sergeant (matt@sergeant.org) wrote the original XML::Filter::Base,
275 and patched a few things here and there, and imported it into the
276 XML::SAX distribution.
277
279 XML::SAX
280
281
282
283perl v5.12.0 2010-05-07 XML::SAX::Base(3)