1BuildSAXBase(3)       User Contributed Perl Documentation      BuildSAXBase(3)
2
3
4

NAME

6       XML::SAX::Base - Base class SAX Drivers and Filters
7

SYNOPSIS

9         package MyFilter;
10         use XML::SAX::Base;
11         @ISA = ('XML::SAX::Base');
12

DESCRIPTION

14       This module has a very simple task - to be a base class for PerlSAX
15       drivers and filters. It's default behaviour is to pass the input
16       directly to the output unchanged. It can be useful to use this module
17       as a base class so you don't have to, for example, implement the
18       characters() callback.
19
20       The main advantages that it provides are easy dispatching of events the
21       right way (ie it takes care for you of checking that the handler has
22       implemented that method, or has defined an AUTOLOAD), and the guarantee
23       that filters will pass along events that they aren't implementing to
24       handlers downstream that might nevertheless be interested in them.
25

WRITING SAX DRIVERS AND FILTERS

27       The Perl Sax API Reference is at
28       <http://perl-xml.sourceforge.net/perl-sax/>.
29
30       Writing SAX Filters is tremendously easy: all you need to do is inherit
31       from this module, and define the events you want to handle. A more
32       detailed explanation can be found at
33       http://www.xml.com/pub/a/2001/10/10/sax-filters.html.
34
35       Writing Drivers is equally simple. The one thing you need to pay
36       attention to is NOT to call events yourself (this applies to Filters as
37       well). For instance:
38
39         package MyFilter;
40         use base qw(XML::SAX::Base);
41
42         sub start_element {
43           my $self = shift;
44           my $data = shift;
45           # do something
46           $self->{Handler}->start_element($data); # BAD
47         }
48
49       The above example works well as precisely that: an example. But it has
50       several faults: 1) it doesn't test to see whether the handler defines
51       start_element. Perhaps it doesn't want to see that event, in which case
52       you shouldn't throw it (otherwise it'll die). 2) it doesn't check
53       ContentHandler and then Handler (ie it doesn't look to see that the
54       user hasn't requested events on a specific handler, and if not on the
55       default one), 3) if it did check all that, not only would the code be
56       cumbersome (see this module's source to get an idea) but it would also
57       probably have to check for a DocumentHandler (in case this were SAX1)
58       and for AUTOLOADs potentially defined in all these packages. As you can
59       tell, that would be fairly painful. Instead of going through that,
60       simply remember to use code similar to the following instead:
61
62         package MyFilter;
63         use base qw(XML::SAX::Base);
64
65         sub start_element {
66           my $self = shift;
67           my $data = shift;
68           # do something to filter
69           $self->SUPER::start_element($data); # GOOD (and easy) !
70         }
71
72       This way, once you've done your job you hand the ball back to
73       XML::SAX::Base and it takes care of all those problems for you!
74
75       Note that the above example doesn't apply to filters only, drivers will
76       benefit from the exact same feature.
77

METHODS

79       A number of methods are defined within this class for the purpose of
80       inheritance. Some probably don't need to be overridden (eg parse_file)
81       but some clearly should be (eg parse). Options for these methods are
82       described in the PerlSAX2 specification available from
83       http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/perl-xml/libxml-perl/doc/sax-2.0.html?rev=HEAD&content-type=text/html.
84
85       •   parse
86
87           The parse method is the main entry point to parsing documents.
88           Internally the parse method will detect what type of "thing" you
89           are parsing, and call the appropriate method in your implementation
90           class. Here is the mapping table of what is in the Source options
91           (see the Perl SAX 2.0 specification for the meaning of these
92           values):
93
94             Source Contains           parse() calls
95             ===============           =============
96             CharacterStream (*)       _parse_characterstream($stream, $options)
97             ByteStream                _parse_bytestream($stream, $options)
98             String                    _parse_string($string, $options)
99             SystemId                  _parse_systemid($string, $options)
100
101           However note that these methods may not be sensible if your driver
102           class is not for parsing XML. An example might be a DBI driver that
103           generates XML/SAX from a database table. If that is the case, you
104           likely want to write your own parse() method.
105
106           Also note that the Source may contain both a PublicId entry, and an
107           Encoding entry. To get at these, examine $options->{Source} as
108           passed to your method.
109
110           (*) A CharacterStream is a filehandle that does not need any
111           encoding translation done on it. This is implemented as a regular
112           filehandle and only works under Perl 5.7.2 or higher using PerlIO.
113           To get a single character, or number of characters from it, use the
114           perl core read() function. To get a single byte from it (or number
115           of bytes), you can use sysread(). The encoding of the stream should
116           be in the Encoding entry for the Source.
117
118       •   parse_file, parse_uri, parse_string
119
120           These are all convenience variations on parse(), and in fact simply
121           set up the options before calling it. You probably don't need to
122           override these.
123
124       •   get_options
125
126           This is a convenience method to get options in SAX2 style, or more
127           generically either as hashes or as hashrefs (it returns a hashref).
128           You will probably want to use this method in your own
129           implementations of parse() and of new().
130
131       •   get_feature, set_feature
132
133           These simply get and set features, and throw the appropriate
134           exceptions defined in the specification if need be.
135
136           If your subclass defines features not defined in this one, then you
137           should override these methods in such a way that they check for
138           your features first, and then call the base class's methods for
139           features not defined by your class. An example would be:
140
141             sub get_feature {
142                 my $self = shift;
143                 my $feat = shift;
144                 if (exists $MY_FEATURES{$feat}) {
145                     # handle the feature in various ways
146                 }
147                 else {
148                     return $self->SUPER::get_feature($feat);
149                 }
150             }
151
152           Currently this part is unimplemented.
153
154       •   set_handler
155
156           This method takes a handler type (Handler, ContentHandler, etc.)
157           and a handler object as arguments, and changes the current handler
158           for that handler type, while taking care of resetting the internal
159           state that needs to be reset. This allows one to change a handler
160           during parse without running into problems (changing it on the
161           parser object directly will most likely cause trouble).
162
163       •   set_document_handler, set_content_handler, set_dtd_handler,
164           set_lexical_handler, set_decl_handler, set_error_handler,
165           set_entity_resolver
166
167           These are just simple wrappers around the former method, and take a
168           handler object as their argument. Internally they simply call
169           set_handler with the correct arguments.
170
171       •   get_handler
172
173           The inverse of set_handler, this method takes a an optional string
174           containing a handler type (DTDHandler, ContentHandler, etc.
175           'Handler' is used if no type is passed). It returns a reference to
176           the object that implements that class, or undef if that handler
177           type is not set for the current driver/filter.
178
179       •   get_document_handler, get_content_handler, get_dtd_handler,
180           get_lexical_handler, get_decl_handler, get_error_handler,
181           get_entity_resolver
182
183           These are just simple wrappers around the get_handler() method, and
184           take no arguments. Internally they simply call get_handler with the
185           correct handler type name.
186
187       It would be rather useless to describe all the methods that this module
188       implements here. They are all the methods supported in SAX1 and SAX2.
189       In case your memory is a little short, here is a list. The apparent
190       duplicates are there so that both versions of SAX can be supported.
191
192       •   start_document
193
194       •   end_document
195
196       •   start_element
197
198       •   start_document
199
200       •   end_document
201
202       •   start_element
203
204       •   end_element
205
206       •   characters
207
208       •   processing_instruction
209
210       •   ignorable_whitespace
211
212       •   set_document_locator
213
214       •   start_prefix_mapping
215
216       •   end_prefix_mapping
217
218       •   skipped_entity
219
220       •   start_cdata
221
222       •   end_cdata
223
224       •   comment
225
226       •   entity_reference
227
228       •   notation_decl
229
230       •   unparsed_entity_decl
231
232       •   element_decl
233
234       •   attlist_decl
235
236       •   doctype_decl
237
238       •   xml_decl
239
240       •   entity_decl
241
242       •   attribute_decl
243
244       •   internal_entity_decl
245
246       •   external_entity_decl
247
248       •   resolve_entity
249
250       •   start_dtd
251
252       •   end_dtd
253
254       •   start_entity
255
256       •   end_entity
257
258       •   warning
259
260       •   error
261
262       •   fatal_error
263

TODO

265         - more tests
266         - conform to the "SAX Filters" and "Java and DOM compatibility"
267           sections of the SAX2 document.
268

AUTHOR

270       Kip Hampton (khampton@totalcinema.com) did most of the work, after
271       porting it from XML::Filter::Base.
272
273       Robin Berjon (robin@knowscape.com) pitched in with patches to make it
274       usable as a base for drivers as well as filters, along with other
275       patches.
276
277       Matt Sergeant (matt@sergeant.org) wrote the original XML::Filter::Base,
278       and patched a few things here and there, and imported it into the
279       XML::SAX distribution.
280

SEE ALSO

282       XML::SAX
283
284
285
286perl v5.34.0                      2021-07-23                   BuildSAXBase(3)
Impressum