1XML::SAX::Base(3)     User Contributed Perl Documentation    XML::SAX::Base(3)
2
3
4

NAME

6       XML::SAX::Base - Base class SAX Drivers and Filters
7

SYNOPSIS

9         package MyFilter;
10         use XML::SAX::Base;
11         @ISA = ('XML::SAX::Base');
12

DESCRIPTION

14       This module has a very simple task - to be a base class for PerlSAX
15       drivers and filters. It's default behaviour is to pass the input
16       directly to the output unchanged. It can be useful to use this module
17       as a base class so you don't have to, for example, implement the char‐
18       acters() callback.
19
20       The main advantages that it provides are easy dispatching of events the
21       right way (ie it takes care for you of checking that the handler has
22       implemented that method, or has defined an AUTOLOAD), and the guarantee
23       that filters will pass along events that they aren't implementing to
24       handlers downstream that might nevertheless be interested in them.
25

WRITING SAX DRIVERS AND FILTERS

27       Writing SAX Filters is tremendously easy: all you need to do is inherit
28       from this module, and define the events you want to handle. A more
29       detailed explanation can be found at
30       http://www.xml.com/pub/a/2001/10/10/sax-filters.html.
31
32       Writing Drivers is equally simple. The one thing you need to pay atten‐
33       tion to is NOT to call events yourself (this applies to Filters as
34       well). For instance:
35
36         package MyFilter;
37         use base qw(XML::SAX::Base);
38
39         sub start_element {
40           my $self = shift;
41           my $data = shift;
42           # do something
43           $self->{Handler}->start_element($data); # BAD
44         }
45
46       The above example works well as precisely that: an example. But it has
47       several faults: 1) it doesn't test to see whether the handler defines
48       start_element. Perhaps it doesn't want to see that event, in which case
49       you shouldn't throw it (otherwise it'll die). 2) it doesn't check Con‐
50       tentHandler and then Handler (ie it doesn't look to see that the user
51       hasn't requested events on a specific handler, and if not on the
52       default one), 3) if it did check all that, not only would the code be
53       cumbersome (see this module's source to get an idea) but it would also
54       probably have to check for a DocumentHandler (in case this were SAX1)
55       and for AUTOLOADs potentially defined in all these packages. As you can
56       tell, that would be fairly painful. Instead of going through that, sim‐
57       ply remember to use code similar to the following instead:
58
59         package MyFilter;
60         use base qw(XML::SAX::Base);
61
62         sub start_element {
63           my $self = shift;
64           my $data = shift;
65           # do something to filter
66           $self->SUPER::start_element($data); # GOOD (and easy) !
67         }
68
69       This way, once you've done your job you hand the ball back to
70       XML::SAX::Base and it takes care of all those problems for you!
71
72       Note that the above example doesn't apply to filters only, drivers will
73       benefit from the exact same feature.
74

METHODS

76       A number of methods are defined within this class for the purpose of
77       inheritance. Some probably don't need to be overridden (eg parse_file)
78       but some clearly should be (eg parse). Options for these methods are
79       described in the PerlSAX2 specification available from
80       http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~check‐
81       out~/perl-xml/libxml-perl/doc/sax-2.0.html?rev=HEAD&con‐
82       tent-type=text/html.
83
84       * parse
85           The parse method is the main entry point to parsing documents.
86           Internally the parse method will detect what type of "thing" you
87           are parsing, and call the appropriate method in your implementation
88           class. Here is the mapping table of what is in the Source options
89           (see the Perl SAX 2.0 specification for the meaning of these val‐
90           ues):
91
92             Source Contains           parse() calls
93             ===============           =============
94             CharacterStream (*)       _parse_characterstream($stream, $options)
95             ByteStream                _parse_bytestream($stream, $options)
96             String                    _parse_string($string, $options)
97             SystemId                  _parse_systemid($string, $options)
98
99           However note that these methods may not be sensible if your driver
100           class is not for parsing XML. An example might be a DBI driver that
101           generates XML/SAX from a database table. If that is the case, you
102           likely want to write your own parse() method.
103
104           Also note that the Source may contain both a PublicId entry, and an
105           Encoding entry. To get at these, examine $options->{Source} as
106           passed to your method.
107
108           (*) A CharacterStream is a filehandle that does not need any encod‐
109           ing translation done on it. This is implemented as a regular file‐
110           handle and only works under Perl 5.7.2 or higher using PerlIO. To
111           get a single character, or number of characters from it, use the
112           perl core read() function. To get a single byte from it (or number
113           of bytes), you can use sysread(). The encoding of the stream should
114           be in the Encoding entry for the Source.
115
116       * parse_file, parse_uri, parse_string
117           These are all convenience variations on parse(), and in fact simply
118           set up the options before calling it. You probably don't need to
119           override these.
120
121       * get_options
122           This is a convenience method to get options in SAX2 style, or more
123           generically either as hashes or as hashrefs (it returns a hashref).
124           You will probably want to use this method in your own implementa‐
125           tions of parse() and of new().
126
127       * get_feature, set_feature
128           These simply get and set features, and throw the appropriate excep‐
129           tions defined in the specification if need be.
130
131           If your subclass defines features not defined in this one, then you
132           should override these methods in such a way that they check for
133           your features first, and then call the base class's methods for
134           features not defined by your class. An example would be:
135
136             sub get_feature {
137                 my $self = shift;
138                 my $feat = shift;
139                 if (exists $MY_FEATURES{$feat}) {
140                     # handle the feature in various ways
141                 }
142                 else {
143                     return $self->SUPER::get_feature($feat);
144                 }
145             }
146
147           Currently this part is unimplemented.
148
149       * set_handler
150           This method takes a handler type (Handler, ContentHandler, etc.)
151           and a handler object as arguments, and changes the current handler
152           for that handler type, while taking care of resetting the internal
153           state that needs to be reset. This allows one to change a handler
154           during parse without running into problems (changing it on the
155           parser object directly will most likely cause trouble).
156
157       * set_document_handler, set_content_handler, set_dtd_handler, set_lexi‐
158       cal_handler, set_decl_handler, set_error_handler, set_entity_resolver
159           These are just simple wrappers around the former method, and take a
160           handler object as their argument. Internally they simply call
161           set_handler with the correct arguments.
162
163       * get_handler
164           The inverse of set_handler, this method takes a an optional string
165           containing a handler type (DTDHandler, ContentHandler, etc. 'Han‐
166           dler' is used if no type is passed). It returns a reference to the
167           object that implements that that class, or undef if that handler
168           type is not set for the current driver/filter.
169
170       * get_document_handler, get_content_handler, get_dtd_handler, get_lexi‐
171       cal_handler, get_decl_handler, get_error_handler, get_entity_resolver
172           These are just simple wrappers around the get_handler() method, and
173           take no arguments. Internally they simply call get_handler with the
174           correct handler type name.
175
176       It would be rather useless to describe all the methods that this module
177       implements here. They are all the methods supported in SAX1 and SAX2.
178       In case your memory is a little short, here is a list. The apparent
179       duplicates are there so that both versions of SAX can be supported.
180
181       * start_document
182       * end_document
183       * start_element
184       * start_document
185       * end_document
186       * start_element
187       * end_element
188       * characters
189       * processing_instruction
190       * ignorable_whitespace
191       * set_document_locator
192       * start_prefix_mapping
193       * end_prefix_mapping
194       * skipped_entity
195       * start_cdata
196       * end_cdata
197       * comment
198       * entity_reference
199       * notation_decl
200       * unparsed_entity_decl
201       * element_decl
202       * attlist_decl
203       * doctype_decl
204       * xml_decl
205       * entity_decl
206       * attribute_decl
207       * internal_entity_decl
208       * external_entity_decl
209       * resolve_entity
210       * start_dtd
211       * end_dtd
212       * start_entity
213       * end_entity
214       * warning
215       * error
216       * fatal_error
217

TODO

219         - more tests
220         - conform to the "SAX Filters" and "Java and DOM compatibility"
221           sections of the SAX2 document.
222

AUTHOR

224       Kip Hampton (khampton@totalcinema.com) did most of the work, after
225       porting it from XML::Filter::Base.
226
227       Robin Berjon (robin@knowscape.com) pitched in with patches to make it
228       usable as a base for drivers as well as filters, along with other
229       patches.
230
231       Matt Sergeant (matt@sergeant.org) wrote the original XML::Filter::Base,
232       and patched a few things here and there, and imported it into the
233       XML::SAX distribution.
234

SEE ALSO

236       XML::SAX
237
238
239
240perl v5.8.8                       2007-02-13                 XML::SAX::Base(3)
Impressum