1XML::SAX::Writer(3) User Contributed Perl Documentation XML::SAX::Writer(3)
2
3
4
6 XML::SAX::Writer - SAX2 Writer
7
9 use XML::SAX::Writer;
10 use XML::SAX::SomeDriver;
11
12 my $w = XML::SAX::Writer->new;
13 my $d = XML::SAX::SomeDriver->new(Handler => $w);
14
15 $d->parse('some options...');
16
18 Why yet another XML Writer ?
19 A new XML Writer was needed to match the SAX2 effort because quite
20 naturally no existing writer understood SAX2. My first intention had
21 been to start patching XML::Handler::YAWriter as it had previously been
22 my favourite writer in the SAX1 world.
23
24 However the more I patched it the more I realised that what I thought
25 was going to be a simple patch (mostly adding a few event handlers and
26 changing the attribute syntax) was turning out to be a rewrite due to
27 various ideas I'd been collecting along the way. Besides, I couldn't
28 find a way to elegantly make it work with SAX2 without breaking the
29 SAX1 compatibility which people are probably still using. There are of
30 course ways to do that, but most require user interaction which is
31 something I wanted to avoid.
32
33 So in the end there was a new writer. I think it's in fact better this
34 way as it helps keep SAX1 and SAX2 separated.
35
37 · new(%hash)
38
39 This is the constructor for this object. It takes a number of
40 parameters, all of which are optional.
41
42 · -- Output
43
44 This parameter can be one of several things. If it is a simple
45 scalar, it is interpreted as a filename which will be opened for
46 writing. If it is a scalar reference, output will be appended to
47 this scalar. If it is an array reference, output will be pushed
48 onto this array as it is generated. If it is a filehandle, then
49 output will be sent to this filehandle.
50
51 Finally, it is possible to pass an object for this parameter, in
52 which case it is assumed to be an object that implements the
53 consumer interface described later in the documentation.
54
55 If this parameter is not provided, then output is sent to STDOUT.
56
57 · -- Escape
58
59 This should be a hash reference where the keys are characters
60 sequences that should be escaped and the values are the escaped
61 form of the sequence. By default, this module will escape the
62 ampersand (&), less than (<), greater than (>), double quote ("),
63 and apostrophe ('). Note that some browsers don't support the
64 ' escape used for apostrophes so that you should be careful
65 when outputting XHTML.
66
67 If you only want to add entries to the Escape hash, you can first
68 copy the contents of %XML::SAX::Writer::DEFAULT_ESCAPE.
69
70 · -- CommentEscape
71
72 Comment content often needs to be escaped differently from other
73 content. This option works exactly as the previous one except that
74 by default it only escapes the double dash (--) and that the
75 contents can be copied from %XML::SAX::Writer::COMMENT_ESCAPE.
76
77 · -- EncodeFrom
78
79 The character set encoding in which incoming data will be provided.
80 This defaults to UTF-8, which works for US-ASCII as well.
81
82 · -- EncodeTo
83
84 The character set encoding in which output should be encoded.
85 Again, this defaults to UTF-8.
86
87 · -- QuoteCharacter
88
89 Set the character used to quote attributes. This defaults to single
90 quotes (') for backwards compatiblity.
91
93 XML::SAX::Writer can receive pluggable consumer objects that will be in
94 charge of writing out what is formatted by this module. Setting a
95 Consumer is done by setting the Output option to the object of your
96 choice instead of to an array, scalar, or file handle as is more
97 commonly done (internally those in fact map to Consumer classes and and
98 simply available as options for your convienience).
99
100 If you don't understand this, don't worry. You don't need it most of
101 the time.
102
103 That object can be from any class, but must have two methods in its
104 API. It is also strongly recommended that it inherits from
105 XML::SAX::Writer::ConsumerInterface so that it will not break if that
106 interface evolves over time. There are examples at the end of
107 XML::SAX::Writer's code.
108
109 The two methods that it needs to implement are:
110
111 · output STRING
112
113 (Required)
114
115 This is called whenever the Writer wants to output a string
116 formatted in XML. Encoding conversion, character escaping, and
117 formatting have already taken place. It's up to the consumer to do
118 whatever it wants with the string.
119
120 · finalize()
121
122 (Optional)
123
124 This is called once the document has been output in its entirety,
125 during the end_document event. end_document will in fact return
126 whatever finalize() returns, and that in turn should be returned by
127 parse() for whatever parser was invoked. It might be useful if you
128 need to provide feedback of some sort.
129
130 Here's an example of a custom consumer. Note the extra "$" signs in
131 front of $self; the base class is optimized for the overwhelmingly
132 common case where only one data member is required and $self is a
133 reference to that data member.
134
135 package MyConsumer;
136
137 @ISA = qw( XML::SAX::Writer::ConsumerInterface );
138
139 use strict;
140
141 sub new {
142 my $self = shift->SUPER::new( my $output );
143
144 $$self = ''; # Note the extra '$'
145
146 return $self;
147 }
148
149 sub output {
150 my $self = shift;
151 $$self .= uc shift;
152 }
153
154 sub get_output {
155 my $self = shift;
156 return $$self;
157 }
158
159 And here's one way to use it:
160
161 my $c = MyConsumer->new;
162 my $w = XML::SAX::Writer->new( Output => $c );
163
164 ## ... send events to $w ...
165
166 print $c->get_output;
167
168 If you need to store more that one data member, pass in an array or
169 hash reference:
170
171 my $self = shift->SUPER::new( {} );
172
173 and access it like:
174
175 sub output {
176 my $self = shift;
177 $$self->{Output} .= uc shift;
178 }
179
181 Encoders can be plugged in to allow one to use one's favourite encoder
182 object. Presently there are two encoders: Iconv and NullEncoder, and
183 one based on "Encode" ought to be out soon. They need to implement two
184 methods, and may inherit from XML::SAX::Writer::NullConverter if they
185 wish to
186
187 new FROM_ENCODING, TO_ENCODING
188 Creates a new Encoder. The arguments are the chosen encodings.
189
190 convert STRING
191 Converts that string and returns it.
192
194 This module is generally used to write XML -- which it does most of the
195 time -- but just like the rest of SAX it can be used as a generic
196 framework to output data, the opposite of a non-XML SAX parser.
197
198 Of course there's only so much that one can abstract, so depending on
199 your format this may or may not be useful. If it is, you'll need to
200 know the followin API (and probably to have a look inside
201 "XML::SAX::Writer::XML", the default Writer).
202
203 init
204 Called before the writing starts, it's a chance for the subclass to
205 do some initialisation if it needs it.
206
207 setConverter
208 This is used to set the proper converter for character encodings.
209 The default implementation should suffice but you can override it.
210 It must set "$self-"{Encoder}> to an Encoder object. Subclasses
211 *should* call it.
212
213 setConsumer
214 Same as above, except that it is for the Consumer object, and that
215 it must set "$self-"{Consumer}>.
216
217 setEscaperRegex
218 Will initialise the escaping regex "$self-"{EscaperRegex}> based on
219 what is needed.
220
221 escape STRING
222 Takes a string and escapes it properly.
223
224 setCommentEscaperRegex and escapeComment STRING
225 These work exactly the same as the two above, except that they are
226 meant to operate on comment contents, which often have different
227 escaping rules than those that apply to regular content.
228
230 - proper UTF-16 handling
231
232 - the formatting options need to be developed.
233
234 - test, test, test (and then some tests)
235
236 - doc, doc, doc (actually this part is in better shape)
237
238 - remove the xml_decl and replace it with intelligent logic, as
239 discussed on perl-xml
240
241 - make a the Consumer selecting code available in the API, to avoid
242 duplicating
243
244 - add an Apache output Consumer, triggered by passing $r as Output
245
247 Michael Koehne (XML::Handler::YAWriter) for much inspiration and Barrie
248 Slaymaker for the Consumer pattern idea, the coderef output option and
249 miscellaneous bugfixes and performance tweaks. Of course the usual
250 suspects (Kip Hampton and Matt Sergeant) helped in the usual ways.
251
253 Robin Berjon, robin@knowscape.com
254
256 Copyright (c) 2001-2006 Robin Berjon and Perl XML project. Some rights
257 reserved. This program is free software; you can redistribute it
258 and/or modify it under the same terms as Perl itself.
259
261 XML::SAX::*
262
264 Hey! The above document had some coding errors, which are explained
265 below:
266
267 Around line 440:
268 Non-ASCII character seen before =encoding in ' It'. Assuming UTF-8
269
270 Around line 443:
271 Expected '=item *'
272
273 Around line 459:
274 Expected '=item *'
275
276 Around line 471:
277 Expected '=item *'
278
279 Around line 478:
280 Expected '=item *'
281
282 Around line 483:
283 Expected '=item *'
284
285 Around line 488:
286 Expected '=item *'
287
288
289
290perl v5.16.3 2010-07-12 XML::SAX::Writer(3)