1Writer(3)             User Contributed Perl Documentation            Writer(3)
2
3
4

NAME

6       XML::SAX::Writer - SAX2 Writer
7

SYNOPSIS

9         use XML::SAX::Writer;
10         use XML::SAX::SomeDriver;
11
12         my $w = XML::SAX::Writer->new;
13         my $d = XML::SAX::SomeDriver->new(Handler => $w);
14
15         $d->parse('some options...');
16

DESCRIPTION

18       Why yet another XML Writer ?
19
20       A new XML Writer was needed to match the SAX2 effort because quite nat‐
21       urally no existing writer understood SAX2. My first intention had been
22       to start patching XML::Handler::YAWriter as it had previously been my
23       favourite writer in the SAX1 world.
24
25       However the more I patched it the more I realised that what I thought
26       was going to be a simple patch (mostly adding a few event handlers and
27       changing the attribute syntax) was turning out to be a rewrite due to
28       various ideas I'd been collecting along the way. Besides, I couldn't
29       find a way to elegantly make it work with SAX2 without breaking the
30       SAX1 compatibility which people are probably still using. There are of
31       course ways to do that, but most require user interaction which is
32       something I wanted to avoid.
33
34       So in the end there was a new writer. I think it's in fact better this
35       way as it helps keep SAX1 and SAX2 separated.
36

METHODS

38       * new(%hash)
39           This is the constructor for this object.  It takes a number of
40           parameters, all of which are optional.
41
42       -- Output
43           This parameter can be one of several things.  If it is a simple
44           scalar, it is interpreted as a filename which will be opened for
45           writing.  If it is a scalar reference, output will be appended to
46           this scalar.  If it is an array reference, output will be pushed
47           onto this array as it is generated.  If it is a filehandle, then
48           output will be sent to this filehandle.
49
50           Finally, it is possible to pass an object for this parameter, in
51           which case it is assumed to be an object that implements the con‐
52           sumer interface described later in the documentation.
53
54           If this parameter is not provided, then output is sent to STDOUT.
55
56       -- Escape
57           This should be a hash reference where the keys are characters
58           sequences that should be escaped and the values are the escaped
59           form of the sequence.  By default, this module will escape the
60           ampersand (&), less than (<), greater than (>), double quote ("),
61           and apostrophe ('). Note that some browsers don't support the
62           &apos; escape used for apostrophes so that you should be careful
63           when outputting XHTML.
64
65           If you only want to add entries to the Escape hash, you can first
66           copy the contents of %XML::SAX::Writer::DEFAULT_ESCAPE.
67
68       -- CommentEscape
69           Comment content often needs to be escaped differently from other
70           content. This option works exactly as the previous one except that
71           by default it only escapes the double dash (--) and that the con‐
72           tents can be copied from %XML::SAX::Writer::COMMENT_ESCAPE.
73
74       -- EncodeFrom
75           The character set encoding in which incoming data will be provided.
76           This defaults to UTF-8, which works for US-ASCII as well.
77
78       -- EncodeTo
79           The character set encoding in which output should be encoded.
80            Again, this defaults to UTF-8.
81

THE CONSUMER INTERFACE

83       XML::SAX::Writer can receive pluggable consumer objects that will be in
84       charge of writing out what is formatted by this module. Setting a Con‐
85       sumer is done by setting the Output option to the object of your choice
86       instead of to an array, scalar, or file handle as is more commonly done
87       (internally those in fact map to Consumer classes and and simply avail‐
88       able as options for your convienience).
89
90       If you don't understand this, don't worry. You don't need it most of
91       the time.
92
93       That object can be from any class, but must have two methods in its
94       API. It is also strongly recommended that it inherits from
95       XML::SAX::Writer::ConsumerInterface so that it will not break if that
96       interface evolves over time. There are examples at the end of
97       XML::SAX::Writer's code.
98
99       The two methods that it needs to implement are:
100
101       * output STRING
102           (Required)
103
104           This is called whenever the Writer wants to output a string format‐
105           ted in XML. Encoding conversion, character escaping, and formatting
106           have already taken place. It's up to the consumer to do whatever it
107           wants with the string.
108
109       * finalize()
110           (Optional)
111
112           This is called once the document has been output in its entirety,
113           during the end_document event. end_document will in fact return
114           whatever finalize() returns, and that in turn should be returned by
115           parse() for whatever parser was invoked. It might be useful if you
116           need to provide feedback of some sort.
117
118       Here's an example of a custom consumer.  Note the extra "$" signs in
119       front of $self; the base class is optimized for the overwhelmingly com‐
120       mon case where only one data member is required and $self is a refer‐
121       ence to that data member.
122
123           package MyConsumer;
124
125           @ISA = qw( XML::SAX::Writer::ConsumerInterface );
126
127           use strict;
128
129           sub new {
130               my $self = shift->SUPER::new( my $output );
131
132               $$self = '';      # Note the extra '$'
133
134               return $self;
135           }
136
137           sub output {
138               my $self = shift;
139               $$self .= uc shift;
140           }
141
142           sub get_output {
143               my $self = shift;
144               return $$self;
145           }
146
147       And here's one way to use it:
148
149           my $c = MyConsumer->new;
150           my $w = XML::SAX::Writer->new( Output => $c );
151
152           ## ... send events to $w ...
153
154           print $c->get_output;
155
156       If you need to store more that one data member, pass in an array or
157       hash reference:
158
159               my $self = shift->SUPER::new( {} );
160
161       and access it like:
162
163           sub output {
164               my $self = shift;
165               $$self->{Output} .= uc shift;
166           }
167

THE ENCODER INTERFACE

169       Encoders can be plugged in to allow one to use one's favourite encoder
170       object. Presently there are two encoders: Iconv and NullEncoder, and
171       one based on "Encode" ought to be out soon. They need to implement two
172       methods, and may inherit from XML::SAX::Writer::NullConverter if they
173       wish to
174
175       new FROM_ENCODING, TO_ENCODING
176           Creates a new Encoder. The arguments are the chosen encodings.
177
178       convert STRING
179           Converts that string and returns it.
180

CUSTOM OUTPUT

182       This module is generally used to write XML -- which it does most of the
183       time -- but just like the rest of SAX it can be used as a generic
184       framework to output data, the opposite of a non-XML SAX parser.
185
186       Of course there's only so much that one can abstract, so depending on
187       your format this may or may not be useful. If it is, you'll need to
188       know the followin API (and probably to have a look inside
189       "XML::SAX::Writer::XML", the default Writer).
190
191       init
192           Called before the writing starts, it's a chance for the subclass to
193           do some initialisation if it needs it.
194
195       setConverter
196           This is used to set the proper converter for character encodings.
197           The default implementation should suffice but you can override it.
198           It must set "$self-"{Encoder}> to an Encoder object. Subclasses
199           *should* call it.
200
201       setConsumer
202           Same as above, except that it is for the Consumer object, and that
203           it must set "$self-"{Consumer}>.
204
205       setEscaperRegex
206           Will initialise the escaping regex "$self-"{EscaperRegex}> based on
207           what is needed.
208
209       escape STRING
210           Takes a string and escapes it properly.
211
212       setCommentEscaperRegex and escapeComment STRING
213           These work exactly the same as the two above, except that they are
214           meant to operate on comment contents, which often have different
215           escaping rules than those that apply to regular content.
216

TODO

218           - proper UTF-16 handling
219
220           - make the quote character an option. By default it is here ', but
221           I know that a lot of people (for reasons I don't understand but
222           won't question :-) prefer to use ". (on most keyboards " is more
223           typing, on the rest it's often as much typing).
224
225           - the formatting options need to be developed.
226
227           - test, test, test (and then some tests)
228
229           - doc, doc, doc (actually this part is in better shape)
230
231           - add support for Perl 5.7's Encode module so that we can use it
232           instead of Text::Iconv. Encode is more complete and likely to be
233           better supported overall. This will be done using a pluggable
234           encoder (so that users can provide their own if they want to)
235           and detecter both in Makefile.PL requirements and in the module
236           at runtime.
237
238           - remove the xml_decl and replace it with intelligent logic, as
239           discussed on perl-xml
240
241           - make a the Consumer selecting code available in the API, to avoid
242           duplicating
243
244           - add an Apache output Consumer, triggered by passing $r as Output
245

CREDITS

247       Michael Koehne (XML::Handler::YAWriter) for much inspiration and Barrie
248       Slaymaker for the Consumer pattern idea, the coderef output option and
249       miscellaneous bugfixes and performance tweaks. Of course the usual sus‐
250       pects (Kip Hampton and Matt Sergeant) helped in the usual ways.
251

AUTHOR

253       Robin Berjon, robin@knowscape.com
254
256       Copyright (c) 2001-2006 Robin Berjon nad Perl XML project. All rights
257       reserved.  This program is free software; you can redistribute it
258       and/or modify it under the same terms as Perl itself.
259

SEE ALSO

261       XML::SAX::*
262
263
264
265perl v5.8.8                       2007-03-22                         Writer(3)
Impressum