1XML::SAX::Writer(3)   User Contributed Perl Documentation  XML::SAX::Writer(3)
2
3
4

NAME

6       XML::SAX::Writer - SAX2 XML Writer
7

VERSION

9       version 0.57
10

SYNOPSIS

12         use XML::SAX::Writer;
13         use XML::SAX::SomeDriver;
14
15         my $w = XML::SAX::Writer->new;
16         my $d = XML::SAX::SomeDriver->new(Handler => $w);
17
18         $d->parse('some options...');
19

DESCRIPTION

21   Why yet another XML Writer ?
22       A new XML Writer was needed to match the SAX2 effort because quite
23       naturally no existing writer understood SAX2. My first intention had
24       been to start patching XML::Handler::YAWriter as it had previously been
25       my favourite writer in the SAX1 world.
26
27       However the more I patched it the more I realised that what I thought
28       was going to be a simple patch (mostly adding a few event handlers and
29       changing the attribute syntax) was turning out to be a rewrite due to
30       various ideas I'd been collecting along the way. Besides, I couldn't
31       find a way to elegantly make it work with SAX2 without breaking the
32       SAX1 compatibility which people are probably still using. There are of
33       course ways to do that, but most require user interaction which is
34       something I wanted to avoid.
35
36       So in the end there was a new writer. I think it's in fact better this
37       way as it helps keep SAX1 and SAX2 separated.
38

METHODS

40       •   new(%hash)
41
42           This is the constructor for this object. It takes a number of
43           parameters, all of which are optional.
44
45       •   Output
46
47           This parameter can be one of several things. If it is a simple
48           scalar, it is interpreted as a filename which will be opened for
49           writing. If it is a scalar reference, output will be appended to
50           this scalar. If it is an array reference, output will be pushed
51           onto this array as it is generated. If it is a filehandle, then
52           output will be sent to this filehandle.
53
54           Finally, it is possible to pass an object for this parameter, in
55           which case it is assumed to be an object that implements the
56           consumer interface described later in the documentation.
57
58           If this parameter is not provided, then output is sent to STDOUT.
59
60           Note that there is no means to set an encoding layer on filehandles
61           created by this module; if this is necessary, the calling code
62           should first open a filehandle with the appropriate encoding set,
63           and pass that filehandle to this module.
64
65       •   Escape
66
67           This should be a hash reference where the keys are characters
68           sequences that should be escaped and the values are the escaped
69           form of the sequence. By default, this module will escape the
70           ampersand (&), less than (<), greater than (>), double quote ("),
71           and apostrophe ('). Note that some browsers don't support the
72           &apos; escape used for apostrophes so that you should be careful
73           when outputting XHTML.
74
75           If you only want to add entries to the Escape hash, you can first
76           copy the contents of %XML::SAX::Writer::DEFAULT_ESCAPE.
77
78       •   CommentEscape
79
80           Comment content often needs to be escaped differently from other
81           content. This option works exactly as the previous one except that
82           by default it only escapes the double dash (--) and that the
83           contents can be copied from %XML::SAX::Writer::COMMENT_ESCAPE.
84
85       •   EncodeFrom
86
87           The character set encoding in which incoming data will be provided.
88           This defaults to UTF-8, which works for US-ASCII as well.
89
90           Set this to "undef" if you do not wish to decode your data.
91
92       •   EncodeTo
93
94           The character set encoding in which output should be encoded.
95           Again, this defaults to UTF-8.
96
97           Set this to "undef" if you do not with to encode your data.
98
99       •   QuoteCharacter
100
101           Set the character used to quote attributes. This defaults to single
102           quotes (') for backwards compatibility.
103

THE CONSUMER INTERFACE

105       XML::SAX::Writer can receive pluggable consumer objects that will be in
106       charge of writing out what is formatted by this module. Setting a
107       Consumer is done by setting the Output option to the object of your
108       choice instead of to an array, scalar, or file handle as is more
109       commonly done (internally those in fact map to Consumer classes and and
110       simply available as options for your convenience).
111
112       If you don't understand this, don't worry. You don't need it most of
113       the time.
114
115       That object can be from any class, but must have two methods in its
116       API. It is also strongly recommended that it inherits from
117       XML::SAX::Writer::ConsumerInterface so that it will not break if that
118       interface evolves over time. There are examples at the end of
119       XML::SAX::Writer's code.
120
121       The two methods that it needs to implement are:
122
123       •   output STRING
124
125           (Required)
126
127           This is called whenever the Writer wants to output a string
128           formatted in XML. Encoding conversion, character escaping, and
129           formatting have already taken place. It's up to the consumer to do
130           whatever it wants with the string.
131
132finalize()
133
134           (Optional)
135
136           This is called once the document has been output in its entirety,
137           during the end_document event. end_document will in fact return
138           whatever finalize() returns, and that in turn should be returned by
139           parse() for whatever parser was invoked. It might be useful if you
140           need to provide feedback of some sort.
141
142       Here's an example of a custom consumer.  Note the extra "$" signs in
143       front of $self; the base class is optimized for the overwhelmingly
144       common case where only one data member is required and $self is a
145       reference to that data member.
146
147           package MyConsumer;
148
149           @ISA = qw( XML::SAX::Writer::ConsumerInterface );
150
151           use strict;
152
153           sub new {
154               my $self = shift->SUPER::new( my $output );
155
156               $$self = '';      # Note the extra '$'
157
158               return $self;
159           }
160
161           sub output {
162               my $self = shift;
163               $$self .= uc shift;
164           }
165
166           sub get_output {
167               my $self = shift;
168               return $$self;
169           }
170
171       And here is one way to use it:
172
173           my $c = MyConsumer->new;
174           my $w = XML::SAX::Writer->new( Output => $c );
175
176           ## ... send events to $w ...
177
178           print $c->get_output;
179
180       If you need to store more that one data member, pass in an array or
181       hash reference:
182
183               my $self = shift->SUPER::new( {} );
184
185       and access it like:
186
187           sub output {
188               my $self = shift;
189               $$self->{Output} .= uc shift;
190           }
191

THE ENCODER INTERFACE

193       Encoders can be plugged in to allow one to use one's favourite encoder
194       object. Presently there are two encoders: Encode and NullEncoder. They
195       need to implement two methods, and may inherit from
196       XML::SAX::Writer::NullConverter if they wish to
197
198       new FROM_ENCODING, TO_ENCODING
199           Creates a new Encoder. The arguments are the chosen encodings.
200
201       convert STRING
202           Converts that string and returns it.
203
204       Note that the return value of the convert method is not checked. Output
205       may be truncated if a character couldn't be converted correctly. To
206       avoid problems the encoder should take care encoding errors itself, for
207       example by raising an exception.
208

CUSTOM OUTPUT

210       This module is generally used to write XML -- which it does most of the
211       time -- but just like the rest of SAX it can be used as a generic
212       framework to output data, the opposite of a non-XML SAX parser.
213
214       Of course there's only so much that one can abstract, so depending on
215       your format this may or may not be useful. If it is, you'll need to
216       know the following API (and probably to have a look inside
217       "XML::SAX::Writer::XML", the default Writer).
218
219       init
220           Called before the writing starts, it's a chance for the subclass to
221           do some initialisation if it needs it.
222
223       setConverter
224           This is used to set the proper converter for character encodings.
225           The default implementation should suffice but you can override it.
226           It must set "$self->{Encoder}" to an Encoder object. Subclasses
227           *should* call it.
228
229       setConsumer
230           Same as above, except that it is for the Consumer object, and that
231           it must set "$self->{Consumer}".
232
233       setEscaperRegex
234           Will initialise the escaping regex "$self->{EscaperRegex}" based on
235           what is needed.
236
237       escape STRING
238           Takes a string and escapes it properly.
239
240       setCommentEscaperRegex and escapeComment STRING
241           These work exactly the same as the two above, except that they are
242           meant to operate on comment contents, which often have different
243           escaping rules than those that apply to regular content.
244

TODO

246           - proper UTF-16 handling
247
248           - the formatting options need to be developed.
249
250           - test, test, test (and then some tests)
251
252           - doc, doc, doc (actually this part is in better shape)
253
254           - remove the xml_decl and replace it with intelligent logic, as
255           discussed on perl-xml
256
257           - make a the Consumer selecting code available in the API, to avoid
258           duplicating
259
260           - add an Apache output Consumer, triggered by passing $r as Output
261

CREDITS

263       Michael Koehne (XML::Handler::YAWriter) for much inspiration and Barrie
264       Slaymaker for the Consumer pattern idea, the coderef output option and
265       miscellaneous bugfixes and performance tweaks. Of course the usual
266       suspects (Kip Hampton and Matt Sergeant) helped in the usual ways.
267

SEE ALSO

269       XML::SAX::*
270

AUTHORS

272       •   Robin Berjon <robin@knowscape.com>
273
274       •   Chris Prather <chris@prather.org>
275
277       This software is copyright (c) 2014 by Robin Berjon.
278
279       This is free software; you can redistribute it and/or modify it under
280       the same terms as the Perl 5 programming language system itself.
281
282
283
284perl v5.34.0                      2022-01-21               XML::SAX::Writer(3)
Impressum