1PerlData(3) User Contributed Perl Documentation PerlData(3)
2
3
4
6 XML::Generator::PerlData - Perl extension for generating SAX2 events
7 from nested Perl data structures.
8
10 use XML::Generator::PerlData;
11 use SomeSAX2HandlerOrFilter;
12
13 ## Simple style ##
14
15 # get a deeply nested Perl data structure...
16 my $hash_ref = $obj->getScaryNestedDataStructure();
17
18 # create an instance of a handler class to forward events to...
19 my $handler = SomeSAX2HandlerOrFilter->new();
20
21 # create an instance of the PerlData driver...
22 my $driver = XML::Generator::PerlData->new( Handler => $handler );
23
24 # generate XML from the data structure...
25 $driver->parse( $hash_ref );
26
27
28 ## Or, Stream style ##
29
30 use XML::Generator::PerlData;
31 use SomeSAX2HandlerOrFilter;
32
33 # create an instance of a handler class to forward events to...
34 my $handler = SomeSAX2HandlerOrFilter->new();
35
36 # create an instance of the PerlData driver...
37 my $driver = XML::Generator::PerlData->new( Handler => $handler );
38
39 # start the event stream...
40 $driver->parse_start();
41
42 # pass the data through in chunks
43 # (from a database handle here)
44 while ( my $array_ref = $dbd_sth->fetchrow_arrayref ) {
45 $driver->parse_chunk( $array_ref );
46 }
47
48 # end the event stream...
49 $driver->parse_end();
50
51 and you're done...
52
54 XML::Generator::PerlData provides a simple way to generate SAX2 events
55 from nested Perl data structures, while providing finer-grained control
56 over the resulting document streams.
57
58 Processing comes in two flavors: Simple Style and Stream Style:
59
60 In a nutshell, 'simple style' is best used for those cases where you
61 have a a single Perl data structure that you want to convert to XML as
62 quickly and painlessly as possible. 'Stream style' is more useful for
63 cases where you are receiving chunks of data (like from a DBI handle)
64 and you want to process those chunks as they appear. See PROCESSING
65 METHODS for more info about how each style works.
66
68 new
69 (class constructor)
70
71 Accepts: An optional hash of configuration options.
72
73 Returns: A new instance of the XML::Generator::PerlData class.
74
75 Creates a new instance of XML::Generator::PerlData.
76
77 While basic usage of this module is designed to be simple and
78 straightforward, there is a small host of options available to help
79 ensure that the SAX event streams (and by extension the XML documents)
80 that are created from the data structures you pass are in just the
81 format that you want.
82
83 Options
84
85 • Handler (required)
86
87 XML::Generator::PerlData is a SAX Driver/Generator. As such, it
88 needs a SAX Handler or Filter class to forward its events to. The
89 value for this option must be an instance of a SAX2-aware Handler
90 or Filter.
91
92 • rootname (optional)
93
94 Sets the name of the top-level (root) element. The default is
95 'document'.
96
97 • defaultname (optional)
98
99 Sets the default name to be used for elements when no other logical
100 name is available (think lists-of-lists). The default is 'default'.
101
102 • keymap (optional)
103
104 Often, the names of the keys in a given hash do not map directly to
105 the XML elements names that you want to appear in the resulting
106 document. The option contains a set of keyname->element name
107 mappings for the current process.
108
109 • skipelements (optional)
110
111 Passed in as an array reference, this option sets the internal list
112 of keynames that will be skipped over during processing. Note that
113 any descendant structures belonging to those keys will also be
114 skipped.
115
116 • attrmap (optional)
117
118 Used to determine which 'children' of a given hash key/element-name
119 will be forwarded as attributes of that element rather than as
120 child elements.
121
122 (see CAVEATS for a discussion of the limitations of this method.)
123
124 • namespaces (optional)
125
126 Sets the internal list of namespace/prefix pairs for the current
127 process. It takes the form of a hash, where the keys are the URIs
128 of the given namespace and the values are the associated prefix.
129
130 To set a default (unprefixed) namespace, set the prefix to
131 '#default'.
132
133 • namespacemap (optional)
134
135 Sets which elements in the result will be bound to which declared
136 namespaces. It takes the form of a hash of key/value pairs where
137 the keys are one of the declared namespace URIs that are relevant
138 to the current process and the values are either single key/element
139 names or an array reference of key/element names.
140
141 • skiproot (optional)
142
143 When set to a defined value, this option blocks the generator from
144 adding the top-level root element when parse() or parse_start() and
145 parse_end() are called.
146
147 Do not use this option unless you absolutely sure you know what you
148 are doing and why, since the resulting event stream will most
149 likely produce non-well-formed XML.
150
151 • bindattrs (optional)
152
153 When set to a defined value, this option tells the generator to
154 bind attributes to the same namespace as element that contains
155 them. By default attributes will be unbound and unprefixed.
156
157 • processing_instructions (optional)
158
159 This option provides a way to include XML processing instructions
160 events into the generated stream before the root element is
161 emitted. The value of this key can be either a hash reference or an
162 array reference of hash references. For example, when connected to
163 XML::SAX::Writer:
164
165 $pd->new( Handler => $writer_instance,
166 rootname => 'document',
167 processing_instructions => {
168 'xml-stylesheet' => {
169 href => '/path/to/stylesheet.xsl',
170 type => 'text/xml',
171 },
172 });
173
174 would generate
175
176 <?xml version="1.0"?>
177 <?xml-stylesheet href="/path/to/stylesheet.xsl" type="text/xsl" ?>
178 <document>
179 ...
180
181 Where multiple processing instructions will have the same target
182 and/or where the document order of those PIs matter, an array
183 reference should be used instead. For example:
184
185 $pd->new( Handler => $writer_instance,
186 rootname => 'document',
187 processing_instructions => [
188 'xml-stylesheet' => {
189 href => '/path/to/stylesheet.xsl',
190 type => 'text/xml',
191 },
192 'xml-stylesheet' => {
193 href => '/path/to/second/stylesheet.xsl',
194 type => 'text/xml',
195 }
196
197 ]);
198
199 would produce:
200
201 <?xml version="1.0"?>
202 <?xml-stylesheet href="/path/to/stylesheet.xsl" type="text/xsl" ?>
203 <?xml-stylesheet href="/path/to/second/stylesheet.xsl" type="text/xsl" ?>
204 <document>
205 ...
206
208 Simple style processing
209 parse
210 Accepts: A reference to a Perl data structure. Optionally, a hash
211 of config options.
212
213 Returns: [none]
214
215 The core method used during 'simple style' processing, this method
216 accepts a reference to a Perl data structure and, based on the
217 options passed, produces a stream of SAX events that can be used to
218 transform that structure into XML. The optional second argument is
219 a hash of config options identical to those detailed in the OPTIONS
220 section of the the new() constructor description.
221
222 Examples:
223
224 $pd->parse( \%my_hash );
225
226 $pd->parse( \%my_hash, rootname => 'recordset' );
227
228 $pd->parse( \@my_list, %some_options );
229
230 $pd->parse( $my_hashref );
231
232 $pd->parse( $my_arrayref, keymap => { default => ['foo', 'bar', 'baz'] } );
233
234 Stream style processing
235 parse_start
236 Accepts: An optional hash of config options.
237
238 Returns: [none]
239
240 Starts the SAX event stream and (unless configured not to) fires
241 the event the top-level root element. The optional argument is a
242 hash of config options identical to those detailed in the OPTIONS
243 section of the the new() constructor description.
244
245 Example:
246
247 $pd->parse_start();
248
249 parse_end
250 Accepts: [none].
251
252 Returns: Varies. Returns what the final Handler returns.
253
254 Ends the SAX event stream and (unless configured not to) fires the
255 event to close the top-level root element.
256
257 Example:
258
259 $pd->parse_end();
260
261 parse_chunk
262 Accepts: A reference to a Perl data structure.
263
264 Returns: [none]
265
266 The core method used during 'stream style' processing, this method
267 accepts a reference to a Perl data structure and, based on the
268 options passed, produces a stream of SAX events that can be used to
269 transform that structure into XML.
270
271 Examples:
272
273 $pd->parse_chunk( \%my_hash );
274
275 $pd->parse_chunk( \@my_list );
276
277 $pd->parse_chunk( $my_hashref );
278
279 $pd->parse_chunk( $my_arrayref );
280
282 All config options can be passed to calls to the new() constructor
283 using the typical "hash of named properties" syntax. The methods below
284 offer direct access to the individual options (or ways to add/remove
285 the smaller definitions contained by those options).
286
287 init
288 Accepts: The same configuration options that can be passed to the
289 new() constructor.
290
291 Returns: [none]
292
293 See the list of OPTIONS above in the definition of new() for
294 details.
295
296 rootname
297 Accepts: A string or [none].
298
299 Returns: The current root name.
300
301 When called with an argument, this method sets the name of the top-
302 level (root) element. It always returns the name of the current (or
303 new) root name.
304
305 Examples:
306
307 $pd->rootname( $new_name );
308
309 my $current_root = $pd->rootname();
310
311 defaultname
312 Accepts: A string or [none]
313
314 Returns: The current default element name.
315
316 When called with an argument, this method sets the name of the
317 default element. It always returns the name of the current (or new)
318 default name.
319
320 Examples:
321
322 $pd->defaultname( $new_name );
323
324 my $current_default = $pd->defaultname();
325
326 keymap
327 Accepts: A hash (or hash reference) containing a series of
328 keyname->elementname mappings or [none].
329
330 Returns: The current keymap hash (as a plain hash, or hash
331 reference depending on caller context).
332
333 When called with a hash (hash reference) as its argument, this
334 method sets/resets the entire internal keyname->elementname
335 mappings definitions (where 'keyname' means the name of a given key
336 in the hash and 'elementname' is the name used when firing SAX
337 events for that key).
338
339 In addition to simple name->othername mappings, value of a keymap
340 option can also a reference to a subroutine (or an anonymous sub).
341 The keyname will be passed as the sole argument to this subroutine
342 and the sub is expected to return the new element name. In the
343 cases of nested arrayrefs, no keyname will be passed, but you can
344 still generate the name from scratch.
345
346 Extending that idea, keymap will also accept a default mapping
347 using the key '*' that will be applied to all elements that do have
348 an explict mapping configured.
349
350 To add new mappings or remove existing ones without having to reset
351 the whole list of mappings, see add_keymap() and delete_keymap()
352 respectively.
353
354 If your are using "stream style" processing, this method should be
355 used with caution since altering this mapping during processing may
356 result in not-well-formed XML.
357
358 Examples:
359
360 $pd->keymap( keyname => 'othername',
361 anotherkey => 'someothername' );
362
363 $pd->keymap( \%mymap );
364
365 # make all tags lower case
366 $pd->keymap( '*' => sub{ return lc( $_[0];} );
367
368 # process keys named 'keyname' with a local sub
369 $pd->keymap( keyname => \&my_namer,
370
371 my %kmap_hash = $pd->keymap();
372
373 my $kmap_hashref = $pd->keymap();
374
375 add_keymap
376 Accepts: A hash (or hash reference) containing a series of
377 keyname->elementname mappings.
378
379 Returns: [none]
380
381 Adds a series of keyname->elementname mappings (where 'keyname'
382 means the name of a given key in the hash and 'elementname' is the
383 name used when firing SAX events for that key).
384
385 Examples:
386
387 $pd->add_keymap( keyname => 'othername' );
388
389 $pd->add_keymap( \%hash_of_mappings );
390
391 delete_keymap
392 Accepts: A list (or array reference) of element/keynames.
393
394 Returns: [none]
395
396 Deletes a list of keyname->elementname mappings (where 'keyname'
397 means the name of a given key in the hash and 'elementname' is the
398 name used when firing SAX events for that key).
399
400 This method should be used with caution since altering this mapping
401 during processing may result in not-well-formed XML.
402
403 Examples:
404
405 $pd->delete_keymap( 'some', 'key', 'names' );
406
407 $pd->delete_keymap( \@keynames );
408
409 skipelements
410 Accepts: A list (or array reference) containing a series of
411 key/element names or [none].
412
413 Returns: The current skipelements array (as a plain list, or array
414 reference depending on caller context).
415
416 When called with an array (array reference) as its argument, this
417 method sets/resets the entire internal skipelement definitions
418 (which determines which keys will not be 'parsed' during
419 processing).
420
421 To add new mappings or remove existing ones without having to reset
422 the whole list of mappings, see add_skipelements() and
423 delete_skipelements() respectively.
424
425 Examples:
426
427 $pd->skipelements( 'elname', 'othername', 'thirdname' );
428
429 $pd->skipelements( \@skip_names );
430
431 my @skiplist = $pd->skipelements();
432
433 my $skiplist_ref = $pd->skipelements();
434
435 add_skipelements
436 Accepts: A list (or array reference) containing a series of
437 key/element names.
438
439 Returns: [none]
440
441 Adds a list of key/element names to skip during processing.
442
443 Examples:
444
445 $pd->add_skipelements( 'some', 'key', 'names' );
446
447 $pd->add_skipelements( \@keynames );
448
449 delete_skipelements
450 Accepts: A list (or array reference) containing a series of
451 key/element names.
452
453 Returns: [none]
454
455 Deletes a list of key/element names to skip during processing.
456
457 Examples:
458
459 $pd->delete_skipelements( 'some', 'key', 'names' );
460
461 $pd->delete_skipelements( \@keynames );
462
463 charmap
464 Accepts: A hash (or hash reference) containing a series of
465 parent/child keyname pairs or [none].
466
467 Returns: The current charmap hash (as a plain hash, or hash
468 reference depending on caller context).
469
470 When called with a hash (hash reference) as its argument, this
471 method sets/resets the entire internal
472 keyname/elementname->characters children mappings definitions
473 (where 'keyname' means the name of a given key in the hash and
474 'characters children' is list containing the nested keynames that
475 should be passed as the text children of the element named
476 'keyname' (instead of being processed as child elements or
477 attributes).
478
479 To add new mappings or remove existing ones without having to reset
480 the whole list of mappings, see add_charmap() and delete_charmap()
481 respectively.
482
483 See CAVEATS for the limitations that relate to this method.
484
485 Examples:
486
487 $pd->charmap( elname => ['list', 'of', 'nested', 'keynames' );
488
489 $pd->charmap( \%mymap );
490
491 my %charmap_hash = $pd->charmap();
492
493 my $charmap_hashref = $pd->charmap();
494
495 add_charmap
496 Accepts: A hash or hash reference containing a series of
497 parent/child keyname pairs.
498
499 Returns: [none]
500
501 Adds a series of parent-key -> child-key relationships that define
502 which of the possible child keys will be processed as text children
503 of the created 'parent' element.
504
505 Examples:
506
507 $pd->add_charmap( parentname => ['list', 'of', 'child', 'keys'] );
508
509 $pd->add_charmap( parentname => 'childkey' );
510
511 $pd->add_charmap( \%parents_and_kids );
512
513 delete_charmap
514 Accepts: A list (or array reference) of element/keynames.
515
516 Returns: [none]
517
518 Deletes a list of parent-key -> child-key relationships from the
519 instance-wide hash of "parent->nested names to pass as text
520 children definitions. If you need to alter the list of child names
521 (without deleting the parent key) use add_charmap() to reset the
522 parent-key's definition.
523
524 Examples:
525
526 $pd->delete_charmap( 'some', 'parent', 'keys' );
527
528 $pd->delete_charmap( \@parentkeynames );
529
530 attrmap
531 Accepts: A hash (or hash reference) containing a series of
532 parent/child keyname pairs or [none].
533
534 Returns: The current attrmap hash (as a plain hash, or hash
535 reference depending on caller context).
536
537 When called with a hash (hash reference) as its argument, this
538 method sets/resets the entire internal keyname/elementname->attr
539 children mappings definitions (where 'keyname' means the name of a
540 given key in the hash and 'attr children' is list containing the
541 nested keynames that should be passed as attributes of the element
542 named 'keyname' (instead of as child elements).
543
544 To add new mappings or remove existing ones without having to reset
545 the whole list of mappings, see add_attrmap() and delete_attrmap()
546 respectively.
547
548 See CAVEATS for the limitations that relate to this method.
549
550 Examples:
551
552 $pd->attrmap( elname => ['list', 'of', 'nested', 'keynames' );
553
554 $pd->attr( \%mymap );
555
556 my %attrmap_hash = $pd->attrmap();
557
558 my $attrmap_hashref = $pd->attrmap();
559
560 add_attrmap
561 Accepts: A hash or hash reference containing a series of
562 parent/child keyname pairs.
563
564 Returns: [none]
565
566 Adds a series of parent-key -> child-key relationships that define
567 which of the possible child keys will be processed as attributes of
568 the created 'parent' element.
569
570 Examples:
571
572 $pd->add_attrmap( parentname => ['list', 'of', 'child', 'keys'] );
573
574 $pd->add_attrmap( parentname => 'childkey' );
575
576 $pd->add_attrmap( \%parents_and_kids );
577
578 delete_attrmap
579 Accepts: A list (or array reference) of element/keynames.
580
581 Returns: [none]
582
583 Deletes a list of parent-key -> child-key relationships from the
584 instance-wide hash of "parent->nested names to pass as attributes"
585 definitions. If you need to alter the list of child names (without
586 deleting the parent key) use add_attrmap() to reset the parent-
587 key's definition.
588
589 Examples:
590
591 $pd->delete_attrmap( 'some', 'parent', 'keys' );
592
593 $pd->delete_attrmap( \@parentkeynames );
594
595 bindattrs
596 Accepts: 1 or 0 or [none].
597
598 Returns: undef or 1 based on the current state of the bindattrs
599 option.
600
601 Consider:
602
603 <myns:foo bar="quux"/>
604
605 and
606
607 <myns:foo myns:bar="quux"/>
608
609 are not functionally equivalent.
610
611 By default, attributes will be forwarded as not being bound to the
612 namespace of the containing element (like the first example above).
613 Setting this option to a true value alters that behavior.
614
615 Examples:
616
617 $pd->bindattrs(1); # attributes now bound and prefixed.
618
619 $pd->bindattrs(0);
620
621 my $is_binding = $pd->bindattrs();
622
623 add_namespace
624 Accepts: A hash containing the defined keys 'uri' and 'prefix'.
625
626 Returns: [none]
627
628 Add a namespace URI/prefix pair to the instance-wide list of XML
629 namespaces that will be used while processing. The reserved prefix
630 '#default' can be used to set the default (unprefixed) namespace
631 declaration for elements.
632
633 Examples:
634
635 $pd->add_namespace( uri => 'http://myhost.tld/myns',
636 prefix => 'myns' );
637
638 $pd->add_namespace( uri => 'http://myhost.tld/default',
639 prefix => '#default' );
640
641 See namespacemap() or the namespacemap option detailed in new() for
642 details about how to associate key/element name with a given
643 namespace.
644
645 namespacemap
646 Accepts: A hash (or hash reference) containing a series of
647 uri->key/element name mappings or [none].
648
649 Returns: The current namespacemap hash (as a plain hash, or hash
650 reference depending on caller context).
651
652 When called with a hash (hash reference) as its argument, this
653 method sets/resets the entire internal namespace
654 URI->keyname/elementname mappings definitions (where 'keyname'
655 means the name of a given key in the hash and 'namespace URI' is a
656 declared namespace URI for the given process).
657
658 To add new mappings or remove existing ones without having to reset
659 the whole list of mappings, see add_namespacemap() and
660 delete_namespacemap() respectively.
661
662 If your are using "stream style" processing, this method should be
663 used with caution since altering this mapping during processing may
664 result in not-well-formed XML.
665
666 Examples:
667
668 $pd->add_namespace( uri => 'http://myhost.tld/myns',
669 prefix => 'myns' );
670
671 $pd->namespacemap( 'http://myhost.tld/myns' => elname );
672
673 $pd->namespacemap( 'http://myhost.tld/myns' => [ 'list', 'of', 'elnames' ] );
674
675 $pd->namespacemap( \%mymap );
676
677 my %nsmap_hash = $pd->namespacemap();
678
679 my $nsmap_hashref = $pd->namespacemap();
680
681 add_namespacemap
682 Accepts: A hash (or hash reference) containing a series of
683 uri->key/element name mappings
684
685 Returns: [none]
686
687 Adds one or more namespace->element/keyname rule to the instance-
688 wide list of mappings.
689
690 Examples:
691
692 $pd->add_namespacemap( 'http://myhost.tld/foo' => ['some', 'list', 'of' 'keys'] );
693
694 $pd->add_namespacemap( %new_nsmappings );
695
696 remove_namespacemap
697 Accepts: A list (or array reference) of element/keynames.
698
699 Returns: [none]
700
701 Removes a list of namespace->element/keyname rules to the instance-
702 wide list of mappings.
703
704 Examples:
705
706 $pd->delete_namespacemap( 'foo', 'bar', 'baz' );
707
708 $pd->delete_namespacemap( \@list_of_keynames );
709
711 As a subclass of XML::SAX::Base, XML::Generator::PerlData allows you to
712 call all of the SAX event methods directly to insert arbitrary events
713 into the stream as needed. While its use in this way is probably a Bad
714 Thing (and only relevant to "stream style" processing) it is good to
715 know that such fine-grained access is there if you need it.
716
717 With that aside, there may be cases (again, using the "stream style")
718 where you'll want to insert single elements into the output (wrapping
719 each array in series of arrays in single 'record' elements, for
720 example).
721
722 The following methods may be used to simplify this task by allowing you
723 to pass in simple element name strings and have the result 'just work'
724 without requiring an expert knowledge of the Perl SAX2 implementation
725 or forcing you to keep track of things like namespace context.
726
727 Take care to ensure that every call to start_tag() has a corresponding
728 call to end_tag() or your documents will not be well-formed.
729
730 start_tag
731 Accepts: A string containing an element name and an optional hash
732 of simple key/value attributes.
733
734 Returns: [none]
735
736 Examples:
737
738 $pd->start_tag( $element_name );
739
740 $pd->start_tag( $element_name, id => $generated_id );
741
742 $pd->start_tag( $element_name, %some_attrs );
743
744 end_tag
745 Accepts: A string containing an element name.
746
747 Returns: [none]
748
749 Examples:
750
751 $pd->end_tag( $element_name );
752
754 In general, XML is based on the idea that every bit of data is going to
755 have a corresponding name (Elements, Attributes, etc.). While this is
756 not at all a Bad Thing, it means that some Perl data structures do not
757 map cleanly onto an XML representation.
758
759 Consider:
760
761 my %hash = ( foo => ['one', 'two', 'three'] );
762
763 How do you represent that as XML? Is it three 'foo' elements, or is it
764 a 'foo' parent element with 3 mystery children?
765 XML::Generator::PerlData chooses the former. Or:
766
767 <foo>one</foo>
768 <foo>two</foo>
769 <foo>three</foo>
770
771 Now consider:
772
773 my @lol = ( ['one', 'two', 'three'], ['four', 'five', 'six'] );
774
775 In this case you wind up with a pile of elements named 'default'. You
776 can work around this by doing $pd->add_keymap( default => ['list',
777 'of', 'names'] ) but that only works if you know how many entries are
778 going to be in each nested list.
779
780 The practical implication here is that the current version of
781 XML::Generator::PerlData favors data structures that are based on
782 hashes of hashes for deeply nested structures (especally when using
783 Simple Style processing) and some options like "attrmap" do not work
784 for arrays at all. Future versions will address these issues if sanely
785 possible.
786
788 Kip Hampton, khampton@totalcinema.com
789
791 (c) Kip Hampton, 2002-2014, All Rights Reserved.
792
794 This module is released under the Perl Artistic Licence and may be
795 redistributed under the same terms as perl itself.
796
798 XML::SAX, XML::SAX::Writer.
799
800
801
802perl v5.36.0 2023-01-20 PerlData(3)