1Twig(3)               User Contributed Perl Documentation              Twig(3)
2
3
4

NAME

6       XML::Twig - A perl module for processing huge XML documents in tree
7       mode.
8

SYNOPSIS

10       Note that this documentation is intended as a reference to the module.
11
12       Complete docs, including a tutorial, examples, an easier to use HTML
13       version, a quick reference card and a FAQ are available at
14       http://www.xmltwig.com/xmltwig
15
16       Small documents (loaded in memory as a tree):
17
18         my $twig=XML::Twig->new();    # create the twig
19         $twig->parsefile( 'doc.xml'); # build it
20         my_process( $twig);           # use twig methods to process it
21         $twig->print;                 # output the twig
22
23       Huge documents (processed in combined stream/tree mode):
24
25         # at most one div will be loaded in memory
26         my $twig=XML::Twig->new(
27           twig_handlers =>
28             { title   => sub { $_->set_tag( 'h2') }, # change title tags to h2
29               para    => sub { $_->set_tag( 'p')  }, # change para to p
30               hidden  => sub { $_->delete;       },  # remove hidden elements
31               list    => \&my_list_process,          # process list elements
32               div     => sub { $_[0]->flush;     },  # output and free memory
33             },
34           pretty_print => 'indented',                # output will be nicely formatted
35           empty_tags   => 'html',                    # outputs <empty_tag />
36                                );
37           $twig->flush;                              # flush the end of the document
38
39       See XML::Twig 101 for other ways to use the module, as a filter for
40       example
41

DESCRIPTION

43       This module provides a way to process XML documents. It is build on top
44       of "XML::Parser".
45
46       The module offers a tree interface to the document, while allowing you
47       to output the parts of it that have been completely processed.
48
49       It allows minimal resource (CPU and memory) usage by building the tree
50       only for the parts of the documents that need actual processing,
51       through the use of the "twig_roots " and "twig_print_outside_roots "
52       options. The "finish " and "finish_print " methods also help to
53       increase performances.
54
55       XML::Twig tries to make simple things easy so it tries its best to
56       takes care of a lot of the (usually) annoying (but sometimes necessary)
57       features that come with XML and XML::Parser.
58

XML::Twig 101

60       XML::Twig can be used either on "small" XML documents (that fit in mem‐
61       ory) or on huge ones, by processing parts of the document and out‐
62       putting or discarding them once they are processed.
63
64       Loading an XML document and processing it
65
66         my $t= XML::Twig->new();
67         $t->parse( '<d><title>title</title><para>p 1</para><para>p 2</para></d>');
68         my $root= $t->root;
69         $root->set_tag( 'html');              # change doc to html
70         $title= $root->first_child( 'title'); # get the title
71         $title->set_tag( 'h1');               # turn it into h1
72         my @para= $root->children( 'para');   # get the para children
73         foreach my $para (@para)
74           { $para->set_tag( 'p'); }           # turn them into p
75         $t->print;                            # output the document
76
77       Other useful methods include:
78
79       att: "$elt->{'att'}->{'foo'}" return the "foo" attribute for an ele‐
80       ment,
81
82       set_att : "$elt->set_att( foo => "bar")" sets the "foo" attribute to
83       the "bar" value,
84
85       next_sibling: "$elt->{next_sibling}" return the next sibling in the
86       document (in the example "$title->{next_sibling}" is the first "para",
87       you can also (and actually should) use "$elt->next_sibling( 'para')" to
88       get it
89
90       The document can also be transformed through the use of the cut, copy,
91       paste and move methods: "$title->cut; $title->paste( after => $p);" for
92       example
93
94       And much, much more, see Elt.
95
96       Processing an XML document chunk by chunk
97
98       One of the strengths of XML::Twig is that it let you work with files
99       that do not fit in memory (BTW storing an XML document in memory as a
100       tree is quite memory-expensive, the expansion factor being often around
101       10).
102
103       To do this you can define handlers, that will be called once a specific
104       element has been completely parsed. In these handlers you can access
105       the element and process it as you see fit, using the navigation and the
106       cut-n-paste methods, plus lots of convenient ones like "prefix ".  Once
107       the element is completely processed you can then "flush " it, which
108       will output it and free the memory. You can also "purge " it if you
109       don't need to output it (if you are just extracting some data from the
110       document for example). The handler will be called again once the next
111       relevant element has been parsed.
112
113         my $t= XML::Twig->new( twig_handlers =>
114                                 { section => \&section,
115                                   para   => sub { $_->set_tag( 'p');
116                                 },
117                              );
118         $t->parsefile( 'doc.xml');
119         $t->flush; # don't forget to flush one last time in the end or anything
120                    # after the last </section> tag will not be output
121
122         # the handler is called once a section is completely parsed, ie when
123         # the end tag for section is found, it receives the twig itself and
124         # the element (including all its sub-elements) as arguments
125         sub section
126           { my( $t, $section)= @_;      # arguments for all twig_handlers
127             $section->set_tag( 'div');  # change the tag name.4, my favourite method...
128             # let's use the attribute nb as a prefix to the title
129             my $title= $section->first_child( 'title'); # find the title
130             my $nb= $title->{'att'}->{'nb'}; # get the attribute
131             $title->prefix( "$nb - ");  # easy isn't it?
132             $section->flush;            # outputs the section and frees memory
133           }
134
135       There is of course more to it: you can trigger handlers on more elabo‐
136       rate conditions than just the name of the element, "section/title" for
137       example.
138
139         my $t= XML::Twig->new( twig_handlers =>
140                                  { 'section/title' => sub { $_->print } }
141                              )
142                         ->parsefile( 'doc.xml');
143
144       Here "sub { $_->print }" simply prints the current element ($_ is
145       aliased to the element in the handler).
146
147       You can also trigger a handler on a test on an attribute:
148
149         my $t= XML::Twig->new( twig_handlers =>
150                             { 'section[@level="1"]' => sub { $_->print } }
151                              );
152                         ->parsefile( 'doc.xml');
153
154       You can also use "start_tag_handlers " to process an element as soon as
155       the start tag is found. Besides "prefix " you can also use "suffix ",
156
157       Processing just parts of an XML document
158
159       The twig_roots mode builds only the required sub-trees from the docu‐
160       ment Anything outside of the twig roots will just be ignored:
161
162         my $t= XML::Twig->new(
163              # the twig will include just the root and selected titles
164                  twig_roots   => { 'section/title' => \&print_n_purge,
165                                    'annex/title'   => \&print_n_purge
166                  }
167                             );
168         $t->parsefile( 'doc.xml');
169
170         sub print_n_purge
171           { my( $t, $elt)= @_;
172             print $elt->text;    # print the text (including sub-element texts)
173             $t->purge;           # frees the memory
174           }
175
176       You can use that mode when you want to process parts of a documents but
177       are not interested in the rest and you don't want to pay the price,
178       either in time or memory, to build the tree for the it.
179
180       Building an XML filter
181
182       You can combine the "twig_roots" and the "twig_print_outside_roots"
183       options to build filters, which let you modify selected elements and
184       will output the rest of the document as is.
185
186       This would convert prices in $ to prices in Euro in a document:
187
188         my $t= XML::Twig->new(
189                  twig_roots   => { 'price' => \&convert, },   # process prices
190                  twig_print_outside_roots => 1,               # print the rest
191                             );
192         $t->parsefile( 'doc.xml');
193
194         sub convert
195           { my( $t, $price)= @_;
196             my $currency=  $price->{'att'}->{'currency'};          # get the currency
197             if( $currency eq 'USD')
198               { $usd_price= $price->text;                     # get the price
199                 # %rate is just a conversion table
200                 my $euro_price= $usd_price * $rate{usd2euro};
201                 $price->set_text( $euro_price);               # set the new price
202                 $price->set_att( currency => 'EUR');          # don't forget this!
203               }
204             $price->print;                                    # output the price
205           }
206
207       XML::Twig and various versions of Perl, XML::Parser and expat:
208
209       Before being uploaded to CPAN, XML::Twig 3.22 has been tested under the
210       following environments:
211
212       linux-x86
213           perl 5.6.2, expat 1.95.8, XML::Parser 2.34 perl 5.8.0, expat
214           1.95.8, XML::Parser 2.34 perl 5.8.7, expat 1.95.8, XML::Parser2.34
215
216       Solaris
217           perl 5.6.1, expat 1.95.2, XML::Parser 2.31
218
219       XML::Twig is a lot more sensitive to variations in versions of perl,
220       XML::Parser and expat than to the OS, so this should cover some reason‐
221       able configurations.
222
223       The "recommended configuration" is perl 5.8.3+ (for good Unicode sup‐
224       port), XML::Parser 2.31+ and expat 1.95.5+
225
226       See <http://testers.cpan.org/search?request=dist&dist=XML-Twig> for the
227       CPAN testers reports on XML::Twig, which list all tested configura‐
228       tions.
229
230       An Atom feed of the CPAN Testers results is available at
231       <http://xmltwig.com/rss/twig_testers.rss>
232
233       Finally:
234
235       XML::Twig does NOT work with expat 1.95.4
236       XML::Twig only works with XML::Parser 2.27 in perl 5.6.*
237           Note that I can't compile XML::Parser 2.27 anymore, so I can't
238           garantee that it still works
239
240       XML::Parser 2.28 does not really work
241
242       When in doubt, upgrade expat, XML::Parser and Scalar::Util
243
244       Finally, for some optional features, XML::Twig depends on some addi‐
245       tional modules. The complete list, which depends somewhat on the ver‐
246       sion of Perl that you are running, is given by running "t/zz_dump_con‐
247       fig.t"
248

Simplifying XML processing

250       Whitespaces
251           Whitespaces that look non-significant are discarded, this behaviour
252           can be controlled using the "keep_spaces ", "keep_spaces_in " and
253           "discard_spaces_in " options.
254
255       Encoding
256           You can specify that you want the output in the same encoding as
257           the input (provided you have valid XML, which means you have to
258           specify the encoding either in the document or when you create the
259           Twig object) using the "keep_encoding " option
260
261           You can also use "output_encoding" to convert the internal UTF-8
262           format to the required encoding.
263
264       Comments and Processing Instructions (PI)
265           Comments and PI's can be hidden from the processing, but still
266           appear in the output (they are carried by the "real" element closer
267           to them)
268
269       Pretty Printing
270           XML::Twig can output the document pretty printed so it is easier to
271           read for us humans.
272
273       Surviving an untimely death
274           XML parsers are supposed to react violently when fed improper XML.
275           XML::Parser just dies.
276
277           XML::Twig provides the "safe_parse " and the "safe_parsefile "
278           methods which wrap the parse in an eval and return either the
279           parsed twig or 0 in case of failure.
280
281       Private attributes
282           Attributes with a name starting with # (illegal in XML) will not be
283           output, so you can safely use them to store temporary values during
284           processing. Note that you can store anything in a private
285           attribute, not just text, it's just a regular Perl variable, so a
286           reference to an object or a huge data structure is perfectly fine.
287

CLASSES

289       XML::Twig uses a very limited number of classes. The ones you are most
290       likely to use are "XML::Twig" of course, which represents a complete
291       XML document, including the document itself (the root of the document
292       itself is "root"), its handlers, its input or output filters... The
293       other main class is "XML::Twig::Elt", which models an XML element. Ele‐
294       ment here has a very wide definition: it can be a regular element, or
295       but also text, with an element "tag" of "#PCDATA" (or "#CDATA"), an
296       entity (tag is "#ENT"), a Processing Instruction ("#PI"), a comment
297       ("#COMMENT").
298
299       Those are the 2 commonly used classes.
300
301       You might want to look the "elt_class" option if you want to subclass
302       "XML::Twig::Elt".
303
304       Attributes are just attached to their parent element, they are not
305       objects per se. (Please use the provided methods "att" and "set_att" to
306       access them, if you access them as a hash, then your code becomes
307       implementaion dependant and might break in the future).
308
309       Other classes that are seldom used are "XML::Twig::Entity_list" and
310       "XML::Twig::Entity".
311
312       If you use "XML::Twig::XPath" instead of "XML::Twig", elements are then
313       created as "XML::Twig::XPath::Elt"
314

METHODS

316       XML::Twig
317
318       A twig is a subclass of XML::Parser, so all XML::Parser methods can be
319       called on a twig object, including parse and parsefile.  "setHandlers"
320       on the other hand cannot be used, see "BUGS "
321
322       new This is a class method, the constructor for XML::Twig. Options are
323           passed as keyword value pairs. Recognized options are the same as
324           XML::Parser, plus some XML::Twig specifics.
325
326           New Options:
327
328           twig_handlers
329               This argument consists of a hash "{ expression =" \&handler}>
330               where expression is a an XPath-like expression (+ some others).
331
332               XPath expressions are limited to using the child and descendant
333               axis (indeed you can't specify an axis), and predicates cannot
334               be nested.  You can use the "string", or "string(<tag>)" func‐
335               tion (except in "twig_roots" triggers).
336
337               Additionally you can use regexps (/ delimited) to match
338               attribute and string values.
339
340               Examples:
341
342                 foo
343                 foo/bar
344                 foo//bar
345                 /foo/bar
346                 /foo//bar
347                 /foo/bar[@att1 = "val1" and @att2 = "val2"]/baz[@a >= 1]
348                 foo[string()=~ /^duh!+/]
349                 /foo[string(bar)=~ /\d+/]/baz[@att != 3]
350
351               #CDATA can be used to call a handler for a CDATA.  #COMMENT can
352               be used to call a handler for comments
353
354               Some additional (non-XPath) expressions are also provided for
355               convenience:
356
357               processing instructions
358                   '?' or '#PI' triggers the handler for any processing
359                   instruction, and '?<target>' or '#PI <target>' triggers a
360                   handler for processing instruction with the given target(
361                   ex: '#PI xml-stylesheet').
362
363               level(<level>)
364                   Triggers the handler on any element at that level in the
365                   tree (root is level 1)
366
367               _all_
368                   Triggers the handler for all elements in the tree
369
370               _default_
371                   Triggers the handler for each element that does NOT have
372                   any other handler.
373
374               Expressions are evaluated against the input document.  Which
375               means that even if you have changed the tag of an element
376               (changing the tag of a parent element from a handler for exam‐
377               ple) the change will not impact the expression evaluation.
378               There is an exception to this: "private" attributes (which name
379               start with a '#', and can only be created during the parsing,
380               as they are not valid XML) are checked against the current
381               twig.
382
383               Handlers are triggered in fixed order, sorted by their type
384               (xpath expressions first, then regexps, then level), then by
385               whether they specify a full path (starting at the root element)
386               or not, then by by number of steps in the expression , then
387               number of predicates, then number of tests in predicates. Han‐
388               dlers where the last step does not specify a step ("foo/bar/*")
389               are triggered after other XPath handlers.  Finally "_all_" han‐
390               dlers are triggered last.
391
392               Important: once a handler has been triggered if it returns 0
393               then no other handler is called, exept a "_all_" handler which
394               will be called anyway.
395
396               If a handler returns a true value and other handlers apply,
397               then the next applicable handler will be called. Repeat, rince,
398               lather..; The exception to that rule is when the
399               "do_not_chain_handlers" option is set, in which case only the
400               first handler will be called.
401
402               Note that it might be a good idea to explicitely return a short
403               true value (like 1) from handlers: this ensures that other
404               applicable handlers are called even if the last statement for
405               the handler happens to evaluate to false. This might also
406               speedup the code by avoiding the result of the last statement
407               of the code to be copied and passed to the code managing han‐
408               dlers.  It can really pay to have 1 instead of a long string
409               returned.
410
411               When an element is CLOSED the corresponding handler is called,
412               with 2 arguments: the twig and the "/Element ". The twig
413               includes the document tree that has been built so far, the ele‐
414               ment is the complete sub-tree for the element. This means that
415               handlers for inner elements are called before handlers for
416               outer elements.
417
418               $_ is also set to the element, so it is easy to write inline
419               handlers like
420
421                 para => sub { $_->set_tag( 'p'); }
422
423               Text is stored in elements whose tag is #PCDATA (due to mixed
424               content, text and sub-element in an element there is no way to
425               store the text as just an attribute of the enclosing element).
426
427               Warning: if you have used purge or flush on the twig the ele‐
428               ment might not be complete, some of its children might have
429               been entirely flushed or purged, and the start tag might even
430               have been printed (by "flush") already, so changing its tag
431               might not give the expected result.
432
433           twig_roots
434               This argument let's you build the tree only for those elements
435               you are interested in.
436
437                 Example: my $t= XML::Twig->new( twig_roots => { title => 1, subtitle => 1});
438                          $t->parsefile( file);
439                          my $t= XML::Twig->new( twig_roots => { 'section/title' => 1});
440                          $t->parsefile( file);
441
442               return a twig containing a document including only "title" and
443               "subtitle" elements, as children of the root element.
444
445               You can use generic_attribute_condition, attribute_condition,
446               full_path, partial_path, tag, tag_regexp, _default_ and _all_
447               to trigger the building of the twig.  string_condition and reg‐
448               exp_condition cannot be used as the content of the element, and
449               the string, have not yet been parsed when the condition is
450               checked.
451
452               WARNING: path are checked for the document. Even if the
453               "twig_roots" option is used they will be checked against the
454               full document tree, not the virtual tree created by XML::Twig
455
456               WARNING: twig_roots elements should NOT be nested, that would
457               hopelessly confuse XML::Twig ;--(
458
459               Note: you can set handlers (twig_handlers) using twig_roots
460                 Example: my $t= XML::Twig->new( twig_roots =>
461                                                  { title    => sub {
462               $_{1]->print;},
463                                                    subtitle => \&process_sub‐
464               title
465                                                  }
466                                              );
467                          $t->parsefile( file);
468
469           twig_print_outside_roots
470               To be used in conjunction with the "twig_roots" argument. When
471               set to a true value this will print the document outside of the
472               "twig_roots" elements.
473
474                Example: my $t= XML::Twig->new( twig_roots => { title => \&number_title },
475                                               twig_print_outside_roots => 1,
476                                              );
477                          $t->parsefile( file);
478                          { my $nb;
479                          sub number_title
480                            { my( $twig, $title);
481                              $nb++;
482                              $title->prefix( "$nb "; }
483                              $title->print;
484                            }
485                          }
486
487               This example prints the document outside of the title element,
488               calls "number_title" for each "title" element, prints it, and
489               then resumes printing the document. The twig is built only for
490               the "title" elements.
491
492               If the value is a reference to a file handle then the document
493               outside the "twig_roots" elements will be output to this file
494               handle:
495
496                 open( OUT, ">out_file") or die "cannot open out file out_file:$!";
497                 my $t= XML::Twig->new( twig_roots => { title => \&number_title },
498                                        # default output to OUT
499                                        twig_print_outside_roots => \*OUT,
500                                      );
501
502                        { my $nb;
503                          sub number_title
504                            { my( $twig, $title);
505                              $nb++;
506                              $title->prefix( "$nb "; }
507                              $title->print( \*OUT);    # you have to print to \*OUT here
508                            }
509                          }
510
511           start_tag_handlers
512               A hash "{ expression =" \&handler}>. Sets element handlers that
513               are called when the element is open (at the end of the
514               XML::Parser "Start" handler). The handlers are called with 2
515               params: the twig and the element. The element is empty at that
516               point, its attributes are created though.
517
518               You can use generic_attribute_condition, attribute_condition,
519               full_path, partial_path, tag, tag_regexp, _default_  and _all_
520               to trigger the handler.
521
522               string_condition and regexp_condition cannot be used as the
523               content of the element, and the string, have not yet been
524               parsed when the condition is checked.
525
526               The main uses for those handlers are to change the tag name
527               (you might have to do it as soon as you find the open tag if
528               you plan to "flush" the twig at some point in the element, and
529               to create temporary attributes that will be used when process‐
530               ing sub-element with "twig_hanlders".
531
532               You should also use it to change tags if you use "flush". If
533               you change the tag in a regular "twig_handler" then the start
534               tag might already have been flushed.
535
536               Note: "start_tag" handlers can be called outside of
537               "twig_roots" if this argument is used, in this case handlers
538               are called with the following arguments: $t (the twig), $tag
539               (the tag of the element) and %att (a hash of the attributes of
540               the element).
541
542               If the "twig_print_outside_roots" argument is also used, if the
543               last handler called returns  a "true" value, then the the start
544               tag will be output as it appeared in the original document, if
545               the handler returns a a "false" value then the start tag will
546               not be printed (so you can print a modified string yourself for
547               example).
548
549               Note that you can use the ignore method in "start_tag_handlers"
550               (and only there).
551
552           end_tag_handlers
553               A hash "{ expression =" \&handler}>. Sets element handlers that
554               are called when the element is closed (at the end of the
555               XML::Parser "End" handler). The handlers are called with 2
556               params: the twig and the tag of the element.
557
558               twig_handlers are called when an element is completely parsed,
559               so why have this redundant option? There is only one use for
560               "end_tag_handlers": when using the "twig_roots" option, to
561               trigger a handler for an element outside the roots.  It is for
562               example very useful to number titles in a document using nested
563               sections:
564
565                 my @no= (0);
566                 my $no;
567                 my $t= XML::Twig->new(
568                         start_tag_handlers =>
569                          { section => sub { $no[$#no]++; $no= join '.', @no; push @no, 0; } },
570                         twig_roots         =>
571                          { title   => sub { $_[1]->prefix( $no); $_[1]->print; } },
572                         end_tag_handlers   => { section => sub { pop @no;  } },
573                         twig_print_outside_roots => 1
574                                     );
575                  $t->parsefile( $file);
576
577               Using the "end_tag_handlers" argument without "twig_roots" will
578               result in an error.
579
580           do_not_chain_handlers
581               If this option is set to a true value, then only one handler
582               will be called for each element, even if several satisfy the
583               condition
584
585               Note that the "_all_" handler will still be called regardeless
586
587           ignore_elts
588               This option lets you ignore elements when building the twig.
589               This is useful in cases where you cannot use "twig_roots" to
590               ignore elements, for example if the element to ignore is a sib‐
591               ling of elements you are interested in.
592
593               Example:
594
595                 my $twig= XML::Twig->new( ignore_elts => { elt => 1 });
596                 $twig->parsefile( 'doc.xml');
597
598               This will build the complete twig for the document, except that
599               all "elt" elements (and their children) will be left out.
600
601           char_handler
602               A reference to a subroutine that will be called every time
603               "PCDATA" is found.
604
605               The subroutine receives the string as argument, and returns the
606               modified string:
607
608                 # we want all strings in upper case
609                 sub my_char_handler
610                   { my( $text)= @_;
611                     $text= uc( $text);
612                     return $text;
613                   }
614
615           elt_class
616               The name of a class used to store elements. this class should
617               inherit from "XML::Twig::Elt" (and by default it is
618               "XML::Twig::Elt"). This option is used to subclass the element
619               class and extend it with new methods.
620
621               This option is needed because during the parsing of the XML,
622               elements are created by "XML::Twig", without any control from
623               the user code.
624
625           keep_atts_order
626               Setting this option to a true value causes the attribute hash
627               to be tied to a "Tie::IxHash" object.  This means that
628               "Tie::IxHash" needs to be installed for this option to be
629               available. It also means that the hash keeps its order, so you
630               will get the attributes in order. This allows outputing the
631               attributes in the same order as they were in the original docu‐
632               ment.
633
634           keep_encoding
635               This is a (slightly?) evil option: if the XML document is not
636               UTF-8 encoded and you want to keep it that way, then setting
637               keep_encoding will use the"Expat" original_string method for
638               character, thus keeping the original encoding, as well as the
639               original entities in the strings.
640
641               See the "t/test6.t" test file to see what results you can
642               expect from the various encoding options.
643
644               WARNING: if the original encoding is multi-byte then attribute
645               parsing will be EXTREMELY unsafe under any Perl before 5.6, as
646               it uses regular expressions which do not deal properly with
647               multi-byte characters. You can specify an alternate function to
648               parse the start tags with the "parse_start_tag" option (see
649               below)
650
651               WARNING: this option is NOT used when parsing with the non-
652               blocking parser ("parse_start", "parse_more", parse_done meth‐
653               ods) which you probably should not use with XML::Twig anyway as
654               they are totally untested!
655
656           output_encoding
657               This option generates an output_filter using "Encode",
658               "Text::Iconv" or "Unicode::Map8" and "Unicode::Strings", and
659               sets the encoding in the XML declaration. This is the easiest
660               way to deal with encodings, if you need more sophisticated fea‐
661               tures, look at "output_filter" below
662
663           output_filter
664               This option is used to convert the character encoding of the
665               output document.  It is passed either a string corresponding to
666               a predefined filter or a subroutine reference. The filter will
667               be called every time a document or element is processed by the
668               "print" functions ("print", "sprint", "flush").
669
670               Pre-defined filters:
671
672               latin1
673                   uses either "Encode", "Text::Iconv" or "Unicode::Map8" and
674                   "Unicode::String" or a regexp (which works only with
675                   XML::Parser 2.27), in this order, to convert all characters
676                   to ISO-8859-1 (aka latin1)
677
678               html
679                   does the same conversion as "latin1", plus encodes entities
680                   using "HTML::Entities" (oddly enough you will need to have
681                   HTML::Entities intalled for it to be available). This
682                   should only be used if the tags and attribute names them‐
683                   selves are in US-ASCII, or they will be converted and the
684                   output will not be valid XML any more
685
686               safe
687                   converts the output to ASCII (US) only  plus character
688                   entities ("&#nnn;") this should be used only if the tags
689                   and attribute names themselves are in US-ASCII, or they
690                   will be converted and the output will not be valid XML any
691                   more
692
693               safe_hex
694                   same as "safe" except that the character entities are in
695                   hexa ("&#xnnn;")
696
697               encode_convert ($encoding)
698                   Return a subref that can be used to convert utf8 strings to
699                   $encoding).  Uses "Encode".
700
701                      my $conv = XML::Twig::encode_convert( 'latin1');
702                      my $t = XML::Twig->new(output_filter => $conv);
703
704               iconv_convert ($encoding)
705                   this function is used to create a filter subroutine that
706                   will be used to convert the characters to the target encod‐
707                   ing using "Text::Iconv" (which needs to be installed, look
708                   at the documentation for the module and for the "iconv"
709                   library to find out which encodings are available on your
710                   system)
711
712                      my $conv = XML::Twig::iconv_convert( 'latin1');
713                      my $t = XML::Twig->new(output_filter => $conv);
714
715               unicode_convert ($encoding)
716                   this function is used to create a filter subroutine that
717                   will be used to convert the characters to the target encod‐
718                   ing using  "Unicode::Strings" and "Unicode::Map8" (which
719                   need to be installed, look at the documentation for the
720                   modules to find out which encodings are available on your
721                   system)
722
723                      my $conv = XML::Twig::unicode_convert( 'latin1');
724                      my $t = XML::Twig->new(output_filter => $conv);
725
726               The "text" and "att" methods do not use the filter, so their
727               result are always in unicode.
728
729               Those predeclared filters are based on subroutines that can be
730               used by themselves (as "XML::Twig::foo").
731
732               html_encode ($string)
733                   Use "HTML::Entities" to encode a utf8 string
734
735               safe_encode ($string)
736                   Use either a regexp (perl < 5.8) or "Encode" to encode non-
737                   ascii characters in the string in "&#<nnnn>;" format
738
739               safe_encode_hex ($string)
740                   Use either a regexp (perl < 5.8) or "Encode" to encode non-
741                   ascii characters in the string in "&#x<nnnn>;" format
742
743               regexp2latin1 ($string)
744                   Use a regexp to encode a utf8 string into latin 1
745                   (ISO-8859-1). Does not work with Perl 5.8.0!
746
747           output_text_filter
748               same as output_filter, except it doesn't apply to the brackets
749               and quotes around attribute values. This is useful for all fil‐
750               ters that could change the tagging, basically anything that
751               does not just change the encoding of the output. "html", "safe"
752               and "safe_hex" are better used with this option.
753
754           input_filter
755               This option is similar to "output_filter" except the filter is
756               applied to the characters before they are stored in the twig,
757               at parsing time.
758
759           remove_cdata
760               Setting this option to a true value will force the twig to out‐
761               put CDATA sections as regular (escaped) PCDATA
762
763           parse_start_tag
764               If you use the "keep_encoding" option then this option can be
765               used to replace the default parsing function. You should pro‐
766               vide a coderef (a reference to a subroutine) as the argument,
767               this subroutine takes the original tag (given by
768               XML::Parser::Expat "original_string()" method) and returns a
769               tag and the attributes in a hash (or in a list
770               attribute_name/attribute value).
771
772           expand_external_ents
773               When this option is used external entities (that are defined)
774               are expanded when the document is output using "print" func‐
775               tions such as "print ", "sprint ", "flush " and "xml_string ".
776               Note that in the twig the entity will be stored as an element
777               whith a tag '"#ENT"', the entity will not be expanded there, so
778               you might want to process the entities before outputting it.
779
780           load_DTD
781               If this argument is set to a true value, "parse" or "parsefile"
782               on the twig will load  the DTD information. This information
783               can then be accessed through the twig, in a "DTD_handler" for
784               example. This will load even an external DTD.
785
786               Default and fixed values for attributes will also be filled,
787               based on the DTD.
788
789               Note that to do this the module will generate a temporary file
790               in the current directory. If this is a problem let me know and
791               I will add an option to specify an alternate directory.
792
793               See DTD Handling for more information
794
795           DTD_handler
796               Set a handler that will be called once the doctype (and the
797               DTD) have been loaded, with 2 arguments, the twig and the DTD.
798
799           no_prolog
800               Does not output a prolog (XML declaration and DTD)
801
802           id  This optional argument gives the name of an attribute that can
803               be used as an ID in the document. Elements whose ID is known
804               can be accessed through the elt_id method. id defaults to 'id'.
805               See "BUGS "
806
807           discard_spaces
808               If this optional argument is set to a true value then spaces
809               are discarded when they look non-significant: strings contain‐
810               ing only spaces are discarded.  This argument is set to true by
811               default.
812
813           keep_spaces
814               If this optional argument is set to a true value then all spa‐
815               ces in the document are kept, and stored as "PCDATA".
816
817               Warning: adding this option can result in changes in the twig
818               generated: space that was previously discarded might end up in
819               a new text element. see the difference by calling the following
820               code with 0 and 1 as arguments:
821
822                 perl -MXML::Twig -e'print XML::Twig->new( keep_spaces => shift)->parse( "<d> \n<e/></d>")->_dump'
823
824               "keep_spaces" and "discard_spaces" cannot be both set.
825
826           discard_spaces_in
827               This argument sets "keep_spaces" to true but will cause the
828               twig builder to discard spaces in the elements listed.
829
830               The syntax for using this argument is:
831
832                 XML::Twig->new( discard_spaces_in => [ 'elt1', 'elt2']);
833
834           keep_spaces_in
835               This argument sets "discard_spaces" to true but will cause the
836               twig builder to keep spaces in the elements listed.
837
838               The syntax for using this argument is:
839
840                 XML::Twig->new( keep_spaces_in => [ 'elt1', 'elt2']);
841
842               Warning: adding this option can result in changes in the twig
843               generated: space that was previously discarded might end up in
844               a new text element.
845
846           pretty_print
847               Set the pretty print method, amongst '"none"' (default),
848               '"nsgmls"', '"nice"', '"indented"', '"indented_c"', "wrapped",
849               '"record"' and '"record_c"'
850
851               pretty_print formats:
852
853               none
854                   The document is output as one ling string, with no line
855                   breaks except those found within text elements
856
857               nsgmls
858                   Line breaks are inserted in safe places: that is within
859                   tags, between a tag and an attribute, between attributes
860                   and before the > at the end of a tag.
861
862                   This is quite ugly but better than "none", and it is very
863                   safe, the document will still be valid (conforming to its
864                   DTD).
865
866                   This is how the SGML parser "sgmls" splits documents, hence
867                   the name.
868
869               nice
870                   This option inserts line breaks before any tag that does
871                   not contain text (so element with textual content are not
872                   broken as the \n is the significant).
873
874                   WARNING: this option leaves the document well-formed but
875                   might make it invalid (not conformant to its DTD). If you
876                   have elements declared as
877
878                     <!ELEMENT foo (#PCDATA⎪bar)>
879
880                   then a "foo" element including a "bar" one will be printed
881                   as
882
883                     <foo>
884                     <bar>bar is just pcdata</bar>
885                     </foo>
886
887                   This is invalid, as the parser will take the line break
888                   after the "foo" tag as a sign that the element contains
889                   PCDATA, it will then die when it finds the "bar" tag. This
890                   may or may not be important for you, but be aware of it!
891
892               indented
893                   Same as "nice" (and with the same warning) but indents ele‐
894                   ments according to their level
895
896               indented_c
897                   Same as "indented" but a little more compact: the closing
898                   tags are on the same line as the preceeding text
899
900               wrapped
901                   Same as "indented_c" but lines are wrapped using
902                   Text::Wrap::wrap. The default length for lines is the
903                   default for $Text::Wrap::columns, and can be changed by
904                   changing that variable.
905
906               record
907                   This is a record-oriented pretty print, that display data
908                   in records, one field per line (which looks a LOT like
909                   "indented")
910
911               record_c
912                   Stands for record compact, one record per line
913
914           empty_tags
915               Set the empty tag display style ('"normal"', '"html"' or
916               '"expand"').
917
918               "normal" outputs an empty tag '"<tag/>"', "html" adds a space
919               '"<tag />"' for elements that can be empty in XHTML and
920               "expand" outputs '"<tag></tag>"'
921
922           quote
923               Set the quote character for attributes ('"single"' or '"dou‐
924               ble"').
925
926           comments
927               Set the way comments are processed: '"drop"' (default),
928               '"keep"' or '"process"'
929
930               Comments processing options:
931
932               drop
933                   drops the comments, they are not read, nor printed to the
934                   output
935
936               keep
937                   comments are loaded and will appear on the output, they are
938                   not accessible within the twig and will not interfere with
939                   processing though
940
941                   Note: comments in the middle of a text element such as
942
943                     <p>text <!-- comment --> more text --></p>
944
945                   are kept at their original position in the text. Using
946                   ˝"print" methods like "print" or "sprint" will return the
947                   comments in the text. Using "text" or "field" on the other
948                   hand will not.
949
950                   Any use of "set_pcdata" on the "#PCDATA" element (directly
951                   or through other methods like "set_content") will delete
952                   the comment(s).
953
954               process
955                   comments are loaded in the twig and will be treated as reg‐
956                   ular elements (their "tag" is "#COMMENT") this can inter‐
957                   fere with processing if you expect "$elt->{first_child}" to
958                   be an element but find a comment there.  Validation will
959                   not protect you from this as comments can happen anywhere.
960                   You can use "$elt->first_child( 'tag')" (which is a good
961                   habit anyway) to get where you want.
962
963                   Consider using "process" if you are outputing SAX events
964                   from XML::Twig.
965
966           pi  Set the way processing instructions are processed: '"drop"',
967               '"keep"' (default) or '"process"'
968
969               Note that you can also set PI handlers in the "twig_handlers"
970               option:
971
972                 '?'       => \&handler
973                 '?target' => \&handler 2
974
975               The handlers will be called with 2 parameters, the twig and the
976               PI element if "pi" is set to "process", and with 3, the twig,
977               the target and the data if "pi" is set to "keep". Of course
978               they will not be called if "pi" is set to "drop".
979
980               If "pi" is set to "keep" the handler should return a string
981               that will be used as-is as the PI text (it should look like ""
982               <?target data?" >" or '' if you want to remove the PI),
983
984               Only one handler will be called, "?target" or "?" if no spe‐
985               cific handler for that target is available.
986
987           map_xmlns
988               This option is passed a hashref that maps uri's to prefixes.
989               The prefixes in the document will be replaced by the ones in
990               the map. The mapped prefixes can (actually have to) be used to
991               trigger handlers, navigate or query the document.
992
993               Here is an example:
994
995                 my $t= XML::Twig->new( map_xmlns => {'http://www.w3.org/2000/svg' => "svg"},
996                                        twig_handlers =>
997                                          { 'svg:circle' => sub { $_->set_att( r => 20) } },
998                                        pretty_print => 'indented',
999                                      )
1000                                 ->parse( '<doc xmlns:gr="http://www.w3.org/2000/svg">
1001                                             <gr:circle cx="10" cy="90" r="10"/>
1002                                          </doc>'
1003                                        )
1004                                 ->print;
1005
1006               This will output:
1007
1008                 <doc xmlns:svg="http://www.w3.org/2000/svg">
1009                    <svg:circle cx="10" cy="90" r="20"/>
1010                 </doc>
1011
1012           keep_original_prefix
1013               When used with "map_xmlns" this option will make "XML::Twig"
1014               use the original namespace prefixes when outputing a document.
1015               The mapped prefix will still be used for triggering handlers
1016               and in navigation and query methods.
1017
1018                 my $t= XML::Twig->new( map_xmlns => {'http://www.w3.org/2000/svg' => "svg"},
1019                                        twig_handlers =>
1020                                          { 'svg:circle' => sub { $_->set_att( r => 20) } },
1021                                        keep_original_prefix => 1,
1022                                        pretty_print => 'indented',
1023                                      )
1024                                 ->parse( '<doc xmlns:gr="http://www.w3.org/2000/svg">
1025                                             <gr:circle cx="10" cy="90" r="10"/>
1026                                          </doc>'
1027                                        )
1028                                 ->print;
1029
1030               This will output:
1031
1032                 <doc xmlns:gr="http://www.w3.org/2000/svg">
1033                    <gr:circle cx="10" cy="90" r="20"/>
1034                 </doc>
1035
1036           index ($arrayref or $hashref)
1037               This option creates lists of specific elements during the pars‐
1038               ing of the XML.  It takes a reference to either a list of trig‐
1039               gering expressions or to a hash name => expression, and for
1040               each one generates the list of elements that match the expres‐
1041               sion. The list can be accessed through the "index" method.
1042
1043               example:
1044
1045                 # using an array ref
1046                 my $t= XML::Twig->new( index => [ 'div', 'table' ])
1047                                 ->parsefile( "foo.xml');
1048                 my $divs= $t->index( 'div');
1049                 my $first_div= $divs->[0];
1050                 my $last_table= $t->index( table => -1);
1051
1052                 # using a hashref to name the indexes
1053                 my $t= XML::Twig->new( index => { email => 'a[@href=~/^\s*mailto:/]')
1054                                 ->parsefile( "foo.xml');
1055                 my $last_emails= $t->index( email => -1);
1056
1057               Note that the index is not maintained after the parsing. If
1058               elements are deleted, renamed or otherwise hurt during process‐
1059               ing, the index is NOT updated.
1060
1061           Note: I _HATE_ the Java-like name of arguments used by most XML
1062           modules.  So in pure TIMTOWTDI fashion all arguments can be written
1063           either as "UglyJavaLikeName" or as "readable_perl_name":
1064           "twig_print_outside_roots" or "TwigPrintOutsideRoots" (or even
1065           "twigPrintOutsideRoots" {shudder}).  XML::Twig normalizes them
1066           before processing them.
1067
1068       parse ( $source)
1069           The $source parameter should either be a string containing the
1070           whole XML document, or it should be an open "IO::Handle". Construc‐
1071           tor options to "XML::Parser::Expat" given as keyword-value pairs
1072           may follow the$source parameter. These override, for this call, any
1073           options or attributes passed through from the XML::Parser instance.
1074
1075           A die call is thrown if a parse error occurs. Otherwise it will
1076           return the twig built by the parse. Use "safe_parse" if you want
1077           the parsing to return even when an error occurs.
1078
1079       parsestring
1080           This is just an alias for "parse" for backwards compatibility.
1081
1082       parsefile (FILE [, OPT => OPT_VALUE [...]])
1083           Open "FILE" for reading, then call "parse" with the open handle.
1084           The file is closed no matter how "parse" returns.
1085
1086           A "die" call is thrown if a parse error occurs. Otherwise it will
1087           return the twig built by the parse. Use "safe_parsefile" if you
1088           want the parsing to return even when an error occurs.
1089
1090       parsefile_inplace ( $file, $optional_extension)
1091           Parse and update a file "in place". It does this by creating a temp
1092           file, selecting it as the default for print() statements (and meth‐
1093           ods), then parsing the input file. If the parsing is successful,
1094           then the temp file is moved to replace the input file.
1095
1096           If an extension is given then the original file is backed-up (the
1097           rules for the extension are the same as the rule for the -i option
1098           in perl).
1099
1100       parsefile_html_inplace ( $file, $optional_extension)
1101           Same as parsefile_inplace, except that it parses HTML instead of
1102           XML
1103
1104       parseurl ($url $optional_user_agent)
1105           Gets the data from $url and parse it. The data is piped to the
1106           parser in chunks the size of the XML::Parser::Expat buffer, so mem‐
1107           ory consumption and hopefully speed are optimal.
1108
1109           For most (read "small") XML it is probably as efficient (and easier
1110           to debug) to just "get" the XML file and then parse it as a string.
1111
1112             use XML::Twig;
1113             use LWP::Simple;
1114             my $twig= XML::Twig->new();
1115             $twig->parse( LWP::Simple::get( $URL ));
1116
1117           or
1118
1119             use XML::Twig;
1120             my $twig= XML::Twig->nparse( $URL);
1121
1122           If the $optional_user_agent argument is used then it is used, oth‐
1123           erwise a new one is created.
1124
1125       safe_parse ( SOURCE [, OPT => OPT_VALUE [...]])
1126           This method is similar to "parse" except that it wraps the parsing
1127           in an "eval" block. It returns the twig on success and 0 on failure
1128           (the twig object also contains the parsed twig). $@ contains the
1129           error message on failure.
1130
1131           Note that the parsing still stops as soon as an error is detected,
1132           there is no way to keep going after an error.
1133
1134       safe_parsefile (FILE [, OPT => OPT_VALUE [...]])
1135           This method is similar to "parsefile" except that it wraps the
1136           parsing in an "eval" block. It returns the twig on success and 0 on
1137           failure (the twig object also contains the parsed twig) . $@ con‐
1138           tains the error message on failure
1139
1140           Note that the parsing still stops as soon as an error is detected,
1141           there is no way to keep going after an error.
1142
1143       safe_parseurl ($url $optional_user_agent)
1144           Same as "parseurl" except that it wraps the parsing in an "eval"
1145           block. It returns the twig on success and 0 on failure (the twig
1146           object also contains the parsed twig) . $@ contains the error mes‐
1147           sage on failure
1148
1149       parse_html
1150           parse an HTML string or file handle (by converting it to XML using
1151           HTML::TreeBuilder, which needs to be available).
1152
1153           This works nicely, but some information gets lost in the process:
1154           newlines are removed, and (at least on the version I use), comments
1155           get get an extra CDATA section inside ( <!-- foo --> becomes <!--
1156           <![CDATA[ foo ]]> -->
1157
1158       parsefile_html
1159           parse an HTML file (by converting it to XML using HTML::Tree‐
1160           Builder, which needs to be available). The file is loaded com‐
1161           pletely in memory and converted to XML before being parsed.
1162
1163           Alpha: implementation, and thus generated XML could change.
1164
1165       xparse ($thing_to_parse)
1166           parse the $thing_to_parse, whether it is a filehandle, a string, an
1167           HTML file, an HTML URL, an URL or a file.
1168
1169           Note that this is mostly a convenience method for one-off scripts.
1170           For example files that end in '.htm' or '.html' are parsed first as
1171           XML, and if this fails as HTML. This is certainly not the most
1172           efficient way to do this in general.
1173
1174       nparse ($optional_twig_options, $thing_to_parse)
1175           create a twig with the $optional_options, and parse the
1176           $thing_to_parse, whether it is a filehandle, a string, an HTML
1177           file, an HTML URL, an URL or a file.
1178
1179           Examples:
1180
1181              XML::Twig->nparse( "file.xml");
1182              XML::Twig->nparse( error_context => 1, "file://file.xml");
1183
1184       nparse_pp ($optional_twig_options, $thing_to_parse)
1185           same as "nparse" but also sets the "pretty_print" option to
1186           "indented".
1187
1188       nparse_e ($optional_twig_options, $thing_to_parse)
1189           same as "nparse" but also sets the "error_context" option to 1.
1190
1191       nparse_ppe ($optional_twig_options, $thing_to_parse)
1192           same as "nparse" but also sets the "pretty_print" option to
1193           "indented" and the "error_context" option to 1.
1194
1195       parser
1196           This method returns the "expat" object (actually the
1197           XML::Parser::Expat object) used during parsing. It is useful for
1198           example to call XML::Parser::Expat methods on it. To get the line
1199           of a tag for example use "$t->parser->current_line".
1200
1201       setTwigHandlers ($handlers)
1202           Set the twig_handlers. $handlers is a reference to a hash similar
1203           to the one in the "twig_handlers" option of new. All previous han‐
1204           dlers are unset.  The method returns the reference to the previous
1205           handlers.
1206
1207       setTwigHandler ($exp $handler)
1208           Set a single twig_handler for elements matching $exp. $handler is a
1209           reference to a subroutine. If the handler was previously set then
1210           the reference to the previous handler is returned.
1211
1212       setStartTagHandlers ($handlers)
1213           Set the start_tag handlers. $handlers is a reference to a hash sim‐
1214           ilar to the one in the "start_tag_handlers" option of new. All pre‐
1215           vious handlers are unset.  The method returns the reference to the
1216           previous handlers.
1217
1218       setStartTagHandler ($exp $handler)
1219           Set a single start_tag handlers for elements matching $exp. $han‐
1220           dler is a reference to a subroutine. If the handler was previously
1221           set then the reference to the previous handler is returned.
1222
1223       setEndTagHandlers ($handlers)
1224           Set the end_tag handlers. $handlers is a reference to a hash simi‐
1225           lar to the one in the "end_tag_handlers" option of new. All previ‐
1226           ous handlers are unset.  The method returns the reference to the
1227           previous handlers.
1228
1229       setEndTagHandler ($exp $handler)
1230           Set a single end_tag handlers for elements matching $exp. $handler
1231           is a reference to a subroutine. If the handler was previously set
1232           then the reference to the previous handler is returned.
1233
1234       setTwigRoots ($handlers)
1235           Same as using the "twig_roots" option when creating the twig
1236
1237       setCharHandler ($exp $handler)
1238           Set a "char_handler"
1239
1240       setIgnoreEltsHandler ($exp)
1241           Set a "ignore_elt" handler (elements that match $exp will be
1242           ignored
1243
1244       setIgnoreEltsHandlers ($exp)
1245           Set all "ignore_elt" handlers (previous handlers are replaced)
1246
1247       dtd Return the dtd (an XML::Twig::DTD object) of a twig
1248
1249       xmldecl
1250           Return the XML declaration for the document, or a default one if it
1251           doesn't have one
1252
1253       doctype
1254           Return the doctype for the document
1255
1256       dtd_text
1257           Return the DTD text
1258
1259       dtd_print
1260           Print the DTD
1261
1262       model ($tag)
1263           Return the model (in the DTD) for the element $tag
1264
1265       root
1266           Return the root element of a twig
1267
1268       set_root ($elt)
1269           Set the root of a twig
1270
1271       first_elt ($optional_condition)
1272           Return the first element matching $optional_condition of a twig, if
1273           no condition is given then the root is returned
1274
1275       last_elt ($optional_condition)
1276           Return the last element matching $optional_condition of a twig, if
1277           no condition is given then the last element of the twig is returned
1278
1279       elt_id        ($id)
1280           Return the element whose "id" attribute is $id
1281
1282       getEltById
1283           Same as "elt_id"
1284
1285       index ($index_name, $optional_index)
1286           If the $optional_index argument is present, return the correspond‐
1287           ing element in the index (created using the "index" option for
1288           "XML::Twig-"new>)
1289
1290           If the argument is not present, return an arrayref to the index
1291
1292       encoding
1293           This method returns the encoding of the XML document, as defined by
1294           the "encoding" attribute in the XML declaration (ie it is "undef"
1295           if the attribute is not defined)
1296
1297       set_encoding
1298           This method sets the value of the "encoding" attribute in the XML
1299           declaration.  Note that if the document did not have a declaration
1300           it is generated (with an XML version of 1.0)
1301
1302       xml_version
1303           This method returns the XML version, as defined by the "version"
1304           attribute in the XML declaration (ie it is "undef" if the attribute
1305           is not defined)
1306
1307       set_xml_version
1308           This method sets the value of the "version" attribute in the XML
1309           declaration.  If the declaration did not exist it is created.
1310
1311       standalone
1312           This method returns the value of the "standalone" declaration for
1313           the document
1314
1315       set_standalone
1316           This method sets the value of the "standalone" attribute in the XML
1317           declaration.  Note that if the document did not have a declaration
1318           it is generated (with an XML version of 1.0)
1319
1320       set_output_encoding
1321           Set the "encoding" "attribute" in the XML declaration
1322
1323       set_doctype ($name, $system, $public, $internal)
1324           Set the doctype of the element. If an argument is "undef" (or not
1325           present) then its former value is retained, if a false ('' or 0)
1326           value is passed then the former value is deleted;
1327
1328       entity_list
1329           Return the entity list of a twig
1330
1331       entity_names
1332           Return the list of all defined entities
1333
1334       entity ($entity_name)
1335           Return the entity
1336
1337       change_gi      ($old_gi, $new_gi)
1338           Performs a (very fast) global change. All elements $old_gi are now
1339           $new_gi. This is a bit dangerous though and should be avoided if <
1340           possible, as the new tag might be ignored in subsequent processing.
1341
1342           See "BUGS "
1343
1344       flush            ($optional_filehandle, %options)
1345           Flushes a twig up to (and including) the current element, then
1346           deletes all unnecessary elements from the tree that's kept in mem‐
1347           ory.  "flush" keeps track of which elements need to be open/closed,
1348           so if you flush from handlers you don't have to worry about any‐
1349           thing. Just keep flushing the twig every time you're done with a
1350           sub-tree and it will come out well-formed. After the whole parsing
1351           don't forget to"flush" one more time to print the end of the docu‐
1352           ment.  The doctype and entity declarations are also printed.
1353
1354           flush take an optional filehandle as an argument.
1355
1356           options: use the "update_DTD" option if you have updated the
1357           (internal) DTD and/or the entity list and you want the updated DTD
1358           to be output
1359
1360           The "pretty_print" option sets the pretty printing of the document.
1361
1362              Example: $t->flush( Update_DTD => 1);
1363                       $t->flush( $filehandle, pretty_print => 'indented');
1364                       $t->flush( \*FILE);
1365
1366       flush_up_to ($elt, $optional_filehandle, %options)
1367           Flushes up to the $elt element. This allows you to keep part of the
1368           tree in memory when you "flush".
1369
1370           options: see flush.
1371
1372       purge
1373           Does the same as a "flush" except it does not print the twig. It
1374           just deletes all elements that have been completely parsed so far.
1375
1376       purge_up_to ($elt)
1377           Purges up to the $elt element. This allows you to keep part of the
1378           tree in memory when you "purge".
1379
1380       print            ($optional_filehandle, %options)
1381           Prints the whole document associated with the twig. To be used only
1382           AFTER the parse.
1383
1384           options: see "flush".
1385
1386       print_to_file    ($filename, %options)
1387           Prints the whole document associated with the twig to file $file‐
1388           name.  To be used only AFTER the parse.
1389
1390           options: see "flush".
1391
1392       sprint
1393           Return the text of the whole document associated with the twig. To
1394           be used only AFTER the parse.
1395
1396           options: see "flush".
1397
1398       trim
1399           Trim the document: gets rid of initial and trailing spaces, and
1400           relace multiple spaces by a single one.
1401
1402       toSAX1 ($handler)
1403           Send SAX events for the twig to the SAX1 handler $handler
1404
1405       toSAX2 ($handler)
1406           Send SAX events for the twig to the SAX2 handler $handler
1407
1408       flush_toSAX1 ($handler)
1409           Same as flush, except that SAX events are sent to the SAX1 handler
1410           $handler instead of the twig being printed
1411
1412       flush_toSAX2 ($handler)
1413           Same as flush, except that SAX events are sent to the SAX2 handler
1414           $handler instead of the twig being printed
1415
1416       ignore
1417           This method hould be called during parsing, usually in
1418           "start_tag_handlers".  It causes the element to be skipped during
1419           the parsing: the twig is not built for this element, it will not be
1420           accessible during parsing or after it. The element will not take up
1421           any memory and parsing will be faster.
1422
1423           Note that this method can also be called on an element. If the ele‐
1424           ment is a parent of the current element then this element will be
1425           ignored (the twig will not be built any more for it and what has
1426           already been built will be deleted).
1427
1428       set_pretty_print  ($style)
1429           Set the pretty print method, amongst '"none"' (default),
1430           '"nsgmls"', '"nice"', '"indented"', "indented_c", '"wrapped"',
1431           '"record"' and '"record_c"'
1432
1433           WARNING: the pretty print style is a GLOBAL variable, so once set
1434           it's applied to ALL "print"'s (and "sprint"'s). Same goes if you
1435           use XML::Twig with "mod_perl" . This should not be a problem as the
1436           XML that's generated is valid anyway, and XML processors (as well
1437           as HTML processors, including browsers) should not care. Let me
1438           know if this is a big problem, but at the moment the perfor‐
1439           mance/cleanliness trade-off clearly favors the global approach.
1440
1441       set_empty_tag_style  ($style)
1442           Set the empty tag display style ('"normal"', '"html"' or
1443           '"expand"'). As with "set_pretty_print" this sets a global flag.
1444
1445           "normal" outputs an empty tag '"<tag/>"', "html" adds a space
1446           '"<tag />"' for elements that can be empty in XHTML and "expand"
1447           outputs '"<tag></tag>"'
1448
1449       set_remove_cdata  ($flag)
1450           set (or unset) the flag that forces the twig to output CDATA sec‐
1451           tions as regular (escaped) PCDATA
1452
1453       print_prolog     ($optional_filehandle, %options)
1454           Prints the prolog (XML declaration + DTD + entity declarations) of
1455           a document.
1456
1457           options: see "flush".
1458
1459       prolog     ($optional_filehandle, %options)
1460           Return the prolog (XML declaration + DTD + entity declarations) of
1461           a document.
1462
1463           options: see "flush".
1464
1465       finish
1466           Call Expat "finish" method.  Unsets all handlers (including inter‐
1467           nal ones that set context), but expat continues parsing to the end
1468           of the document or until it finds an error.  It should finish up a
1469           lot faster than with the handlers set.
1470
1471       finish_print
1472           Stop twig processing, flush the twig and proceed to finish printing
1473           the document as fast as possible. Use this method when modifying a
1474           document and the modification is done.
1475
1476       set_expand_external_entities
1477           Same as using the "expand_external_ents" option when creating the
1478           twig
1479
1480       set_input_filter
1481           Same as using the "input_filter" option when creating the twig
1482
1483       set_keep_atts_order
1484           Same as using the "keep_atts_order" option when creating the twig
1485
1486       set_keep_encoding
1487           Same as using the "keep_encoding" option when creating the twig
1488
1489       set_output_filter
1490           Same as using the "output_filter" option when creating the twig
1491
1492       set_output_text_filter
1493           Same as using the "output_text_filter" option when creating the
1494           twig
1495
1496       add_stylesheet ($type, @options)
1497           Adds an external stylesheet to an XML document.
1498
1499           Supported types and options:
1500
1501           xsl option: the url of the stylesheet
1502
1503               Example:
1504
1505                 $t->add_stylesheet( xsl => "xsl_style.xsl");
1506
1507               will generate the following PI at the beginning of the docu‐
1508               ment:
1509
1510                 <?xml-stylesheet type="text/xsl" href="xsl_style.xsl"?>
1511
1512           css option: the url of the stylesheet
1513
1514       Methods inherited from XML::Parser::Expat
1515           A twig inherits all the relevant methods from XML::Parser::Expat.
1516           These methods can only be used during the parsing phase (they will
1517           generate a fatal error otherwise).
1518
1519           Inherited methods are:
1520
1521           depth
1522               Returns the size of the context list.
1523
1524           in_element
1525               Returns true if NAME is equal to the name of the innermost cur‐
1526               rently opened element. If namespace processing is being used
1527               and you want to check against a name that may be in a names‐
1528               pace, then use the generate_ns_name method to create the NAME
1529               argument.
1530
1531           within_element
1532               Returns the number of times the given name appears in the con‐
1533               text list.  If namespace processing is being used and you want
1534               to check against a name that may be in a namespace, then use
1535               the gener‐ ate_ns_name method to create the NAME argument.
1536
1537           context
1538               Returns a list of element names that represent open elements,
1539               with the last one being the innermost. Inside start and end tag
1540               han‐ dlers, this will be the tag of the parent element.
1541
1542           current_line
1543               Returns the line number of the current position of the parse.
1544
1545           current_column
1546               Returns the column number of the current position of the parse.
1547
1548           current_byte
1549               Returns the current position of the parse.
1550
1551           position_in_context
1552               Returns a string that shows the current parse position. LINES
1553               should be an integer >= 0 that represents the number of lines
1554               on either side of the current parse line to place into the
1555               returned string.
1556
1557           base ([NEWBASE])
1558               Returns the current value of the base for resolving relative
1559               URIs.  If NEWBASE is supplied, changes the base to that value.
1560
1561           current_element
1562               Returns the name of the innermost currently opened element.
1563               Inside start or end handlers, returns the parent of the element
1564               associated with those tags.
1565
1566           element_index
1567               Returns an integer that is the depth-first visit order of the
1568               cur‐ rent element. This will be zero outside of the root ele‐
1569               ment. For example, this will return 1 when called from the
1570               start handler for the root element start tag.
1571
1572           recognized_string
1573               Returns the string from the document that was recognized in
1574               order to call the current handler. For instance, when called
1575               from a start handler, it will give us the the start-tag string.
1576               The string is encoded in UTF-8.  This method doesn't return a
1577               meaningful string inside declaration handlers.
1578
1579           original_string
1580               Returns the verbatim string from the document that was recog‐
1581               nized in order to call the current handler. The string is in
1582               the original document encoding. This method doesn't return a
1583               meaningful string inside declaration handlers.
1584
1585           xpcroak
1586               Concatenate onto the given message the current line number
1587               within the XML document plus the message implied by ErrorCon‐
1588               text. Then croak with the formed message.
1589
1590           xpcarp
1591               Concatenate onto the given message the current line number
1592               within the XML document plus the message implied by ErrorCon‐
1593               text. Then carp with the formed message.
1594
1595           xml_escape(TEXT [, CHAR [, CHAR ...]])
1596               Returns TEXT with markup characters turned into character enti‐
1597               ties.  Any additional characters provided as arguments are also
1598               turned into character references where found in TEXT.
1599
1600               (this method is broken on some versions of expat/XML::Parser)
1601
1602       path ( $optional_tag)
1603           Return the element context in a form similar to XPath's short form:
1604           '"/root/tag1/../tag"'
1605
1606       get_xpath  ( $optional_array_ref, $xpath, $optional_offset)
1607           Performs a "get_xpath" on the document root (see <Elt⎪"Elt">)
1608
1609           If the $optional_array_ref argument is used the array must contain
1610           elements. The $xpath expression is applied to each element in turn
1611           and the result is union of all results. This way a first query can
1612           be refined in further steps.
1613
1614       find_nodes ( $optional_array_ref, $xpath, $optional_offset)
1615           same as "get_xpath"
1616
1617       findnodes ( $optional_array_ref, $xpath, $optional_offset)
1618           same as "get_xpath" (similar to the XML::LibXML method)
1619
1620       findvalue ( $optional_array_ref, $xpath, $optional_offset)
1621           Return the "join" of all texts of the results of appling
1622           "get_xpath" to the node (similar to the XML::LibXML method)
1623
1624       subs_text ($regexp, $replace)
1625           subs_text does text substitution on the whole document, similar to
1626           perl's " s///" operator.
1627
1628       dispose
1629           Useful only if you don't have "Scalar::Util" or "WeakRef"
1630           installed.
1631
1632           Reclaims properly the memory used by an XML::Twig object. As the
1633           object has circular references it never goes out of scope, so if
1634           you want to parse lots of XML documents then the memory leak
1635           becomes a problem. Use "$twig->dispose" to clear this problem.
1636
1637       create_accessors (list_of_attribute_names)
1638           A convenience method that creates l-valued accessors for
1639           attributes.  So "$twig->create_accessors( 'foo')" will create a
1640           "foo" method that can be called on elements:
1641
1642             $elt->foo;         # equivalent to $elt->{'att'}->{'foo'};
1643             $elt->foo( 'bar'); # equivalent to $elt->set_att( foo => 'bar');
1644
1645       set_do_not_escape_amp_in_atts
1646           An evil method, that I only document because Test::Pod::Coverage
1647           complaints otherwise, but really, you don't want to know about it.
1648
1649       XML::Twig::Elt
1650
1651       new          ($optional_tag, $optional_atts, @optional_content)
1652           The "tag" is optional (but then you can't have a content ), the
1653           $optional_atts argument is a refreference to a hash of attributes,
1654           the content can be just a string or a list of strings and element.
1655           A content of '"#EMPTY"' creates an empty element;
1656
1657            Examples: my $elt= XML::Twig::Elt->new();
1658                      my $elt= XML::Twig::Elt->new( para => { align => 'center' });
1659                      my $elt= XML::Twig::Elt->new( para => { align => 'center' }, 'foo');
1660                      my $elt= XML::Twig::Elt->new( br   => '#EMPTY');
1661                      my $elt= XML::Twig::Elt->new( 'para');
1662                      my $elt= XML::Twig::Elt->new( para => 'this is a para');
1663                      my $elt= XML::Twig::Elt->new( para => $elt3, 'another para');
1664
1665           The strings are not parsed, the element is not attached to any
1666           twig.
1667
1668           WARNING: if you rely on ID's then you will have to set the id your‐
1669           self. At this point the element does not belong to a twig yet, so
1670           the ID attribute is not known so it won't be strored in the ID
1671           list.
1672
1673           Note that "#COMMENT", "#PCDATA" or "#CDATA" are valid tag names,
1674           that will create text elements.
1675
1676           To create an element "foo" containing a CDATA section:
1677
1678                      my $foo= XML::Twig::Elt->new( '#CDATA' => "content of the CDATA section")
1679                                             ->wrap_in( 'foo');
1680
1681           An attribute of '#CDATA', will create the content of the attribute
1682           as CDATA:
1683
1684             my $elt= XML::Twig::Elt->new( 'p' => { #CDATA => 1}, 'foo < bar');
1685
1686           creates an element
1687
1688             <p><![CDATA[foo < bar]]></>
1689
1690       parse         ($string, %args)
1691           Creates an element from an XML string. The string is actually
1692           parsed as a new twig, then the root of that twig is returned.  The
1693           arguments in %args are passed to the twig.  As always if the parse
1694           fails the parser will die, so use an eval if you want to trap syn‐
1695           tax errors.
1696
1697           As obviously the element does not exist beforehand this method has
1698           to be called on the class:
1699
1700             my $elt= parse XML::Twig::Elt( "<a> string to parse, with <sub/>
1701                                             <elements>, actually tons of </elements>
1702                             h</a>");
1703
1704       set_inner_xml ($string)
1705           Sets the content of the element to be the tree created from the
1706           string
1707
1708       set_inner_html ($string)
1709           Sets the content of the element, after parsing the string with an
1710           HTML parser (HTML::Parser)
1711
1712       print         ($optional_filehandle, $optional_pretty_print_style)
1713           Prints an entire element, including the tags, optionally to a
1714           $optional_filehandle, optionally with a $pretty_print_style.
1715
1716           The print outputs XML data so base entities are escaped.
1717
1718       sprint       ($elt, $optional_no_enclosing_tag)
1719           Return the xml string for an entire element, including the tags.
1720           If the optional second argument is true then only the string inside
1721           the element is returned (the start and end tag for $elt are not).
1722           The text is XML-escaped: base entities (& and < in text, & < and "
1723           in attribute values) are turned into entities.
1724
1725       gi  Return the gi of the element (the gi is the "generic identifier"
1726           the tag name in SGML parlance).
1727
1728           "tag" and "name" are synonyms of "gi".
1729
1730       tag Same as "gi"
1731
1732       name
1733           Same as "tag"
1734
1735       set_gi         ($tag)
1736           Set the gi (tag) of an element
1737
1738       set_tag        ($tag)
1739           Set the tag (="tag") of an element
1740
1741       set_name       ($name)
1742           Set the name (="tag") of an element
1743
1744       root
1745           Return the root of the twig in which the element is contained.
1746
1747       twig
1748           Return the twig containing the element.
1749
1750       parent        ($optional_condition)
1751           Return the parent of the element, or the first ancestor matching
1752           the $optional_condition
1753
1754       first_child   ($optional_condition)
1755           Return the first child of the element, or the first child matching
1756           the $optional_condition
1757
1758       has_child ($optional_condition)
1759           Return the first child of the element, or the first child matching
1760           the $optional_condition (same as first_child)
1761
1762       has_children ($optional_condition)
1763           Return the first child of the element, or the first child matching
1764           the $optional_condition (same as first_child)
1765
1766       first_child_text   ($optional_condition)
1767           Return the text of the first child of the element, or the first
1768           child
1769            matching the $optional_condition If there is no first_child then
1770           returns ''. This avoids getting the child, checking for its exis‐
1771           tence then getting the text for trivial cases.
1772
1773           Similar methods are available for the other navigation methods:
1774
1775           last_child_text
1776           prev_sibling_text
1777           next_sibling_text
1778           prev_elt_text
1779           next_elt_text
1780           child_text
1781           parent_text
1782
1783           All this methods also exist in "trimmed" variant:
1784
1785           first_child_trimmed_text
1786           last_child_trimmed_text
1787           prev_sibling_trimmed_text
1788           next_sibling_trimmed_text
1789           prev_elt_trimmed_text
1790           next_elt_trimmed_text
1791           child_trimmed_text
1792           parent_trimmed_text
1793       field         ($optional_condition)
1794           Same method as "first_child_text" with a different name
1795
1796       trimmed_field         ($optional_condition)
1797           Same method as "first_child_trimmed_text" with a different name
1798
1799       set_field ($condition, $optional_atts, @list_of_elt_and_strings)
1800           Set the content of the first child of the element that matches
1801           $condition, the rest of the arguments is tha same as for "set_con‐
1802           tent"
1803
1804           If no child matches $condition _and_ if $condition is a valid XML
1805           element name, then a new element by that name is created and
1806           inserted as the last child.
1807
1808       first_child_matches   ($optional_condition)
1809           Return the element if the first child of the element (if it exists)
1810           passes the $optional_condition "undef" otherwise
1811
1812             if( $elt->first_child_matches( 'title')) ...
1813
1814           is equivalent to
1815
1816             if( $elt->{first_child} && $elt->{first_child}->passes( 'title'))
1817
1818           "first_child_is" is an other name for this method
1819
1820           Similar methods are available for the other navigation methods:
1821
1822           last_child_matches
1823           prev_sibling_matches
1824           next_sibling_matches
1825           prev_elt_matches
1826           next_elt_matches
1827           child_matches
1828           parent_matches
1829       is_first_child ($optional_condition)
1830           returns true (the element) if the element is the first child of its
1831           parent (optionaly that satisfies the $optional_condition)
1832
1833       is_last_child ($optional_condition)
1834           returns true (the element) if the element is the first child of its
1835           parent (optionaly that satisfies the $optional_condition)
1836
1837       prev_sibling  ($optional_condition)
1838           Return the previous sibling of the element, or the previous sibling
1839           matching $optional_condition
1840
1841       next_sibling  ($optional_condition)
1842           Return the next sibling of the element, or the first one matching
1843           $optional_condition.
1844
1845       next_elt     ($optional_elt, $optional_condition)
1846           Return the next elt (optionally matching $optional_condition) of
1847           the element. This is defined as the next element which opens after
1848           the current element opens.  Which usually means the first child of
1849           the element.  Counter-intuitive as it might look this allows you to
1850           loop through the whole document by starting from the root.
1851
1852           The $optional_elt is the root of a subtree. When the "next_elt" is
1853           out of the subtree then the method returns undef. You can then walk
1854           a sub tree with:
1855
1856             my $elt= $subtree_root;
1857             while( $elt= $elt->next_elt( $subtree_root)
1858               { # insert processing code here
1859               }
1860
1861       prev_elt     ($optional_condition)
1862           Return the previous elt (optionally matching $optional_condition)
1863           of the element. This is the first element which opens before the
1864           current one.  It is usually either the last descendant of the pre‐
1865           vious sibling or simply the parent
1866
1867       next_n_elt   ($offset, $optional_condition)
1868           Return the $offset-th element that matches the $optional_condition
1869
1870       following_elt
1871           Return the following element (as per the XPath following axis)
1872
1873       preceding_elt
1874           Return the preceding element (as per the XPath preceding axis)
1875
1876       following_elts
1877           Return the list of following elements (as per the XPath following
1878           axis)
1879
1880       preceding_elts
1881           Return the pst of preceding elements (as per the XPath preceding
1882           axis)
1883
1884       children     ($optional_condition)
1885           Return the list of children (optionally which matches
1886           $optional_condition) of the element. The list is in document order.
1887
1888       children_count ($optional_condition)
1889           Return the number of children of the element (optionally which
1890           matches $optional_condition)
1891
1892       children_text ($optional_condition)
1893           Return an array containing the text of children of the element
1894           (optionally which matches $optional_condition)
1895
1896       children_trimmed_text ($optional_condition)
1897           Return an array containing the trimmed text of children of the ele‐
1898           ment (optionally which matches $optional_condition)
1899
1900       children_copy ($optional_condition)
1901           Return a list of elements that are copies of the children of the
1902           element, optionally which matches $optional_condition
1903
1904       descendants     ($optional_condition)
1905           Return the list of all descendants (optionally which matches
1906           $optional_condition) of the element. This is the equivalent of the
1907           "getElementsByTagName" of the DOM (by the way, if you are really a
1908           DOM addict, you can use "getElementsByTagName" instead)
1909
1910       getElementsByTagName ($optional_condition)
1911           Same as "descendants"
1912
1913       find_by_tag_name ($optional_condition)
1914           Same as "descendants"
1915
1916       descendants_or_self ($optional_condition)
1917           Same as "descendants" except that the element itself is included in
1918           the list if it matches the $optional_condition
1919
1920       first_descendant  ($optional_condition)
1921           Return the first descendant of the element that matches the condi‐
1922           tion
1923
1924       last_descendant  ($optional_condition)
1925           Return the last descendant of the element that matches the condi‐
1926           tion
1927
1928       ancestors    ($optional_condition)
1929           Return the list of ancestors (optionally matching $optional_condi‐
1930           tion) of the element.  The list is ordered from the innermost
1931           ancestor to the outtermost one
1932
1933           NOTE: the element itself is not part of the list, in order to
1934           include it you will have to use ancestors_or_self
1935
1936       ancestors_or_self     ($optional_condition)
1937           Return the list of ancestors (optionally matching $optional_condi‐
1938           tion) of the element, including the element (if it matches the con‐
1939           dition>).  The list is ordered from the innermost ancestor to the
1940           outtermost one
1941
1942       passes ($condition)
1943           Return the element if it passes the $condition
1944
1945       att          ($att)
1946           Return the value of attribute $att or "undef"
1947
1948       set_att      ($att, $att_value)
1949           Set the attribute of the element to the given value
1950
1951           You can actually set several attributes this way:
1952
1953             $elt->set_att( att1 => "val1", att2 => "val2");
1954
1955       del_att      ($att)
1956           Delete the attribute for the element
1957
1958           You can actually delete several attributes at once:
1959
1960             $elt->del_att( 'att1', 'att2', 'att3');
1961
1962       cut Cut the element from the tree. The element still exists, it can be
1963           copied or pasted somewhere else, it is just not attached to the
1964           tree anymore.
1965
1966           Note that the "old" links to the parent, previous and next siblings
1967           can still be accessed using the former_* methods
1968
1969       former_next_sibling
1970           Returns the former next sibling of a cut node (or undef if the node
1971           has not been cut)
1972
1973           This makes it easier to write loops where you cut elements:
1974
1975               my $child= $parent->first_child( 'achild');
1976               while( $child->{'att'}->{'cut'})
1977                 { $child->cut; $child= $child->former_next_sibling; }
1978
1979       former_prev_sibling
1980           Returns the former previous sibling of a cut node (or undef if the
1981           node has not been cut)
1982
1983       former_parent
1984           Returns the former parent of a cut node (or undef if the node has
1985           not been cut)
1986
1987       cut_children ($optional_condition)
1988           Cut all the children of the element (or all of those which satisfy
1989           the $optional_condition).
1990
1991           Return the list of children
1992
1993       copy        ($elt)
1994           Return a copy of the element. The copy is a "deep" copy: all sub
1995           elements of the element are duplicated.
1996
1997       paste       ($optional_position, $ref)
1998           Paste a (previously "cut" or newly generated) element. Die if the
1999           element already belongs to a tree.
2000
2001           Note that the calling element is pasted:
2002
2003             $child->paste( first_child => $existing_parent);
2004             $new_sibling->paste( after => $this_sibling_is_already_in_the_tree);
2005
2006           or
2007
2008             my $new_elt= XML::Twig::Elt->new( tag => $content);
2009             $new_elt->paste( $position => $existing_elt);
2010
2011           Example:
2012
2013             my $t= XML::Twig->new->parse( 'doc.xml')
2014             my $toc= $t->root->new( 'toc');
2015             $toc->paste( $t->root); # $toc is pasted as first child of the root
2016             foreach my $title ($t->findnodes( '/doc/section/title'))
2017               { my $title_toc= $title->copy;
2018                 # paste $title_toc as the last child of toc
2019                 $title_toc->paste( last_child => $toc)
2020               }
2021
2022           Position options:
2023
2024           first_child (default)
2025               The element is pasted as the first child of $ref
2026
2027           last_child
2028               The element is pasted as the last child of $ref
2029
2030           before
2031               The element is pasted before $ref, as its previous sibling.
2032
2033           after
2034               The element is pasted after $ref, as its next sibling.
2035
2036           within
2037               In this case an extra argument, $offset, should be supplied.
2038               The element will be pasted in the reference element (or in its
2039               first text child) at the given offset. To achieve this the ref‐
2040               erence element will be split at the offset.
2041
2042           Note that you can call directly the underlying method:
2043
2044           paste_before
2045           paste_after
2046           paste_first_child
2047           paste_last_child
2048           paste_within
2049       move       ($optional_position, $ref)
2050           Move an element in the tree.  This is just a "cut" then a "paste".
2051           The syntax is the same as "paste".
2052
2053       replace       ($ref)
2054           Replaces an element in the tree. Sometimes it is just not possible
2055           to"cut" an element then "paste" another in its place, so "replace"
2056           comes in handy.  The calling element replaces $ref.
2057
2058       replace_with   (@elts)
2059           Replaces the calling element with one or more elements
2060
2061       delete
2062           Cut the element and frees the memory.
2063
2064       prefix       ($text, $optional_option)
2065           Add a prefix to an element. If the element is a "PCDATA" element
2066           the text is added to the pcdata, if the elements first child is a
2067           "PCDATA" then the text is added to it's pcdata, otherwise a new
2068           "PCDATA" element is created and pasted as the first child of the
2069           element.
2070
2071           If the option is "asis" then the prefix is added asis: it is cre‐
2072           ated in a separate "PCDATA" element with an "asis" property. You
2073           can then write:
2074
2075             $elt1->prefix( '<b>', 'asis');
2076
2077           to create a "<b>" in the output of "print".
2078
2079       suffix       ($text, $optional_option)
2080           Add a suffix to an element. If the element is a "PCDATA" element
2081           the text is added to the pcdata, if the elements last child is a
2082           "PCDATA" then the text is added to it's pcdata, otherwise a new
2083           PCDATA element is created and pasted as the last child of the ele‐
2084           ment.
2085
2086           If the option is "asis" then the suffix is added asis: it is cre‐
2087           ated in a separate "PCDATA" element with an "asis" property. You
2088           can then write:
2089
2090             $elt2->suffix( '</b>', 'asis');
2091
2092       trim
2093           Trim the element in-place: spaces at the beginning and at the end
2094           of the element are discarded and multiple spaces within the element
2095           (or its descendants) are replaced by a single space.
2096
2097           Note that in some cases you can still end up with multiple spaces,
2098           if they are split between several elements:
2099
2100             <doc>  text <b>  hah! </b>  yep</doc>
2101
2102           gets trimmed to
2103
2104             <doc>text <b> hah! </b> yep</doc>
2105
2106           This is somewhere in between a bug and a feature.
2107
2108       simplify (%options)
2109           Return a data structure suspiciously similar to XML::Simple's.
2110           Options are identical to XMLin options, see XML::Simple doc for
2111           more details (or use DATA::dumper or YAML to dump the data struc‐
2112           ture)
2113
2114           content_key
2115           forcearray
2116           keyattr
2117           noattr
2118           normalize_space
2119               aka normalise_space
2120
2121           variables (%var_hash)
2122               %var_hash is a hash { name => value }
2123
2124               This option allows variables in the XML to be expanded when the
2125               file is read. (there is no facility for putting the variable
2126               names back if you regenerate XML using XMLout).
2127
2128               A 'variable' is any text of the form ${name} (or $name) which
2129               occurs in an attribute value or in the text content of an ele‐
2130               ment. If 'name' matches a key in the supplied hashref, ${name}
2131               will be replaced with the corresponding value from the hashref.
2132               If no matching key is found, the variable will not be replaced.
2133
2134           var_att ($attribute_name)
2135               This option gives the name of an attribute that will be used to
2136               create variables in the XML:
2137
2138                 <dirs>
2139                   <dir name="prefix">/usr/local</dir>
2140                   <dir name="exec_prefix">$prefix/bin</dir>
2141                 </dirs>
2142
2143               use "var => 'name'" to get $prefix replaced by /usr/local in
2144               the generated data structure
2145
2146               By default variables are captured by the following regexp:
2147               /$(\w+)/
2148
2149           var_regexp (regexp)
2150               This option changes the regexp used to capture variables. The
2151               variable name should be in $1
2152
2153           group_tags { grouping tag => grouped tag, grouping tag 2 => grouped
2154           tag 2...}
2155               Option used to simplify the structure: elements listed will not
2156               be used.  Their children will be, they will be considered chil‐
2157               dren of the element parent.
2158
2159               If the element is:
2160
2161                 <config host="laptop.xmltwig.com">
2162                   <server>localhost</server>
2163                   <dirs>
2164                     <dir name="base">/home/mrodrigu/standards</dir>
2165                     <dir name="tools">$base/tools</dir>
2166                   </dirs>
2167                   <templates>
2168                     <template name="std_def">std_def.templ</template>
2169                     <template name="dummy">dummy</template>
2170                   </templates>
2171                 </config>
2172
2173               Then calling simplify with "group_tags => { dirs => 'dir', tem‐
2174               plates => 'template'}" makes the data structure be exactly as
2175               if the start and end tags for "dirs" and "templates" were not
2176               there.
2177
2178               A YAML dump of the structure
2179
2180                 base: '/home/mrodrigu/standards'
2181                 host: laptop.xmltwig.com
2182                 server: localhost
2183                 template:
2184                   - std_def.templ
2185                   - dummy.templ
2186                 tools: '$base/tools'
2187
2188       split_at        ($offset)
2189           Split a text ("PCDATA" or "CDATA") element in 2 at $offset, the
2190           original element now holds the first part of the string and a new
2191           element holds the right part. The new element is returned
2192
2193           If the element is not a text element then the first text child of
2194           the element is split
2195
2196       split        ( $optional_regexp, $tag1, $atts1, $tag2, $atts2...)
2197           Split the text descendants of an element in place, the text is
2198           split using the $regexp, if the regexp includes () then the matched
2199           separators will be wrapped in elements.  $1 is wrapped in $tag1,
2200           with attributes $atts1 if $atts1 is given (as a hashref), $2 is
2201           wrapped in $tag2...
2202
2203           if $elt is "<p>tati tata <b>tutu tati titi</b> tata tati tata</p>"
2204
2205             $elt->split( qr/(ta)ti/, 'foo', {type => 'toto'} )
2206
2207           will change $elt to
2208
2209             <p><foo type="toto">ta</foo> tata <b>tutu <foo type="toto">ta</foo>
2210                 titi</b> tata <foo type="toto">ta</foo> tata</p>
2211
2212           The regexp can be passed either as a string or as "qr//" (perl
2213           5.005 and later), it defaults to \s+ just as the "split" built-in
2214           (but this would be quite a useless behaviour without the
2215           $optional_tag parameter)
2216
2217           $optional_tag defaults to PCDATA or CDATA, depending on the initial
2218           element type
2219
2220           The list of descendants is returned (including un-touched original
2221           elements and newly created ones)
2222
2223       mark        ( $regexp, $optional_tag, $optional_attribute_ref)
2224           This method behaves exactly as split, except only the newly created
2225           elements are returned
2226
2227       wrap_children ( $regexp_string, $tag, $optional_attribute_hashref)
2228           Wrap the children of the element that match the regexp in an ele‐
2229           ment $tag.  If $optional_attribute_hashref is passed then the new
2230           element will have these attributes.
2231
2232           The $regexp_string includes tags, within pointy brackets, as in
2233           "<title><para>+" and the usual Perl modifiers (+*?...).  Tags can
2234           be further qualified with attributes: "<para type="warning" clas‐
2235           sif="cosmic_secret">+". The values for attributes should be
2236           xml-escaped: "<candy type="M&amp;Ms">*" ("<", "&" ">" and """
2237           should be escaped).
2238
2239           Note that elements might get extra "id" attributes in the process.
2240           See add_id.  Use strip_att to remove unwanted id's.
2241
2242           Here is an example:
2243
2244           If the element $elt has the following content:
2245
2246             <elt>
2247              <p>para 1</p>
2248              <l_l1_1>list 1 item 1 para 1</l_l1_1>
2249                <l_l1>list 1 item 1 para 2</l_l1>
2250              <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
2251              <l_l1_n>list 1 item 3 para 1</l_l1_n>
2252                <l_l1>list 1 item 3 para 2</l_l1>
2253                <l_l1>list 1 item 3 para 3</l_l1>
2254              <l_l1_1>list 2 item 1 para 1</l_l1_1>
2255                <l_l1>list 2 item 1 para 2</l_l1>
2256              <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
2257              <l_l1_n>list 2 item 3 para 1</l_l1_n>
2258                <l_l1>list 2 item 3 para 2</l_l1>
2259                <l_l1>list 2 item 3 para 3</l_l1>
2260             </elt>
2261
2262           Then the code
2263
2264             $elt->wrap_children( q{<l_l1_1><l_l1>*} , li => { type => "ul1" });
2265             $elt->wrap_children( q{<l_l1_n><l_l1>*} , li => { type => "ul" });
2266
2267             $elt->wrap_children( q{<li type="ul1"><li type="ul">+}, "ul");
2268             $elt->strip_att( 'id');
2269             $elt->strip_att( 'type');
2270             $elt->print;
2271
2272           will output:
2273
2274             <elt>
2275                <p>para 1</p>
2276                <ul>
2277                  <li>
2278                    <l_l1_1>list 1 item 1 para 1</l_l1_1>
2279                    <l_l1>list 1 item 1 para 2</l_l1>
2280                  </li>
2281                  <li>
2282                    <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
2283                  </li>
2284                  <li>
2285                    <l_l1_n>list 1 item 3 para 1</l_l1_n>
2286                    <l_l1>list 1 item 3 para 2</l_l1>
2287                    <l_l1>list 1 item 3 para 3</l_l1>
2288                  </li>
2289                </ul>
2290                <ul>
2291                  <li>
2292                    <l_l1_1>list 2 item 1 para 1</l_l1_1>
2293                    <l_l1>list 2 item 1 para 2</l_l1>
2294                  </li>
2295                  <li>
2296                    <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
2297                  </li>
2298                  <li>
2299                    <l_l1_n>list 2 item 3 para 1</l_l1_n>
2300                    <l_l1>list 2 item 3 para 2</l_l1>
2301                    <l_l1>list 2 item 3 para 3</l_l1>
2302                  </li>
2303                </ul>
2304             </elt>
2305
2306       subs_text ($regexp, $replace)
2307           subs_text does text substitution, similar to perl's " s///" opera‐
2308           tor.
2309
2310           $regexp must be a perl regexp, created with the "qr" operatot.
2311
2312           $replace can include "$1, $2"... from the $regexp. It can also be
2313           used to create element and entities, by using "&elt( tag => { att
2314           => val }, text)" (similar syntax as "new") and "&ent( name)".
2315
2316           Here is a rather complex example:
2317
2318             $elt->subs_text( qr{(?<!do not )link to (http://([^\s,]*))},
2319                              'see &elt( a =>{ href => $1 }, $2)'
2320                            );
2321
2322           This will replace text like link to http://www.xmltwig.com by see
2323           <a href="www.xmltwig.com">www.xmltwig.com</a>, but not do not link
2324           to...
2325
2326           Generating entities (here replacing spaces with &nbsp;):
2327
2328             $elt->subs_text( qr{ }, '&ent( "&nbsp;")');
2329
2330           or, using a variable:
2331
2332             my $ent="&nbsp;";
2333             $elt->subs_text( qr{ }, "&ent( '$ent')");
2334
2335           Note that the substitution is always global, as in using the "g"
2336           modifier in a perl substitution, and that it is performed on all
2337           text descendants of the element.
2338
2339           Bug: in the $regexp, you can only use "\1", "\2"... if the replace‐
2340           ment expression does not include elements or attributes. eg
2341
2342             t->subs_text( qr/((t[aiou])\2)/, '$2');             # ok, replaces toto, tata, titi, tutu by to, ta, ti, tu
2343             t->subs_text( qr/((t[aiou])\2)/, '&elt(p => $1)' ); # NOK, does not find toto...
2344
2345       add_id ($optional_coderef)
2346           Add an id to the element.
2347
2348           The id is an attribute, "id" by default, see the "id" option for
2349           XML::Twig "new" to change it. Use an id starting with "#" to get an
2350           id that's not output by print, flush or sprint, yet that allows you
2351           to use the elt_id method to get the element easily.
2352
2353           If the element already has an id, no new id is generated.
2354
2355           By default the method create an id of the form "twig_id_<nnnn>",
2356           where "<nnnn>" is a number, incremented each time the method is
2357           called successfully.
2358
2359       set_id_seed ($prefix)
2360           by default the id generated by "add_id" is "twig_id_<nnnn>",
2361           "set_id_seed" changes the prefix to $prefix and resets the number
2362           to 1
2363
2364       strip_att ($att)
2365           Remove the attribute $att from all descendants of the element
2366           (including the element)
2367
2368           Return the element
2369
2370       change_att_name ($old_name, $new_name)
2371           Change the name of the attribute from $old_name to $new_name. If
2372           there is no attribute $old_name nothing happens.
2373
2374       sort_children_on_value( %options)
2375           Sort the children of the element in place according to their text.
2376           All children are sorted.
2377
2378           Return the element, with its children sorted.
2379
2380           %options are
2381
2382             type  : numeric ⎪  alpha     (default: alpha)
2383             order : normal  ⎪  reverse   (default: normal)
2384
2385           Return the element, with its children sorted
2386
2387       sort_children_on_att ($att, %options)
2388           Sort the children of the  element in place according to attribute
2389           $att.  %options are the same as for "sort_children_on_value"
2390
2391           Return the element.
2392
2393       sort_children_on_field ($tag, %options)
2394           Sort the children of the element in place, according to the field
2395           $tag (the text of the first child of the child with this tag).
2396           %options are the same as for "sort_children_on_value".
2397
2398           Return the element, with its children sorted
2399
2400       sort_children( $get_key, %options)
2401           Sort the children of the element in place. The $get_key argument is
2402           a reference to a function that returns the sort key when passed an
2403           element.
2404
2405           For example:
2406
2407             $elt->sort_children( sub { $_[0]->{'att'}->{"nb"} + $_[0]->text },
2408                                  type => 'numeric', order => 'reverse'
2409                                );
2410
2411       field_to_att ($cond, $att)
2412           Turn the text of the first sub-element matched by $cond into the
2413           value of attribute $att of the element. If $att is ommited then
2414           $cond is used as the name of the attribute, which makes sense only
2415           if $cond is a valid element (and attribute) name.
2416
2417           The sub-element is then cut.
2418
2419       att_to_field ($att, $tag)
2420           Take the value of attribute $att and create a sub-element $tag as
2421           first child of the element. If $tag is ommited then $att is used as
2422           the name of the sub-element.
2423
2424       get_xpath  ($xpath, $optional_offset)
2425           Return a list of elements satisfying the $xpath. $xpath is an
2426           XPATH-like expression.
2427
2428           A subset of the XPATH abbreviated syntax is covered:
2429
2430             tag
2431             tag[1] (or any other positive number)
2432             tag[last()]
2433             tag[@att] (the attribute exists for the element)
2434             tag[@att="val"]
2435             tag[@att=~ /regexp/]
2436             tag[att1="val1" and att2="val2"]
2437             tag[att1="val1" or att2="val2"]
2438             tag[string()="toto"] (returns tag elements which text (as per the text method)
2439                                  is toto)
2440             tag[string()=~/regexp/] (returns tag elements which text (as per the text
2441                                     method) matches regexp)
2442             expressions can start with / (search starts at the document root)
2443             expressions can start with . (search starts at the current element)
2444             // can be used to get all descendants instead of just direct children
2445             * matches any tag
2446
2447           So the following examples from the XPath recommenda‐
2448           tion<http://www.w3.org/TR/xpath.html#path-abbrev> work:
2449
2450             para selects the para element children of the context node
2451             * selects all element children of the context node
2452             para[1] selects the first para child of the context node
2453             para[last()] selects the last para child of the context node
2454             */para selects all para grandchildren of the context node
2455             /doc/chapter[5]/section[2] selects the second section of the fifth chapter
2456                of the doc
2457             chapter//para selects the para element descendants of the chapter element
2458                children of the context node
2459             //para selects all the para descendants of the document root and thus selects
2460                all para elements in the same document as the context node
2461             //olist/item selects all the item elements in the same document as the
2462                context node that have an olist parent
2463             .//para selects the para element descendants of the context node
2464             .. selects the parent of the context node
2465             para[@type="warning"] selects all para children of the context node that have
2466                a type attribute with value warning
2467             employee[@secretary and @assistant] selects all the employee children of the
2468                context node that have both a secretary attribute and an assistant
2469                attribute
2470
2471           The elements will be returned in the document order.
2472
2473           If $optional_offset is used then only one element will be returned,
2474           the one with the appropriate offset in the list, starting at 0
2475
2476           Quoting and interpolating variables can be a pain when the Perl
2477           syntax and the XPATH syntax collide, so use alternate quoting mech‐
2478           anisms like q or qq (I like q{} and qq{} myself).
2479
2480           Here are some more examples to get you started:
2481
2482             my $p1= "p1";
2483             my $p2= "p2";
2484             my @res= $t->get_xpath( qq{p[string( "$p1") or string( "$p2")]});
2485
2486             my $a= "a1";
2487             my @res= $t->get_xpath( qq{//*[@att="$a"]});
2488
2489             my $val= "a1";
2490             my $exp= qq{//p[ \@att='$val']}; # you need to use \@ or you will get a warning
2491             my @res= $t->get_xpath( $exp);
2492
2493           Note that the only supported regexps delimiters are / and that you
2494           must backslash all / in regexps AND in regular strings.
2495
2496           XML::Twig does not provide natively full XPATH support, but you can
2497           use "XML::Twig::XPath" to get "findnodes" to use "XML::XPath" as
2498           the XPath engine, with full coverage of the spec.
2499
2500           "XML::Twig::XPath" to get "findnodes" to use "XML::XPath" as the
2501           XPath engine, with full coverage of the spec.
2502
2503       find_nodes
2504           same as"get_xpath"
2505
2506       findnodes
2507           same as "get_xpath"
2508
2509       text @optional_options
2510           Return a string consisting of all the "PCDATA" and "CDATA" in an
2511           element, without any tags. The text is not XML-escaped: base enti‐
2512           ties such as "&" and "<" are not escaped.
2513
2514           The '"no_recurse"' option will only return the text of the element,
2515           not of any included sub-elements (same as "text_only").
2516
2517       text_only
2518           Same as "text" except that the text returned doesn't include the
2519           text of sub-elements.
2520
2521       trimmed_text
2522           Same as "text" except that the text is trimmed: leading and trail‐
2523           ing spaces are discarded, consecutive spaces are collapsed
2524
2525       set_text        ($string)
2526           Set the text for the element: if the element is a "PCDATA", just
2527           set its text, otherwise cut all the children of the element and
2528           create a single "PCDATA" child for it, which holds the text.
2529
2530       merge ($elt2)
2531           Move the content of $elt2 within the element
2532
2533       insert         ($tag1, [$optional_atts1], $tag2, [$optional_atts2],...)
2534           For each tag in the list inserts an element $tag as the only child
2535           of the element.  The element gets the optional attributes
2536           in"$optional_atts<n>."  All children of the element are set as
2537           children of the new element.  The upper level element is returned.
2538
2539             $p->insert( table => { border=> 1}, 'tr', 'td')
2540
2541           put $p in a table with a visible border, a single "tr" and a single
2542           "td" and return the "table" element:
2543
2544             <p><table border="1"><tr><td>original content of p</td></tr></table></p>
2545
2546       wrap_in        (@tag)
2547           Wrap elements in @tag as the successive ancestors of the element,
2548           returns the new element.  "$elt->wrap_in( 'td', 'tr', 'table')"
2549           wraps the element as a single cell in a table for example.
2550
2551           Optionally each tag can be followed by a hasref of attributes, that
2552           will be set on the wrapping element:
2553
2554             $elt->wrap_in( p => { class => "advisory" }, div => { class => "intro", id => "div_intro });
2555
2556       insert_new_elt ($opt_position, $tag, $opt_atts_hashref, @opt_content)
2557           Combines a "new " and a "paste ": creates a new element using $tag,
2558           $opt_atts_hashref and @opt_content which are arguments similar to
2559           those for "new", then paste it, using $opt_position or
2560           'first_child', relative to $elt.
2561
2562           Return the newly created element
2563
2564       erase
2565           Erase the element: the element is deleted and all of its children
2566           are pasted in its place.
2567
2568       set_content    ( $optional_atts, @list_of_elt_and_strings) (
2569       $optional_atts, '#EMPTY')
2570           Set the content for the element, from a list of strings and ele‐
2571           ments.  Cuts all the element children, then pastes the list ele‐
2572           ments as the children.  This method will create a "PCDATA" element
2573           for any strings in the list.
2574
2575           The $optional_atts argument is the ref of a hash of attributes. If
2576           this argument is used then the previous attributes are deleted,
2577           otherwise they are left untouched.
2578
2579           WARNING: if you rely on ID's then you will have to set the id your‐
2580           self. At this point the element does not belong to a twig yet, so
2581           the ID attribute is not known so it won't be strored in the ID
2582           list.
2583
2584           A content of '"#EMPTY"' creates an empty element;
2585
2586       namespace ($optional_prefix)
2587           Return the URI of the namespace that $optional_prefix or the ele‐
2588           ment name belongs to. If the name doesn't belong to any namespace,
2589           "undef" is returned.
2590
2591       local_name
2592           Return the local name (without the prefix) for the element
2593
2594       ns_prefix
2595           Return the namespace prefix for the element
2596
2597       current_ns_prefixes
2598           Returna list of namespace prefixes valid for the element. The order
2599           of the prefixes in the list has no meaning. If the default names‐
2600           pace is currently bound, '' appears in the list.
2601
2602       inherit_att  ($att, @optional_tag_list)
2603           Return the value of an attribute inherited from parent tags. The
2604           value returned is found by looking for the attribute in the element
2605           then in turn in each of its ancestors. If the @optional_tag_list is
2606           supplied only those ancestors whose tag is in the list will be
2607           checked.
2608
2609       all_children_are ($optional_condition)
2610           return 1 if all children of the element pass the $optional_condi‐
2611           tion, 0 otherwise
2612
2613       level       ($optional_condition)
2614           Return the depth of the element in the twig (root is 0).  If
2615           $optional_condition is given then only ancestors that match the
2616           condition are counted.
2617
2618           WARNING: in a tree created using the "twig_roots" option this will
2619           not return the level in the document tree, level 0 will be the doc‐
2620           ument root, level 1 will be the "twig_roots" elements. During the
2621           parsing (in a "twig_handler") you can use the "depth" method on the
2622           twig object to get the real parsing depth.
2623
2624       in           ($potential_parent)
2625           Return true if the element is in the potential_parent ($poten‐
2626           tial_parent is an element)
2627
2628       in_context   ($cond, $optional_level)
2629           Return true if the element is included in an element which passes
2630           $cond optionally within $optional_level levels. The returned value
2631           is the including element.
2632
2633       pcdata
2634           Return the text of a "PCDATA" element or "undef" if the element is
2635           not "PCDATA".
2636
2637       pcdata_xml_string
2638           Return the text of a PCDATA element or undef if the element is not
2639           PCDATA.  The text is "XML-escaped" ('&' and '<' are replaced by
2640           '&amp;' and '&lt;')
2641
2642       set_pcdata     ($text)
2643           Set the text of a "PCDATA" element.
2644
2645       append_pcdata  ($text)
2646           Add the text at the end of a "PCDATA" element.
2647
2648       is_cdata
2649           Return 1 if the element is a "CDATA" element, returns 0 otherwise.
2650
2651       is_text
2652           Return 1 if the element is a "CDATA" or "PCDATA" element, returns 0
2653           otherwise.
2654
2655       cdata
2656           Return the text of a "CDATA" element or "undef" if the element is
2657           not "CDATA".
2658
2659       cdata_string
2660           Return the XML string of a "CDATA" element, including the opening
2661           and closing markers.
2662
2663       set_cdata     ($text)
2664           Set the text of a "CDATA" element.
2665
2666       append_cdata  ($text)
2667           Add the text at the end of a "CDATA" element.
2668
2669       remove_cdata
2670           Turns all "CDATA" sections in the element into regular "PCDATA"
2671           elements. This is useful when converting XML to HTML, as browsers
2672           do not support CDATA sections.
2673
2674       extra_data
2675           Return the extra_data (comments and PI's) attached to an element
2676
2677       set_extra_data     ($extra_data)
2678           Set the extra_data (comments and PI's) attached to an element
2679
2680       append_extra_data  ($extra_data)
2681           Append extra_data to the existing extra_data before the element (if
2682           no previous extra_data exists then it is created)
2683
2684       set_asis
2685           Set a property of the element that causes it to be output without
2686           being XML escaped by the print functions: if it contains "a < b" it
2687           will be output as such and not as "a &lt; b". This can be useful to
2688           create text elements that will be output as markup. Note that all
2689           "PCDATA" descendants of the element are also marked as having the
2690           property (they are the ones taht are actually impacted by the
2691           change).
2692
2693           If the element is a "CDATA" element it will also be output asis,
2694           without the "CDATA" markers. The same goes for any "CDATA" descen‐
2695           dant of the element
2696
2697       set_not_asis
2698           Unsets the "asis" property for the element and its text descen‐
2699           dants.
2700
2701       is_asis
2702           Return the "asis" property status of the element ( 1 or "undef")
2703
2704       closed
2705           Return true if the element has been closed. Might be usefull if you
2706           are somewhere in the tree, during the parse, and have no idea
2707           whether a parent element is completely loaded or not.
2708
2709       get_type
2710           Return the type of the element: '"#ELT"' for "real" elements, or
2711           '"#PCDATA"', '"#CDATA"', '"#COMMENT"', '"#ENT"', '"#PI"'
2712
2713       is_elt
2714           Return the tag if the element is a "real" element, or 0 if it is
2715           "PCDATA", "CDATA"...
2716
2717       contains_only_text
2718           Return 1 if the element does not contain any other "real" element
2719
2720       contains_only ($exp)
2721           Return the list of children if all children of the element match
2722           the expression $exp
2723
2724             if( $para->contains_only( 'tt')) { ... }
2725
2726       contains_a_single ($exp)
2727           If the element contains a single child that matches the expression
2728           $exp returns that element. Otherwise returns 0.
2729
2730       is_field
2731           same as "contains_only_text"
2732
2733       is_pcdata
2734           Return 1 if the element is a "PCDATA" element, returns 0 otherwise.
2735
2736       is_ent
2737           Return 1 if the element is an entity (an unexpanded entity) ele‐
2738           ment, return 0 otherwise.
2739
2740       is_empty
2741           Return 1 if the element is empty, 0 otherwise
2742
2743       set_empty
2744           Flags the element as empty. No further check is made, so if the
2745           element is actually not empty the output will be messed. The only
2746           effect of this method is that the output will be "<tag
2747           att="value""/>".
2748
2749       set_not_empty
2750           Flags the element as not empty. if it is actually empty then the
2751           element will be output as "<tag att="value""></tag>"
2752
2753       is_pi
2754           Return 1 if the element is a processing instruction ("#PI") ele‐
2755           ment, return 0 otherwise.
2756
2757       target
2758           Return the target of a processing instruction
2759
2760       set_target ($target)
2761           Set the target of a processing instruction
2762
2763       data
2764           Return the data part of a processing instruction
2765
2766       set_data ($data)
2767           Set the data of a processing instruction
2768
2769       set_pi ($target, $data)
2770           Set the target and data of a processing instruction
2771
2772       pi_string
2773           Return the string form of a processing instruction ("<?target
2774           data?>")
2775
2776       is_comment
2777           Return 1 if the element is a comment ("#COMMENT") element, return 0
2778           otherwise.
2779
2780       set_comment ($comment_text)
2781           Set the text for a comment
2782
2783       comment
2784           Return the content of a comment (just the text, not the "<!--" and
2785           "-->")
2786
2787       comment_string
2788           Return the XML string for a comment ("<!-- comment -->")
2789
2790       set_ent ($entity)
2791           Set an (non-expanded) entity ("#ENT"). $entity) is the entity text
2792           ("&ent;")
2793
2794       ent Return the entity for an entity ("#ENT") element ("&ent;")
2795
2796       ent_name
2797           Return the entity name for an entity ("#ENT") element ("ent")
2798
2799       ent_string
2800           Return the entity, either expanded if the expanded version is
2801           available, or non-expanded ("&ent;") otherwise
2802
2803       child ($offset, $optional_condition)
2804           Return the $offset-th child of the element, optionally the $off‐
2805           set-th child that matches $optional_condition. The children are
2806           treated as a list, so "$elt->child( 0)" is the first child, while
2807           "$elt->child( -1)" is the last child.
2808
2809       child_text ($offset, $optional_condition)
2810           Return the text of a child or "undef" if the sibling does not
2811           exist. Arguments are the same as child.
2812
2813       last_child    ($optional_condition)
2814           Return the last child of the element, or the last child matching
2815           $optional_condition (ie the last of the element children matching
2816           the condition).
2817
2818       last_child_text   ($optional_condition)
2819           Same as "first_child_text" but for the last child.
2820
2821       sibling  ($offset, $optional_condition)
2822           Return the next or previous $offset-th sibling of the element, or
2823           the $offset-th one matching $optional_condition. If $offset is neg‐
2824           ative then a previous sibling is returned, if $offset is positive
2825           then  a next sibling is returned. "$offset=0" returns the element
2826           if there is no condition or if the element matches the condition>,
2827           "undef" otherwise.
2828
2829       sibling_text ($offset, $optional_condition)
2830           Return the text of a sibling or "undef" if the sibling does not
2831           exist.  Arguments are the same as "sibling".
2832
2833       prev_siblings ($optional_condition)
2834           Return the list of previous siblings (optionaly matching
2835           $optional_condition) for the element. The elements are ordered in
2836           document order.
2837
2838       next_siblings ($optional_condition)
2839           Return the list of siblings (optionaly matching $optional_condi‐
2840           tion) following the element. The elements are ordered in document
2841           order.
2842
2843       pos ($optional_condition)
2844           Return the position of the element in the children list. The first
2845           child has a position of 1 (as in XPath).
2846
2847           If the $optional_condition is given then only siblings that match
2848           the condition are counted. If the element itself does not match the
2849           condition then 0 is returned.
2850
2851       atts
2852           Return a hash ref containing the element attributes
2853
2854       set_atts      ({att1=>$att1_val, att2=> $att2_val... })
2855           Set the element attributes with the hash ref supplied as the argu‐
2856           ment
2857
2858       del_atts
2859           Deletes all the element attributes.
2860
2861       att_nb
2862           Return the number of attributes for the element
2863
2864       has_atts
2865           Return true if the element has attributes (in fact return the num‐
2866           ber of attributes, thus being an alias to "att_nb"
2867
2868       has_no_atts
2869           Return true if the element has no attributes, false (0) otherwise
2870
2871       att_names
2872           return a list of the attribute names for the element
2873
2874       att_xml_string ($att, $optional_quote)
2875           Return the attribute value, where '&', '<' and $quote (" by
2876           default) are XML-escaped
2877
2878           if $optional_quote is passed then it is used as the quote.
2879
2880       set_id       ($id)
2881           Set the "id" attribute of the element to the value.  See "elt_id "
2882           to change the id attribute name
2883
2884       id  Gets the id attribute value
2885
2886       del_id       ($id)
2887           Deletes the "id" attribute of the element and remove it from the id
2888           list for the document
2889
2890       class
2891           Return the "class" attribute for the element (methods on the
2892           "class" attribute are quite convenient when dealing with XHTML, or
2893           plain XML that will eventually be displayed using CSS)
2894
2895       set_class ($class)
2896           Set the "class" attribute for the element to $class
2897
2898       add_to_class ($class)
2899           Add $class to the element "class" attribute: the new class is added
2900           only if it is not already present. Note that classes are sorted
2901           alphabetically, so the "class" attribute can be changed even if the
2902           class is already there
2903
2904       att_to_class ($att)
2905           Set the "class" attribute to the value of attribute $att
2906
2907       add_att_to_class ($att)
2908           Add the value of attribute $att to the "class" attribute of the
2909           element
2910
2911       move_att_to_class ($att)
2912           Add the value of attribute $att to the "class" attribute of the
2913           element and delete the attribute
2914
2915       tag_to_class
2916           Set the "class" attribute of the element to the element tag
2917
2918       add_tag_to_class
2919           Add the element tag to its "class" attribute
2920
2921       set_tag_class ($new_tag)
2922           Add the element tag to its "class" attribute and sets the tag to
2923           $new_tag
2924
2925       in_class ($class)
2926           Return true (1) if the element is in the class $class (if $class is
2927           one of the tokens in the element "class" attribute)
2928
2929       tag_to_span
2930           Change the element tag tp "span" and set its class to the old tag
2931
2932       tag_to_div
2933           Change the element tag tp "div" and set its class to the old tag
2934
2935       DESTROY
2936           Frees the element from memory.
2937
2938       start_tag
2939           Return the string for the start tag for the element, including the
2940           "/>" at the end of an empty element tag
2941
2942       end_tag
2943           Return the string for the end tag of an element.  For an empty ele‐
2944           ment, this returns the empty string ('').
2945
2946       xml_string @optional_options
2947           Equivalent to "$elt->sprint( 1)", returns the string for the entire
2948           element, excluding the element's tags (but nested element tags are
2949           present)
2950
2951           The '"no_recurse"' option will only return the text of the element,
2952           not of any included sub-elements (same as "xml_text_only").
2953
2954       inner_xml
2955           Another synonym for xml_string
2956
2957       outer_xml
2958           An other synonym for sprint
2959
2960       xml_text
2961           Return the text of the element, encoded (and processed by the cur‐
2962           rent "output_filter" or "output_encoding" options, without any tag.
2963
2964       xml_text_only
2965           Same as "xml_text" except that the text returned doesn't include
2966           the text of sub-elements.
2967
2968       set_pretty_print ($style)
2969           Set the pretty print method, amongst '"none"' (default),
2970           '"nsgmls"', '"nice"', '"indented"', '"record"' and '"record_c"'
2971
2972           pretty_print styles:
2973
2974           none
2975               the default, no "\n" is used
2976
2977           nsgmls
2978               nsgmls style, with "\n" added within tags
2979
2980           nice
2981               adds "\n" wherever possible (NOT SAFE, can lead to invalid XML)
2982
2983           indented
2984               same as "nice" plus indents elements (NOT SAFE, can lead to
2985               invalid XML)
2986
2987           record
2988               table-oriented pretty print, one field per line
2989
2990           record_c
2991               table-oriented pretty print, more compact than "record", one
2992               record per line
2993
2994       set_empty_tag_style ($style)
2995           Set the method to output empty tags, amongst '"normal"' (default),
2996           '"html"', and '"expand"',
2997
2998           "normal" outputs an empty tag '"<tag/>"', "html" adds a space
2999           '"<tag />"' for elements that can be empty in XHTML and "expand"
3000           outputs '"<tag></tag>"'
3001
3002       set_remove_cdata  ($flag)
3003           set (or unset) the flag that forces the twig to output CDATA sec‐
3004           tions as regular (escaped) PCDATA
3005
3006       set_indent ($string)
3007           Set the indentation for the indented pretty print style (default is
3008           2 spaces)
3009
3010       set_quote ($quote)
3011           Set the quotes used for attributes. can be '"double"' (default) or
3012           '"single"'
3013
3014       cmp       ($elt)
3015             Compare the order of the 2 elements in a twig.
3016
3017             C<$a> is the <A>..</A> element, C<$b> is the <B>...</B> element
3018
3019             document                        $a->cmp( $b)
3020             <A> ... </A> ... <B>  ... </B>     -1
3021             <A> ... <B>  ... </B> ... </A>     -1
3022             <B> ... </B> ... <A>  ... </A>      1
3023             <B> ... <A>  ... </A> ... </B>      1
3024              $a == $b                           0
3025              $a and $b not in the same tree   undef
3026
3027       before       ($elt)
3028           Return 1 if $elt starts before the element, 0 otherwise. If the 2
3029           elements are not in the same twig then return "undef".
3030
3031               if( $a->cmp( $b) == -1) { return 1; } else { return 0; }
3032
3033       after       ($elt)
3034           Return 1 if $elt starts after the element, 0 otherwise. If the 2
3035           elements are not in the same twig then return "undef".
3036
3037               if( $a->cmp( $b) == -1) { return 1; } else { return 0; }
3038
3039       other comparison methods
3040           lt
3041           le
3042           gt
3043           ge
3044       path
3045           Return the element context in a form similar to XPath's short form:
3046           '"/root/tag1/../tag"'
3047
3048       xpath
3049           Return a unique XPath expression that can be used to find the ele‐
3050           ment again.
3051
3052           It looks like "/doc/sect[3]/title": unique elements do not have an
3053           index, the others do.
3054
3055       private methods
3056           Low-level methods on the twig:
3057
3058           set_parent        ($parent)
3059           set_first_child   ($first_child)
3060           set_last_child    ($last_child)
3061           set_prev_sibling  ($prev_sibling)
3062           set_next_sibling  ($next_sibling)
3063           set_twig_current
3064           del_twig_current
3065           twig_current
3066           flush
3067               This method should NOT be used, always flush the twig, not an
3068               element.
3069
3070           contains_text
3071
3072           Those methods should not be used, unless of course you find some
3073           creative and interesting, not to mention useful, ways to do it.
3074
3075       cond
3076
3077       Most of the navigation functions accept a condition as an optional
3078       argument The first element (or all elements for "children " or "ances‐
3079       tors ") that passes the condition is returned.
3080
3081       The condition is a single step of an XPath expression using the XPath
3082       subset defined by "get_xpath". Additional conditions are:
3083
3084       The condition can be
3085
3086       #ELT
3087           return a "real" element (not a PCDATA, CDATA, comment or pi ele‐
3088           ment)
3089
3090       #TEXT
3091           return a PCDATA or CDATA element
3092
3093       regular expression
3094           return an element whose tag matches the regexp. The regexp has to
3095           be created with "qr//" (hence this is available only on perl 5.005
3096           and above)
3097
3098       code reference
3099           applies the code, passing the current element as argument, if the
3100           code returns true then the element is returned, if it returns false
3101           then the code is applied to the next candidate.
3102
3103       XML::Twig::XPath
3104
3105       XML::Twig implements a subset of XPath through the "get_xpath" method.
3106
3107       If you want to use the whole XPath power, then you can use
3108       "XML::Twig::XPath" instead. In this case "XML::Twig" uses "XML::XPath"
3109       to execute XPath queries.  You will of course need "XML::XPath"
3110       installed to be able to use "XML::Twig::XPath".
3111
3112       See XML::XPath for more information.
3113
3114       The methods you can use are:
3115
3116       findnodes              ($path)
3117           return a list of nodes found by $path.
3118
3119       findnodes_as_string    ($path)
3120           return the nodes found reproduced as XML. The result is not guaran‐
3121           teed to be valid XML though.
3122
3123       findvalue              ($path)
3124           return the concatenation of the text content of the result nodes
3125
3126       In order for "XML::XPath" to be used as the XPath engine the following
3127       methods are included in "XML::Twig":
3128
3129       in XML::Twig
3130
3131       getRootNode
3132       getParentNode
3133       getChildNodes
3134
3135       in XML::Twig::Elt
3136
3137       string_value
3138       toString
3139       getName
3140       getRootNode
3141       getNextSibling
3142       getPreviousSibling
3143       isElementNode
3144       isTextNode
3145       isPI
3146       isPINode
3147       isProcessingInstructionNode
3148       isComment
3149       isCommentNode
3150       getTarget
3151       getChildNodes
3152       getElementById
3153
3154       XML::Twig::XPath::Elt
3155
3156       The methods you can use are the same as on "XML::Twig::XPath" elements:
3157
3158       findnodes              ($path)
3159           return a list of nodes found by $path.
3160
3161       findnodes_as_string    ($path)
3162           return the nodes found reproduced as XML. The result is not guaran‐
3163           teed to be valid XML though.
3164
3165       findvalue              ($path)
3166           return the concatenation of the text content of the result nodes
3167
3168       XML::Twig::Entity_list
3169
3170       new Create an entity list.
3171
3172       add         ($ent)
3173           Add an entity to an entity list.
3174
3175       add_new_ent ($name, $val, $sysid, $pubid, $ndata)
3176           Create a new entity and add it to the entity list
3177
3178       delete     ($ent or $tag).
3179           Delete an entity (defined by its name or by the Entity object) from
3180           the list.
3181
3182       print      ($optional_filehandle)
3183           Print the entity list.
3184
3185       list
3186           Return the list as an array
3187
3188       XML::Twig::Entity
3189
3190       new        ($name, $val, $sysid, $pubid, $ndata)
3191           Same arguments as the Entity handler for XML::Parser.
3192
3193       print       ($optional_filehandle)
3194           Print an entity declaration.
3195
3196       name
3197           Return the name of the entity
3198
3199       val Return the value of the entity
3200
3201       sysid
3202           Return the system id for the entity (for NDATA entities)
3203
3204       pubid
3205           Return the public id for the entity (for NDATA entities)
3206
3207       ndata
3208           Return true if the entity is an NDATA entity
3209
3210       text
3211           Return the entity declaration text.
3212

EXAMPLES

3214       Additional examples (and a complete tutorial) can be found  on the
3215       XML::Twig Page<http://www.xmltwig.com/xmltwig/>
3216
3217       To figure out what flush does call the following script with an XML
3218       file and an element name as arguments
3219
3220         use XML::Twig;
3221
3222         my ($file, $elt)= @ARGV;
3223         my $t= XML::Twig->new( twig_handlers =>
3224             { $elt => sub {$_[0]->flush; print "\n[flushed here]\n";} });
3225         $t->parsefile( $file, ErrorContext => 2);
3226         $t->flush;
3227         print "\n";
3228

NOTES

3230       Subclassing XML::Twig
3231
3232       Useful methods:
3233
3234       elt_class
3235           In order to subclass "XML::Twig" you will probably need to subclass
3236           also "XML::Twig::Elt". Use the "elt_class" option when you create
3237           the "XML::Twig" object to get the elements created in a different
3238           class (which should be a subclass of "XML::Twig::Elt".
3239
3240       add_options
3241           If you inherit "XML::Twig" new method but want to add more options
3242           to it you can use this method to prevent XML::Twig to issue warn‐
3243           ings for those additional options.
3244
3245       DTD Handling
3246
3247       There are 3 possibilities here.  They are:
3248
3249       No DTD
3250           No doctype, no DTD information, no entity information, the world is
3251           simple...
3252
3253       Internal DTD
3254           The XML document includes an internal DTD, and maybe entity decla‐
3255           rations.
3256
3257           If you use the load_DTD option when creating the twig the DTD
3258           information and the entity declarations can be accessed.
3259
3260           The DTD and the entity declarations will be "flush"'ed (or
3261           "print"'ed) either as is (if they have not been modified) or as
3262           reconstructed (poorly, comments are lost, order is not kept, due to
3263           it's content this DTD should not be viewed by anyone) if they have
3264           been modified. You can also modify them directly by changing the
3265           "$twig->{twig_doctype}->{internal}" field (straight from
3266           XML::Parser, see the "Doctype" handler doc)
3267
3268       External DTD
3269           The XML document includes a reference to an external DTD, and maybe
3270           entity declarations.
3271
3272           If you use the "load_DTD" when creating the twig the DTD informa‐
3273           tion and the entity declarations can be accessed. The entity decla‐
3274           rations will be "flush"'ed (or "print"'ed) either as is (if they
3275           have not been modified) or as reconstructed (badly, comments are
3276           lost, order is not kept).
3277
3278           You can change the doctype through the "$twig->set_doctype" method
3279           and print the dtd through the "$twig->dtd_text" or
3280           "$twig->dtd_print"
3281            methods.
3282
3283           If you need to modify the entity list this is probably the easiest
3284           way to do it.
3285
3286       Flush
3287
3288       If you set handlers and use "flush", do not forget to flush the twig
3289       one last time AFTER the parsing, or you might be missing the end of the
3290       document.
3291
3292       Remember that element handlers are called when the element is CLOSED,
3293       so if you have handlers for nested elements the inner handlers will be
3294       called first. It makes it for example trickier than it would seem to
3295       number nested clauses.
3296

BUGS

3298       entity handling
3299           Due to XML::Parser behaviour, non-base entities in attribute values
3300           disappear: "att="val&ent;"" will be turned into "att => val",
3301           unless you use the "keep_encoding" argument to "XML::Twig->new"
3302
3303       DTD handling
3304           The DTD handling methods are quite bugged. No one uses them and it
3305           seems very difficult to get them to work in all cases, including
3306           with several slightly incompatible versions of XML::Parser and of
3307           libexpat.
3308
3309           Basically you can read the DTD, output it back properly, and update
3310           entities, but not much more.
3311
3312           So use XML::Twig with standalone documents, or with documents
3313           refering to an external DTD, but don't expect it to properly parse
3314           and even output back the DTD.
3315
3316       memory leak
3317           If you use a lot of twigs you might find that you leak quite a lot
3318           of memory (about 2Ks per twig). You can use the "dispose " method
3319           to free that memory after you are done.
3320
3321           If you create elements the same thing might happen, use the
3322           "delete" method to get rid of them.
3323
3324           Alternatively installing the "Scalar::Util" (or "WeakRef") module
3325           on a version of Perl that supports it (>5.6.0) will get rid of the
3326           memory leaks automagically.
3327
3328       ID list
3329           The ID list is NOT updated when elements are cut or deleted.
3330
3331       change_gi
3332           This method will not function properly if you do:
3333
3334                $twig->change_gi( $old1, $new);
3335                $twig->change_gi( $old2, $new);
3336                $twig->change_gi( $new, $even_newer);
3337
3338       sanity check on XML::Parser method calls
3339           XML::Twig should really prevent calls to some XML::Parser methods,
3340           especially the "setHandlers" method.
3341
3342       pretty printing
3343           Pretty printing (at least using the '"indented"' style) is hard to
3344           get right!  Only elements that belong to the document will be prop‐
3345           erly indented. Printing elements that do not belong to the twig
3346           makes it impossible for XML::Twig to figure out their depth, and
3347           thus their indentation level.
3348
3349           Also there is an unavoidable bug when using "flush" and pretty
3350           printing for elements with mixed content that start with an embed‐
3351           ded element:
3352
3353             <elt><b>b</b>toto<b>bold</b></elt>
3354
3355             will be output as
3356
3357             <elt>
3358               <b>b</b>toto<b>bold</b></elt>
3359
3360           if you flush the twig when you find the "<b>" element
3361

Globals

3363       These are the things that can mess up calling code, especially if
3364       threaded.  They might also cause problem under mod_perl.
3365
3366       Exported constants
3367           Whether you want them or not you get them! These are subroutines to
3368           use as constant when creating or testing elements
3369
3370             PCDATA  return '#PCDATA'
3371             CDATA   return '#CDATA'
3372             PI      return '#PI', I had the choice between PROC and PI :--(
3373
3374       Module scoped values: constants
3375           these should cause no trouble:
3376
3377             %base_ent= ( '>' => '&gt;',
3378                          '<' => '&lt;',
3379                          '&' => '&amp;',
3380                          "'" => '&apos;',
3381                          '"' => '&quot;',
3382                        );
3383             CDATA_START   = "<![CDATA[";
3384             CDATA_END     = "]]>";
3385             PI_START      = "<?";
3386             PI_END        = "?>";
3387             COMMENT_START = "<!--";
3388             COMMENT_END   = "-->";
3389
3390           pretty print styles
3391
3392             ( $NSGMLS, $NICE, $INDENTED, $INDENTED_C, $WRAPPED, $RECORD1, $RECORD2)= (1..7);
3393
3394           empty tag output style
3395
3396             ( $HTML, $EXPAND)= (1..2);
3397
3398       Module scoped values: might be changed
3399           Most of these deal with pretty printing, so the worst that can hap‐
3400           pen is probably that XML output does not look right, but is still
3401           valid and processed identically by XML processors.
3402
3403           $empty_tag_style can mess up HTML bowsers though and changing $ID
3404           would most likely create problems.
3405
3406             $pretty=0;           # pretty print style
3407             $quote='"';          # quote for attributes
3408             $INDENT= '  ';       # indent for indented pretty print
3409             $empty_tag_style= 0; # how to display empty tags
3410             $ID                  # attribute used as an id ('id' by default)
3411
3412       Module scoped values: definitely changed
3413           These 2 variables are used to replace tags by an index, thus saving
3414           some space when creating a twig. If they really cause you too much
3415           trouble, let me know, it is probably possible to create either a
3416           switch or at least a version of XML::Twig that does not perform
3417           this optimisation.
3418
3419             %gi2index;     # tag => index
3420             @index2gi;     # list of tags
3421
3422       If you need to manipulate all those values, you can use the following
3423       methods on the XML::Twig object:
3424
3425       global_state
3426           Return a hasref with all the global variables used by XML::Twig
3427
3428           The hash has the following fields:  "pretty", "quote", "indent",
3429           "empty_tag_style", "keep_encoding", "expand_external_entities",
3430           "output_filter", "output_text_filter", "keep_atts_order"
3431
3432       set_global_state ($state)
3433           Set the global state, $state is a hashref
3434
3435       save_global_state
3436           Save the current global state
3437
3438       restore_global_state
3439           Restore the previously saved (using "Lsave_global_state"> state
3440

TODO

3442       SAX handlers
3443           Allowing XML::Twig to work on top of any SAX parser
3444
3445       multiple twigs are not well supported
3446           A number of twig features are just global at the moment. These
3447           include the ID list and the "tag pool" (if you use "change_gi" then
3448           you change the tag for ALL twigs).
3449
3450           A future version will try to support this while trying not to be to
3451           hard on performance (at least when a single twig is used!).
3452

AUTHOR

3454       Michel Rodriguez <mirod@xmltwig.com>
3455

LICENSE

3457       This library is free software; you can redistribute it and/or modify it
3458       under the same terms as Perl itself.
3459
3460       Bug reports should be sent using: RT
3461       <http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-Twig>
3462
3463       Comments can be sent to mirod@xmltwig.com
3464
3465       The XML::Twig page is at <http://www.xmltwig.com/xmltwig/> It includes
3466       the development version of the module, a slightly better version of the
3467       documentation, examples, a tutorial and a: Processing XML efficiently
3468       with Perl and XML::Twig: <http://www.xmltwig.com/xmltwig/tuto
3469       rial/index.html>
3470

SEE ALSO

3472       Complete docs, including a tutorial, examples, an easier to use HTML
3473       version of the docs, a quick reference card and a FAQ are available at
3474       <http://www.xmltwig.com/xmltwig/>
3475
3476       XML::Parser, XML::Parser::Expat, XML::XPath, Encode, Text::Iconv,
3477       Scalar::Utils
3478
3479       Alternative Modules
3480
3481       XML::Twig is not the only XML::Processing module available on CPAN (far
3482       from it!).
3483
3484       The main alternative I would recommend is XML::LibXML.
3485
3486       Here is a quick comparison of the 2 modules:
3487
3488       XML::LibXML, actually "libxml2" on which it is based, sticks to the
3489       standards, and implements a good number of them in a rather strict way:
3490       XML, XPath, DOM, RelaxNG, I must be forgetting a couple (XInclude?). It
3491       is fast and rather frugal memory-wise.
3492
3493       XML::Twig is older: when I started writing it XML::Parser/expat was the
3494       only game in town. It implements XML and that's about it (plus a subset
3495       of XPath, and you can use XML::Twig::XPath if you have XML::XPath
3496       installed for full support). It is slower and requires more memory for
3497       a full tree than XML::LibXML. On the plus side (yes, there is a plus
3498       side!) it lets you process a big document in chunks, and thus let you
3499       tackle documents that couldn't be loaded in memory by XML::LibXML, and
3500       it offers a lot (and I mean a LOT!) of higher-level methods, for every‐
3501       thing, from adding structure to "low-level" XML, to shortcuts for XHTML
3502       conversions and more. It also DWIMs quite a bit, getting comments and
3503       non-significant whitespaces out of the way but preserving them in the
3504       output for example. As it does not stick to the DOM, is also usually
3505       leads to shorter code than in XML::LibXML.
3506
3507       Beyond the pure features of the 2 modules, XML::LibXML seems to be
3508       prefered by "XML-purists", while XML::Twig seems to be more used by
3509       Perl Hackers who have to deal with XML. As you have noted, XML::Twig
3510       also comes with quite a lot of docs, but I am sure if you ask for help
3511       about XML::LibXML here or on Perlmonks you will get answers.
3512
3513       Note that it is actually quite hard for me to compare the 2 modules: on
3514       one hand I know XML::Twig inside-out and I can get it to do pretty much
3515       anything I need to (or I improve it ;--), while I have a very basic
3516       knowledge of XML::LibXML.  So feature-wise, I'd rather use XML::Twig
3517       ;--). On the other hand, I am painfully aware of some of the deficien‐
3518       cies, potential bugs and plain ugly code that lurk in XML::Twig, even
3519       though you are unlikely to be affected by them (unless for example you
3520       need to change the DTD of a document programatically), while I haven't
3521       looked much into XML::LibXML so it still looks shinny and clean to me.
3522
3523       That said, ifyou need to process a document that is too big to fit mem‐
3524       ory and XML::Twig is too slow for you, my reluctant advice would be to
3525       use "bare" XML::Parser.  It won't be as easy to use as XML::Twig: basi‐
3526       cally with XML::Twig you trade some speed (depending on what you do
3527       from a factor 3 to... none) for ease-of-use, but it will be easier IMHO
3528       than using SAX (albeit not standard), and at this point a LOT faster
3529       (see the last test in <http://www.xmltwig.com/article/simple_bench
3530       mark/>).
3531
3532
3533
3534perl v5.8.8                       2007-02-13                           Twig(3)
Impressum