1Twig(3)               User Contributed Perl Documentation              Twig(3)
2
3
4

NAME

6       XML::Twig - A perl module for processing huge XML documents in tree
7       mode.
8

SYNOPSIS

10       Note that this documentation is intended as a reference to the module.
11
12       Complete docs, including a tutorial, examples, an easier to use HTML
13       version, a quick reference card and a FAQ are available at
14       <http://www.xmltwig.com/xmltwig>
15
16       Small documents (loaded in memory as a tree):
17
18         my $twig=XML::Twig->new();    # create the twig
19         $twig->parsefile( 'doc.xml'); # build it
20         my_process( $twig);           # use twig methods to process it
21         $twig->print;                 # output the twig
22
23       Huge documents (processed in combined stream/tree mode):
24
25         # at most one div will be loaded in memory
26         my $twig=XML::Twig->new(
27           twig_handlers =>
28             { title   => sub { $_->set_tag( 'h2') }, # change title tags to h2
29               para    => sub { $_->set_tag( 'p')  }, # change para to p
30               hidden  => sub { $_->delete;       },  # remove hidden elements
31               list    => \&my_list_process,          # process list elements
32               div     => sub { $_[0]->flush;     },  # output and free memory
33             },
34           pretty_print => 'indented',                # output will be nicely formatted
35           empty_tags   => 'html',                    # outputs <empty_tag />
36                                );
37           $twig->flush;                              # flush the end of the document
38
39       See XML::Twig 101 for other ways to use the module, as a filter for
40       example.
41

DESCRIPTION

43       This module provides a way to process XML documents. It is build on top
44       of "XML::Parser".
45
46       The module offers a tree interface to the document, while allowing you
47       to output the parts of it that have been completely processed.
48
49       It allows minimal resource (CPU and memory) usage by building the tree
50       only for the parts of the documents that need actual processing,
51       through the use of the "twig_roots " and "twig_print_outside_roots "
52       options. The "finish " and "finish_print " methods also help to
53       increase performances.
54
55       XML::Twig tries to make simple things easy so it tries its best to
56       takes care of a lot of the (usually) annoying (but sometimes necessary)
57       features that come with XML and XML::Parser.
58

XML::Twig 101

60       XML::Twig can be used either on "small" XML documents (that fit in
61       memory) or on huge ones, by processing parts of the document and
62       outputting or discarding them once they are processed.
63
64   Loading an XML document and processing it
65         my $t= XML::Twig->new();
66         $t->parse( '<d><title>title</title><para>p 1</para><para>p 2</para></d>');
67         my $root= $t->root;
68         $root->set_tag( 'html');              # change doc to html
69         $title= $root->first_child( 'title'); # get the title
70         $title->set_tag( 'h1');               # turn it into h1
71         my @para= $root->children( 'para');   # get the para children
72         foreach my $para (@para)
73           { $para->set_tag( 'p'); }           # turn them into p
74         $t->print;                            # output the document
75
76       Other useful methods include:
77
78       att: "$elt->{'att'}->{'foo'}" return the "foo" attribute for an
79       element,
80
81       set_att : "$elt->set_att( foo => "bar")" sets the "foo" attribute to
82       the "bar" value,
83
84       next_sibling: "$elt->{next_sibling}" return the next sibling in the
85       document (in the example "$title->{next_sibling}" is the first "para",
86       you can also (and actually should) use "$elt->next_sibling( 'para')" to
87       get it
88
89       The document can also be transformed through the use of the cut, copy,
90       paste and move methods: "$title->cut; $title->paste( after => $p);" for
91       example
92
93       And much, much more, see XML::Twig::Elt.
94
95   Processing an XML document chunk by chunk
96       One of the strengths of XML::Twig is that it let you work with files
97       that do not fit in memory (BTW storing an XML document in memory as a
98       tree is quite memory-expensive, the expansion factor being often around
99       10).
100
101       To do this you can define handlers, that will be called once a specific
102       element has been completely parsed. In these handlers you can access
103       the element and process it as you see fit, using the navigation and the
104       cut-n-paste methods, plus lots of convenient ones like "prefix ".  Once
105       the element is completely processed you can then "flush " it, which
106       will output it and free the memory. You can also "purge " it if you
107       don't need to output it (if you are just extracting some data from the
108       document for example). The handler will be called again once the next
109       relevant element has been parsed.
110
111         my $t= XML::Twig->new( twig_handlers =>
112                                 { section => \&section,
113                                   para   => sub { $_->set_tag( 'p'); }
114                                 },
115                              );
116         $t->parsefile( 'doc.xml');
117         $t->flush; # don't forget to flush one last time in the end or anything
118                    # after the last </section> tag will not be output
119
120         # the handler is called once a section is completely parsed, ie when
121         # the end tag for section is found, it receives the twig itself and
122         # the element (including all its sub-elements) as arguments
123         sub section
124           { my( $t, $section)= @_;      # arguments for all twig_handlers
125             $section->set_tag( 'div');  # change the tag name.4, my favourite method...
126             # let's use the attribute nb as a prefix to the title
127             my $title= $section->first_child( 'title'); # find the title
128             my $nb= $title->{'att'}->{'nb'}; # get the attribute
129             $title->prefix( "$nb - ");  # easy isn't it?
130             $section->flush;            # outputs the section and frees memory
131           }
132
133       There is of course more to it: you can trigger handlers on more
134       elaborate conditions than just the name of the element, "section/title"
135       for example.
136
137         my $t= XML::Twig->new( twig_handlers =>
138                                  { 'section/title' => sub { $_->print } }
139                              )
140                         ->parsefile( 'doc.xml');
141
142       Here "sub { $_->print }" simply prints the current element ($_ is
143       aliased to the element in the handler).
144
145       You can also trigger a handler on a test on an attribute:
146
147         my $t= XML::Twig->new( twig_handlers =>
148                             { 'section[@level="1"]' => sub { $_->print } }
149                              );
150                         ->parsefile( 'doc.xml');
151
152       You can also use "start_tag_handlers " to process an element as soon as
153       the start tag is found. Besides "prefix " you can also use "suffix ",
154
155   Processing just parts of an XML document
156       The twig_roots mode builds only the required sub-trees from the
157       document Anything outside of the twig roots will just be ignored:
158
159         my $t= XML::Twig->new(
160              # the twig will include just the root and selected titles
161                  twig_roots   => { 'section/title' => \&print_n_purge,
162                                    'annex/title'   => \&print_n_purge
163                  }
164                             );
165         $t->parsefile( 'doc.xml');
166
167         sub print_n_purge
168           { my( $t, $elt)= @_;
169             print $elt->text;    # print the text (including sub-element texts)
170             $t->purge;           # frees the memory
171           }
172
173       You can use that mode when you want to process parts of a documents but
174       are not interested in the rest and you don't want to pay the price,
175       either in time or memory, to build the tree for the it.
176
177   Building an XML filter
178       You can combine the "twig_roots" and the "twig_print_outside_roots"
179       options to build filters, which let you modify selected elements and
180       will output the rest of the document as is.
181
182       This would convert prices in $ to prices in Euro in a document:
183
184         my $t= XML::Twig->new(
185                  twig_roots   => { 'price' => \&convert, },   # process prices
186                  twig_print_outside_roots => 1,               # print the rest
187                             );
188         $t->parsefile( 'doc.xml');
189
190         sub convert
191           { my( $t, $price)= @_;
192             my $currency=  $price->{'att'}->{'currency'};          # get the currency
193             if( $currency eq 'USD')
194               { $usd_price= $price->text;                     # get the price
195                 # %rate is just a conversion table
196                 my $euro_price= $usd_price * $rate{usd2euro};
197                 $price->set_text( $euro_price);               # set the new price
198                 $price->set_att( currency => 'EUR');          # don't forget this!
199               }
200             $price->print;                                    # output the price
201           }
202
203   XML::Twig and various versions of Perl, XML::Parser and expat:
204       Before being uploaded to CPAN, XML::Twig 3.22 has been tested under the
205       following environments:
206
207       linux-x86
208           perl 5.6.2, expat 1.95.8, XML::Parser 2.34 perl 5.8.0, expat
209           1.95.8, XML::Parser 2.34 perl 5.8.7, expat 1.95.8, XML::Parser2.34
210
211       Solaris
212           perl 5.6.1, expat 1.95.2, XML::Parser 2.31
213
214       XML::Twig is a lot more sensitive to variations in versions of perl,
215       XML::Parser and expat than to the OS, so this should cover some
216       reasonable configurations.
217
218       The "recommended configuration" is perl 5.8.3+ (for good Unicode
219       support), XML::Parser 2.31+ and expat 1.95.5+
220
221       See <http://testers.cpan.org/search?request=dist&dist=XML-Twig> for the
222       CPAN testers reports on XML::Twig, which list all tested
223       configurations.
224
225       An Atom feed of the CPAN Testers results is available at
226       <http://xmltwig.com/rss/twig_testers.rss>
227
228       Finally:
229
230       XML::Twig does NOT work with expat 1.95.4
231       XML::Twig only works with XML::Parser 2.27 in perl 5.6.*
232           Note that I can't compile XML::Parser 2.27 anymore, so I can't
233           guarantee that it still works
234
235       XML::Parser 2.28 does not really work
236
237       When in doubt, upgrade expat, XML::Parser and Scalar::Util
238
239       Finally, for some optional features, XML::Twig depends on some
240       additional modules. The complete list, which depends somewhat on the
241       version of Perl that you are running, is given by running
242       "t/zz_dump_config.t"
243

Simplifying XML processing

245       Whitespaces
246           Whitespaces that look non-significant are discarded, this behaviour
247           can be controlled using the "keep_spaces ", "keep_spaces_in " and
248           "discard_spaces_in " options.
249
250       Encoding
251           You can specify that you want the output in the same encoding as
252           the input (provided you have valid XML, which means you have to
253           specify the encoding either in the document or when you create the
254           Twig object) using the "keep_encoding " option
255
256           You can also use "output_encoding" to convert the internal UTF-8
257           format to the required encoding.
258
259       Comments and Processing Instructions (PI)
260           Comments and PI's can be hidden from the processing, but still
261           appear in the output (they are carried by the "real" element closer
262           to them)
263
264       Pretty Printing
265           XML::Twig can output the document pretty printed so it is easier to
266           read for us humans.
267
268       Surviving an untimely death
269           XML parsers are supposed to react violently when fed improper XML.
270           XML::Parser just dies.
271
272           XML::Twig provides the "safe_parse " and the "safe_parsefile "
273           methods which wrap the parse in an eval and return either the
274           parsed twig or 0 in case of failure.
275
276       Private attributes
277           Attributes with a name starting with # (illegal in XML) will not be
278           output, so you can safely use them to store temporary values during
279           processing. Note that you can store anything in a private
280           attribute, not just text, it's just a regular Perl variable, so a
281           reference to an object or a huge data structure is perfectly fine.
282

CLASSES

284       XML::Twig uses a very limited number of classes. The ones you are most
285       likely to use are "XML::Twig" of course, which represents a complete
286       XML document, including the document itself (the root of the document
287       itself is "root"), its handlers, its input or output filters... The
288       other main class is "XML::Twig::Elt", which models an XML element.
289       Element here has a very wide definition: it can be a regular element,
290       or but also text, with an element "tag" of "#PCDATA" (or "#CDATA"), an
291       entity (tag is "#ENT"), a Processing Instruction ("#PI"), a comment
292       ("#COMMENT").
293
294       Those are the 2 commonly used classes.
295
296       You might want to look the "elt_class" option if you want to subclass
297       "XML::Twig::Elt".
298
299       Attributes are just attached to their parent element, they are not
300       objects per se. (Please use the provided methods "att" and "set_att" to
301       access them, if you access them as a hash, then your code becomes
302       implementaion dependent and might break in the future).
303
304       Other classes that are seldom used are "XML::Twig::Entity_list" and
305       "XML::Twig::Entity".
306
307       If you use "XML::Twig::XPath" instead of "XML::Twig", elements are then
308       created as "XML::Twig::XPath::Elt"
309

METHODS

311   XML::Twig
312       A twig is a subclass of XML::Parser, so all XML::Parser methods can be
313       called on a twig object, including parse and parsefile.  "setHandlers"
314       on the other hand cannot be used, see "BUGS "
315
316       new This is a class method, the constructor for XML::Twig. Options are
317           passed as keyword value pairs. Recognized options are the same as
318           XML::Parser, plus some XML::Twig specifics.
319
320           New Options:
321
322           twig_handlers
323               This argument consists of a hash "{ expression =" \&handler}>
324               where expression is a an XPath-like expression (+ some others).
325
326               XPath expressions are limited to using the child and descendant
327               axis (indeed you can't specify an axis), and predicates cannot
328               be nested.  You can use the "string", or "string(<tag>)"
329               function (except in "twig_roots" triggers).
330
331               Additionally you can use regexps (/ delimited) to match
332               attribute and string values.
333
334               Examples:
335
336                 foo
337                 foo/bar
338                 foo//bar
339                 /foo/bar
340                 /foo//bar
341                 /foo/bar[@att1 = "val1" and @att2 = "val2"]/baz[@a >= 1]
342                 foo[string()=~ /^duh!+/]
343                 /foo[string(bar)=~ /\d+/]/baz[@att != 3]
344
345               #CDATA can be used to call a handler for a CDATA.  #COMMENT can
346               be used to call a handler for comments
347
348               Some additional (non-XPath) expressions are also provided for
349               convenience:
350
351               processing instructions
352                   '?' or '#PI' triggers the handler for any processing
353                   instruction, and '?<target>' or '#PI <target>' triggers a
354                   handler for processing instruction with the given target(
355                   ex: '#PI xml-stylesheet').
356
357               level(<level>)
358                   Triggers the handler on any element at that level in the
359                   tree (root is level 1)
360
361               _all_
362                   Triggers the handler for all elements in the tree
363
364               _default_
365                   Triggers the handler for each element that does NOT have
366                   any other handler.
367
368               Expressions are evaluated against the input document.  Which
369               means that even if you have changed the tag of an element
370               (changing the tag of a parent element from a handler for
371               example) the change will not impact the expression evaluation.
372               There is an exception to this: "private" attributes (which name
373               start with a '#', and can only be created during the parsing,
374               as they are not valid XML) are checked against the current
375               twig.
376
377               Handlers are triggered in fixed order, sorted by their type
378               (xpath expressions first, then regexps, then level), then by
379               whether they specify a full path (starting at the root element)
380               or not, then by by number of steps in the expression , then
381               number of predicates, then number of tests in predicates.
382               Handlers where the last step does not specify a step
383               ("foo/bar/*") are triggered after other XPath handlers.
384               Finally "_all_" handlers are triggered last.
385
386               Important: once a handler has been triggered if it returns 0
387               then no other handler is called, except a "_all_" handler which
388               will be called anyway.
389
390               If a handler returns a true value and other handlers apply,
391               then the next applicable handler will be called. Repeat, rinse,
392               lather..; The exception to that rule is when the
393               "do_not_chain_handlers" option is set, in which case only the
394               first handler will be called.
395
396               Note that it might be a good idea to explicitly return a short
397               true value (like 1) from handlers: this ensures that other
398               applicable handlers are called even if the last statement for
399               the handler happens to evaluate to false. This might also
400               speedup the code by avoiding the result of the last statement
401               of the code to be copied and passed to the code managing
402               handlers.  It can really pay to have 1 instead of a long string
403               returned.
404
405               When an element is CLOSED the corresponding handler is called,
406               with 2 arguments: the twig and the "Element ". The twig
407               includes the document tree that has been built so far, the
408               element is the complete sub-tree for the element. This means
409               that handlers for inner elements are called before handlers for
410               outer elements.
411
412               $_ is also set to the element, so it is easy to write inline
413               handlers like
414
415                 para => sub { $_->set_tag( 'p'); }
416
417               Text is stored in elements whose tag is #PCDATA (due to mixed
418               content, text and sub-element in an element there is no way to
419               store the text as just an attribute of the enclosing element).
420
421               Warning: if you have used purge or flush on the twig the
422               element might not be complete, some of its children might have
423               been entirely flushed or purged, and the start tag might even
424               have been printed (by "flush") already, so changing its tag
425               might not give the expected result.
426
427           twig_roots
428               This argument let's you build the tree only for those elements
429               you are interested in.
430
431                 Example: my $t= XML::Twig->new( twig_roots => { title => 1, subtitle => 1});
432                          $t->parsefile( file);
433                          my $t= XML::Twig->new( twig_roots => { 'section/title' => 1});
434                          $t->parsefile( file);
435
436               return a twig containing a document including only "title" and
437               "subtitle" elements, as children of the root element.
438
439               You can use generic_attribute_condition, attribute_condition,
440               full_path, partial_path, tag, tag_regexp, _default_ and _all_
441               to trigger the building of the twig.  string_condition and
442               regexp_condition cannot be used as the content of the element,
443               and the string, have not yet been parsed when the condition is
444               checked.
445
446               WARNING: path are checked for the document. Even if the
447               "twig_roots" option is used they will be checked against the
448               full document tree, not the virtual tree created by XML::Twig
449
450               WARNING: twig_roots elements should NOT be nested, that would
451               hopelessly confuse XML::Twig ;--(
452
453               Note: you can set handlers (twig_handlers) using twig_roots
454                 Example: my $t= XML::Twig->new( twig_roots =>
455                                                  { title    => sub {
456               $_{1]->print;},
457                                                    subtitle =>
458               \&process_subtitle
459                                                  }
460                                              );
461                          $t->parsefile( file);
462
463           twig_print_outside_roots
464               To be used in conjunction with the "twig_roots" argument. When
465               set to a true value this will print the document outside of the
466               "twig_roots" elements.
467
468                Example: my $t= XML::Twig->new( twig_roots => { title => \&number_title },
469                                               twig_print_outside_roots => 1,
470                                              );
471                          $t->parsefile( file);
472                          { my $nb;
473                          sub number_title
474                            { my( $twig, $title);
475                              $nb++;
476                              $title->prefix( "$nb "; }
477                              $title->print;
478                            }
479                          }
480
481               This example prints the document outside of the title element,
482               calls "number_title" for each "title" element, prints it, and
483               then resumes printing the document. The twig is built only for
484               the "title" elements.
485
486               If the value is a reference to a file handle then the document
487               outside the "twig_roots" elements will be output to this file
488               handle:
489
490                 open( OUT, ">out_file") or die "cannot open out file out_file:$!";
491                 my $t= XML::Twig->new( twig_roots => { title => \&number_title },
492                                        # default output to OUT
493                                        twig_print_outside_roots => \*OUT,
494                                      );
495
496                        { my $nb;
497                          sub number_title
498                            { my( $twig, $title);
499                              $nb++;
500                              $title->prefix( "$nb "; }
501                              $title->print( \*OUT);    # you have to print to \*OUT here
502                            }
503                          }
504
505           start_tag_handlers
506               A hash "{ expression =" \&handler}>. Sets element handlers that
507               are called when the element is open (at the end of the
508               XML::Parser "Start" handler). The handlers are called with 2
509               params: the twig and the element. The element is empty at that
510               point, its attributes are created though.
511
512               You can use generic_attribute_condition, attribute_condition,
513               full_path, partial_path, tag, tag_regexp, _default_  and _all_
514               to trigger the handler.
515
516               string_condition and regexp_condition cannot be used as the
517               content of the element, and the string, have not yet been
518               parsed when the condition is checked.
519
520               The main uses for those handlers are to change the tag name
521               (you might have to do it as soon as you find the open tag if
522               you plan to "flush" the twig at some point in the element, and
523               to create temporary attributes that will be used when
524               processing sub-element with "twig_hanlders".
525
526               You should also use it to change tags if you use "flush". If
527               you change the tag in a regular "twig_handler" then the start
528               tag might already have been flushed.
529
530               Note: "start_tag" handlers can be called outside of
531               "twig_roots" if this argument is used, in this case handlers
532               are called with the following arguments: $t (the twig), $tag
533               (the tag of the element) and %att (a hash of the attributes of
534               the element).
535
536               If the "twig_print_outside_roots" argument is also used, if the
537               last handler called returns  a "true" value, then the the start
538               tag will be output as it appeared in the original document, if
539               the handler returns a a "false" value then the start tag will
540               not be printed (so you can print a modified string yourself for
541               example).
542
543               Note that you can use the ignore method in "start_tag_handlers"
544               (and only there).
545
546           end_tag_handlers
547               A hash "{ expression =" \&handler}>. Sets element handlers that
548               are called when the element is closed (at the end of the
549               XML::Parser "End" handler). The handlers are called with 2
550               params: the twig and the tag of the element.
551
552               twig_handlers are called when an element is completely parsed,
553               so why have this redundant option? There is only one use for
554               "end_tag_handlers": when using the "twig_roots" option, to
555               trigger a handler for an element outside the roots.  It is for
556               example very useful to number titles in a document using nested
557               sections:
558
559                 my @no= (0);
560                 my $no;
561                 my $t= XML::Twig->new(
562                         start_tag_handlers =>
563                          { section => sub { $no[$#no]++; $no= join '.', @no; push @no, 0; } },
564                         twig_roots         =>
565                          { title   => sub { $_[1]->prefix( $no); $_[1]->print; } },
566                         end_tag_handlers   => { section => sub { pop @no;  } },
567                         twig_print_outside_roots => 1
568                                     );
569                  $t->parsefile( $file);
570
571               Using the "end_tag_handlers" argument without "twig_roots" will
572               result in an error.
573
574           do_not_chain_handlers
575               If this option is set to a true value, then only one handler
576               will be called for each element, even if several satisfy the
577               condition
578
579               Note that the "_all_" handler will still be called regardless
580
581           ignore_elts
582               This option lets you ignore elements when building the twig.
583               This is useful in cases where you cannot use "twig_roots" to
584               ignore elements, for example if the element to ignore is a
585               sibling of elements you are interested in.
586
587               Example:
588
589                 my $twig= XML::Twig->new( ignore_elts => { elt => 1 });
590                 $twig->parsefile( 'doc.xml');
591
592               This will build the complete twig for the document, except that
593               all "elt" elements (and their children) will be left out.
594
595           char_handler
596               A reference to a subroutine that will be called every time
597               "PCDATA" is found.
598
599               The subroutine receives the string as argument, and returns the
600               modified string:
601
602                 # we want all strings in upper case
603                 sub my_char_handler
604                   { my( $text)= @_;
605                     $text= uc( $text);
606                     return $text;
607                   }
608
609           elt_class
610               The name of a class used to store elements. this class should
611               inherit from "XML::Twig::Elt" (and by default it is
612               "XML::Twig::Elt"). This option is used to subclass the element
613               class and extend it with new methods.
614
615               This option is needed because during the parsing of the XML,
616               elements are created by "XML::Twig", without any control from
617               the user code.
618
619           keep_atts_order
620               Setting this option to a true value causes the attribute hash
621               to be tied to a "Tie::IxHash" object.  This means that
622               "Tie::IxHash" needs to be installed for this option to be
623               available. It also means that the hash keeps its order, so you
624               will get the attributes in order. This allows outputting the
625               attributes in the same order as they were in the original
626               document.
627
628           keep_encoding
629               This is a (slightly?) evil option: if the XML document is not
630               UTF-8 encoded and you want to keep it that way, then setting
631               keep_encoding will use the"Expat" original_string method for
632               character, thus keeping the original encoding, as well as the
633               original entities in the strings.
634
635               See the "t/test6.t" test file to see what results you can
636               expect from the various encoding options.
637
638               WARNING: if the original encoding is multi-byte then attribute
639               parsing will be EXTREMELY unsafe under any Perl before 5.6, as
640               it uses regular expressions which do not deal properly with
641               multi-byte characters. You can specify an alternate function to
642               parse the start tags with the "parse_start_tag" option (see
643               below)
644
645               WARNING: this option is NOT used when parsing with the non-
646               blocking parser ("parse_start", "parse_more", parse_done
647               methods) which you probably should not use with XML::Twig
648               anyway as they are totally untested!
649
650           output_encoding
651               This option generates an output_filter using "Encode",
652               "Text::Iconv" or "Unicode::Map8" and "Unicode::Strings", and
653               sets the encoding in the XML declaration. This is the easiest
654               way to deal with encodings, if you need more sophisticated
655               features, look at "output_filter" below
656
657           output_filter
658               This option is used to convert the character encoding of the
659               output document.  It is passed either a string corresponding to
660               a predefined filter or a subroutine reference. The filter will
661               be called every time a document or element is processed by the
662               "print" functions ("print", "sprint", "flush").
663
664               Pre-defined filters:
665
666               latin1
667                   uses either "Encode", "Text::Iconv" or "Unicode::Map8" and
668                   "Unicode::String" or a regexp (which works only with
669                   XML::Parser 2.27), in this order, to convert all characters
670                   to ISO-8859-1 (aka latin1)
671
672               html
673                   does the same conversion as "latin1", plus encodes entities
674                   using "HTML::Entities" (oddly enough you will need to have
675                   HTML::Entities installed for it to be available). This
676                   should only be used if the tags and attribute names
677                   themselves are in US-ASCII, or they will be converted and
678                   the output will not be valid XML any more
679
680               safe
681                   converts the output to ASCII (US) only  plus character
682                   entities ("&#nnn;") this should be used only if the tags
683                   and attribute names themselves are in US-ASCII, or they
684                   will be converted and the output will not be valid XML any
685                   more
686
687               safe_hex
688                   same as "safe" except that the character entities are in
689                   hexa ("&#xnnn;")
690
691               encode_convert ($encoding)
692                   Return a subref that can be used to convert utf8 strings to
693                   $encoding).  Uses "Encode".
694
695                      my $conv = XML::Twig::encode_convert( 'latin1');
696                      my $t = XML::Twig->new(output_filter => $conv);
697
698               iconv_convert ($encoding)
699                   this function is used to create a filter subroutine that
700                   will be used to convert the characters to the target
701                   encoding using "Text::Iconv" (which needs to be installed,
702                   look at the documentation for the module and for the
703                   "iconv" library to find out which encodings are available
704                   on your system)
705
706                      my $conv = XML::Twig::iconv_convert( 'latin1');
707                      my $t = XML::Twig->new(output_filter => $conv);
708
709               unicode_convert ($encoding)
710                   this function is used to create a filter subroutine that
711                   will be used to convert the characters to the target
712                   encoding using  "Unicode::Strings" and "Unicode::Map8"
713                   (which need to be installed, look at the documentation for
714                   the modules to find out which encodings are available on
715                   your system)
716
717                      my $conv = XML::Twig::unicode_convert( 'latin1');
718                      my $t = XML::Twig->new(output_filter => $conv);
719
720               The "text" and "att" methods do not use the filter, so their
721               result are always in unicode.
722
723               Those predeclared filters are based on subroutines that can be
724               used by themselves (as "XML::Twig::foo").
725
726               html_encode ($string)
727                   Use "HTML::Entities" to encode a utf8 string
728
729               safe_encode ($string)
730                   Use either a regexp (perl < 5.8) or "Encode" to encode non-
731                   ascii characters in the string in "&#<nnnn>;" format
732
733               safe_encode_hex ($string)
734                   Use either a regexp (perl < 5.8) or "Encode" to encode non-
735                   ascii characters in the string in "&#x<nnnn>;" format
736
737               regexp2latin1 ($string)
738                   Use a regexp to encode a utf8 string into latin 1
739                   (ISO-8859-1). Does not work with Perl 5.8.0!
740
741           output_text_filter
742               same as output_filter, except it doesn't apply to the brackets
743               and quotes around attribute values. This is useful for all
744               filters that could change the tagging, basically anything that
745               does not just change the encoding of the output. "html", "safe"
746               and "safe_hex" are better used with this option.
747
748           input_filter
749               This option is similar to "output_filter" except the filter is
750               applied to the characters before they are stored in the twig,
751               at parsing time.
752
753           remove_cdata
754               Setting this option to a true value will force the twig to
755               output CDATA sections as regular (escaped) PCDATA
756
757           parse_start_tag
758               If you use the "keep_encoding" option then this option can be
759               used to replace the default parsing function. You should
760               provide a coderef (a reference to a subroutine) as the
761               argument, this subroutine takes the original tag (given by
762               XML::Parser::Expat "original_string()" method) and returns a
763               tag and the attributes in a hash (or in a list
764               attribute_name/attribute value).
765
766           expand_external_ents
767               When this option is used external entities (that are defined)
768               are expanded when the document is output using "print"
769               functions such as "print ", "sprint ", "flush " and "xml_string
770               ".  Note that in the twig the entity will be stored as an
771               element with a tag '"#ENT"', the entity will not be expanded
772               there, so you might want to process the entities before
773               outputting it.
774
775               If an external entity is not available, then the parse will
776               fail.
777
778               A special case is when the value of this option is -1. In that
779               case a missing entity will not cause the parser to die, but its
780               "name", "sysid" and "pubid" will be stored in the twig as
781               "$twig->{twig_missing_system_entities}" (a reference to an
782               array of hashes { name => <name>, sysid => <sysid>, pubid =>
783               <pubid> }). Yes, this is a bit of a hack, but it's useful in
784               some cases.
785
786           load_DTD
787               If this argument is set to a true value, "parse" or "parsefile"
788               on the twig will load  the DTD information. This information
789               can then be accessed through the twig, in a "DTD_handler" for
790               example. This will load even an external DTD.
791
792               Default and fixed values for attributes will also be filled,
793               based on the DTD.
794
795               Note that to do this the module will generate a temporary file
796               in the current directory. If this is a problem let me know and
797               I will add an option to specify an alternate directory.
798
799               See "DTD Handling" for more information
800
801           DTD_handler
802               Set a handler that will be called once the doctype (and the
803               DTD) have been loaded, with 2 arguments, the twig and the DTD.
804
805           no_prolog
806               Does not output a prolog (XML declaration and DTD)
807
808           id  This optional argument gives the name of an attribute that can
809               be used as an ID in the document. Elements whose ID is known
810               can be accessed through the elt_id method. id defaults to 'id'.
811               See "BUGS "
812
813           discard_spaces
814               If this optional argument is set to a true value then spaces
815               are discarded when they look non-significant: strings
816               containing only spaces are discarded.  This argument is set to
817               true by default.
818
819           keep_spaces
820               If this optional argument is set to a true value then all
821               spaces in the document are kept, and stored as "PCDATA".
822
823               Warning: adding this option can result in changes in the twig
824               generated: space that was previously discarded might end up in
825               a new text element. see the difference by calling the following
826               code with 0 and 1 as arguments:
827
828                 perl -MXML::Twig -e'print XML::Twig->new( keep_spaces => shift)->parse( "<d> \n<e/></d>")->_dump'
829
830               "keep_spaces" and "discard_spaces" cannot be both set.
831
832           discard_spaces_in
833               This argument sets "keep_spaces" to true but will cause the
834               twig builder to discard spaces in the elements listed.
835
836               The syntax for using this argument is:
837
838                 XML::Twig->new( discard_spaces_in => [ 'elt1', 'elt2']);
839
840           keep_spaces_in
841               This argument sets "discard_spaces" to true but will cause the
842               twig builder to keep spaces in the elements listed.
843
844               The syntax for using this argument is:
845
846                 XML::Twig->new( keep_spaces_in => [ 'elt1', 'elt2']);
847
848               Warning: adding this option can result in changes in the twig
849               generated: space that was previously discarded might end up in
850               a new text element.
851
852           pretty_print
853               Set the pretty print method, amongst '"none"' (default),
854               '"nsgmls"', '"nice"', '"indented"', '"indented_c"',
855               '"indented_a"', '"indented_close_tag"', '"cvs"', '"wrapped"',
856               '"record"' and '"record_c"'
857
858               pretty_print formats:
859
860               none
861                   The document is output as one ling string, with no line
862                   breaks except those found within text elements
863
864               nsgmls
865                   Line breaks are inserted in safe places: that is within
866                   tags, between a tag and an attribute, between attributes
867                   and before the > at the end of a tag.
868
869                   This is quite ugly but better than "none", and it is very
870                   safe, the document will still be valid (conforming to its
871                   DTD).
872
873                   This is how the SGML parser "sgmls" splits documents, hence
874                   the name.
875
876               nice
877                   This option inserts line breaks before any tag that does
878                   not contain text (so element with textual content are not
879                   broken as the \n is the significant).
880
881                   WARNING: this option leaves the document well-formed but
882                   might make it invalid (not conformant to its DTD). If you
883                   have elements declared as
884
885                     <!ELEMENT foo (#PCDATA|bar)>
886
887                   then a "foo" element including a "bar" one will be printed
888                   as
889
890                     <foo>
891                     <bar>bar is just pcdata</bar>
892                     </foo>
893
894                   This is invalid, as the parser will take the line break
895                   after the "foo" tag as a sign that the element contains
896                   PCDATA, it will then die when it finds the "bar" tag. This
897                   may or may not be important for you, but be aware of it!
898
899               indented
900                   Same as "nice" (and with the same warning) but indents
901                   elements according to their level
902
903               indented_c
904                   Same as "indented" but a little more compact: the closing
905                   tags are on the same line as the preceding text
906
907               indented_close_tag
908                   Same as "indented" except that the closing tag is also
909                   indented, to line up with the tags within the element
910
911               idented_a
912                   This formats XML files in a line-oriented version control
913                   friendly way.  The format is described in
914                   <http://tinyurl.com/2kwscq> (that's an Oracle document with
915                   an insanely long URL).
916
917                   Note that to be totaly conformant to the "spec", the order
918                   of attributes should not be changed, so if they are not
919                   already in alphabetical order you will need to use the
920                   "keep_atts_order" option.
921
922               cvs Same as "idented_a".
923
924               wrapped
925                   Same as "indented_c" but lines are wrapped using
926                   Text::Wrap::wrap. The default length for lines is the
927                   default for $Text::Wrap::columns, and can be changed by
928                   changing that variable.
929
930               record
931                   This is a record-oriented pretty print, that display data
932                   in records, one field per line (which looks a LOT like
933                   "indented")
934
935               record_c
936                   Stands for record compact, one record per line
937
938           empty_tags
939               Set the empty tag display style ('"normal"', '"html"' or
940               '"expand"').
941
942               "normal" outputs an empty tag '"<tag/>"', "html" adds a space
943               '"<tag />"' for elements that can be empty in XHTML and
944               "expand" outputs '"<tag></tag>"'
945
946           quote
947               Set the quote character for attributes ('"single"' or
948               '"double"').
949
950           escape_gt
951               By default XML::Twig does not escape the character > in its
952               output, as it is not mandated by the XML spec. With this option
953               on, > will be replaced by "&gt;"
954
955           comments
956               Set the way comments are processed: '"drop"' (default),
957               '"keep"' or '"process"'
958
959               Comments processing options:
960
961               drop
962                   drops the comments, they are not read, nor printed to the
963                   output
964
965               keep
966                   comments are loaded and will appear on the output, they are
967                   not accessible within the twig and will not interfere with
968                   processing though
969
970                   Note: comments in the middle of a text element such as
971
972                     <p>text <!-- comment --> more text --></p>
973
974                   are kept at their original position in the text. Using
975                   EeX"print" methods like "print" or "sprint" will return the
976                   comments in the text. Using "text" or "field" on the other
977                   hand will not.
978
979                   Any use of "set_pcdata" on the "#PCDATA" element (directly
980                   or through other methods like "set_content") will delete
981                   the comment(s).
982
983               process
984                   comments are loaded in the twig and will be treated as
985                   regular elements (their "tag" is "#COMMENT") this can
986                   interfere with processing if you expect
987                   "$elt->{first_child}" to be an element but find a comment
988                   there.  Validation will not protect you from this as
989                   comments can happen anywhere.  You can use
990                   "$elt->first_child( 'tag')" (which is a good habit anyway)
991                   to get where you want.
992
993                   Consider using "process" if you are outputting SAX events
994                   from XML::Twig.
995
996           pi  Set the way processing instructions are processed: '"drop"',
997               '"keep"' (default) or '"process"'
998
999               Note that you can also set PI handlers in the "twig_handlers"
1000               option:
1001
1002                 '?'       => \&handler
1003                 '?target' => \&handler 2
1004
1005               The handlers will be called with 2 parameters, the twig and the
1006               PI element if "pi" is set to "process", and with 3, the twig,
1007               the target and the data if "pi" is set to "keep". Of course
1008               they will not be called if "pi" is set to "drop".
1009
1010               If "pi" is set to "keep" the handler should return a string
1011               that will be used as-is as the PI text (it should look like ""
1012               <?target data?" >" or '' if you want to remove the PI),
1013
1014               Only one handler will be called, "?target" or "?" if no
1015               specific handler for that target is available.
1016
1017           map_xmlns
1018               This option is passed a hashref that maps uri's to prefixes.
1019               The prefixes in the document will be replaced by the ones in
1020               the map. The mapped prefixes can (actually have to) be used to
1021               trigger handlers, navigate or query the document.
1022
1023               Here is an example:
1024
1025                 my $t= XML::Twig->new( map_xmlns => {'http://www.w3.org/2000/svg' => "svg"},
1026                                        twig_handlers =>
1027                                          { 'svg:circle' => sub { $_->set_att( r => 20) } },
1028                                        pretty_print => 'indented',
1029                                      )
1030                                 ->parse( '<doc xmlns:gr="http://www.w3.org/2000/svg">
1031                                             <gr:circle cx="10" cy="90" r="10"/>
1032                                          </doc>'
1033                                        )
1034                                 ->print;
1035
1036               This will output:
1037
1038                 <doc xmlns:svg="http://www.w3.org/2000/svg">
1039                    <svg:circle cx="10" cy="90" r="20"/>
1040                 </doc>
1041
1042           keep_original_prefix
1043               When used with "map_xmlns" this option will make "XML::Twig"
1044               use the original namespace prefixes when outputting a document.
1045               The mapped prefix will still be used for triggering handlers
1046               and in navigation and query methods.
1047
1048                 my $t= XML::Twig->new( map_xmlns => {'http://www.w3.org/2000/svg' => "svg"},
1049                                        twig_handlers =>
1050                                          { 'svg:circle' => sub { $_->set_att( r => 20) } },
1051                                        keep_original_prefix => 1,
1052                                        pretty_print => 'indented',
1053                                      )
1054                                 ->parse( '<doc xmlns:gr="http://www.w3.org/2000/svg">
1055                                             <gr:circle cx="10" cy="90" r="10"/>
1056                                          </doc>'
1057                                        )
1058                                 ->print;
1059
1060               This will output:
1061
1062                 <doc xmlns:gr="http://www.w3.org/2000/svg">
1063                    <gr:circle cx="10" cy="90" r="20"/>
1064                 </doc>
1065
1066           index ($arrayref or $hashref)
1067               This option creates lists of specific elements during the
1068               parsing of the XML.  It takes a reference to either a list of
1069               triggering expressions or to a hash name => expression, and for
1070               each one generates the list of elements that match the
1071               expression. The list can be accessed through the "index"
1072               method.
1073
1074               example:
1075
1076                 # using an array ref
1077                 my $t= XML::Twig->new( index => [ 'div', 'table' ])
1078                                 ->parsefile( "foo.xml');
1079                 my $divs= $t->index( 'div');
1080                 my $first_div= $divs->[0];
1081                 my $last_table= $t->index( table => -1);
1082
1083                 # using a hashref to name the indexes
1084                 my $t= XML::Twig->new( index => { email => 'a[@href=~/^\s*mailto:/]')
1085                                 ->parsefile( "foo.xml');
1086                 my $last_emails= $t->index( email => -1);
1087
1088               Note that the index is not maintained after the parsing. If
1089               elements are deleted, renamed or otherwise hurt during
1090               processing, the index is NOT updated.
1091
1092           Note: I _HATE_ the Java-like name of arguments used by most XML
1093           modules.  So in pure TIMTOWTDI fashion all arguments can be written
1094           either as "UglyJavaLikeName" or as "readable_perl_name":
1095           "twig_print_outside_roots" or "TwigPrintOutsideRoots" (or even
1096           "twigPrintOutsideRoots" {shudder}).  XML::Twig normalizes them
1097           before processing them.
1098
1099       parse ( $source)
1100           The $source parameter should either be a string containing the
1101           whole XML document, or it should be an open "IO::Handle".
1102           Constructor options to "XML::Parser::Expat" given as keyword-value
1103           pairs may follow the$source parameter. These override, for this
1104           call, any options or attributes passed through from the XML::Parser
1105           instance.
1106
1107           A die call is thrown if a parse error occurs. Otherwise it will
1108           return the twig built by the parse. Use "safe_parse" if you want
1109           the parsing to return even when an error occurs.
1110
1111           If this method is called as a class method ("XML::Twig->parse(
1112           $some_xml_or_html)") then an XML::Twig object is created, using the
1113           parameters except the last one (eg "XML::Twig->parse( pretty_print
1114           => 'indented', $some_xml_or_html)") and "xparse" is called on it.
1115
1116       parsestring
1117           This is just an alias for "parse" for backwards compatibility.
1118
1119       parsefile (FILE [, OPT => OPT_VALUE [...]])
1120           Open "FILE" for reading, then call "parse" with the open handle.
1121           The file is closed no matter how "parse" returns.
1122
1123           A "die" call is thrown if a parse error occurs. Otherwise it will
1124           return the twig built by the parse. Use "safe_parsefile" if you
1125           want the parsing to return even when an error occurs.
1126
1127       parsefile_inplace ( $file, $optional_extension)
1128           Parse and update a file "in place". It does this by creating a temp
1129           file, selecting it as the default for print() statements (and
1130           methods), then parsing the input file. If the parsing is
1131           successful, then the temp file is moved to replace the input file.
1132
1133           If an extension is given then the original file is backed-up (the
1134           rules for the extension are the same as the rule for the -i option
1135           in perl).
1136
1137       parsefile_html_inplace ( $file, $optional_extension)
1138           Same as parsefile_inplace, except that it parses HTML instead of
1139           XML
1140
1141       parseurl ($url $optional_user_agent)
1142           Gets the data from $url and parse it. The data is piped to the
1143           parser in chunks the size of the XML::Parser::Expat buffer, so
1144           memory consumption and hopefully speed are optimal.
1145
1146           For most (read "small") XML it is probably as efficient (and easier
1147           to debug) to just "get" the XML file and then parse it as a string.
1148
1149             use XML::Twig;
1150             use LWP::Simple;
1151             my $twig= XML::Twig->new();
1152             $twig->parse( LWP::Simple::get( $URL ));
1153
1154           or
1155
1156             use XML::Twig;
1157             my $twig= XML::Twig->nparse( $URL);
1158
1159           If the $optional_user_agent argument is used then it is used,
1160           otherwise a new one is created.
1161
1162       safe_parse ( SOURCE [, OPT => OPT_VALUE [...]])
1163           This method is similar to "parse" except that it wraps the parsing
1164           in an "eval" block. It returns the twig on success and 0 on failure
1165           (the twig object also contains the parsed twig). $@ contains the
1166           error message on failure.
1167
1168           Note that the parsing still stops as soon as an error is detected,
1169           there is no way to keep going after an error.
1170
1171       safe_parsefile (FILE [, OPT => OPT_VALUE [...]])
1172           This method is similar to "parsefile" except that it wraps the
1173           parsing in an "eval" block. It returns the twig on success and 0 on
1174           failure (the twig object also contains the parsed twig) . $@
1175           contains the error message on failure
1176
1177           Note that the parsing still stops as soon as an error is detected,
1178           there is no way to keep going after an error.
1179
1180       safe_parseurl ($url $optional_user_agent)
1181           Same as "parseurl" except that it wraps the parsing in an "eval"
1182           block. It returns the twig on success and 0 on failure (the twig
1183           object also contains the parsed twig) . $@ contains the error
1184           message on failure
1185
1186       parse_html ($string_or_fh)
1187           parse an HTML string or file handle (by converting it to XML using
1188           HTML::TreeBuilder, which needs to be available).
1189
1190           This works nicely, but some information gets lost in the process:
1191           newlines are removed, and (at least on the version I use), comments
1192           get get an extra CDATA section inside ( <!-- foo --> becomes <!--
1193           <![CDATA[ foo ]]> -->
1194
1195       parsefile_html
1196           parse an HTML file (by converting it to XML using
1197           HTML::TreeBuilder, which needs to be available). The file is loaded
1198           completely in memory and converted to XML before being parsed.
1199
1200           Alpha: implementation, and thus generated XML could change.
1201
1202       safe_parseurl_html ($url $optional_user_agent)
1203           Same as "parseurl_html"> except that it wraps the parsing in an
1204           "eval" block.  It returns the twig on success and 0 on failure (the
1205           twig object also contains the parsed twig) . $@ contains the error
1206           message on failure
1207
1208       safe_parsefile_html ($file $optional_user_agent)
1209           Same as "parsefile_html"> except that it wraps the parsing in an
1210           "eval" block.  It returns the twig on success and 0 on failure (the
1211           twig object also contains the parsed twig) . $@ contains the error
1212           message on failure
1213
1214       safe_parse_html ($string_or_fh)
1215           Same as "parse_html" except that it wraps the parsing in an "eval"
1216           block.  It returns the twig on success and 0 on failure (the twig
1217           object also contains the parsed twig) . $@ contains the error
1218           message on failure
1219
1220       xparse ($thing_to_parse)
1221           parse the $thing_to_parse, whether it is a filehandle, a string, an
1222           HTML file, an HTML URL, an URL or a file.
1223
1224           Note that this is mostly a convenience method for one-off scripts.
1225           For example files that end in '.htm' or '.html' are parsed first as
1226           XML, and if this fails as HTML. This is certainly not the most
1227           efficient way to do this in general.
1228
1229       nparse ($optional_twig_options, $thing_to_parse)
1230           create a twig with the $optional_options, and parse the
1231           $thing_to_parse, whether it is a filehandle, a string, an HTML
1232           file, an HTML URL, an URL or a file.
1233
1234           Examples:
1235
1236              XML::Twig->nparse( "file.xml");
1237              XML::Twig->nparse( error_context => 1, "file://file.xml");
1238
1239       nparse_pp ($optional_twig_options, $thing_to_parse)
1240           same as "nparse" but also sets the "pretty_print" option to
1241           "indented".
1242
1243       nparse_e ($optional_twig_options, $thing_to_parse)
1244           same as "nparse" but also sets the "error_context" option to 1.
1245
1246       nparse_ppe ($optional_twig_options, $thing_to_parse)
1247           same as "nparse" but also sets the "pretty_print" option to
1248           "indented" and the "error_context" option to 1.
1249
1250       parser
1251           This method returns the "expat" object (actually the
1252           XML::Parser::Expat object) used during parsing. It is useful for
1253           example to call XML::Parser::Expat methods on it. To get the line
1254           of a tag for example use "$t->parser->current_line".
1255
1256       setTwigHandlers ($handlers)
1257           Set the twig_handlers. $handlers is a reference to a hash similar
1258           to the one in the "twig_handlers" option of new. All previous
1259           handlers are unset.  The method returns the reference to the
1260           previous handlers.
1261
1262       setTwigHandler ($exp $handler)
1263           Set a single twig_handler for elements matching $exp. $handler is a
1264           reference to a subroutine. If the handler was previously set then
1265           the reference to the previous handler is returned.
1266
1267       setStartTagHandlers ($handlers)
1268           Set the start_tag handlers. $handlers is a reference to a hash
1269           similar to the one in the "start_tag_handlers" option of new. All
1270           previous handlers are unset.  The method returns the reference to
1271           the previous handlers.
1272
1273       setStartTagHandler ($exp $handler)
1274           Set a single start_tag handlers for elements matching $exp.
1275           $handler is a reference to a subroutine. If the handler was
1276           previously set then the reference to the previous handler is
1277           returned.
1278
1279       setEndTagHandlers ($handlers)
1280           Set the end_tag handlers. $handlers is a reference to a hash
1281           similar to the one in the "end_tag_handlers" option of new. All
1282           previous handlers are unset.  The method returns the reference to
1283           the previous handlers.
1284
1285       setEndTagHandler ($exp $handler)
1286           Set a single end_tag handlers for elements matching $exp. $handler
1287           is a reference to a subroutine. If the handler was previously set
1288           then the reference to the previous handler is returned.
1289
1290       setTwigRoots ($handlers)
1291           Same as using the "twig_roots" option when creating the twig
1292
1293       setCharHandler ($exp $handler)
1294           Set a "char_handler"
1295
1296       setIgnoreEltsHandler ($exp)
1297           Set a "ignore_elt" handler (elements that match $exp will be
1298           ignored
1299
1300       setIgnoreEltsHandlers ($exp)
1301           Set all "ignore_elt" handlers (previous handlers are replaced)
1302
1303       dtd Return the dtd (an XML::Twig::DTD object) of a twig
1304
1305       xmldecl
1306           Return the XML declaration for the document, or a default one if it
1307           doesn't have one
1308
1309       doctype
1310           Return the doctype for the document
1311
1312       doctype_name
1313           returns the doctype of the document from the doctype declaration
1314
1315       system_id
1316           returns the system value of the DTD of the document from the
1317           doctype declaration
1318
1319       public_id
1320           returns the public doctype of the document from the doctype
1321           declaration
1322
1323       internal_subset
1324           returns the internal subset of the DTD
1325
1326       dtd_text
1327           Return the DTD text
1328
1329       dtd_print
1330           Print the DTD
1331
1332       model ($tag)
1333           Return the model (in the DTD) for the element $tag
1334
1335       root
1336           Return the root element of a twig
1337
1338       set_root ($elt)
1339           Set the root of a twig
1340
1341       first_elt ($optional_condition)
1342           Return the first element matching $optional_condition of a twig, if
1343           no condition is given then the root is returned
1344
1345       last_elt ($optional_condition)
1346           Return the last element matching $optional_condition of a twig, if
1347           no condition is given then the last element of the twig is returned
1348
1349       elt_id        ($id)
1350           Return the element whose "id" attribute is $id
1351
1352       getEltById
1353           Same as "elt_id"
1354
1355       index ($index_name, $optional_index)
1356           If the $optional_index argument is present, return the
1357           corresponding element in the index (created using the "index"
1358           option for "XML::Twig-"new>)
1359
1360           If the argument is not present, return an arrayref to the index
1361
1362       normalize
1363           merge together all consecutive pcdata elements in the document (if
1364           for example you have turned some elements into pcdata using
1365           "erase", this will give you a "clean" document in which there all
1366           text elements are as long as possible).
1367
1368       encoding
1369           This method returns the encoding of the XML document, as defined by
1370           the "encoding" attribute in the XML declaration (ie it is "undef"
1371           if the attribute is not defined)
1372
1373       set_encoding
1374           This method sets the value of the "encoding" attribute in the XML
1375           declaration.  Note that if the document did not have a declaration
1376           it is generated (with an XML version of 1.0)
1377
1378       xml_version
1379           This method returns the XML version, as defined by the "version"
1380           attribute in the XML declaration (ie it is "undef" if the attribute
1381           is not defined)
1382
1383       set_xml_version
1384           This method sets the value of the "version" attribute in the XML
1385           declaration.  If the declaration did not exist it is created.
1386
1387       standalone
1388           This method returns the value of the "standalone" declaration for
1389           the document
1390
1391       set_standalone
1392           This method sets the value of the "standalone" attribute in the XML
1393           declaration.  Note that if the document did not have a declaration
1394           it is generated (with an XML version of 1.0)
1395
1396       set_output_encoding
1397           Set the "encoding" "attribute" in the XML declaration
1398
1399       set_doctype ($name, $system, $public, $internal)
1400           Set the doctype of the element. If an argument is "undef" (or not
1401           present) then its former value is retained, if a false ('' or 0)
1402           value is passed then the former value is deleted;
1403
1404       entity_list
1405           Return the entity list of a twig
1406
1407       entity_names
1408           Return the list of all defined entities
1409
1410       entity ($entity_name)
1411           Return the entity
1412
1413       change_gi      ($old_gi, $new_gi)
1414           Performs a (very fast) global change. All elements $old_gi are now
1415           $new_gi. This is a bit dangerous though and should be avoided if <
1416           possible, as the new tag might be ignored in subsequent processing.
1417
1418           See "BUGS "
1419
1420       flush            ($optional_filehandle, %options)
1421           Flushes a twig up to (and including) the current element, then
1422           deletes all unnecessary elements from the tree that's kept in
1423           memory.  "flush" keeps track of which elements need to be
1424           open/closed, so if you flush from handlers you don't have to worry
1425           about anything. Just keep flushing the twig every time you're done
1426           with a sub-tree and it will come out well-formed. After the whole
1427           parsing don't forget to"flush" one more time to print the end of
1428           the document.  The doctype and entity declarations are also
1429           printed.
1430
1431           flush take an optional filehandle as an argument.
1432
1433           options: use the "update_DTD" option if you have updated the
1434           (internal) DTD and/or the entity list and you want the updated DTD
1435           to be output
1436
1437           The "pretty_print" option sets the pretty printing of the document.
1438
1439              Example: $t->flush( Update_DTD => 1);
1440                       $t->flush( $filehandle, pretty_print => 'indented');
1441                       $t->flush( \*FILE);
1442
1443       flush_up_to ($elt, $optional_filehandle, %options)
1444           Flushes up to the $elt element. This allows you to keep part of the
1445           tree in memory when you "flush".
1446
1447           options: see flush.
1448
1449       purge
1450           Does the same as a "flush" except it does not print the twig. It
1451           just deletes all elements that have been completely parsed so far.
1452
1453       purge_up_to ($elt)
1454           Purges up to the $elt element. This allows you to keep part of the
1455           tree in memory when you "purge".
1456
1457       print            ($optional_filehandle, %options)
1458           Prints the whole document associated with the twig. To be used only
1459           AFTER the parse.
1460
1461           options: see "flush".
1462
1463       print_to_file    ($filename, %options)
1464           Prints the whole document associated with the twig to file
1465           $filename.  To be used only AFTER the parse.
1466
1467           options: see "flush".
1468
1469       sprint
1470           Return the text of the whole document associated with the twig. To
1471           be used only AFTER the parse.
1472
1473           options: see "flush".
1474
1475       trim
1476           Trim the document: gets rid of initial and trailing spaces, and
1477           replaces multiple spaces by a single one.
1478
1479       toSAX1 ($handler)
1480           Send SAX events for the twig to the SAX1 handler $handler
1481
1482       toSAX2 ($handler)
1483           Send SAX events for the twig to the SAX2 handler $handler
1484
1485       flush_toSAX1 ($handler)
1486           Same as flush, except that SAX events are sent to the SAX1 handler
1487           $handler instead of the twig being printed
1488
1489       flush_toSAX2 ($handler)
1490           Same as flush, except that SAX events are sent to the SAX2 handler
1491           $handler instead of the twig being printed
1492
1493       ignore
1494           This method should be called during parsing, usually in
1495           "start_tag_handlers".  It causes the element to be skipped during
1496           the parsing: the twig is not built for this element, it will not be
1497           accessible during parsing or after it. The element will not take up
1498           any memory and parsing will be faster.
1499
1500           Note that this method can also be called on an element. If the
1501           element is a parent of the current element then this element will
1502           be ignored (the twig will not be built any more for it and what has
1503           already been built will be deleted).
1504
1505       set_pretty_print  ($style)
1506           Set the pretty print method, amongst '"none"' (default),
1507           '"nsgmls"', '"nice"', '"indented"', "indented_c", '"wrapped"',
1508           '"record"' and '"record_c"'
1509
1510           WARNING: the pretty print style is a GLOBAL variable, so once set
1511           it's applied to ALL "print"'s (and "sprint"'s). Same goes if you
1512           use XML::Twig with "mod_perl" . This should not be a problem as the
1513           XML that's generated is valid anyway, and XML processors (as well
1514           as HTML processors, including browsers) should not care. Let me
1515           know if this is a big problem, but at the moment the
1516           performance/cleanliness trade-off clearly favors the global
1517           approach.
1518
1519       set_empty_tag_style  ($style)
1520           Set the empty tag display style ('"normal"', '"html"' or
1521           '"expand"'). As with "set_pretty_print" this sets a global flag.
1522
1523           "normal" outputs an empty tag '"<tag/>"', "html" adds a space
1524           '"<tag />"' for elements that can be empty in XHTML and "expand"
1525           outputs '"<tag></tag>"'
1526
1527       set_remove_cdata  ($flag)
1528           set (or unset) the flag that forces the twig to output CDATA
1529           sections as regular (escaped) PCDATA
1530
1531       print_prolog     ($optional_filehandle, %options)
1532           Prints the prolog (XML declaration + DTD + entity declarations) of
1533           a document.
1534
1535           options: see "flush".
1536
1537       prolog     ($optional_filehandle, %options)
1538           Return the prolog (XML declaration + DTD + entity declarations) of
1539           a document.
1540
1541           options: see "flush".
1542
1543       finish
1544           Call Expat "finish" method.  Unsets all handlers (including
1545           internal ones that set context), but expat continues parsing to the
1546           end of the document or until it finds an error.  It should finish
1547           up a lot faster than with the handlers set.
1548
1549       finish_print
1550           Stops twig processing, flush the twig and proceed to finish
1551           printing the document as fast as possible. Use this method when
1552           modifying a document and the modification is done.
1553
1554       finish_now
1555           Stops twig processing, does not finish parsing the document (which
1556           could actually be not well-formed after the point where
1557           "finish_now" is called).  Execution resumes after the "Lparse"> or
1558           "parsefile" call. The content of the twig is what has been parsed
1559           so far (all open elements at the time "finish_now" is called are
1560           considered closed).
1561
1562       set_expand_external_entities
1563           Same as using the "expand_external_ents" option when creating the
1564           twig
1565
1566       set_input_filter
1567           Same as using the "input_filter" option when creating the twig
1568
1569       set_keep_atts_order
1570           Same as using the "keep_atts_order" option when creating the twig
1571
1572       set_keep_encoding
1573           Same as using the "keep_encoding" option when creating the twig
1574
1575       escape_gt
1576           usually XML::Twig does not escape > in its output. Using this
1577           option makes it replace > by &gt;
1578
1579       do_not_escape_gt
1580           reverts XML::Twig behavior to its default of not escaping > in its
1581           output.
1582
1583       set_output_filter
1584           Same as using the "output_filter" option when creating the twig
1585
1586       set_output_text_filter
1587           Same as using the "output_text_filter" option when creating the
1588           twig
1589
1590       add_stylesheet ($type, @options)
1591           Adds an external stylesheet to an XML document.
1592
1593           Supported types and options:
1594
1595           xsl option: the url of the stylesheet
1596
1597               Example:
1598
1599                 $t->add_stylesheet( xsl => "xsl_style.xsl");
1600
1601               will generate the following PI at the beginning of the
1602               document:
1603
1604                 <?xml-stylesheet type="text/xsl" href="xsl_style.xsl"?>
1605
1606           css option: the url of the stylesheet
1607
1608       Methods inherited from XML::Parser::Expat
1609           A twig inherits all the relevant methods from XML::Parser::Expat.
1610           These methods can only be used during the parsing phase (they will
1611           generate a fatal error otherwise).
1612
1613           Inherited methods are:
1614
1615           depth
1616               Returns the size of the context list.
1617
1618           in_element
1619               Returns true if NAME is equal to the name of the innermost
1620               curaXX rently opened element. If namespace processing is being
1621               used and you want to check against a name that may be in a
1622               namespace, then use the generate_ns_name method to create the
1623               NAME argument.
1624
1625           within_element
1626               Returns the number of times the given name appears in the
1627               context list.  If namespace processing is being used and you
1628               want to check against a name that may be in a namespace, then
1629               use the generaXX ate_ns_name method to create the NAME
1630               argument.
1631
1632           context
1633               Returns a list of element names that represent open elements,
1634               with the last one being the innermost. Inside start and end tag
1635               hanaXX dlers, this will be the tag of the parent element.
1636
1637           current_line
1638               Returns the line number of the current position of the parse.
1639
1640           current_column
1641               Returns the column number of the current position of the parse.
1642
1643           current_byte
1644               Returns the current position of the parse.
1645
1646           position_in_context
1647               Returns a string that shows the current parse position. LINES
1648               should be an integer >= 0 that represents the number of lines
1649               on either side of the current parse line to place into the
1650               returned string.
1651
1652           base ([NEWBASE])
1653               Returns the current value of the base for resolving relative
1654               URIs.  If NEWBASE is supplied, changes the base to that value.
1655
1656           current_element
1657               Returns the name of the innermost currently opened element.
1658               Inside start or end handlers, returns the parent of the element
1659               associated with those tags.
1660
1661           element_index
1662               Returns an integer that is the depth-first visit order of the
1663               curaXX rent element. This will be zero outside of the root
1664               element. For example, this will return 1 when called from the
1665               start handler for the root element start tag.
1666
1667           recognized_string
1668               Returns the string from the document that was recognized in
1669               order to call the current handler. For instance, when called
1670               from a start handler, it will give us the the start-tag string.
1671               The string is encoded in UTF-8.  This method doesn't return a
1672               meaningful string inside declaration handlers.
1673
1674           original_string
1675               Returns the verbatim string from the document that was
1676               recognized in order to call the current handler. The string is
1677               in the original document encoding. This method doesn't return a
1678               meaningful string inside declaration handlers.
1679
1680           xpcroak
1681               Concatenate onto the given message the current line number
1682               within the XML document plus the message implied by
1683               ErrorContext. Then croak with the formed message.
1684
1685           xpcarp
1686               Concatenate onto the given message the current line number
1687               within the XML document plus the message implied by
1688               ErrorContext. Then carp with the formed message.
1689
1690           xml_escape(TEXT [, CHAR [, CHAR ...]])
1691               Returns TEXT with markup characters turned into character
1692               entities.  Any additional characters provided as arguments are
1693               also turned into character references where found in TEXT.
1694
1695               (this method is broken on some versions of expat/XML::Parser)
1696
1697       path ( $optional_tag)
1698           Return the element context in a form similar to XPath's short form:
1699           '"/root/tag1/../tag"'
1700
1701       get_xpath  ( $optional_array_ref, $xpath, $optional_offset)
1702           Performs a "get_xpath" on the document root (see <Elt|"Elt">)
1703
1704           If the $optional_array_ref argument is used the array must contain
1705           elements. The $xpath expression is applied to each element in turn
1706           and the result is union of all results. This way a first query can
1707           be refined in further steps.
1708
1709       find_nodes ( $optional_array_ref, $xpath, $optional_offset)
1710           same as "get_xpath"
1711
1712       findnodes ( $optional_array_ref, $xpath, $optional_offset)
1713           same as "get_xpath" (similar to the XML::LibXML method)
1714
1715       findvalue ( $optional_array_ref, $xpath, $optional_offset)
1716           Return the "join" of all texts of the results of applying
1717           "get_xpath" to the node (similar to the XML::LibXML method)
1718
1719       subs_text ($regexp, $replace)
1720           subs_text does text substitution on the whole document, similar to
1721           perl's " s///" operator.
1722
1723       dispose
1724           Useful only if you don't have "Scalar::Util" or "WeakRef"
1725           installed.
1726
1727           Reclaims properly the memory used by an XML::Twig object. As the
1728           object has circular references it never goes out of scope, so if
1729           you want to parse lots of XML documents then the memory leak
1730           becomes a problem. Use "$twig->dispose" to clear this problem.
1731
1732       create_accessors (list_of_attribute_names)
1733           A convenience method that creates l-valued accessors for
1734           attributes.  So "$twig->create_accessors( 'foo')" will create a
1735           "foo" method that can be called on elements:
1736
1737             $elt->foo;         # equivalent to $elt->{'att'}->{'foo'};
1738             $elt->foo( 'bar'); # equivalent to $elt->set_att( foo => 'bar');
1739
1740       set_do_not_escape_amp_in_atts
1741           An evil method, that I only document because Test::Pod::Coverage
1742           complaints otherwise, but really, you don't want to know about it.
1743
1744   XML::Twig::Elt
1745       new          ($optional_tag, $optional_atts, @optional_content)
1746           The "tag" is optional (but then you can't have a content ), the
1747           $optional_atts argument is a reference to a hash of attributes, the
1748           content can be just a string or a list of strings and element. A
1749           content of '"#EMPTY"' creates an empty element;
1750
1751            Examples: my $elt= XML::Twig::Elt->new();
1752                      my $elt= XML::Twig::Elt->new( para => { align => 'center' });
1753                      my $elt= XML::Twig::Elt->new( para => { align => 'center' }, 'foo');
1754                      my $elt= XML::Twig::Elt->new( br   => '#EMPTY');
1755                      my $elt= XML::Twig::Elt->new( 'para');
1756                      my $elt= XML::Twig::Elt->new( para => 'this is a para');
1757                      my $elt= XML::Twig::Elt->new( para => $elt3, 'another para');
1758
1759           The strings are not parsed, the element is not attached to any
1760           twig.
1761
1762           WARNING: if you rely on ID's then you will have to set the id
1763           yourself. At this point the element does not belong to a twig yet,
1764           so the ID attribute is not known so it won't be stored in the ID
1765           list.
1766
1767           Note that "#COMMENT", "#PCDATA" or "#CDATA" are valid tag names,
1768           that will create text elements.
1769
1770           To create an element "foo" containing a CDATA section:
1771
1772                      my $foo= XML::Twig::Elt->new( '#CDATA' => "content of the CDATA section")
1773                                             ->wrap_in( 'foo');
1774
1775           An attribute of '#CDATA', will create the content of the element as
1776           CDATA:
1777
1778             my $elt= XML::Twig::Elt->new( 'p' => { '#CDATA' => 1}, 'foo < bar');
1779
1780           creates an element
1781
1782             <p><![CDATA[foo < bar]]></>
1783
1784       parse         ($string, %args)
1785           Creates an element from an XML string. The string is actually
1786           parsed as a new twig, then the root of that twig is returned.  The
1787           arguments in %args are passed to the twig.  As always if the parse
1788           fails the parser will die, so use an eval if you want to trap
1789           syntax errors.
1790
1791           As obviously the element does not exist beforehand this method has
1792           to be called on the class:
1793
1794             my $elt= parse XML::Twig::Elt( "<a> string to parse, with <sub/>
1795                                             <elements>, actually tons of </elements>
1796                             h</a>");
1797
1798       set_inner_xml ($string)
1799           Sets the content of the element to be the tree created from the
1800           string
1801
1802       set_inner_html ($string)
1803           Sets the content of the element, after parsing the string with an
1804           HTML parser (HTML::Parser)
1805
1806       print         ($optional_filehandle, $optional_pretty_print_style)
1807           Prints an entire element, including the tags, optionally to a
1808           $optional_filehandle, optionally with a $pretty_print_style.
1809
1810           The print outputs XML data so base entities are escaped.
1811
1812       sprint       ($elt, $optional_no_enclosing_tag)
1813           Return the xml string for an entire element, including the tags.
1814           If the optional second argument is true then only the string inside
1815           the element is returned (the start and end tag for $elt are not).
1816           The text is XML-escaped: base entities (& and < in text, & < and "
1817           in attribute values) are turned into entities.
1818
1819       gi  Return the gi of the element (the gi is the "generic identifier"
1820           the tag name in SGML parlance).
1821
1822           "tag" and "name" are synonyms of "gi".
1823
1824       tag Same as "gi"
1825
1826       name
1827           Same as "tag"
1828
1829       set_gi         ($tag)
1830           Set the gi (tag) of an element
1831
1832       set_tag        ($tag)
1833           Set the tag (="tag") of an element
1834
1835       set_name       ($name)
1836           Set the name (="tag") of an element
1837
1838       root
1839           Return the root of the twig in which the element is contained.
1840
1841       twig
1842           Return the twig containing the element.
1843
1844       parent        ($optional_condition)
1845           Return the parent of the element, or the first ancestor matching
1846           the $optional_condition
1847
1848       first_child   ($optional_condition)
1849           Return the first child of the element, or the first child matching
1850           the $optional_condition
1851
1852       has_child ($optional_condition)
1853           Return the first child of the element, or the first child matching
1854           the $optional_condition (same as first_child)
1855
1856       has_children ($optional_condition)
1857           Return the first child of the element, or the first child matching
1858           the $optional_condition (same as first_child)
1859
1860       first_child_text   ($optional_condition)
1861           Return the text of the first child of the element, or the first
1862           child
1863            matching the $optional_condition If there is no first_child then
1864           returns ''. This avoids getting the child, checking for its
1865           existence then getting the text for trivial cases.
1866
1867           Similar methods are available for the other navigation methods:
1868
1869           last_child_text
1870           prev_sibling_text
1871           next_sibling_text
1872           prev_elt_text
1873           next_elt_text
1874           child_text
1875           parent_text
1876
1877           All this methods also exist in "trimmed" variant:
1878
1879           first_child_trimmed_text
1880           last_child_trimmed_text
1881           prev_sibling_trimmed_text
1882           next_sibling_trimmed_text
1883           prev_elt_trimmed_text
1884           next_elt_trimmed_text
1885           child_trimmed_text
1886           parent_trimmed_text
1887       field         ($condition)
1888           Same method as "first_child_text" with a different name
1889
1890       fields         ($condition_list)
1891           Return the list of field (text of first child matching the
1892           conditions), missing fields are returned as the empty string.
1893
1894           Same method as "first_child_text" with a different name
1895
1896       trimmed_field         ($optional_condition)
1897           Same method as "first_child_trimmed_text" with a different name
1898
1899       set_field ($condition, $optional_atts, @list_of_elt_and_strings)
1900           Set the content of the first child of the element that matches
1901           $condition, the rest of the arguments is the same as for
1902           "set_content"
1903
1904           If no child matches $condition _and_ if $condition is a valid XML
1905           element name, then a new element by that name is created and
1906           inserted as the last child.
1907
1908       first_child_matches   ($optional_condition)
1909           Return the element if the first child of the element (if it exists)
1910           passes the $optional_condition "undef" otherwise
1911
1912             if( $elt->first_child_matches( 'title')) ...
1913
1914           is equivalent to
1915
1916             if( $elt->{first_child} && $elt->{first_child}->passes( 'title'))
1917
1918           "first_child_is" is an other name for this method
1919
1920           Similar methods are available for the other navigation methods:
1921
1922           last_child_matches
1923           prev_sibling_matches
1924           next_sibling_matches
1925           prev_elt_matches
1926           next_elt_matches
1927           child_matches
1928           parent_matches
1929       is_first_child ($optional_condition)
1930           returns true (the element) if the element is the first child of its
1931           parent (optionally that satisfies the $optional_condition)
1932
1933       is_last_child ($optional_condition)
1934           returns true (the element) if the element is the first child of its
1935           parent (optionally that satisfies the $optional_condition)
1936
1937       prev_sibling  ($optional_condition)
1938           Return the previous sibling of the element, or the previous sibling
1939           matching $optional_condition
1940
1941       next_sibling  ($optional_condition)
1942           Return the next sibling of the element, or the first one matching
1943           $optional_condition.
1944
1945       next_elt     ($optional_elt, $optional_condition)
1946           Return the next elt (optionally matching $optional_condition) of
1947           the element. This is defined as the next element which opens after
1948           the current element opens.  Which usually means the first child of
1949           the element.  Counter-intuitive as it might look this allows you to
1950           loop through the whole document by starting from the root.
1951
1952           The $optional_elt is the root of a subtree. When the "next_elt" is
1953           out of the subtree then the method returns undef. You can then walk
1954           a sub tree with:
1955
1956             my $elt= $subtree_root;
1957             while( $elt= $elt->next_elt( $subtree_root)
1958               { # insert processing code here
1959               }
1960
1961       prev_elt     ($optional_condition)
1962           Return the previous elt (optionally matching $optional_condition)
1963           of the element. This is the first element which opens before the
1964           current one.  It is usually either the last descendant of the
1965           previous sibling or simply the parent
1966
1967       next_n_elt   ($offset, $optional_condition)
1968           Return the $offset-th element that matches the $optional_condition
1969
1970       following_elt
1971           Return the following element (as per the XPath following axis)
1972
1973       preceding_elt
1974           Return the preceding element (as per the XPath preceding axis)
1975
1976       following_elts
1977           Return the list of following elements (as per the XPath following
1978           axis)
1979
1980       preceding_elts
1981           Return the pst of preceding elements (as per the XPath preceding
1982           axis)
1983
1984       children     ($optional_condition)
1985           Return the list of children (optionally which matches
1986           $optional_condition) of the element. The list is in document order.
1987
1988       children_count ($optional_condition)
1989           Return the number of children of the element (optionally which
1990           matches $optional_condition)
1991
1992       children_text ($optional_condition)
1993           In array context, reeturns an array containing the text of children
1994           of the element (optionally which matches $optional_condition)
1995
1996           In scalar context, returns the concatenation of the text of
1997           children of the element
1998
1999       children_trimmed_text ($optional_condition)
2000           In array context, returns an array containing the trimmed text of
2001           children of the element (optionally which matches
2002           $optional_condition)
2003
2004           In scalar context, returns the concatenation of the trimmed text of
2005           children of the element
2006
2007       children_copy ($optional_condition)
2008           Return a list of elements that are copies of the children of the
2009           element, optionally which matches $optional_condition
2010
2011       descendants     ($optional_condition)
2012           Return the list of all descendants (optionally which matches
2013           $optional_condition) of the element. This is the equivalent of the
2014           "getElementsByTagName" of the DOM (by the way, if you are really a
2015           DOM addict, you can use "getElementsByTagName" instead)
2016
2017       getElementsByTagName ($optional_condition)
2018           Same as "descendants"
2019
2020       find_by_tag_name ($optional_condition)
2021           Same as "descendants"
2022
2023       descendants_or_self ($optional_condition)
2024           Same as "descendants" except that the element itself is included in
2025           the list if it matches the $optional_condition
2026
2027       first_descendant  ($optional_condition)
2028           Return the first descendant of the element that matches the
2029           condition
2030
2031       last_descendant  ($optional_condition)
2032           Return the last descendant of the element that matches the
2033           condition
2034
2035       ancestors    ($optional_condition)
2036           Return the list of ancestors (optionally matching
2037           $optional_condition) of the element.  The list is ordered from the
2038           innermost ancestor to the outermost one
2039
2040           NOTE: the element itself is not part of the list, in order to
2041           include it you will have to use ancestors_or_self
2042
2043       ancestors_or_self     ($optional_condition)
2044           Return the list of ancestors (optionally matching
2045           $optional_condition) of the element, including the element (if it
2046           matches the condition>).  The list is ordered from the innermost
2047           ancestor to the outermost one
2048
2049       passes ($condition)
2050           Return the element if it passes the $condition
2051
2052       att          ($att)
2053           Return the value of attribute $att or "undef"
2054
2055       set_att      ($att, $att_value)
2056           Set the attribute of the element to the given value
2057
2058           You can actually set several attributes this way:
2059
2060             $elt->set_att( att1 => "val1", att2 => "val2");
2061
2062       del_att      ($att)
2063           Delete the attribute for the element
2064
2065           You can actually delete several attributes at once:
2066
2067             $elt->del_att( 'att1', 'att2', 'att3');
2068
2069       att_exists ($att)
2070           Returns true if the attribute $att exists for the element, false
2071           otherwise
2072
2073       cut Cut the element from the tree. The element still exists, it can be
2074           copied or pasted somewhere else, it is just not attached to the
2075           tree anymore.
2076
2077           Note that the "old" links to the parent, previous and next siblings
2078           can still be accessed using the former_* methods
2079
2080       former_next_sibling
2081           Returns the former next sibling of a cut node (or undef if the node
2082           has not been cut)
2083
2084           This makes it easier to write loops where you cut elements:
2085
2086               my $child= $parent->first_child( 'achild');
2087               while( $child->{'att'}->{'cut'})
2088                 { $child->cut; $child= $child->former_next_sibling; }
2089
2090       former_prev_sibling
2091           Returns the former previous sibling of a cut node (or undef if the
2092           node has not been cut)
2093
2094       former_parent
2095           Returns the former parent of a cut node (or undef if the node has
2096           not been cut)
2097
2098       cut_children ($optional_condition)
2099           Cut all the children of the element (or all of those which satisfy
2100           the $optional_condition).
2101
2102           Return the list of children
2103
2104       copy        ($elt)
2105           Return a copy of the element. The copy is a "deep" copy: all sub
2106           elements of the element are duplicated.
2107
2108       paste       ($optional_position, $ref)
2109           Paste a (previously "cut" or newly generated) element. Die if the
2110           element already belongs to a tree.
2111
2112           Note that the calling element is pasted:
2113
2114             $child->paste( first_child => $existing_parent);
2115             $new_sibling->paste( after => $this_sibling_is_already_in_the_tree);
2116
2117           or
2118
2119             my $new_elt= XML::Twig::Elt->new( tag => $content);
2120             $new_elt->paste( $position => $existing_elt);
2121
2122           Example:
2123
2124             my $t= XML::Twig->new->parse( 'doc.xml')
2125             my $toc= $t->root->new( 'toc');
2126             $toc->paste( $t->root); # $toc is pasted as first child of the root
2127             foreach my $title ($t->findnodes( '/doc/section/title'))
2128               { my $title_toc= $title->copy;
2129                 # paste $title_toc as the last child of toc
2130                 $title_toc->paste( last_child => $toc)
2131               }
2132
2133           Position options:
2134
2135           first_child (default)
2136               The element is pasted as the first child of $ref
2137
2138           last_child
2139               The element is pasted as the last child of $ref
2140
2141           before
2142               The element is pasted before $ref, as its previous sibling.
2143
2144           after
2145               The element is pasted after $ref, as its next sibling.
2146
2147           within
2148               In this case an extra argument, $offset, should be supplied.
2149               The element will be pasted in the reference element (or in its
2150               first text child) at the given offset. To achieve this the
2151               reference element will be split at the offset.
2152
2153           Note that you can call directly the underlying method:
2154
2155           paste_before
2156           paste_after
2157           paste_first_child
2158           paste_last_child
2159           paste_within
2160       move       ($optional_position, $ref)
2161           Move an element in the tree.  This is just a "cut" then a "paste".
2162           The syntax is the same as "paste".
2163
2164       replace       ($ref)
2165           Replaces an element in the tree. Sometimes it is just not possible
2166           to"cut" an element then "paste" another in its place, so "replace"
2167           comes in handy.  The calling element replaces $ref.
2168
2169       replace_with   (@elts)
2170           Replaces the calling element with one or more elements
2171
2172       delete
2173           Cut the element and frees the memory.
2174
2175       prefix       ($text, $optional_option)
2176           Add a prefix to an element. If the element is a "PCDATA" element
2177           the text is added to the pcdata, if the elements first child is a
2178           "PCDATA" then the text is added to it's pcdata, otherwise a new
2179           "PCDATA" element is created and pasted as the first child of the
2180           element.
2181
2182           If the option is "asis" then the prefix is added asis: it is
2183           created in a separate "PCDATA" element with an "asis" property. You
2184           can then write:
2185
2186             $elt1->prefix( '<b>', 'asis');
2187
2188           to create a "<b>" in the output of "print".
2189
2190       suffix       ($text, $optional_option)
2191           Add a suffix to an element. If the element is a "PCDATA" element
2192           the text is added to the pcdata, if the elements last child is a
2193           "PCDATA" then the text is added to it's pcdata, otherwise a new
2194           PCDATA element is created and pasted as the last child of the
2195           element.
2196
2197           If the option is "asis" then the suffix is added asis: it is
2198           created in a separate "PCDATA" element with an "asis" property. You
2199           can then write:
2200
2201             $elt2->suffix( '</b>', 'asis');
2202
2203       trim
2204           Trim the element in-place: spaces at the beginning and at the end
2205           of the element are discarded and multiple spaces within the element
2206           (or its descendants) are replaced by a single space.
2207
2208           Note that in some cases you can still end up with multiple spaces,
2209           if they are split between several elements:
2210
2211             <doc>  text <b>  hah! </b>  yep</doc>
2212
2213           gets trimmed to
2214
2215             <doc>text <b> hah! </b> yep</doc>
2216
2217           This is somewhere in between a bug and a feature.
2218
2219       normalize
2220           merge together all consecutive pcdata elements in the element (if
2221           for example you have turned some elements into pcdata using
2222           "erase", this will give you a "clean" element in which there all
2223           text fragments are as long as possible).
2224
2225       simplify (%options)
2226           Return a data structure suspiciously similar to XML::Simple's.
2227           Options are identical to XMLin options, see XML::Simple doc for
2228           more details (or use DATA::dumper or YAML to dump the data
2229           structure)
2230
2231           content_key
2232           forcearray
2233           keyattr
2234           noattr
2235           normalize_space
2236               aka normalise_space
2237
2238           variables (%var_hash)
2239               %var_hash is a hash { name => value }
2240
2241               This option allows variables in the XML to be expanded when the
2242               file is read. (there is no facility for putting the variable
2243               names back if you regenerate XML using XMLout).
2244
2245               A 'variable' is any text of the form ${name} (or $name) which
2246               occurs in an attribute value or in the text content of an
2247               element. If 'name' matches a key in the supplied hashref,
2248               ${name} will be replaced with the corresponding value from the
2249               hashref. If no matching key is found, the variable will not be
2250               replaced.
2251
2252           var_att ($attribute_name)
2253               This option gives the name of an attribute that will be used to
2254               create variables in the XML:
2255
2256                 <dirs>
2257                   <dir name="prefix">/usr/local</dir>
2258                   <dir name="exec_prefix">$prefix/bin</dir>
2259                 </dirs>
2260
2261               use "var => 'name'" to get $prefix replaced by /usr/local in
2262               the generated data structure
2263
2264               By default variables are captured by the following regexp:
2265               /$(\w+)/
2266
2267           var_regexp (regexp)
2268               This option changes the regexp used to capture variables. The
2269               variable name should be in $1
2270
2271           group_tags { grouping tag => grouped tag, grouping tag 2 => grouped
2272           tag 2...}
2273               Option used to simplify the structure: elements listed will not
2274               be used.  Their children will be, they will be considered
2275               children of the element parent.
2276
2277               If the element is:
2278
2279                 <config host="laptop.xmltwig.com">
2280                   <server>localhost</server>
2281                   <dirs>
2282                     <dir name="base">/home/mrodrigu/standards</dir>
2283                     <dir name="tools">$base/tools</dir>
2284                   </dirs>
2285                   <templates>
2286                     <template name="std_def">std_def.templ</template>
2287                     <template name="dummy">dummy</template>
2288                   </templates>
2289                 </config>
2290
2291               Then calling simplify with "group_tags => { dirs => 'dir',
2292               templates => 'template'}" makes the data structure be exactly
2293               as if the start and end tags for "dirs" and "templates" were
2294               not there.
2295
2296               A YAML dump of the structure
2297
2298                 base: '/home/mrodrigu/standards'
2299                 host: laptop.xmltwig.com
2300                 server: localhost
2301                 template:
2302                   - std_def.templ
2303                   - dummy.templ
2304                 tools: '$base/tools'
2305
2306       split_at        ($offset)
2307           Split a text ("PCDATA" or "CDATA") element in 2 at $offset, the
2308           original element now holds the first part of the string and a new
2309           element holds the right part. The new element is returned
2310
2311           If the element is not a text element then the first text child of
2312           the element is split
2313
2314       split        ( $optional_regexp, $tag1, $atts1, $tag2, $atts2...)
2315           Split the text descendants of an element in place, the text is
2316           split using the $regexp, if the regexp includes () then the matched
2317           separators will be wrapped in elements.  $1 is wrapped in $tag1,
2318           with attributes $atts1 if $atts1 is given (as a hashref), $2 is
2319           wrapped in $tag2...
2320
2321           if $elt is "<p>tati tata <b>tutu tati titi</b> tata tati tata</p>"
2322
2323             $elt->split( qr/(ta)ti/, 'foo', {type => 'toto'} )
2324
2325           will change $elt to
2326
2327             <p><foo type="toto">ta</foo> tata <b>tutu <foo type="toto">ta</foo>
2328                 titi</b> tata <foo type="toto">ta</foo> tata</p>
2329
2330           The regexp can be passed either as a string or as "qr//" (perl
2331           5.005 and later), it defaults to \s+ just as the "split" built-in
2332           (but this would be quite a useless behaviour without the
2333           $optional_tag parameter)
2334
2335           $optional_tag defaults to PCDATA or CDATA, depending on the initial
2336           element type
2337
2338           The list of descendants is returned (including un-touched original
2339           elements and newly created ones)
2340
2341       mark        ( $regexp, $optional_tag, $optional_attribute_ref)
2342           This method behaves exactly as split, except only the newly created
2343           elements are returned
2344
2345       wrap_children ( $regexp_string, $tag, $optional_attribute_hashref)
2346           Wrap the children of the element that match the regexp in an
2347           element $tag.  If $optional_attribute_hashref is passed then the
2348           new element will have these attributes.
2349
2350           The $regexp_string includes tags, within pointy brackets, as in
2351           "<title><para>+" and the usual Perl modifiers (+*?...).  Tags can
2352           be further qualified with attributes: "<para type="warning"
2353           classif="cosmic_secret">+". The values for attributes should be
2354           xml-escaped: "<candy type="M&amp;Ms">*" ("<", "&" ">" and """
2355           should be escaped).
2356
2357           Note that elements might get extra "id" attributes in the process.
2358           See add_id.  Use strip_att to remove unwanted id's.
2359
2360           Here is an example:
2361
2362           If the element $elt has the following content:
2363
2364             <elt>
2365              <p>para 1</p>
2366              <l_l1_1>list 1 item 1 para 1</l_l1_1>
2367                <l_l1>list 1 item 1 para 2</l_l1>
2368              <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
2369              <l_l1_n>list 1 item 3 para 1</l_l1_n>
2370                <l_l1>list 1 item 3 para 2</l_l1>
2371                <l_l1>list 1 item 3 para 3</l_l1>
2372              <l_l1_1>list 2 item 1 para 1</l_l1_1>
2373                <l_l1>list 2 item 1 para 2</l_l1>
2374              <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
2375              <l_l1_n>list 2 item 3 para 1</l_l1_n>
2376                <l_l1>list 2 item 3 para 2</l_l1>
2377                <l_l1>list 2 item 3 para 3</l_l1>
2378             </elt>
2379
2380           Then the code
2381
2382             $elt->wrap_children( q{<l_l1_1><l_l1>*} , li => { type => "ul1" });
2383             $elt->wrap_children( q{<l_l1_n><l_l1>*} , li => { type => "ul" });
2384
2385             $elt->wrap_children( q{<li type="ul1"><li type="ul">+}, "ul");
2386             $elt->strip_att( 'id');
2387             $elt->strip_att( 'type');
2388             $elt->print;
2389
2390           will output:
2391
2392             <elt>
2393                <p>para 1</p>
2394                <ul>
2395                  <li>
2396                    <l_l1_1>list 1 item 1 para 1</l_l1_1>
2397                    <l_l1>list 1 item 1 para 2</l_l1>
2398                  </li>
2399                  <li>
2400                    <l_l1_n>list 1 item 2 para 1 (only para)</l_l1_n>
2401                  </li>
2402                  <li>
2403                    <l_l1_n>list 1 item 3 para 1</l_l1_n>
2404                    <l_l1>list 1 item 3 para 2</l_l1>
2405                    <l_l1>list 1 item 3 para 3</l_l1>
2406                  </li>
2407                </ul>
2408                <ul>
2409                  <li>
2410                    <l_l1_1>list 2 item 1 para 1</l_l1_1>
2411                    <l_l1>list 2 item 1 para 2</l_l1>
2412                  </li>
2413                  <li>
2414                    <l_l1_n>list 2 item 2 para 1 (only para)</l_l1_n>
2415                  </li>
2416                  <li>
2417                    <l_l1_n>list 2 item 3 para 1</l_l1_n>
2418                    <l_l1>list 2 item 3 para 2</l_l1>
2419                    <l_l1>list 2 item 3 para 3</l_l1>
2420                  </li>
2421                </ul>
2422             </elt>
2423
2424       subs_text ($regexp, $replace)
2425           subs_text does text substitution, similar to perl's " s///"
2426           operator.
2427
2428           $regexp must be a perl regexp, created with the "qr" operator.
2429
2430           $replace can include "$1, $2"... from the $regexp. It can also be
2431           used to create element and entities, by using "&elt( tag => { att
2432           => val }, text)" (similar syntax as "new") and "&ent( name)".
2433
2434           Here is a rather complex example:
2435
2436             $elt->subs_text( qr{(?<!do not )link to (http://([^\s,]*))},
2437                              'see &elt( a =>{ href => $1 }, $2)'
2438                            );
2439
2440           This will replace text like link to http://www.xmltwig.com by see
2441           <a href="www.xmltwig.com">www.xmltwig.com</a>, but not do not link
2442           to...
2443
2444           Generating entities (here replacing spaces with &nbsp;):
2445
2446             $elt->subs_text( qr{ }, '&ent( "&nbsp;")');
2447
2448           or, using a variable:
2449
2450             my $ent="&nbsp;";
2451             $elt->subs_text( qr{ }, "&ent( '$ent')");
2452
2453           Note that the substitution is always global, as in using the "g"
2454           modifier in a perl substitution, and that it is performed on all
2455           text descendants of the element.
2456
2457           Bug: in the $regexp, you can only use "\1", "\2"... if the
2458           replacement expression does not include elements or attributes. eg
2459
2460             t->subs_text( qr/((t[aiou])\2)/, '$2');             # ok, replaces toto, tata, titi, tutu by to, ta, ti, tu
2461             t->subs_text( qr/((t[aiou])\2)/, '&elt(p => $1)' ); # NOK, does not find toto...
2462
2463       add_id ($optional_coderef)
2464           Add an id to the element.
2465
2466           The id is an attribute, "id" by default, see the "id" option for
2467           XML::Twig "new" to change it. Use an id starting with "#" to get an
2468           id that's not output by print, flush or sprint, yet that allows you
2469           to use the elt_id method to get the element easily.
2470
2471           If the element already has an id, no new id is generated.
2472
2473           By default the method create an id of the form "twig_id_<nnnn>",
2474           where "<nnnn>" is a number, incremented each time the method is
2475           called successfully.
2476
2477       set_id_seed ($prefix)
2478           by default the id generated by "add_id" is "twig_id_<nnnn>",
2479           "set_id_seed" changes the prefix to $prefix and resets the number
2480           to 1
2481
2482       strip_att ($att)
2483           Remove the attribute $att from all descendants of the element
2484           (including the element)
2485
2486           Return the element
2487
2488       change_att_name ($old_name, $new_name)
2489           Change the name of the attribute from $old_name to $new_name. If
2490           there is no attribute $old_name nothing happens.
2491
2492       lc_attnames
2493           Lower cases the name all the attributes of the element.
2494
2495       sort_children_on_value( %options)
2496           Sort the children of the element in place according to their text.
2497           All children are sorted.
2498
2499           Return the element, with its children sorted.
2500
2501           %options are
2502
2503             type  : numeric |  alpha     (default: alpha)
2504             order : normal  |  reverse   (default: normal)
2505
2506           Return the element, with its children sorted
2507
2508       sort_children_on_att ($att, %options)
2509           Sort the children of the  element in place according to attribute
2510           $att.  %options are the same as for "sort_children_on_value"
2511
2512           Return the element.
2513
2514       sort_children_on_field ($tag, %options)
2515           Sort the children of the element in place, according to the field
2516           $tag (the text of the first child of the child with this tag).
2517           %options are the same as for "sort_children_on_value".
2518
2519           Return the element, with its children sorted
2520
2521       sort_children( $get_key, %options)
2522           Sort the children of the element in place. The $get_key argument is
2523           a reference to a function that returns the sort key when passed an
2524           element.
2525
2526           For example:
2527
2528             $elt->sort_children( sub { $_[0]->{'att'}->{"nb"} + $_[0]->text },
2529                                  type => 'numeric', order => 'reverse'
2530                                );
2531
2532       field_to_att ($cond, $att)
2533           Turn the text of the first sub-element matched by $cond into the
2534           value of attribute $att of the element. If $att is omitted then
2535           $cond is used as the name of the attribute, which makes sense only
2536           if $cond is a valid element (and attribute) name.
2537
2538           The sub-element is then cut.
2539
2540       att_to_field ($att, $tag)
2541           Take the value of attribute $att and create a sub-element $tag as
2542           first child of the element. If $tag is omitted then $att is used as
2543           the name of the sub-element.
2544
2545       get_xpath  ($xpath, $optional_offset)
2546           Return a list of elements satisfying the $xpath. $xpath is an
2547           XPATH-like expression.
2548
2549           A subset of the XPATH abbreviated syntax is covered:
2550
2551             tag
2552             tag[1] (or any other positive number)
2553             tag[last()]
2554             tag[@att] (the attribute exists for the element)
2555             tag[@att="val"]
2556             tag[@att=~ /regexp/]
2557             tag[att1="val1" and att2="val2"]
2558             tag[att1="val1" or att2="val2"]
2559             tag[string()="toto"] (returns tag elements which text (as per the text method)
2560                                  is toto)
2561             tag[string()=~/regexp/] (returns tag elements which text (as per the text
2562                                     method) matches regexp)
2563             expressions can start with / (search starts at the document root)
2564             expressions can start with . (search starts at the current element)
2565             // can be used to get all descendants instead of just direct children
2566             * matches any tag
2567
2568           So the following examples from the XPath
2569           recommendation<http://www.w3.org/TR/xpath.html#path-abbrev> work:
2570
2571             para selects the para element children of the context node
2572             * selects all element children of the context node
2573             para[1] selects the first para child of the context node
2574             para[last()] selects the last para child of the context node
2575             */para selects all para grandchildren of the context node
2576             /doc/chapter[5]/section[2] selects the second section of the fifth chapter
2577                of the doc
2578             chapter//para selects the para element descendants of the chapter element
2579                children of the context node
2580             //para selects all the para descendants of the document root and thus selects
2581                all para elements in the same document as the context node
2582             //olist/item selects all the item elements in the same document as the
2583                context node that have an olist parent
2584             .//para selects the para element descendants of the context node
2585             .. selects the parent of the context node
2586             para[@type="warning"] selects all para children of the context node that have
2587                a type attribute with value warning
2588             employee[@secretary and @assistant] selects all the employee children of the
2589                context node that have both a secretary attribute and an assistant
2590                attribute
2591
2592           The elements will be returned in the document order.
2593
2594           If $optional_offset is used then only one element will be returned,
2595           the one with the appropriate offset in the list, starting at 0
2596
2597           Quoting and interpolating variables can be a pain when the Perl
2598           syntax and the XPATH syntax collide, so use alternate quoting
2599           mechanisms like q or qq (I like q{} and qq{} myself).
2600
2601           Here are some more examples to get you started:
2602
2603             my $p1= "p1";
2604             my $p2= "p2";
2605             my @res= $t->get_xpath( qq{p[string( "$p1") or string( "$p2")]});
2606
2607             my $a= "a1";
2608             my @res= $t->get_xpath( qq{//*[@att="$a"]});
2609
2610             my $val= "a1";
2611             my $exp= qq{//p[ \@att='$val']}; # you need to use \@ or you will get a warning
2612             my @res= $t->get_xpath( $exp);
2613
2614           Note that the only supported regexps delimiters are / and that you
2615           must backslash all / in regexps AND in regular strings.
2616
2617           XML::Twig does not provide natively full XPATH support, but you can
2618           use "XML::Twig::XPath" to get "findnodes" to use "XML::XPath" as
2619           the XPath engine, with full coverage of the spec.
2620
2621           "XML::Twig::XPath" to get "findnodes" to use "XML::XPath" as the
2622           XPath engine, with full coverage of the spec.
2623
2624       find_nodes
2625           same as"get_xpath"
2626
2627       findnodes
2628           same as "get_xpath"
2629
2630       text @optional_options
2631           Return a string consisting of all the "PCDATA" and "CDATA" in an
2632           element, without any tags. The text is not XML-escaped: base
2633           entities such as "&" and "<" are not escaped.
2634
2635           The '"no_recurse"' option will only return the text of the element,
2636           not of any included sub-elements (same as "text_only").
2637
2638       text_only
2639           Same as "text" except that the text returned doesn't include the
2640           text of sub-elements.
2641
2642       trimmed_text
2643           Same as "text" except that the text is trimmed: leading and
2644           trailing spaces are discarded, consecutive spaces are collapsed
2645
2646       set_text        ($string)
2647           Set the text for the element: if the element is a "PCDATA", just
2648           set its text, otherwise cut all the children of the element and
2649           create a single "PCDATA" child for it, which holds the text.
2650
2651       merge ($elt2)
2652           Move the content of $elt2 within the element
2653
2654       insert         ($tag1, [$optional_atts1], $tag2, [$optional_atts2],...)
2655           For each tag in the list inserts an element $tag as the only child
2656           of the element.  The element gets the optional attributes
2657           in"$optional_atts<n>."  All children of the element are set as
2658           children of the new element.  The upper level element is returned.
2659
2660             $p->insert( table => { border=> 1}, 'tr', 'td')
2661
2662           put $p in a table with a visible border, a single "tr" and a single
2663           "td" and return the "table" element:
2664
2665             <p><table border="1"><tr><td>original content of p</td></tr></table></p>
2666
2667       wrap_in        (@tag)
2668           Wrap elements in @tag as the successive ancestors of the element,
2669           returns the new element.  "$elt->wrap_in( 'td', 'tr', 'table')"
2670           wraps the element as a single cell in a table for example.
2671
2672           Optionally each tag can be followed by a hashref of attributes,
2673           that will be set on the wrapping element:
2674
2675             $elt->wrap_in( p => { class => "advisory" }, div => { class => "intro", id => "div_intro });
2676
2677       insert_new_elt ($opt_position, $tag, $opt_atts_hashref, @opt_content)
2678           Combines a "new " and a "paste ": creates a new element using $tag,
2679           $opt_atts_hashref and @opt_content which are arguments similar to
2680           those for "new", then paste it, using $opt_position or
2681           'first_child', relative to $elt.
2682
2683           Return the newly created element
2684
2685       erase
2686           Erase the element: the element is deleted and all of its children
2687           are pasted in its place.
2688
2689       set_content    ( $optional_atts, @list_of_elt_and_strings) (
2690       $optional_atts, '#EMPTY')
2691           Set the content for the element, from a list of strings and
2692           elements.  Cuts all the element children, then pastes the list
2693           elements as the children.  This method will create a "PCDATA"
2694           element for any strings in the list.
2695
2696           The $optional_atts argument is the ref of a hash of attributes. If
2697           this argument is used then the previous attributes are deleted,
2698           otherwise they are left untouched.
2699
2700           WARNING: if you rely on ID's then you will have to set the id
2701           yourself. At this point the element does not belong to a twig yet,
2702           so the ID attribute is not known so it won't be stored in the ID
2703           list.
2704
2705           A content of '"#EMPTY"' creates an empty element;
2706
2707       namespace ($optional_prefix)
2708           Return the URI of the namespace that $optional_prefix or the
2709           element name belongs to. If the name doesn't belong to any
2710           namespace, "undef" is returned.
2711
2712       local_name
2713           Return the local name (without the prefix) for the element
2714
2715       ns_prefix
2716           Return the namespace prefix for the element
2717
2718       current_ns_prefixes
2719           Return a list of namespace prefixes valid for the element. The
2720           order of the prefixes in the list has no meaning. If the default
2721           namespace is currently bound, '' appears in the list.
2722
2723       inherit_att  ($att, @optional_tag_list)
2724           Return the value of an attribute inherited from parent tags. The
2725           value returned is found by looking for the attribute in the element
2726           then in turn in each of its ancestors. If the @optional_tag_list is
2727           supplied only those ancestors whose tag is in the list will be
2728           checked.
2729
2730       all_children_are ($optional_condition)
2731           return 1 if all children of the element pass the
2732           $optional_condition, 0 otherwise
2733
2734       level       ($optional_condition)
2735           Return the depth of the element in the twig (root is 0).  If
2736           $optional_condition is given then only ancestors that match the
2737           condition are counted.
2738
2739           WARNING: in a tree created using the "twig_roots" option this will
2740           not return the level in the document tree, level 0 will be the
2741           document root, level 1 will be the "twig_roots" elements. During
2742           the parsing (in a "twig_handler") you can use the "depth" method on
2743           the twig object to get the real parsing depth.
2744
2745       in           ($potential_parent)
2746           Return true if the element is in the potential_parent
2747           ($potential_parent is an element)
2748
2749       in_context   ($cond, $optional_level)
2750           Return true if the element is included in an element which passes
2751           $cond optionally within $optional_level levels. The returned value
2752           is the including element.
2753
2754       pcdata
2755           Return the text of a "PCDATA" element or "undef" if the element is
2756           not "PCDATA".
2757
2758       pcdata_xml_string
2759           Return the text of a "PCDATA" element or undef if the element is
2760           not "PCDATA".  The text is "XML-escaped" ('&' and '<' are replaced
2761           by '&amp;' and '&lt;')
2762
2763       set_pcdata     ($text)
2764           Set the text of a "PCDATA" element. This method does not check that
2765           the element is indeed a "PCDATA" so usually you should use
2766           "set_text" instead.
2767
2768       append_pcdata  ($text)
2769           Add the text at the end of a "PCDATA" element.
2770
2771       is_cdata
2772           Return 1 if the element is a "CDATA" element, returns 0 otherwise.
2773
2774       is_text
2775           Return 1 if the element is a "CDATA" or "PCDATA" element, returns 0
2776           otherwise.
2777
2778       cdata
2779           Return the text of a "CDATA" element or "undef" if the element is
2780           not "CDATA".
2781
2782       cdata_string
2783           Return the XML string of a "CDATA" element, including the opening
2784           and closing markers.
2785
2786       set_cdata     ($text)
2787           Set the text of a "CDATA" element.
2788
2789       append_cdata  ($text)
2790           Add the text at the end of a "CDATA" element.
2791
2792       remove_cdata
2793           Turns all "CDATA" sections in the element into regular "PCDATA"
2794           elements. This is useful when converting XML to HTML, as browsers
2795           do not support CDATA sections.
2796
2797       extra_data
2798           Return the extra_data (comments and PI's) attached to an element
2799
2800       set_extra_data     ($extra_data)
2801           Set the extra_data (comments and PI's) attached to an element
2802
2803       append_extra_data  ($extra_data)
2804           Append extra_data to the existing extra_data before the element (if
2805           no previous extra_data exists then it is created)
2806
2807       set_asis
2808           Set a property of the element that causes it to be output without
2809           being XML escaped by the print functions: if it contains "a < b" it
2810           will be output as such and not as "a &lt; b". This can be useful to
2811           create text elements that will be output as markup. Note that all
2812           "PCDATA" descendants of the element are also marked as having the
2813           property (they are the ones that are actually impacted by the
2814           change).
2815
2816           If the element is a "CDATA" element it will also be output asis,
2817           without the "CDATA" markers. The same goes for any "CDATA"
2818           descendant of the element
2819
2820       set_not_asis
2821           Unsets the "asis" property for the element and its text
2822           descendants.
2823
2824       is_asis
2825           Return the "asis" property status of the element ( 1 or "undef")
2826
2827       closed
2828           Return true if the element has been closed. Might be useful if you
2829           are somewhere in the tree, during the parse, and have no idea
2830           whether a parent element is completely loaded or not.
2831
2832       get_type
2833           Return the type of the element: '"#ELT"' for "real" elements, or
2834           '"#PCDATA"', '"#CDATA"', '"#COMMENT"', '"#ENT"', '"#PI"'
2835
2836       is_elt
2837           Return the tag if the element is a "real" element, or 0 if it is
2838           "PCDATA", "CDATA"...
2839
2840       contains_only_text
2841           Return 1 if the element does not contain any other "real" element
2842
2843       contains_only ($exp)
2844           Return the list of children if all children of the element match
2845           the expression $exp
2846
2847             if( $para->contains_only( 'tt')) { ... }
2848
2849       contains_a_single ($exp)
2850           If the element contains a single child that matches the expression
2851           $exp returns that element. Otherwise returns 0.
2852
2853       is_field
2854           same as "contains_only_text"
2855
2856       is_pcdata
2857           Return 1 if the element is a "PCDATA" element, returns 0 otherwise.
2858
2859       is_ent
2860           Return 1 if the element is an entity (an unexpanded entity)
2861           element, return 0 otherwise.
2862
2863       is_empty
2864           Return 1 if the element is empty, 0 otherwise
2865
2866       set_empty
2867           Flags the element as empty. No further check is made, so if the
2868           element is actually not empty the output will be messed. The only
2869           effect of this method is that the output will be "<tag
2870           att="value""/>".
2871
2872       set_not_empty
2873           Flags the element as not empty. if it is actually empty then the
2874           element will be output as "<tag att="value""></tag>"
2875
2876       is_pi
2877           Return 1 if the element is a processing instruction ("#PI")
2878           element, return 0 otherwise.
2879
2880       target
2881           Return the target of a processing instruction
2882
2883       set_target ($target)
2884           Set the target of a processing instruction
2885
2886       data
2887           Return the data part of a processing instruction
2888
2889       set_data ($data)
2890           Set the data of a processing instruction
2891
2892       set_pi ($target, $data)
2893           Set the target and data of a processing instruction
2894
2895       pi_string
2896           Return the string form of a processing instruction ("<?target
2897           data?>")
2898
2899       is_comment
2900           Return 1 if the element is a comment ("#COMMENT") element, return 0
2901           otherwise.
2902
2903       set_comment ($comment_text)
2904           Set the text for a comment
2905
2906       comment
2907           Return the content of a comment (just the text, not the "<!--" and
2908           "-->")
2909
2910       comment_string
2911           Return the XML string for a comment ("<!-- comment -->")
2912
2913       set_ent ($entity)
2914           Set an (non-expanded) entity ("#ENT"). $entity) is the entity text
2915           ("&ent;")
2916
2917       ent Return the entity for an entity ("#ENT") element ("&ent;")
2918
2919       ent_name
2920           Return the entity name for an entity ("#ENT") element ("ent")
2921
2922       ent_string
2923           Return the entity, either expanded if the expanded version is
2924           available, or non-expanded ("&ent;") otherwise
2925
2926       child ($offset, $optional_condition)
2927           Return the $offset-th child of the element, optionally the
2928           $offset-th child that matches $optional_condition. The children are
2929           treated as a list, so "$elt->child( 0)" is the first child, while
2930           "$elt->child( -1)" is the last child.
2931
2932       child_text ($offset, $optional_condition)
2933           Return the text of a child or "undef" if the sibling does not
2934           exist. Arguments are the same as child.
2935
2936       last_child    ($optional_condition)
2937           Return the last child of the element, or the last child matching
2938           $optional_condition (ie the last of the element children matching
2939           the condition).
2940
2941       last_child_text   ($optional_condition)
2942           Same as "first_child_text" but for the last child.
2943
2944       sibling  ($offset, $optional_condition)
2945           Return the next or previous $offset-th sibling of the element, or
2946           the $offset-th one matching $optional_condition. If $offset is
2947           negative then a previous sibling is returned, if $offset is
2948           positive then  a next sibling is returned. "$offset=0" returns the
2949           element if there is no condition or if the element matches the
2950           condition>, "undef" otherwise.
2951
2952       sibling_text ($offset, $optional_condition)
2953           Return the text of a sibling or "undef" if the sibling does not
2954           exist.  Arguments are the same as "sibling".
2955
2956       prev_siblings ($optional_condition)
2957           Return the list of previous siblings (optionally matching
2958           $optional_condition) for the element. The elements are ordered in
2959           document order.
2960
2961       next_siblings ($optional_condition)
2962           Return the list of siblings (optionally matching
2963           $optional_condition) following the element. The elements are
2964           ordered in document order.
2965
2966       pos ($optional_condition)
2967           Return the position of the element in the children list. The first
2968           child has a position of 1 (as in XPath).
2969
2970           If the $optional_condition is given then only siblings that match
2971           the condition are counted. If the element itself does not match the
2972           condition then 0 is returned.
2973
2974       atts
2975           Return a hash ref containing the element attributes
2976
2977       set_atts      ({ att1=>$att1_val, att2=> $att2_val... })
2978           Set the element attributes with the hash ref supplied as the
2979           argument. The previous attributes are lost (ie the attributes set
2980           by "set_atts" replace all of the attributes of the element).
2981
2982           You can also pass a list instead of a hashref: "$elt->set_atts(
2983           att1 => 'val1',...)"
2984
2985       del_atts
2986           Deletes all the element attributes.
2987
2988       att_nb
2989           Return the number of attributes for the element
2990
2991       has_atts
2992           Return true if the element has attributes (in fact return the
2993           number of attributes, thus being an alias to "att_nb"
2994
2995       has_no_atts
2996           Return true if the element has no attributes, false (0) otherwise
2997
2998       att_names
2999           return a list of the attribute names for the element
3000
3001       att_xml_string ($att, $options)
3002           Return the attribute value, where '&', '<' and quote (" or the
3003           value of the quote option at twig creation) are XML-escaped.
3004
3005           The options are passed as a hashref, setting "escape_gt" to a true
3006           value will also escape '>' ($elt( 'myatt', { escape_gt => 1 });
3007
3008       set_id       ($id)
3009           Set the "id" attribute of the element to the value.  See "elt_id "
3010           to change the id attribute name
3011
3012       id  Gets the id attribute value
3013
3014       del_id       ($id)
3015           Deletes the "id" attribute of the element and remove it from the id
3016           list for the document
3017
3018       class
3019           Return the "class" attribute for the element (methods on the
3020           "class" attribute are quite convenient when dealing with XHTML, or
3021           plain XML that will eventually be displayed using CSS)
3022
3023       set_class ($class)
3024           Set the "class" attribute for the element to $class
3025
3026       add_to_class ($class)
3027           Add $class to the element "class" attribute: the new class is added
3028           only if it is not already present. Note that classes are sorted
3029           alphabetically, so the "class" attribute can be changed even if the
3030           class is already there
3031
3032       att_to_class ($att)
3033           Set the "class" attribute to the value of attribute $att
3034
3035       add_att_to_class ($att)
3036           Add the value of attribute $att to the "class" attribute of the
3037           element
3038
3039       move_att_to_class ($att)
3040           Add the value of attribute $att to the "class" attribute of the
3041           element and delete the attribute
3042
3043       tag_to_class
3044           Set the "class" attribute of the element to the element tag
3045
3046       add_tag_to_class
3047           Add the element tag to its "class" attribute
3048
3049       set_tag_class ($new_tag)
3050           Add the element tag to its "class" attribute and sets the tag to
3051           $new_tag
3052
3053       in_class ($class)
3054           Return true (1) if the element is in the class $class (if $class is
3055           one of the tokens in the element "class" attribute)
3056
3057       tag_to_span
3058           Change the element tag tp "span" and set its class to the old tag
3059
3060       tag_to_div
3061           Change the element tag tp "div" and set its class to the old tag
3062
3063       DESTROY
3064           Frees the element from memory.
3065
3066       start_tag
3067           Return the string for the start tag for the element, including the
3068           "/>" at the end of an empty element tag
3069
3070       end_tag
3071           Return the string for the end tag of an element.  For an empty
3072           element, this returns the empty string ('').
3073
3074       xml_string @optional_options
3075           Equivalent to "$elt->sprint( 1)", returns the string for the entire
3076           element, excluding the element's tags (but nested element tags are
3077           present)
3078
3079           The '"no_recurse"' option will only return the text of the element,
3080           not of any included sub-elements (same as "xml_text_only").
3081
3082       inner_xml
3083           Another synonym for xml_string
3084
3085       outer_xml
3086           An other synonym for sprint
3087
3088       xml_text
3089           Return the text of the element, encoded (and processed by the
3090           current "output_filter" or "output_encoding" options, without any
3091           tag.
3092
3093       xml_text_only
3094           Same as "xml_text" except that the text returned doesn't include
3095           the text of sub-elements.
3096
3097       set_pretty_print ($style)
3098           Set the pretty print method, amongst '"none"' (default),
3099           '"nsgmls"', '"nice"', '"indented"', '"record"' and '"record_c"'
3100
3101           pretty_print styles:
3102
3103           none
3104               the default, no "\n" is used
3105
3106           nsgmls
3107               nsgmls style, with "\n" added within tags
3108
3109           nice
3110               adds "\n" wherever possible (NOT SAFE, can lead to invalid XML)
3111
3112           indented
3113               same as "nice" plus indents elements (NOT SAFE, can lead to
3114               invalid XML)
3115
3116           record
3117               table-oriented pretty print, one field per line
3118
3119           record_c
3120               table-oriented pretty print, more compact than "record", one
3121               record per line
3122
3123       set_empty_tag_style ($style)
3124           Set the method to output empty tags, amongst '"normal"' (default),
3125           '"html"', and '"expand"',
3126
3127           "normal" outputs an empty tag '"<tag/>"', "html" adds a space
3128           '"<tag />"' for elements that can be empty in XHTML and "expand"
3129           outputs '"<tag></tag>"'
3130
3131       set_remove_cdata  ($flag)
3132           set (or unset) the flag that forces the twig to output CDATA
3133           sections as regular (escaped) PCDATA
3134
3135       set_indent ($string)
3136           Set the indentation for the indented pretty print style (default is
3137           2 spaces)
3138
3139       set_quote ($quote)
3140           Set the quotes used for attributes. can be '"double"' (default) or
3141           '"single"'
3142
3143       cmp       ($elt)
3144             Compare the order of the 2 elements in a twig.
3145
3146             C<$a> is the <A>..</A> element, C<$b> is the <B>...</B> element
3147
3148             document                        $a->cmp( $b)
3149             <A> ... </A> ... <B>  ... </B>     -1
3150             <A> ... <B>  ... </B> ... </A>     -1
3151             <B> ... </B> ... <A>  ... </A>      1
3152             <B> ... <A>  ... </A> ... </B>      1
3153              $a == $b                           0
3154              $a and $b not in the same tree   undef
3155
3156       before       ($elt)
3157           Return 1 if $elt starts before the element, 0 otherwise. If the 2
3158           elements are not in the same twig then return "undef".
3159
3160               if( $a->cmp( $b) == -1) { return 1; } else { return 0; }
3161
3162       after       ($elt)
3163           Return 1 if $elt starts after the element, 0 otherwise. If the 2
3164           elements are not in the same twig then return "undef".
3165
3166               if( $a->cmp( $b) == -1) { return 1; } else { return 0; }
3167
3168       other comparison methods
3169           lt
3170           le
3171           gt
3172           ge
3173       path
3174           Return the element context in a form similar to XPath's short form:
3175           '"/root/tag1/../tag"'
3176
3177       xpath
3178           Return a unique XPath expression that can be used to find the
3179           element again.
3180
3181           It looks like "/doc/sect[3]/title": unique elements do not have an
3182           index, the others do.
3183
3184       private methods
3185           Low-level methods on the twig:
3186
3187           set_parent        ($parent)
3188           set_first_child   ($first_child)
3189           set_last_child    ($last_child)
3190           set_prev_sibling  ($prev_sibling)
3191           set_next_sibling  ($next_sibling)
3192           set_twig_current
3193           del_twig_current
3194           twig_current
3195           flush
3196               This method should NOT be used, always flush the twig, not an
3197               element.
3198
3199           contains_text
3200
3201           Those methods should not be used, unless of course you find some
3202           creative and interesting, not to mention useful, ways to do it.
3203
3204   cond
3205       Most of the navigation functions accept a condition as an optional
3206       argument The first element (or all elements for "children " or
3207       "ancestors ") that passes the condition is returned.
3208
3209       The condition is a single step of an XPath expression using the XPath
3210       subset defined by "get_xpath". Additional conditions are:
3211
3212       The condition can be
3213
3214       #ELT
3215           return a "real" element (not a PCDATA, CDATA, comment or pi
3216           element)
3217
3218       #TEXT
3219           return a PCDATA or CDATA element
3220
3221       regular expression
3222           return an element whose tag matches the regexp. The regexp has to
3223           be created with "qr//" (hence this is available only on perl 5.005
3224           and above)
3225
3226       code reference
3227           applies the code, passing the current element as argument, if the
3228           code returns true then the element is returned, if it returns false
3229           then the code is applied to the next candidate.
3230
3231   XML::Twig::XPath
3232       XML::Twig implements a subset of XPath through the "get_xpath" method.
3233
3234       If you want to use the whole XPath power, then you can use
3235       "XML::Twig::XPath" instead. In this case "XML::Twig" uses "XML::XPath"
3236       to execute XPath queries.  You will of course need "XML::XPath"
3237       installed to be able to use "XML::Twig::XPath".
3238
3239       See XML::XPath for more information.
3240
3241       The methods you can use are:
3242
3243       findnodes              ($path)
3244           return a list of nodes found by $path.
3245
3246       findnodes_as_string    ($path)
3247           return the nodes found reproduced as XML. The result is not
3248           guaranteed to be valid XML though.
3249
3250       findvalue              ($path)
3251           return the concatenation of the text content of the result nodes
3252
3253       In order for "XML::XPath" to be used as the XPath engine the following
3254       methods are included in "XML::Twig":
3255
3256       in XML::Twig
3257
3258       getRootNode
3259       getParentNode
3260       getChildNodes
3261
3262       in XML::Twig::Elt
3263
3264       string_value
3265       toString
3266       getName
3267       getRootNode
3268       getNextSibling
3269       getPreviousSibling
3270       isElementNode
3271       isTextNode
3272       isPI
3273       isPINode
3274       isProcessingInstructionNode
3275       isComment
3276       isCommentNode
3277       getTarget
3278       getChildNodes
3279       getElementById
3280
3281   XML::Twig::XPath::Elt
3282       The methods you can use are the same as on "XML::Twig::XPath" elements:
3283
3284       findnodes              ($path)
3285           return a list of nodes found by $path.
3286
3287       findnodes_as_string    ($path)
3288           return the nodes found reproduced as XML. The result is not
3289           guaranteed to be valid XML though.
3290
3291       findvalue              ($path)
3292           return the concatenation of the text content of the result nodes
3293
3294   XML::Twig::Entity_list
3295       new Create an entity list.
3296
3297       add         ($ent)
3298           Add an entity to an entity list.
3299
3300       add_new_ent ($name, $val, $sysid, $pubid, $ndata, $param)
3301           Create a new entity and add it to the entity list
3302
3303       delete     ($ent or $tag).
3304           Delete an entity (defined by its name or by the Entity object) from
3305           the list.
3306
3307       print      ($optional_filehandle)
3308           Print the entity list.
3309
3310       list
3311           Return the list as an array
3312
3313   XML::Twig::Entity
3314       new        ($name, $val, $sysid, $pubid, $ndata, $param)
3315           Same arguments as the Entity handler for XML::Parser.
3316
3317       print       ($optional_filehandle)
3318           Print an entity declaration.
3319
3320       name
3321           Return the name of the entity
3322
3323       val Return the value of the entity
3324
3325       sysid
3326           Return the system id for the entity (for NDATA entities)
3327
3328       pubid
3329           Return the public id for the entity (for NDATA entities)
3330
3331       ndata
3332           Return true if the entity is an NDATA entity
3333
3334       param
3335           Return true if the entity is a parameter entity
3336
3337       text
3338           Return the entity declaration text.
3339

EXAMPLES

3341       Additional examples (and a complete tutorial) can be found  on the
3342       XML::Twig Page<http://www.xmltwig.com/xmltwig/>
3343
3344       To figure out what flush does call the following script with an XML
3345       file and an element name as arguments
3346
3347         use XML::Twig;
3348
3349         my ($file, $elt)= @ARGV;
3350         my $t= XML::Twig->new( twig_handlers =>
3351             { $elt => sub {$_[0]->flush; print "\n[flushed here]\n";} });
3352         $t->parsefile( $file, ErrorContext => 2);
3353         $t->flush;
3354         print "\n";
3355

NOTES

3357   Subclassing XML::Twig
3358       Useful methods:
3359
3360       elt_class
3361           In order to subclass "XML::Twig" you will probably need to subclass
3362           also "XML::Twig::Elt". Use the "elt_class" option when you create
3363           the "XML::Twig" object to get the elements created in a different
3364           class (which should be a subclass of "XML::Twig::Elt".
3365
3366       add_options
3367           If you inherit "XML::Twig" new method but want to add more options
3368           to it you can use this method to prevent XML::Twig to issue
3369           warnings for those additional options.
3370
3371   DTD Handling
3372       There are 3 possibilities here.  They are:
3373
3374       No DTD
3375           No doctype, no DTD information, no entity information, the world is
3376           simple...
3377
3378       Internal DTD
3379           The XML document includes an internal DTD, and maybe entity
3380           declarations.
3381
3382           If you use the load_DTD option when creating the twig the DTD
3383           information and the entity declarations can be accessed.
3384
3385           The DTD and the entity declarations will be "flush"'ed (or
3386           "print"'ed) either as is (if they have not been modified) or as
3387           reconstructed (poorly, comments are lost, order is not kept, due to
3388           it's content this DTD should not be viewed by anyone) if they have
3389           been modified. You can also modify them directly by changing the
3390           "$twig->{twig_doctype}->{internal}" field (straight from
3391           XML::Parser, see the "Doctype" handler doc)
3392
3393       External DTD
3394           The XML document includes a reference to an external DTD, and maybe
3395           entity declarations.
3396
3397           If you use the "load_DTD" when creating the twig the DTD
3398           information and the entity declarations can be accessed. The entity
3399           declarations will be "flush"'ed (or "print"'ed) either as is (if
3400           they have not been modified) or as reconstructed (badly, comments
3401           are lost, order is not kept).
3402
3403           You can change the doctype through the "$twig->set_doctype" method
3404           and print the dtd through the "$twig->dtd_text" or
3405           "$twig->dtd_print"
3406            methods.
3407
3408           If you need to modify the entity list this is probably the easiest
3409           way to do it.
3410
3411   Flush
3412       If you set handlers and use "flush", do not forget to flush the twig
3413       one last time AFTER the parsing, or you might be missing the end of the
3414       document.
3415
3416       Remember that element handlers are called when the element is CLOSED,
3417       so if you have handlers for nested elements the inner handlers will be
3418       called first. It makes it for example trickier than it would seem to
3419       number nested clauses.
3420

BUGS

3422       entity handling
3423           Due to XML::Parser behaviour, non-base entities in attribute values
3424           disappear: "att="val&ent;"" will be turned into "att => val",
3425           unless you use the "keep_encoding" argument to "XML::Twig->new"
3426
3427       DTD handling
3428           The DTD handling methods are quite bugged. No one uses them and it
3429           seems very difficult to get them to work in all cases, including
3430           with several slightly incompatible versions of XML::Parser and of
3431           libexpat.
3432
3433           Basically you can read the DTD, output it back properly, and update
3434           entities, but not much more.
3435
3436           So use XML::Twig with standalone documents, or with documents
3437           refering to an external DTD, but don't expect it to properly parse
3438           and even output back the DTD.
3439
3440       memory leak
3441           If you use a lot of twigs you might find that you leak quite a lot
3442           of memory (about 2Ks per twig). You can use the "dispose " method
3443           to free that memory after you are done.
3444
3445           If you create elements the same thing might happen, use the
3446           "delete" method to get rid of them.
3447
3448           Alternatively installing the "Scalar::Util" (or "WeakRef") module
3449           on a version of Perl that supports it (>5.6.0) will get rid of the
3450           memory leaks automagically.
3451
3452       ID list
3453           The ID list is NOT updated when elements are cut or deleted.
3454
3455       change_gi
3456           This method will not function properly if you do:
3457
3458                $twig->change_gi( $old1, $new);
3459                $twig->change_gi( $old2, $new);
3460                $twig->change_gi( $new, $even_newer);
3461
3462       sanity check on XML::Parser method calls
3463           XML::Twig should really prevent calls to some XML::Parser methods,
3464           especially the "setHandlers" method.
3465
3466       pretty printing
3467           Pretty printing (at least using the '"indented"' style) is hard to
3468           get right!  Only elements that belong to the document will be
3469           properly indented. Printing elements that do not belong to the twig
3470           makes it impossible for XML::Twig to figure out their depth, and
3471           thus their indentation level.
3472
3473           Also there is an unavoidable bug when using "flush" and pretty
3474           printing for elements with mixed content that start with an
3475           embedded element:
3476
3477             <elt><b>b</b>toto<b>bold</b></elt>
3478
3479             will be output as
3480
3481             <elt>
3482               <b>b</b>toto<b>bold</b></elt>
3483
3484           if you flush the twig when you find the "<b>" element
3485

Globals

3487       These are the things that can mess up calling code, especially if
3488       threaded.  They might also cause problem under mod_perl.
3489
3490       Exported constants
3491           Whether you want them or not you get them! These are subroutines to
3492           use as constant when creating or testing elements
3493
3494             PCDATA  return '#PCDATA'
3495             CDATA   return '#CDATA'
3496             PI      return '#PI', I had the choice between PROC and PI :--(
3497
3498       Module scoped values: constants
3499           these should cause no trouble:
3500
3501             %base_ent= ( '>' => '&gt;',
3502                          '<' => '&lt;',
3503                          '&' => '&amp;',
3504                          "'" => '&apos;',
3505                          '"' => '&quot;',
3506                        );
3507             CDATA_START   = "<![CDATA[";
3508             CDATA_END     = "]]>";
3509             PI_START      = "<?";
3510             PI_END        = "?>";
3511             COMMENT_START = "<!--";
3512             COMMENT_END   = "-->";
3513
3514           pretty print styles
3515
3516             ( $NSGMLS, $NICE, $INDENTED, $INDENTED_C, $WRAPPED, $RECORD1, $RECORD2)= (1..7);
3517
3518           empty tag output style
3519
3520             ( $HTML, $EXPAND)= (1..2);
3521
3522       Module scoped values: might be changed
3523           Most of these deal with pretty printing, so the worst that can
3524           happen is probably that XML output does not look right, but is
3525           still valid and processed identically by XML processors.
3526
3527           $empty_tag_style can mess up HTML bowsers though and changing $ID
3528           would most likely create problems.
3529
3530             $pretty=0;           # pretty print style
3531             $quote='"';          # quote for attributes
3532             $INDENT= '  ';       # indent for indented pretty print
3533             $empty_tag_style= 0; # how to display empty tags
3534             $ID                  # attribute used as an id ('id' by default)
3535
3536       Module scoped values: definitely changed
3537           These 2 variables are used to replace tags by an index, thus saving
3538           some space when creating a twig. If they really cause you too much
3539           trouble, let me know, it is probably possible to create either a
3540           switch or at least a version of XML::Twig that does not perform
3541           this optimization.
3542
3543             %gi2index;     # tag => index
3544             @index2gi;     # list of tags
3545
3546       If you need to manipulate all those values, you can use the following
3547       methods on the XML::Twig object:
3548
3549       global_state
3550           Return a hashref with all the global variables used by XML::Twig
3551
3552           The hash has the following fields:  "pretty", "quote", "indent",
3553           "empty_tag_style", "keep_encoding", "expand_external_entities",
3554           "output_filter", "output_text_filter", "keep_atts_order"
3555
3556       set_global_state ($state)
3557           Set the global state, $state is a hashref
3558
3559       save_global_state
3560           Save the current global state
3561
3562       restore_global_state
3563           Restore the previously saved (using "Lsave_global_state"> state
3564

TODO

3566       SAX handlers
3567           Allowing XML::Twig to work on top of any SAX parser
3568
3569       multiple twigs are not well supported
3570           A number of twig features are just global at the moment. These
3571           include the ID list and the "tag pool" (if you use "change_gi" then
3572           you change the tag for ALL twigs).
3573
3574           A future version will try to support this while trying not to be to
3575           hard on performance (at least when a single twig is used!).
3576

AUTHOR

3578       Michel Rodriguez <mirod@xmltwig.com>
3579

LICENSE

3581       This library is free software; you can redistribute it and/or modify it
3582       under the same terms as Perl itself.
3583
3584       Bug reports should be sent using: RT
3585       <http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-Twig>
3586
3587       Comments can be sent to mirod@xmltwig.com
3588
3589       The XML::Twig page is at <http://www.xmltwig.com/xmltwig/> It includes
3590       the development version of the module, a slightly better version of the
3591       documentation, examples, a tutorial and a: Processing XML efficiently
3592       with Perl and XML::Twig:
3593       <http://www.xmltwig.com/xmltwig/tutorial/index.html>
3594

SEE ALSO

3596       Complete docs, including a tutorial, examples, an easier to use HTML
3597       version of the docs, a quick reference card and a FAQ are available at
3598       <http://www.xmltwig.com/xmltwig/>
3599
3600       git repository at <http://github.com/mirod/xmltwig>
3601
3602       XML::Parser, XML::Parser::Expat, XML::XPath, Encode, Text::Iconv,
3603       Scalar::Utils
3604
3605   Alternative Modules
3606       XML::Twig is not the only XML::Processing module available on CPAN (far
3607       from it!).
3608
3609       The main alternative I would recommend is XML::LibXML.
3610
3611       Here is a quick comparison of the 2 modules:
3612
3613       XML::LibXML, actually "libxml2" on which it is based, sticks to the
3614       standards, and implements a good number of them in a rather strict way:
3615       XML, XPath, DOM, RelaxNG, I must be forgetting a couple (XInclude?). It
3616       is fast and rather frugal memory-wise.
3617
3618       XML::Twig is older: when I started writing it XML::Parser/expat was the
3619       only game in town. It implements XML and that's about it (plus a subset
3620       of XPath, and you can use XML::Twig::XPath if you have XML::XPathEngine
3621       installed for full support). It is slower and requires more memory for
3622       a full tree than XML::LibXML. On the plus side (yes, there is a plus
3623       side!) it lets you process a big document in chunks, and thus let you
3624       tackle documents that couldn't be loaded in memory by XML::LibXML, and
3625       it offers a lot (and I mean a LOT!) of higher-level methods, for
3626       everything, from adding structure to "low-level" XML, to shortcuts for
3627       XHTML conversions and more. It also DWIMs quite a bit, getting comments
3628       and non-significant whitespaces out of the way but preserving them in
3629       the output for example. As it does not stick to the DOM, is also
3630       usually leads to shorter code than in XML::LibXML.
3631
3632       Beyond the pure features of the 2 modules, XML::LibXML seems to be
3633       prefered by "XML-purists", while XML::Twig seems to be more used by
3634       Perl Hackers who have to deal with XML. As you have noted, XML::Twig
3635       also comes with quite a lot of docs, but I am sure if you ask for help
3636       about XML::LibXML here or on Perlmonks you will get answers.
3637
3638       Note that it is actually quite hard for me to compare the 2 modules: on
3639       one hand I know XML::Twig inside-out and I can get it to do pretty much
3640       anything I need to (or I improve it ;--), while I have a very basic
3641       knowledge of XML::LibXML.  So feature-wise, I'd rather use XML::Twig
3642       ;--). On the other hand, I am painfully aware of some of the
3643       deficiencies, potential bugs and plain ugly code that lurk in
3644       XML::Twig, even though you are unlikely to be affected by them (unless
3645       for example you need to change the DTD of a document programatically),
3646       while I haven't looked much into XML::LibXML so it still looks shinny
3647       and clean to me.
3648
3649       That said, if you need to process a document that is too big to fit
3650       memory and XML::Twig is too slow for you, my reluctant advice would be
3651       to use "bare" XML::Parser.  It won't be as easy to use as XML::Twig:
3652       basically with XML::Twig you trade some speed (depending on what you do
3653       from a factor 3 to... none) for ease-of-use, but it will be easier IMHO
3654       than using SAX (albeit not standard), and at this point a LOT faster
3655       (see the last test in
3656       <http://www.xmltwig.com/article/simple_benchmark/>).
3657
3658
3659
3660perl v5.10.1                      2010-08-22                           Twig(3)
Impressum