1XML::LibXML::Parser(3)User Contributed Perl DocumentationXML::LibXML::Parser(3)
2
3
4

NAME

6       XML::LibXML::Parser - Parsing XML Data with XML::LibXML
7

SYNOPSIS

9         use XML::LibXML '1.70';
10
11         # Parser constructor
12
13         $parser = XML::LibXML->new();
14         $parser = XML::LibXML->new(option=>value, ...);
15         $parser = XML::LibXML->new({option=>value, ...});
16
17         # Parsing XML
18
19         $dom = XML::LibXML->load_xml(
20             location => $file_or_url
21             # parser options ...
22           );
23         $dom = XML::LibXML->load_xml(
24             string => $xml_string
25             # parser options ...
26           );
27         $dom = XML::LibXML->load_xml(
28             string => (\$xml_string)
29             # parser options ...
30           );
31         $dom = XML::LibXML->load_xml({
32             IO => $perl_file_handle
33             # parser options ...
34           );
35         $dom = $parser->load_xml(...);
36
37         # Parsing HTML
38
39         $dom = XML::LibXML->load_html(...);
40         $dom = $parser->load_html(...);
41
42         # Parsing well-balanced XML chunks
43
44         $fragment = $parser->parse_balanced_chunk( $wbxmlstring, $encoding );
45
46         # Processing XInclude
47
48         $parser->process_xincludes( $doc );
49         $parser->processXIncludes( $doc );
50
51         # Old-style parser interfaces
52
53         $doc = $parser->parse_file( $xmlfilename );
54         $doc = $parser->parse_fh( $io_fh );
55         $doc = $parser->parse_string( $xmlstring);
56         $doc = $parser->parse_html_file( $htmlfile, \%opts );
57         $doc = $parser->parse_html_fh( $io_fh, \%opts );
58         $doc = $parser->parse_html_string( $htmlstring, \%opts );
59
60         # Push parser
61
62         $parser->parse_chunk($string, $terminate);
63         $parser->init_push();
64         $parser->push(@data);
65         $doc = $parser->finish_push( $recover );
66
67         # Set/query parser options
68
69         $parser->option_exists($name);
70         $parser->get_option($name);
71         $parser->set_option($name,$value);
72         $parser->set_options({$name=>$value,...});
73
74         # XML catalogs
75
76         $parser->load_catalog( $catalog_file );
77

PARSING

79       An XML document is read into a data structure such as a DOM tree by a
80       piece of software, called a parser. XML::LibXML currently provides four
81       different parser interfaces:
82
83       ·   A DOM Pull-Parser
84
85       ·   A DOM Push-Parser
86
87       ·   A SAX Parser
88
89       ·   A DOM based SAX Parser.
90
91   Creating a Parser Instance
92       XML::LibXML provides an OO interface to the libxml2 parser functions.
93       Thus you have to create a parser instance before you can parse any XML
94       data.
95
96       new
97             $parser = XML::LibXML->new();
98             $parser = XML::LibXML->new(option=>value, ...);
99             $parser = XML::LibXML->new({option=>value, ...});
100
101           Create a new XML and HTML parser instance. Each parser instance
102           holds default values for various parser options. Optionally, one
103           can pass a hash reference or a list of option => value pairs to set
104           a different default set of options.  Unless specified otherwise,
105           the options "load_ext_dtd", and "expand_entities" are set to 1. See
106           "Parser Options" for a list of libxml2 parser's options.
107
108   DOM Parser
109       One of the common parser interfaces of XML::LibXML is the DOM parser.
110       This parser reads XML data into a DOM like data structure, so each tag
111       can get accessed and transformed.
112
113       XML::LibXML's DOM parser is not only capable to parse XML data, but
114       also (strict) HTML files. There are three ways to parse documents - as
115       a string, as a Perl filehandle, or as a filename/URL. The return value
116       from each is a XML::LibXML::Document object, which is a DOM object.
117
118       All of the functions listed below will throw an exception if the
119       document is invalid. To prevent this causing your program exiting, wrap
120       the call in an eval{} block
121
122       load_xml
123             $dom = XML::LibXML->load_xml(
124                 location => $file_or_url
125                 # parser options ...
126               );
127             $dom = XML::LibXML->load_xml(
128                 string => $xml_string
129                 # parser options ...
130               );
131             $dom = XML::LibXML->load_xml(
132                 string => (\$xml_string)
133                 # parser options ...
134               );
135             $dom = XML::LibXML->load_xml({
136                 IO => $perl_file_handle
137                 # parser options ...
138               );
139             $dom = $parser->load_xml(...);
140
141           This function is available since XML::LibXML 1.70. It provides easy
142           to use interface to the XML parser that parses given file (or URL),
143           string, or input stream to a DOM tree. The arguments can be passed
144           in a HASH reference or as name => value pairs. The function can be
145           called as a class method or an object method. In both cases it
146           internally creates a new parser instance passing the specified
147           parser options; if called as an object method, it clones the
148           original parser (preserving its settings) and additionally applies
149           the specified options to the new parser. See the constructor "new"
150           and "Parser Options" for more information.
151
152       load_html
153             $dom = XML::LibXML->load_html(...);
154             $dom = $parser->load_html(...);
155
156           This function is available since XML::LibXML 1.70. It has the same
157           usage as "load_xml", providing interface to the HTML parser. See
158           "load_xml" for more information.
159
160       Parsing HTML may cause problems, especially if the ampersand ('&') is
161       used.  This is a common problem if HTML code is parsed that contains
162       links to CGI-scripts. Such links cause the parser to throw errors. In
163       such cases libxml2 still parses the entire document as there was no
164       error, but the error causes XML::LibXML to stop the parsing process.
165       However, the document is not lost.  Such HTML documents should be
166       parsed using the recover flag. By default recovering is deactivated.
167
168       The functions described above are implemented to parse well formed
169       documents.  In some cases a program gets well balanced XML instead of
170       well formed documents (e.g. an XML fragment from a database). With
171       XML::LibXML it is not required to wrap such fragments in the code,
172       because XML::LibXML is capable even to parse well balanced XML
173       fragments.
174
175       parse_balanced_chunk
176             $fragment = $parser->parse_balanced_chunk( $wbxmlstring, $encoding );
177
178           This function parses a well balanced XML string into a
179           XML::LibXML::DocumentFragment. The first arguments contains the
180           input string, the optional second argument can be used to specify
181           character encoding of the input (UTF-8 is assumed by default).
182
183       parse_xml_chunk
184           This is the old name of parse_balanced_chunk(). Because it may
185           causes confusion with the push parser interface, this function
186           should not be used anymore.
187
188       By default XML::LibXML does not process XInclude tags within an XML
189       Document (see options section below). XML::LibXML allows one to post-
190       process a document to expand XInclude tags.
191
192       process_xincludes
193             $parser->process_xincludes( $doc );
194
195           After a document is parsed into a DOM structure, you may want to
196           expand the documents XInclude tags. This function processes the
197           given document structure and expands all XInclude tags (or throws
198           an error) by using the flags and callbacks of the given parser
199           instance.
200
201           Note that the resulting Tree contains some extra nodes (of type
202           XML_XINCLUDE_START and XML_XINCLUDE_END) after successfully
203           processing the document. These nodes indicate where data was
204           included into the original tree.  if the document is serialized,
205           these extra nodes will not show up.
206
207           Remember: A Document with processed XIncludes differs from the
208           original document after serialization, because the original
209           XInclude tags will not get restored!
210
211           If the parser flag "expand_xincludes" is set to 1, you need not to
212           post process the parsed document.
213
214       processXIncludes
215             $parser->processXIncludes( $doc );
216
217           This is an alias to process_xincludes, but through a JAVA like
218           function name.
219
220       parse_file
221             $doc = $parser->parse_file( $xmlfilename );
222
223           This function parses an XML document from a file or network;
224           $xmlfilename can be either a filename or an URL. Note that for
225           parsing files, this function is the fastest choice, about 6-8 times
226           faster then parse_fh().
227
228       parse_fh
229             $doc = $parser->parse_fh( $io_fh );
230
231           parse_fh() parses a IOREF or a subclass of IO::Handle.
232
233           Because the data comes from an open handle, libxml2's parser does
234           not know about the base URI of the document. To set the base URI
235           one should use parse_fh() as follows:
236
237             my $doc = $parser->parse_fh( $io_fh, $baseuri );
238
239       parse_string
240             $doc = $parser->parse_string( $xmlstring);
241
242           This function is similar to parse_fh(), but it parses an XML
243           document that is available as a single string in memory, or
244           alternatively as a reference to a scalar containing a string.
245           Again, you can pass an optional base URI to the function.
246
247             my $doc = $parser->parse_string( $xmlstring, $baseuri );
248             my $doc = $parser->parse_string(\$xmlstring, $baseuri);
249
250       parse_html_file
251             $doc = $parser->parse_html_file( $htmlfile, \%opts );
252
253           Similar to parse_file() but parses HTML (strict) documents;
254           $htmlfile can be filename or URL.
255
256           An optional second argument can be used to pass some options to the
257           HTML parser as a HASH reference. See options labeled with HTML in
258           "Parser Options".
259
260       parse_html_fh
261             $doc = $parser->parse_html_fh( $io_fh, \%opts );
262
263           Similar to parse_fh() but parses HTML (strict) streams.
264
265           An optional second argument can be used to pass some options to the
266           HTML parser as a HASH reference. See options labeled with HTML in
267           "Parser Options".
268
269           Note: encoding option may not work correctly with this function in
270           libxml2 < 2.6.27 if the HTML file declares charset using a META
271           tag.
272
273       parse_html_string
274             $doc = $parser->parse_html_string( $htmlstring, \%opts );
275
276           Similar to parse_string() but parses HTML (strict) strings.
277
278           An optional second argument can be used to pass some options to the
279           HTML parser as a HASH reference. See options labeled with HTML in
280           "Parser Options".
281
282   Push Parser
283       XML::LibXML provides a push parser interface. Rather than pulling the
284       data from a given source the push parser waits for the data to be
285       pushed into it.
286
287       This allows one to parse large documents without waiting for the parser
288       to finish. The interface is especially useful if a program needs to
289       pre-process the incoming pieces of XML (e.g. to detect document
290       boundaries).
291
292       While XML::LibXML parse_*() functions force the data to be a well-
293       formed XML, the push parser will take any arbitrary string that
294       contains some XML data. The only requirement is that all the pushed
295       strings are together a well formed document. With the push parser
296       interface a program can interrupt the parsing process as required,
297       where the parse_*() functions give not enough flexibility.
298
299       Different to the pull parser implemented in parse_fh() or parse_file(),
300       the push parser is not able to find out about the documents end itself.
301       Thus the calling program needs to indicate explicitly when the parsing
302       is done.
303
304       In XML::LibXML this is done by a single function:
305
306       parse_chunk
307             $parser->parse_chunk($string, $terminate);
308
309           parse_chunk() tries to parse a given chunk of data, which isn't
310           necessarily well balanced data. The function takes two parameters:
311           The chunk of data as a string and optional a termination flag. If
312           the termination flag is set to a true value (e.g. 1), the parsing
313           will be stopped and the resulting document will be returned as the
314           following example describes:
315
316             my $parser = XML::LibXML->new;
317             for my $string ( "<", "foo", ' bar="hello world"', "/>") {
318                  $parser->parse_chunk( $string );
319             }
320             my $doc = $parser->parse_chunk("", 1); # terminate the parsing
321
322       Internally XML::LibXML provides three functions that control the push
323       parser process:
324
325       init_push
326             $parser->init_push();
327
328           Initializes the push parser.
329
330       push
331             $parser->push(@data);
332
333           This function pushes the data stored inside the array to libxml2's
334           parser. Each entry in @data must be a normal scalar! This method
335           can be called repeatedly.
336
337       finish_push
338             $doc = $parser->finish_push( $recover );
339
340           This function returns the result of the parsing process. If this
341           function is called without a parameter it will complain about non
342           well-formed documents. If $restore is 1, the push parser can be
343           used to restore broken or non well formed (XML) documents as the
344           following example shows:
345
346             eval {
347                 $parser->push( "<foo>", "bar" );
348                 $doc = $parser->finish_push();    # will report broken XML
349             };
350             if ( $@ ) {
351                # ...
352             }
353
354           This can be annoying if the closing tag is missed by accident. The
355           following code will restore the document:
356
357             eval {
358                 $parser->push( "<foo>", "bar" );
359                 $doc = $parser->finish_push(1);   # will return the data parsed
360                                                   # unless an error happened
361             };
362
363             print $doc->toString(); # returns "<foo>bar</foo>"
364
365           Of course finish_push() will return nothing if there was no data
366           pushed to the parser before.
367
368   Pull Parser (Reader)
369       XML::LibXML also provides a pull-parser interface similar to the
370       XmlReader interface in .NET. This interface is almost streaming, and is
371       usually faster and simpler to use than SAX. See XML::LibXML::Reader.
372
373   Direct SAX Parser
374       XML::LibXML provides a direct SAX parser in the XML::LibXML::SAX
375       module.
376
377   DOM based SAX Parser
378       XML::LibXML also provides a DOM based SAX parser. The SAX parser is
379       defined in the module XML::LibXML::SAX::Parser. As it is not a stream
380       based parser, it parses documents into a DOM and traverses the DOM tree
381       instead.
382
383       The API of this parser is exactly the same as any other Perl SAX2
384       parser. See XML::SAX::Intro for details.
385
386       Aside from the regular parsing methods, you can access the DOM tree
387       traverser directly, using the generate() method:
388
389         my $doc = build_yourself_a_document();
390         my $saxparser = $XML::LibXML::SAX::Parser->new( ... );
391         $parser->generate( $doc );
392
393       This is useful for serializing DOM trees, for example that you might
394       have done prior processing on, or that you have as a result of XSLT
395       processing.
396
397       WARNING
398
399       This is NOT a streaming SAX parser. As I said above, this parser reads
400       the entire document into a DOM and serialises it. Some people couldn't
401       read that in the paragraph above so I've added this warning. If you
402       want a streaming SAX parser look at the XML::LibXML::SAX man page
403

SERIALIZATION

405       XML::LibXML provides some functions to serialize nodes and documents.
406       The serialization functions are described on the XML::LibXML::Node
407       manpage or the XML::LibXML::Document manpage. XML::LibXML checks three
408       global flags that alter the serialization process:
409
410       ·   skipXMLDeclaration
411
412       ·   skipDTD
413
414       ·   setTagCompression
415
416       of that three functions only setTagCompression is available for all
417       serialization functions.
418
419       Because XML::LibXML does these flags not itself, one has to define them
420       locally as the following example shows:
421
422         local $XML::LibXML::skipXMLDeclaration = 1;
423         local $XML::LibXML::skipDTD = 1;
424         local $XML::LibXML::setTagCompression = 1;
425
426       If skipXMLDeclaration is defined and not '0', the XML declaration is
427       omitted during serialization.
428
429       If skipDTD is defined and not '0', an existing DTD would not be
430       serialized with the document.
431
432       If setTagCompression is defined and not '0' empty tags are displayed as
433       open and closing tags rather than the shortcut. For example the empty
434       tag foo will be rendered as <foo></foo> rather than <foo/>.
435

PARSER OPTIONS

437       Handling of libxml2 parser options has been unified and improved in
438       XML::LibXML 1.70. You can now set default options for a particular
439       parser instance by passing them to the constructor as
440       "XML::LibXML->new({name=>value, ...})" or
441       "XML::LibXML->new(name=>value,...)". The options can be queried and
442       changed using the following methods (pre-1.70 interfaces such as
443       "$parser->load_ext_dtd(0)" also exist, see below):
444
445       option_exists
446             $parser->option_exists($name);
447
448           Returns 1 if the current XML::LibXML version supports the option
449           $name, otherwise returns 0 (note that this does not necessarily
450           mean that the option is supported by the underlying libxml2
451           library).
452
453       get_option
454             $parser->get_option($name);
455
456           Returns the current value of the parser option $name.
457
458       set_option
459             $parser->set_option($name,$value);
460
461           Sets option $name to value $value.
462
463       set_options
464             $parser->set_options({$name=>$value,...});
465
466           Sets multiple parsing options at once.
467
468       IMPORTANT NOTE: This documentation reflects the parser flags available
469       in libxml2 2.7.3. Some options have no effect if an older version of
470       libxml2 is used.
471
472       Each of the flags listed below is labeled
473
474       /parser/
475           if it can be used with a "XML::LibXML" parser object (i.e. passed
476           to "XML::LibXML->new", "XML::LibXML->set_option", etc.)
477
478       /html/
479           if it can be used passed to the "parse_html_*" methods
480
481       /reader/
482           if it can be used with the "XML::LibXML::Reader".
483
484       Unless specified otherwise, the default for boolean valued options is 0
485       (false).
486
487       The available options are:
488
489       URI /parser, html, reader/
490
491           In case of parsing strings or file handles, XML::LibXML doesn't
492           know about the base uri of the document. To make relative
493           references such as XIncludes work, one has to set a base URI, that
494           is then used for the parsed document.
495
496       line_numbers
497           /parser, html, reader/
498
499           If this option is activated, libxml2 will store the line number of
500           each element node in the parsed document. The line number can be
501           obtained using the "line_number()" method of the
502           "XML::LibXML::Node" class (for non-element nodes this may report
503           the line number of the containing element). The line numbers are
504           also used for reporting positions of validation errors.
505
506           IMPORTANT: Due to limitations in the libxml2 library line numbers
507           greater than 65535 will be returned as 65535. Unfortunately, this
508           is a long and sad story, please see
509           <http://bugzilla.gnome.org/show_bug.cgi?id=325533> for more
510           details.
511
512       encoding
513           /html/
514
515           character encoding of the input
516
517       recover
518           /parser, html, reader/
519
520           recover from errors; possible values are 0, 1, and 2
521
522           A true value turns on recovery mode which allows one to parse
523           broken XML or HTML data. The recovery mode allows the parser to
524           return the successfully parsed portion of the input document. This
525           is useful for almost well-formed documents, where for example a
526           closing tag is missing somewhere. Still, XML::LibXML will only
527           parse until the first fatal (non-recoverable) error occurs,
528           reporting recoverable parsing errors as warnings. To suppress even
529           these warnings, use recover=>2.
530
531           Note that validation is switched off automatically in recovery
532           mode.
533
534       expand_entities
535           /parser, reader/
536
537           substitute entities; possible values are 0 and 1; default is 1
538
539           Note that although this flag disables entity substitution, it does
540           not prevent the parser from loading external entities; when
541           substitution of an external entity is disabled, the entity will be
542           represented in the document tree by an XML_ENTITY_REF_NODE node
543           whose subtree will be the content obtained by parsing the external
544           resource; Although this nesting is visible from the DOM it is
545           transparent to XPath data model, so it is possible to match nodes
546           in an unexpanded entity by the same XPath expression as if the
547           entity were expanded.  See also ext_ent_handler.
548
549       ext_ent_handler
550           /parser/
551
552           Provide a custom external entity handler to be used when
553           expand_entities is set to 1. Possible value is a subroutine
554           reference.
555
556           This feature does not work properly in libxml2 < 2.6.27!
557
558           The subroutine provided is called whenever the parser needs to
559           retrieve the content of an external entity. It is called with two
560           arguments: the system ID (URI) and the public ID. The value
561           returned by the subroutine is parsed as the content of the entity.
562
563           This method can be used to completely disable entity loading, e.g.
564           to prevent exploits of the type described at
565           (<http://searchsecuritychannel.techtarget.com/generic/0,295582,sid97_gci1304703,00.html>),
566           where a service is tricked to expose its private data by letting it
567           parse a remote file (RSS feed) that contains an entity reference to
568           a local file (e.g. "/etc/fstab").
569
570           A more granular solution to this problem, however, is provided by
571           custom URL resolvers, as in
572
573             my $c = XML::LibXML::InputCallback->new();
574             sub match {   # accept file:/ URIs except for XML catalogs in /etc/xml/
575               my ($uri) = @_;
576               return ($uri=~m{^file:/}
577                       and $uri !~ m{^file:///etc/xml/})
578                      ? 1 : 0;
579             }
580             $c->register_callbacks([ \&match, sub{}, sub{}, sub{} ]);
581             $parser->input_callbacks($c);
582
583       load_ext_dtd
584           /parser, reader/
585
586           load the external DTD subset while parsing; possible values are 0
587           and 1. Unless specified, XML::LibXML sets this option to 1.
588
589           This flag is also required for DTD Validation, to provide complete
590           attribute, and to expand entities, regardless if the document has
591           an internal subset. Thus switching off external DTD loading, will
592           disable entity expansion, validation, and complete attributes on
593           internal subsets as well.
594
595       complete_attributes
596           /parser, reader/
597
598           create default DTD attributes; possible values are 0 and 1
599
600       validation
601           /parser, reader/
602
603           validate with the DTD; possible values are 0 and 1
604
605       suppress_errors
606           /parser, html, reader/
607
608           suppress error reports; possible values are 0 and 1
609
610       suppress_warnings
611           /parser, html, reader/
612
613           suppress warning reports; possible values are 0 and 1
614
615       pedantic_parser
616           /parser, html, reader/
617
618           pedantic error reporting; possible values are 0 and 1
619
620       no_blanks
621           /parser, html, reader/
622
623           remove blank nodes; possible values are 0 and 1
624
625       no_defdtd
626           /html/
627
628           do not add a default DOCTYPE; possible values are 0 and 1
629
630           the default is (0) to add a DTD when the input html lacks one
631
632       expand_xinclude or xinclude
633           /parser, reader/
634
635           Implement XInclude substitution; possible values are 0 and 1
636
637           Expands XInclude tags immediately while parsing the document. Note
638           that the parser will use the URI resolvers installed via
639           "XML::LibXML::InputCallback" to parse the included document (if
640           any).
641
642       no_xinclude_nodes
643           /parser, reader/
644
645           do not generate XINCLUDE START/END nodes; possible values are 0 and
646           1
647
648       no_network
649           /parser, html, reader/
650
651           Forbid network access; possible values are 0 and 1
652
653           If set to true, all attempts to fetch non-local resources (such as
654           DTD or external entities) will fail (unless custom callbacks are
655           defined).
656
657           It may be necessary to use the flag "recover" for processing
658           documents requiring such resources while networking is off.
659
660       clean_namespaces
661           /parser, reader/
662
663           remove redundant namespaces declarations during parsing; possible
664           values are 0 and 1.
665
666       no_cdata
667           /parser, html, reader/
668
669           merge CDATA as text nodes; possible values are 0 and 1
670
671       no_basefix
672           /parser, reader/
673
674           not fixup XINCLUDE xml#base URIS; possible values are 0 and 1
675
676       huge
677           /parser, html, reader/
678
679           relax any hardcoded limit from the parser; possible values are 0
680           and 1. Unless specified, XML::LibXML sets this option to 0.
681
682           Note: the default value for this option was changed to protect
683           against denial of service through entity expansion attacks. Before
684           enabling the option ensure you have taken alternative measures to
685           protect your application against this type of attack.
686
687       gdome
688           /parser/
689
690           THIS OPTION IS EXPERIMENTAL!
691
692           Although quite powerful, XML::LibXML's DOM implementation is
693           incomplete with respect to the DOM level 2 or level 3
694           specifications. XML::GDOME is based on libxml2 as well, and
695           provides a rather complete DOM implementation by wrapping libgdome.
696           This flag allows you to make use of XML::LibXML's full parser
697           options and XML::GDOME's DOM implementation at the same time.
698
699           To make use of this function, one has to install libgdome and
700           configure XML::LibXML to use this library. For this you need to
701           rebuild XML::LibXML!
702
703           Note: this feature was not seriously tested in recent XML::LibXML
704           releases.
705
706       For compatibility with XML::LibXML versions prior to 1.70, the
707       following methods are also supported for querying and setting the
708       corresponding parser options (if called without arguments, the methods
709       return the current value of the corresponding parser options; with an
710       argument sets the option to a given value):
711
712         $parser->validation();
713         $parser->recover();
714         $parser->pedantic_parser();
715         $parser->line_numbers();
716         $parser->load_ext_dtd();
717         $parser->complete_attributes();
718         $parser->expand_xinclude();
719         $parser->gdome_dom();
720         $parser->clean_namespaces();
721         $parser->no_network();
722
723       The following obsolete methods trigger parser options in some special
724       way:
725
726       recover_silently
727             $parser->recover_silently(1);
728
729           If called without an argument, returns true if the current value of
730           the "recover" parser option is 2 and returns false otherwise. With
731           a true argument sets the "recover" parser option to 2; with a false
732           argument sets the "recover" parser option to 0.
733
734       expand_entities
735             $parser->expand_entities(0);
736
737           Get/set the "expand_entities" option. If called with a true
738           argument, also turns the "load_ext_dtd" option to 1.
739
740       keep_blanks
741             $parser->keep_blanks(0);
742
743           This is actually the opposite of the "no_blanks" parser option. If
744           used without an argument retrieves negated value of "no_blanks". If
745           used with an argument sets "no_blanks" to the opposite value.
746
747       base_uri
748             $parser->base_uri( $your_base_uri );
749
750           Get/set the "URI" option.
751

XML CATALOGS

753       "libxml2" supports XML catalogs. Catalogs are used to map remote
754       resources to their local copies. Using catalogs can speed up parsing
755       processes if many external resources from remote addresses are loaded
756       into the parsed documents (such as DTDs or XIncludes).
757
758       Note that libxml2 has a global pool of loaded catalogs, so if you apply
759       the method "load_catalog" to one parser instance, all parser instances
760       will start using the catalog (in addition to other previously loaded
761       catalogs).
762
763       Note also that catalogs are not used when a custom external entity
764       handler is specified. At the current state it is not possible to make
765       use of both types of resolving systems at the same time.
766
767       load_catalog
768             $parser->load_catalog( $catalog_file );
769
770           Loads the XML catalog file $catalog_file.
771
772             # Global external entity loader (similar to ext_ent_handler option
773             # but this works really globally, also in XML::LibXSLT include etc..)
774
775             XML::LibXML::externalEntityLoader(\&my_loader);
776

ERROR REPORTING

778       XML::LibXML throws exceptions during parsing, validation or XPath
779       processing (and some other occasions). These errors can be caught by
780       using eval blocks. The error is stored in $@. There are two
781       implementations: the old one throws $@ which is just a message string,
782       in the new one $@ is an object from the class XML::LibXML::Error; this
783       class overrides the operator "" so that when printed, the object
784       flattens to the usual error message.
785
786       XML::LibXML throws errors as they occur. This is a very common
787       misunderstanding in the use of XML::LibXML. If the eval is omitted,
788       XML::LibXML will always halt your script by "croaking" (see Carp man
789       page for details).
790
791       Also note that an increasing number of functions throw errors if bad
792       data is passed as arguments. If you cannot assure valid data passed to
793       XML::LibXML you should eval these functions.
794
795       Note: since version 1.59, get_last_error() is no longer available in
796       XML::LibXML for thread-safety reasons.
797

AUTHORS

799       Matt Sergeant, Christian Glahn, Petr Pajas
800

VERSION

802       2.0132
803
805       2001-2007, AxKit.com Ltd.
806
807       2002-2006, Christian Glahn.
808
809       2006-2009, Petr Pajas.
810

LICENSE

812       This program is free software; you can redistribute it and/or modify it
813       under the same terms as Perl itself.
814
815
816
817perl v5.26.3                      2017-10-28            XML::LibXML::Parser(3)
Impressum