1LibXML(3)             User Contributed Perl Documentation            LibXML(3)
2
3
4

NAME

6       XML::LibXML - Perl Binding for libxml2
7

SYNOPSIS

9         use XML::LibXML;
10         my $dom = XML::LibXML->load_xml(string => <<'EOT');
11         <some-xml/>
12         EOT
13
14         $Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;
15         $Version_ID = XML::LibXML::LIBXML_VERSION;
16         $DLL_Version = XML::LibXML::LIBXML_RUNTIME_VERSION;
17         $libxmlnode = XML::LibXML->import_GDOME( $node, $deep );
18         $gdomenode = XML::LibXML->export_GDOME( $node, $deep );
19

DESCRIPTION

21       This module is an interface to libxml2, providing XML and HTML parsers
22       with DOM, SAX and XMLReader interfaces, a large subset of DOM Layer 3
23       interface and a XML::XPath-like interface to XPath API of libxml2. The
24       module is split into several packages which are not described in this
25       section; unless stated otherwise, you only need to "use XML::LibXML;"
26       in your programs.
27
28       For further information, please check the following documentation:
29
30       XML::LibXML::Parser
31           Parsing XML files with XML::LibXML
32
33       XML::LibXML::DOM
34           XML::LibXML Document Object Model (DOM) Implementation
35
36       XML::LibXML::SAX
37           XML::LibXML direct SAX parser
38
39       XML::LibXML::Reader
40           Reading XML with a pull-parser
41
42       XML::LibXML::Dtd
43           XML::LibXML frontend for DTD validation
44
45       XML::LibXML::RelaxNG
46           XML::LibXML frontend for RelaxNG schema validation
47
48       XML::LibXML::Schema
49           XML::LibXML frontend for W3C Schema schema validation
50
51       XML::LibXML::XPathContext
52           API for evaluating XPath expressions with enhanced support for the
53           evaluation context
54
55       XML::LibXML::InputCallback
56           Implementing custom URI Resolver and input callbacks
57
58       XML::LibXML::Common
59           Common functions for XML::LibXML related Classes
60
61       The nodes in the Document Object Model (DOM) are represented by the
62       following classes (most of which "inherit" from XML::LibXML::Node):
63
64       XML::LibXML::Document
65           XML::LibXML class for DOM document nodes
66
67       XML::LibXML::Node
68           Abstract base class for XML::LibXML DOM nodes
69
70       XML::LibXML::Element
71           XML::LibXML class for DOM element nodes
72
73       XML::LibXML::Text
74           XML::LibXML class for DOM text nodes
75
76       XML::LibXML::Comment
77           XML::LibXML class for comment DOM nodes
78
79       XML::LibXML::CDATASection
80           XML::LibXML class for DOM CDATA sections
81
82       XML::LibXML::Attr
83           XML::LibXML DOM attribute class
84
85       XML::LibXML::DocumentFragment
86           XML::LibXML's DOM L2 Document Fragment implementation
87
88       XML::LibXML::Namespace
89           XML::LibXML DOM namespace nodes
90
91       XML::LibXML::PI
92           XML::LibXML DOM processing instruction nodes
93

ENCODINGS SUPPORT IN XML::LIBXML

95       Recall that since version 5.6.1, Perl distinguishes between character
96       strings (internally encoded in UTF-8) and so called binary data and,
97       accordingly, applies either character or byte semantics to them. A
98       scalar representing a character string is distinguished from a byte
99       string by special flag (UTF8).  Please refer to perlunicode for
100       details.
101
102       XML::LibXML's API is designed to deal with many encodings of XML
103       documents completely transparently, so that the application using
104       XML::LibXML can be completely ignorant about the encoding of the XML
105       documents it works with. On the other hand, functions like
106       "XML::LibXML::Document->setEncoding" give the user control over the
107       document encoding.
108
109       To ensure the aforementioned transparency and uniformity, most
110       functions of XML::LibXML that work with in-memory trees accept and
111       return data as character strings (i.e. UTF-8 encoded with the UTF8 flag
112       on) regardless of the original document encoding; however, the
113       functions related to I/O operations (i.e.  parsing and saving) operate
114       with binary data (in the original document encoding) obeying the
115       encoding declaration of the XML documents.
116
117       Below we summarize basic rules and principles regarding encoding:
118
119       1.  Do NOT apply any encoding-related PerlIO layers (":utf8" or
120           ":encoding(...)") to file handles that are an input for the parses
121           or an output for a serializer of (full) XML documents. This is
122           because the conversion of the data to/from the internal character
123           representation is provided by libxml2 itself which must be able to
124           enforce the encoding specified by the "<?xml version="1.0"
125           encoding="..."?>" declaration. Here is an example to follow:
126
127             use XML::LibXML;
128             # load
129             open my $fh, '<', 'file.xml';
130             binmode $fh; # drop all PerlIO layers possibly created by a use open pragma
131             $doc = XML::LibXML->load_xml(IO => $fh);
132
133             # save
134             open my $out, '>', 'out.xml';
135             binmode $out; # as above
136             $doc->toFH($out);
137             # or
138             print {$out} $doc->toString();
139
140       2.  All functions working with DOM accept and return character strings
141           (UTF-8 encoded with UTF8 flag on). E.g.
142
143             my $doc = XML::LibXML::Document->new('1.0',$some_encoding);
144             my $element = $doc->createElement($name);
145             $element->appendText($text);
146             $xml_fragment = $element->toString(); # returns a character string
147             $xml_document = $doc->toString(); # returns a byte string
148
149           where $some_encoding is the document encoding that will be used
150           when saving the document, and $name and $text contain character
151           strings (UTF-8 encoded with UTF8 flag on). Note that the method
152           "toString" returns XML as a character string if applied to other
153           node than the Document node and a byte string containing the
154           appropriate
155
156             <?xml version="1.0" encoding="..."?>
157
158           declaration if applied to a XML::LibXML::Document.
159
160       3.  DOM methods also accept binary strings in the original encoding of
161           the document to which the node belongs (UTF-8 is assumed if the
162           node is not attached to any document). Exploiting this feature is
163           NOT RECOMMENDED since it is considered bad practice.
164
165             my $doc = XML::LibXML::Document->new('1.0','iso-8859-2');
166             my $text = $doc->createTextNode($some_latin2_encoded_byte_string);
167             # WORKS, BUT NOT RECOMMENDED!
168
169       NOTE: libxml2 support for many encodings is based on the iconv library.
170       The actual list of supported encodings may vary from platform to
171       platform. To test if your platform works correctly with your language
172       encoding, build a simple document in the particular encoding and try to
173       parse it with XML::LibXML to see if the parser produces any errors.
174       Occasional crashes were reported on rare platforms that ship with a
175       broken version of iconv.
176

THREAD SUPPORT

178       XML::LibXML since 1.67 partially supports Perl threads in Perl >=
179       5.8.8.  XML::LibXML can be used with threads in two ways:
180
181       By default, all XML::LibXML classes use CLONE_SKIP class method to
182       prevent Perl from copying XML::LibXML::* objects when a new thread is
183       spawn. In this mode, all XML::LibXML::* objects are thread specific.
184       This is the safest way to work with XML::LibXML in threads.
185
186       Alternatively, one may use
187
188         use threads;
189         use XML::LibXML qw(:threads_shared);
190
191       to indicate, that all XML::LibXML node and parser objects should be
192       shared between the main thread and any thread spawn from there. For
193       example, in
194
195         my $doc = XML::LibXML->load_xml(location => $filename);
196         my $thr = threads->new(sub{
197           # code working with $doc
198           1;
199         });
200         $thr->join;
201
202       the variable $doc refers to the exact same XML::LibXML::Document in the
203       spawned thread as in the main thread.
204
205       Without using mutex locks, parallel threads may read the same document
206       (i.e.  any node that belongs to the document), parse files, and modify
207       different documents.
208
209       However, if there is a chance that some of the threads will attempt to
210       modify a document (or even create new nodes based on that document,
211       e.g. with "$doc->createElement") that other threads may be reading at
212       the same time, the user is responsible for creating a mutex lock and
213       using it in both in the thread that modifies and the thread that reads:
214
215         my $doc = XML::LibXML->load_xml(location => $filename);
216         my $mutex : shared;
217         my $thr = threads->new(sub{
218            lock $mutex;
219            my $el = $doc->createElement('foo');
220            # ...
221           1;
222         });
223         {
224           lock $mutex;
225           my $root = $doc->documentElement;
226           say $root->name;
227         }
228         $thr->join;
229
230       Note that libxml2 uses dictionaries to store short strings and these
231       dictionaries are kept on a document node. Without mutex locks, it could
232       happen in the previous example that the thread modifies the dictionary
233       while other threads attempt to read from it, which could easily lead to
234       a crash.
235

VERSION INFORMATION

237       Sometimes it is useful to figure out, for which version XML::LibXML was
238       compiled for. In most cases this is for debugging or to check if a
239       given installation meets all functionality for the package. The
240       functions XML::LibXML::LIBXML_DOTTED_VERSION and
241       XML::LibXML::LIBXML_VERSION provide this version information. Both
242       functions simply pass through the values of the similar named macros of
243       libxml2. Similarly, XML::LibXML::LIBXML_RUNTIME_VERSION returns the
244       version of the (usually dynamically) linked libxml2.
245
246       XML::LibXML::LIBXML_DOTTED_VERSION
247             $Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;
248
249           Returns the version string of the libxml2 version XML::LibXML was
250           compiled for.  This will be "2.6.2" for "libxml2 2.6.2".
251
252       XML::LibXML::LIBXML_VERSION
253             $Version_ID = XML::LibXML::LIBXML_VERSION;
254
255           Returns the version id of the libxml2 version XML::LibXML was
256           compiled for.  This will be "20602" for "libxml2 2.6.2". Don't mix
257           this version id with $XML::LibXML::VERSION. The latter contains the
258           version of XML::LibXML itself while the first contains the version
259           of libxml2 XML::LibXML was compiled for.
260
261       XML::LibXML::LIBXML_RUNTIME_VERSION
262             $DLL_Version = XML::LibXML::LIBXML_RUNTIME_VERSION;
263
264           Returns a version string of the libxml2 which is (usually
265           dynamically) linked by XML::LibXML. This will be "20602" for
266           libxml2 released as "2.6.2" and something like "20602-CVS2032" for
267           a CVS build of libxml2.
268
269           XML::LibXML issues a warning if the version of libxml2 dynamically
270           linked to it is less than the version of libxml2 which it was
271           compiled against.
272

EXPORTS

274       By default the module exports all constants and functions listed in the
275       :all tag, described below.
276

EXPORT TAGS

278       ":all"
279           Includes the tags ":libxml", ":encoding", and ":ns" described
280           below.
281
282       ":libxml"
283           Exports integer constants for DOM node types.
284
285             XML_ELEMENT_NODE            => 1
286             XML_ATTRIBUTE_NODE          => 2
287             XML_TEXT_NODE               => 3
288             XML_CDATA_SECTION_NODE      => 4
289             XML_ENTITY_REF_NODE         => 5
290             XML_ENTITY_NODE             => 6
291             XML_PI_NODE                 => 7
292             XML_COMMENT_NODE            => 8
293             XML_DOCUMENT_NODE           => 9
294             XML_DOCUMENT_TYPE_NODE      => 10
295             XML_DOCUMENT_FRAG_NODE      => 11
296             XML_NOTATION_NODE           => 12
297             XML_HTML_DOCUMENT_NODE      => 13
298             XML_DTD_NODE                => 14
299             XML_ELEMENT_DECL            => 15
300             XML_ATTRIBUTE_DECL          => 16
301             XML_ENTITY_DECL             => 17
302             XML_NAMESPACE_DECL          => 18
303             XML_XINCLUDE_START          => 19
304             XML_XINCLUDE_END            => 20
305
306       ":encoding"
307           Exports two encoding conversion functions from XML::LibXML::Common.
308
309             encodeToUTF8()
310             decodeFromUTF8()
311
312       ":ns"
313           Exports two convenience constants: the implicit namespace of the
314           reserved "xml:" prefix, and the implicit namespace for the reserved
315           "xmlns:" prefix.
316
317             XML_XML_NS    => 'http://www.w3.org/XML/1998/namespace'
318             XML_XMLNS_NS  => 'http://www.w3.org/2000/xmlns/'
319
321       The modules described in this section are not part of the XML::LibXML
322       package itself. As they support some additional features, they are
323       mentioned here.
324
325       XML::LibXSLT
326           XSLT 1.0 Processor using libxslt and XML::LibXML
327
328       XML::LibXML::Iterator
329           XML::LibXML Implementation of the DOM Traversal Specification
330
331       XML::CompactTree::XS
332           Uses XML::LibXML::Reader to very efficiently to parse XML document
333           or element into native Perl data structures, which are less
334           flexible but significantly faster to process then DOM.
335

XML::LIBXML AND XML::GDOME

337       Note: THE FUNCTIONS DESCRIBED HERE ARE STILL EXPERIMENTAL
338
339       Although both modules make use of libxml2's XML capabilities, the DOM
340       implementation of both modules are not compatible. But still it is
341       possible to exchange nodes from one DOM to the other. The concept of
342       this exchange is pretty similar to the function cloneNode(): The
343       particular node is copied on the low-level to the opposite DOM
344       implementation.
345
346       Since the DOM implementations cannot coexist within one document, one
347       is forced to copy each node that should be used. Because you are always
348       keeping two nodes this may cause quite an impact on a machines memory
349       usage.
350
351       XML::LibXML provides two functions to export or import GDOME nodes:
352       import_GDOME() and export_GDOME(). Both function have two parameters:
353       the node and a flag for recursive import. The flag works as in
354       cloneNode().
355
356       The two functions allow one to export and import XML::GDOME nodes
357       explicitly, however, XML::LibXML also allows the transparent import of
358       XML::GDOME nodes in functions such as appendChild(), insertAfter() and
359       so on. While native nodes are automatically adopted in most functions
360       XML::GDOME nodes are always cloned in advance. Thus if the original
361       node is modified after the operation, the node in the XML::LibXML
362       document will not have this information.
363
364       import_GDOME
365             $libxmlnode = XML::LibXML->import_GDOME( $node, $deep );
366
367           This clones an XML::GDOME node to an XML::LibXML node explicitly.
368
369       export_GDOME
370             $gdomenode = XML::LibXML->export_GDOME( $node, $deep );
371
372           Allows one to clone an XML::LibXML node into an XML::GDOME node.
373

CONTACTS

375       For bug reports, please use the CPAN request tracker on
376       http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-LibXML
377
378       For suggestions etc., and other issues related to XML::LibXML you may
379       use the perl XML mailing list ("perl-xml@listserv.ActiveState.com"),
380       where most XML-related Perl modules are discussed. In case of problems
381       you should check the archives of that list first. Many problems are
382       already discussed there. You can find the list's archives and
383       subscription options at
384       <http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml>.
385

AUTHORS

387       Matt Sergeant, Christian Glahn, Petr Pajas
388

VERSION

390       2.0134
391
393       2001-2007, AxKit.com Ltd.
394
395       2002-2006, Christian Glahn.
396
397       2006-2009, Petr Pajas.
398

LICENSE

400       This program is free software; you can redistribute it and/or modify it
401       under the same terms as Perl itself.
402
403
404
405perl v5.28.1                      2019-02-10                         LibXML(3)
Impressum