1LibXML(3)             User Contributed Perl Documentation            LibXML(3)
2
3
4

NAME

6       XML::LibXML - Perl Binding for libxml2
7

SYNOPSIS

9         use XML::LibXML;
10         my $dom = XML::LibXML->load_xml(string => <<'EOT');
11         <some-xml/>
12         EOT
13
14         $Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;
15         $Version_ID = XML::LibXML::LIBXML_VERSION;
16         $DLL_Version = XML::LibXML::LIBXML_RUNTIME_VERSION;
17         $libxmlnode = XML::LibXML->import_GDOME( $node, $deep );
18         $gdomenode = XML::LibXML->export_GDOME( $node, $deep );
19

DESCRIPTION

21       This module is an interface to libxml2, providing XML and HTML parsers
22       with DOM, SAX and XMLReader interfaces, a large subset of DOM Layer 3
23       interface and a XML::XPath-like interface to XPath API of libxml2. The
24       module is split into several packages which are not described in this
25       section; unless stated otherwise, you only need to "use XML::LibXML;"
26       in your programs.
27
28       For further information, please check the following documentation:
29
30       XML::LibXML::Parser
31           Parsing XML files with XML::LibXML
32
33       XML::LibXML::DOM
34           XML::LibXML Document Object Model (DOM) Implementation
35
36       XML::LibXML::SAX
37           XML::LibXML direct SAX parser
38
39       XML::LibXML::Reader
40           Reading XML with a pull-parser
41
42       XML::LibXML::Dtd
43           XML::LibXML frontend for DTD validation
44
45       XML::LibXML::RelaxNG
46           XML::LibXML frontend for RelaxNG schema validation
47
48       XML::LibXML::Schema
49           XML::LibXML frontend for W3C Schema schema validation
50
51       XML::LibXML::XPathContext
52           API for evaluating XPath expressions with enhanced support for the
53           evaluation context
54
55       XML::LibXML::InputCallback
56           Implementing custom URI Resolver and input callbacks
57
58       XML::LibXML::Common
59           Common functions for XML::LibXML related Classes
60
61       The nodes in the Document Object Model (DOM) are represented by the
62       following classes (most of which "inherit" from XML::LibXML::Node):
63
64       XML::LibXML::Document
65           XML::LibXML class for DOM document nodes
66
67       XML::LibXML::Node
68           Abstract base class for XML::LibXML DOM nodes
69
70       XML::LibXML::Element
71           XML::LibXML class for DOM element nodes
72
73       XML::LibXML::Text
74           XML::LibXML class for DOM text nodes
75
76       XML::LibXML::Comment
77           XML::LibXML class for comment DOM nodes
78
79       XML::LibXML::CDATASection
80           XML::LibXML class for DOM CDATA sections
81
82       XML::LibXML::Attr
83           XML::LibXML DOM attribute class
84
85       XML::LibXML::DocumentFragment
86           XML::LibXML's DOM L2 Document Fragment implementation
87
88       XML::LibXML::Namespace
89           XML::LibXML DOM namespace nodes
90
91       XML::LibXML::PI
92           XML::LibXML DOM processing instruction nodes
93

ENCODINGS SUPPORT IN XML::LIBXML

95       Recall that since version 5.6.1, Perl distinguishes between character
96       strings (internally encoded in UTF-8) and so called binary data and,
97       accordingly, applies either character or byte semantics to them. A
98       scalar representing a character string is distinguished from a byte
99       string by special flag (UTF8).  Please refer to perlunicode for
100       details.
101
102       XML::LibXML's API is designed to deal with many encodings of XML
103       documents completely transparently, so that the application using
104       XML::LibXML can be completely ignorant about the encoding of the XML
105       documents it works with. On the other hand, functions like
106       "XML::LibXML::Document->setEncoding" give the user control over the
107       document encoding.
108
109       To ensure the aforementioned transparency and uniformity, most
110       functions of XML::LibXML that work with in-memory trees accept and
111       return data as character strings (i.e. UTF-8 encoded with the UTF8 flag
112       on) regardless of the original document encoding; however, the
113       functions related to I/O operations (i.e.  parsing and saving) operate
114       with binary data (in the original document encoding) obeying the
115       encoding declaration of the XML documents.
116
117       Below we summarize basic rules and principles regarding encoding:
118
119       1.  Do NOT apply any encoding-related PerlIO layers (":utf8" or
120           ":encoding(...)") to file handles that are an input for the parses
121           or an output for a serializer of (full) XML documents. This is
122           because the conversion of the data to/from the internal character
123           representation is provided by libxml2 itself which must be able to
124           enforce the encoding specified by the "<?xml version="1.0"
125           encoding="..."?>" declaration. Here is an example to follow:
126
127             use XML::LibXML;
128             open my $fh, "file.xml";
129             binmode $fh; # drop all PerlIO layers possibly created by a use open pragma
130             $doc = XML::LibXML->load_xml(IO => $fh);
131             open my $out, "out.xml";
132             binmode $fh; # as above
133             $doc->toFh($fh);
134             # or
135             print $fh $doc->toString();
136
137       2.  All functions working with DOM accept and return character strings
138           (UTF-8 encoded with UTF8 flag on). E.g.
139
140             my $doc = XML::LibXML:Document->new('1.0',$some_encoding);
141             my $element = $doc->createElement($name);
142             $element->appendText($text);
143             $xml_fragment = $element->toString(); # returns a character string
144             $xml_document = $doc->toString(); # returns a byte string
145
146           where $some_encoding is the document encoding that will be used
147           when saving the document, and $name and $text contain character
148           strings (UTF-8 encoded with UTF8 flag on). Note that the method
149           "toString" returns XML as a character string if applied to other
150           node than the Document node and a byte string containing the
151           apropriate
152
153             <?xml version="1.0" encoding="..."?>
154
155           declaration if applied to a XML::LibXML::Document.
156
157       3.  DOM methods also accept binary strings in the original encoding of
158           the document to which the node belongs (UTF-8 is assumed if the
159           node is not attached to any document). Exploiting this feature is
160           NOT RECOMMENDED since it is considered a bad practice.
161
162             my $doc = XML::LibXML:Document->new('1.0','iso-8859-2');
163             my $text = $doc->createTextNode($some_latin2_encoded_byte_string);
164             # WORKS, BUT NOT RECOMMENDED!
165
166       NOTE: libxml2 support for many encodings is based on the iconv library.
167       The actual list of supported encodings may vary from platform to
168       platform. To test if your platform works correctly with your language
169       encoding, build a simple document in the particular encoding and try to
170       parse it with XML::LibXML to see if the parser produces any errors.
171       Occasional crashes were reported on rare platforms that ship with a
172       broken version of iconv.
173

THREAD SUPPORT

175       XML::LibXML since 1.67 partially supports Perl threads in Perl >=
176       5.8.8.  XML::LibXML can be used with threads in two ways:
177
178       By default, all XML::LibXML classes use CLONE_SKIP class method to
179       prevent Perl from copying XML::LibXML::* objects when a new thread is
180       spawn. In this mode, all XML::LibXML::* objects are thread specific.
181       This is the safest way to work with XML::LibXML in threads.
182
183       Alternatively, one may use
184
185         use threads;
186         use XML::LibXML qw(:threads_shared);
187
188       to indicate, that all XML::LibXML node and parser objects should be
189       shared between the main thread and any thread spawn from there. For
190       example, in
191
192         my $doc = XML::LibXML->load_xml(location => $filename);
193         my $thr = threads->new(sub{
194           # code working with $doc
195           1;
196         });
197         $thr->join;
198
199       the variable $doc refers to the exact same XML::LibXML::Document in the
200       spawned thread as in the main thread.
201
202       Without using mutex locks, oaralel threads may read the same document
203       (i.e. any node that belongs to the document), parse files, and modify
204       different documents.
205
206       However, if there is a chance that some of the threads will attempt to
207       modify a document ( or even create new nodes based on that document,
208       e.g. with "$doc->createElement") that other threads may be reading at
209       the same time, the user is responsible for creating a mutex lock and
210       using it in both in the thread that modifies and the thread that reads:
211
212         my $doc = XML::LibXML->load_xml(location => $filename);
213         my $mutex : shared;
214         my $thr = threads->new(sub{
215            lock $mutex;
216            my $el = $doc->createElement('foo');
217            # ...
218           1;
219         });
220         {
221           lock $mutex;
222           my $root = $doc->documentElement;
223           say $root->name;
224         }
225         $thr->join;
226
227       Note that libxml2 uses dictionaries to store short strings and these
228       dicionaries are kept on a document node. Without mutex locks, it could
229       happen in the previous example that the thread modifies the dictionary
230       while other threads attempt to read from it, which could easily lead to
231       a crash.
232

VERSION INFORMATION

234       Sometimes it is useful to figure out, for which version XML::LibXML was
235       compiled for. In most cases this is for debugging or to check if a
236       given installation meets all functionality for the package. The
237       functions XML::LibXML::LIBXML_DOTTED_VERSION and
238       XML::LibXML::LIBXML_VERSION provide this version information. Both
239       functions simply pass through the values of the similar named macros of
240       libxml2. Similarly, XML::LibXML::LIBXML_RUNTIME_VERSION returns the
241       version of the (usually dynamically) linked libxml2.
242
243       XML::LibXML::LIBXML_DOTTED_VERSION
244             $Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;
245
246           Returns the version string of the libxml2 version XML::LibXML was
247           compiled for.  This will be "2.6.2" for "libxml2 2.6.2".
248
249       XML::LibXML::LIBXML_VERSION
250             $Version_ID = XML::LibXML::LIBXML_VERSION;
251
252           Returns the version id of the libxml2 version XML::LibXML was
253           compiled for.  This will be "20602" for "libxml2 2.6.2". Don't mix
254           this version id with $XML::LibXML::VERSION. The latter contains the
255           version of XML::LibXML itself while the first contains the version
256           of libxml2 XML::LibXML was compiled for.
257
258       XML::LibXML::LIBXML_RUNTIME_VERSION
259             $DLL_Version = XML::LibXML::LIBXML_RUNTIME_VERSION;
260
261           Returns a version string of the libxml2 which is (usually
262           dynamically) linked by XML::LibXML. This will be "20602" for
263           libxml2 released as "2.6.2" and something like "20602-CVS2032" for
264           a CVS build of libxml2.
265
266           XML::LibXML issues a warning if the version of libxml2 dynamically
267           linked to it is less than the version of libxml2 which it was
268           compiled against.
269

EXPORTS

271       By default the module exports all constants and functions listed in the
272       :all tag, described below.
273

EXPORT TAGS

275       ":all"
276           Includes the tags ":libxml", ":encoding", and ":ns" described
277           below.
278
279       ":libxml"
280           Exports integer constants for DOM node types.
281
282             XML_ELEMENT_NODE            => 1
283             XML_ATTRIBUTE_NODE          => 2
284             XML_TEXT_NODE               => 3
285             XML_CDATA_SECTION_NODE      => 4
286             XML_ENTITY_REF_NODE         => 5
287             XML_ENTITY_NODE             => 6
288             XML_PI_NODE                 => 7
289             XML_COMMENT_NODE            => 8
290             XML_DOCUMENT_NODE           => 9
291             XML_DOCUMENT_TYPE_NODE      => 10
292             XML_DOCUMENT_FRAG_NODE      => 11
293             XML_NOTATION_NODE           => 12
294             XML_HTML_DOCUMENT_NODE      => 13
295             XML_DTD_NODE                => 14
296             XML_ELEMENT_DECL            => 15
297             XML_ATTRIBUTE_DECL          => 16
298             XML_ENTITY_DECL             => 17
299             XML_NAMESPACE_DECL          => 18
300             XML_XINCLUDE_START          => 19
301             XML_XINCLUDE_END            => 20
302
303       ":encoding"
304           Exports two encoding conversion functions from XML::LibXML::Common.
305
306             encodeToUTF8()
307             decodeFromUTF8()
308
309       ":ns"
310           Exports two convenience constants: the implicit namespace of the
311           reserved "xml:" prefix, and the implicit namespace for the reserved
312           "xmlns:" prefix.
313
314             XML_XML_NS    => 'http://www.w3.org/XML/1998/namespace'
315             XML_XMLNS_NS  => 'http://www.w3.org/2000/xmlns/'
316
318       The modules described in this section are not part of the XML::LibXML
319       package itself. As they support some additional features, they are
320       mentioned here.
321
322       XML::LibXSLT
323           XSLT 1.0 Processor using libxslt and XML::LibXML
324
325       XML::LibXML::Iterator
326           XML::LibXML Implementation of the DOM Traversal Specification
327
328       XML::CompactTree::XS
329           Uses XML::LibXML::Reader to very efficiently to parse XML document
330           or element into native Perl data structures, which are less
331           flexible but significantly faster to process then DOM.
332

XML::LIBXML AND XML::GDOME

334       Note: THE FUNCTIONS DESCRIBED HERE ARE STILL EXPERIMENTAL
335
336       Although both modules make use of libxml2's XML capabilities, the DOM
337       implementation of both modules are not compatible. But still it is
338       possible to exchange nodes from one DOM to the other. The concept of
339       this exchange is pretty similar to the function cloneNode(): The
340       particular node is copied on the low-level to the opposite DOM
341       implementation.
342
343       Since the DOM implementations cannot coexist within one document, one
344       is forced to copy each node that should be used. Because you are always
345       keeping two nodes this may cause quite an impact on a machines memory
346       usage.
347
348       XML::LibXML provides two functions to export or import GDOME nodes:
349       import_GDOME() and export_GDOME(). Both function have two parameters:
350       the node and a flag for recursive import. The flag works as in
351       cloneNode().
352
353       The two functions allow to export and import XML::GDOME nodes
354       explicitly, however, XML::LibXML allows also the transparent import of
355       XML::GDOME nodes in functions such as appendChild(), insertAfter() and
356       so on. While native nodes are automatically adopted in most functions
357       XML::GDOME nodes are always cloned in advance. Thus if the original
358       node is modified after the operation, the node in the XML::LibXML
359       document will not have this information.
360
361       import_GDOME
362             $libxmlnode = XML::LibXML->import_GDOME( $node, $deep );
363
364           This clones an XML::GDOME node to a XML::LibXML node explicitly.
365
366       export_GDOME
367             $gdomenode = XML::LibXML->export_GDOME( $node, $deep );
368
369           Allows to clone an XML::LibXML node into a XML::GDOME node.
370

CONTACTS

372       For bug reports, please use the CPAN request tracker on
373       http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-LibXML
374
375       For suggestions etc., and other issues related to XML::LibXML you may
376       use the perl XML mailing list ("perl-xml@listserv.ActiveState.com"),
377       where most XML-related Perl modules are discussed. In case of problems
378       you should check the archives of that list first. Many problems are
379       already discussed there. You can find the list's archives and
380       subscription options at
381       http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml
382       <http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml>.
383

AUTHORS

385       Matt Sergeant, Christian Glahn, Petr Pajas
386

VERSION

388       1.70
389
391       2001-2007, AxKit.com Ltd.
392
393       2002-2006, Christian Glahn.
394
395       2006-2009, Petr Pajas.
396
397
398
399perl v5.12.0                      2009-10-07                         LibXML(3)
Impressum