1XML::LibXML::Document(3U)ser Contributed Perl DocumentatiXoMnL::LibXML::Document(3)
2
3
4

NAME

6       XML::LibXML::Document - XML::LibXML DOM Document Class
7

SYNOPSIS

9         use XML::LibXML;
10         # Only methods specific to Document nodes are listed here,
11         # see XML::LibXML::Node manpage for other methods
12
13         $dom = XML::LibXML::Document->new( $version, $encoding );
14         $dom = XML::LibXML::Document->createDocument( $version, $encoding );
15         $strURI = $doc->URI();
16         $doc->setURI($strURI);
17         $strEncoding = $doc->encoding();
18         $strEncoding = $doc->actualEncoding();
19         $doc->setEncoding($new_encoding);
20         $strVersion = $doc->version();
21         $doc->standalone
22         $doc->setStandalone($numvalue);
23         my $compression = $doc->compression;
24         $doc->setCompression($ziplevel);
25         $docstring = $dom->toString($format);
26         $c14nstr = $doc->toStringC14N($comment_flag, $xpath [, $xpath_context ]);
27         $ec14nstr = $doc->toStringEC14N($comment_flag, $xpath [, $xpath_context ], $inclusive_prefix_list);
28         $str = $doc->serialize($format);
29         $state = $doc->toFile($filename, $format);
30         $state = $doc->toFH($fh, $format);
31         $str = $document->toStringHTML();
32         $str = $document->serialize_html();
33         $bool = $dom->is_valid();
34         $dom->validate();
35         $root = $dom->documentElement();
36         $dom->setDocumentElement( $root );
37         $element = $dom->createElement( $nodename );
38         $element = $dom->createElementNS( $namespaceURI, $qname );
39         $text = $dom->createTextNode( $content_text );
40         $comment = $dom->createComment( $comment_text );
41         $attrnode = $doc->createAttribute($name [,$value]);
42         $attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] );
43         $fragment = $doc->createDocumentFragment();
44         $cdata = $dom->create( $cdata_content );
45         my $pi = $doc->createProcessingInstruction( $target, $data );
46         my $entref = $doc->createEntityReference($refname);
47         $dtd = $document->createInternalSubset( $rootnode, $public, $system);
48         $dtd = $document->createExternalSubset( $rootnode_name, $publicId, $systemId);
49         $document->importNode( $node );
50         $document->adoptNode( $node );
51         my $dtd = $doc->externalSubset;
52         my $dtd = $doc->internalSubset;
53         $doc->setExternalSubset($dtd);
54         $doc->setInternalSubset($dtd);
55         my $dtd = $doc->removeExternalSubset();
56         my $dtd = $doc->removeInternalSubset();
57         my @nodelist = $doc->getElementsByTagName($tagname);
58         my @nodelist = $doc->getElementsByTagNameNS($nsURI,$tagname);
59         my @nodelist = $doc->getElementsByLocalName($localname);
60         my $node = $doc->getElementById($id);
61         $dom->indexElements();
62

DESCRIPTION

64       The Document Class is in most cases the result of a parsing process.
65       But sometimes it is necessary to create a Document from scratch. The
66       DOM Document Class provides functions that conform to the DOM Core
67       naming style.
68
69       It inherits all functions from XML::LibXML::Node as specified in the
70       DOM specification. This enables access to the nodes besides the root
71       element on document level - a "DTD" for example. The support for these
72       nodes is limited at the moment.
73
74       While generally nodes are bound to a document in the DOM concept it is
75       suggested that one should always create a node not bound to any
76       document. There is no need of really including the node to the
77       document, but once the node is bound to a document, it is quite safe
78       that all strings have the correct encoding. If an unbound text node
79       with an ISO encoded string is created (e.g.  with $CLASS->new()), the
80       "toString" function may not return the expected result.
81
82       To prevent such problems, it is recommended to pass all data to
83       XML::LibXML methods as character strings (i.e. UTF-8 encoded, with the
84       UTF8 flag on).
85

METHODS

87       Many functions listed here are extensively documented in the DOM Level
88       3 specification (<http://www.w3.org/TR/DOM-Level-3-Core/>). Please
89       refer to the specification for extensive documentation.
90
91       new
92             $dom = XML::LibXML::Document->new( $version, $encoding );
93
94           alias for createDocument()
95
96       createDocument
97             $dom = XML::LibXML::Document->createDocument( $version, $encoding );
98
99           The constructor for the document class. As Parameter it takes the
100           version string and (optionally) the encoding string. Simply calling
101           createDocument() will create the document:
102
103             <?xml version="your version" encoding="your encoding"?>
104
105           Both parameter are optional. The default value for $version is 1.0,
106           of course. If the $encoding parameter is not set, the encoding will
107           be left unset, which means UTF-8 is implied.
108
109           The call of createDocument() without any parameter will result the
110           following code:
111
112             <?xml version="1.0"?>
113
114           Alternatively one can call this constructor directly from the
115           XML::LibXML class level, to avoid some typing. This will not have
116           any effect on the class instance, which is always
117           XML::LibXML::Document.
118
119             my $document = XML::LibXML->createDocument( "1.0", "UTF-8" );
120
121           is therefore a shortcut for
122
123             my $document = XML::LibXML::Document->createDocument( "1.0", "UTF-8" );
124
125       URI
126             $strURI = $doc->URI();
127
128           Returns the URI (or filename) of the original document. For
129           documents obtained by parsing a string of a FH without using the
130           URI parsing argument of the corresponding "parse_*" function, the
131           result is a generated string unknown-XYZ where XYZ is some number;
132           for documents created with the constructor "new", the URI is
133           undefined.
134
135           The value can be modified by calling "setURI" method on the
136           document node.
137
138       setURI
139             $doc->setURI($strURI);
140
141           Sets the URI of the document reported by the method URI (see also
142           the URI argument to the various "parse_*" functions).
143
144       encoding
145             $strEncoding = $doc->encoding();
146
147           returns the encoding string of the document.
148
149             my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" );
150             print $doc->encoding; # prints ISO-8859-15
151
152       actualEncoding
153             $strEncoding = $doc->actualEncoding();
154
155           returns the encoding in which the XML will be returned by
156           $doc->toString().  This is usually the original encoding of the
157           document as declared in the XML declaration and returned by
158           $doc->encoding. If the original encoding is not known (e.g. if
159           created in memory or parsed from a XML without a declared
160           encoding), 'UTF-8' is returned.
161
162             my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" );
163             print $doc->encoding; # prints ISO-8859-15
164
165       setEncoding
166             $doc->setEncoding($new_encoding);
167
168           This method allows to change the declaration of encoding in the XML
169           declaration of the document. The value also affects the encoding in
170           which the document is serialized to XML by $doc->toString(). Use
171           setEncoding() to remove the encoding declaration.
172
173       version
174             $strVersion = $doc->version();
175
176           returns the version string of the document
177
178           getVersion() is an alternative form of this function.
179
180       standalone
181             $doc->standalone
182
183           This function returns the Numerical value of a documents XML
184           declarations standalone attribute. It returns 1 if standalone="yes"
185           was found, 0 if standalone="no" was found and -1 if standalone was
186           not specified (default on creation).
187
188       setStandalone
189             $doc->setStandalone($numvalue);
190
191           Through this method it is possible to alter the value of a
192           documents standalone attribute. Set it to 1 to set
193           standalone="yes", to 0 to set standalone="no" or set it to -1 to
194           remove the standalone attribute from the XML declaration.
195
196       compression
197             my $compression = $doc->compression;
198
199           libxml2 allows reading of documents directly from gzipped files. In
200           this case the compression variable is set to the compression level
201           of that file (0-8). If XML::LibXML parsed a different source or the
202           file wasn't compressed, the returned value will be -1.
203
204       setCompression
205             $doc->setCompression($ziplevel);
206
207           If one intends to write the document directly to a file, it is
208           possible to set the compression level for a given document. This
209           level can be in the range from 0 to 8. If XML::LibXML should not
210           try to compress use -1 (default).
211
212           Note that this feature will only work if libxml2 is compiled with
213           zlib support and toFile() is used for output.
214
215       toString
216             $docstring = $dom->toString($format);
217
218           toString is a DOM serializing function, so the DOM Tree is
219           serialized into a XML string, ready for output.
220
221           IMPORTANT: unlike toString for other nodes, on document nodes this
222           function returns the XML as a byte string in the original encoding
223           of the document (see the actualEncoding() method)! This means you
224           can simply do:
225
226             open OUT, $file;
227             print OUT $doc->toString;
228
229           regardless of the actual encoding of the document. See the section
230           on encodings in XML::LibXML for more details.
231
232           The optional $format parameter sets the indenting of the output.
233           This parameter is expected to be an "integer" value, that specifies
234           that indentation should be used. The format parameter can have
235           three different values if it is used:
236
237           If $format is 0, than the document is dumped as it was originally
238           parsed
239
240           If $format is 1, libxml2 will add ignorable white spaces, so the
241           nodes content is easier to read. Existing text nodes will not be
242           altered
243
244           If $format is 2 (or higher), libxml2 will act as $format == 1 but
245           it add a leading and a trailing line break to each text node.
246
247           libxml2 uses a hard-coded indentation of 2 space characters per
248           indentation level. This value can not be altered on run-time.
249
250       toStringC14N
251             $c14nstr = $doc->toStringC14N($comment_flag, $xpath [, $xpath_context ]);
252
253           See the documentation in XML::LibXML::Node.
254
255       toStringEC14N
256             $ec14nstr = $doc->toStringEC14N($comment_flag, $xpath [, $xpath_context ], $inclusive_prefix_list);
257
258           See the documentation in XML::LibXML::Node.
259
260       serialize
261             $str = $doc->serialize($format);
262
263           An alias for toString(). This function was name added to be more
264           consistent with libxml2.
265
266       serialize_c14n
267           An alias for toStringC14N().
268
269       serialize_exc_c14n
270           An alias for toStringEC14N().
271
272       toFile
273             $state = $doc->toFile($filename, $format);
274
275           This function is similar to toString(), but it writes the document
276           directly into a filesystem. This function is very useful, if one
277           needs to store large documents.
278
279           The format parameter has the same behaviour as in toString().
280
281       toFH
282             $state = $doc->toFH($fh, $format);
283
284           This function is similar to toString(), but it writes the document
285           directly to a filehandle or a stream. A byte stream in the document
286           encoding is passed to the file handle. Do NOT apply any
287           ":encoding(...)" or ":utf8" PerlIO layer to the filehandle! See the
288           section on encodings in XML::LibXML for more details.
289
290           The format parameter has the same behaviour as in toString().
291
292       toStringHTML
293             $str = $document->toStringHTML();
294
295           toStringHTML serialize the tree to a byte string in the document
296           encoding as HTML. With this method indenting is automatic and
297           managed by libxml2 internally.
298
299       serialize_html
300             $str = $document->serialize_html();
301
302           An alias for toStringHTML().
303
304       is_valid
305             $bool = $dom->is_valid();
306
307           Returns either TRUE or FALSE depending on whether the DOM Tree is a
308           valid Document or not.
309
310           You may also pass in a XML::LibXML::Dtd object, to validate against
311           an external DTD:
312
313             if (!$dom->is_valid($dtd)) {
314                  warn("document is not valid!");
315              }
316
317       validate
318             $dom->validate();
319
320           This is an exception throwing equivalent of is_valid. If the
321           document is not valid it will throw an exception containing the
322           error. This allows you much better error reporting than simply
323           is_valid or not.
324
325           Again, you may pass in a DTD object
326
327       documentElement
328             $root = $dom->documentElement();
329
330           Returns the root element of the Document. A document can have just
331           one root element to contain the documents data.
332
333           Optionally one can use getDocumentElement.
334
335       setDocumentElement
336             $dom->setDocumentElement( $root );
337
338           This function enables you to set the root element for a document.
339           The function supports the import of a node from a different
340           document tree, but does not support a document fragment as $root.
341
342       createElement
343             $element = $dom->createElement( $nodename );
344
345           This function creates a new Element Node bound to the DOM with the
346           name $nodename.
347
348       createElementNS
349             $element = $dom->createElementNS( $namespaceURI, $qname );
350
351           This function creates a new Element Node bound to the DOM with the
352           name $nodename and placed in the given namespace.
353
354       createTextNode
355             $text = $dom->createTextNode( $content_text );
356
357           As an equivalent of createElement, but it creates a Text Node bound
358           to the DOM.
359
360       createComment
361             $comment = $dom->createComment( $comment_text );
362
363           As an equivalent of createElement, but it creates a Comment Node
364           bound to the DOM.
365
366       createAttribute
367             $attrnode = $doc->createAttribute($name [,$value]);
368
369           Creates a new Attribute node.
370
371       createAttributeNS
372             $attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] );
373
374           Creates an Attribute bound to a namespace.
375
376       createDocumentFragment
377             $fragment = $doc->createDocumentFragment();
378
379           This function creates a DocumentFragment.
380
381       createCDATASection
382             $cdata = $dom->create( $cdata_content );
383
384           Similar to createTextNode and createComment, this function creates
385           a CDataSection bound to the current DOM.
386
387       createProcessingInstruction
388             my $pi = $doc->createProcessingInstruction( $target, $data );
389
390           create a processing instruction node.
391
392           Since this method is quite long one may use its short form
393           createPI().
394
395       createEntityReference
396             my $entref = $doc->createEntityReference($refname);
397
398           If a document has a DTD specified, one can create entity references
399           by using this function. If one wants to add a entity reference to
400           the document, this reference has to be created by this function.
401
402           An entity reference is unique to a document and cannot be passed to
403           other documents as other nodes can be passed.
404
405           NOTE: A text content containing something that looks like an entity
406           reference, will not be expanded to a real entity reference unless
407           it is a predefined entity
408
409             my $string = "&foo;";
410              $some_element->appendText( $string );
411              print $some_element->textContent; # prints "&amp;foo;"
412
413       createInternalSubset
414             $dtd = $document->createInternalSubset( $rootnode, $public, $system);
415
416           This function creates and adds an internal subset to the given
417           document.  Because the function automatically adds the DTD to the
418           document there is no need to add the created node explicitly to the
419           document.
420
421             my $document = XML::LibXML::Document->new();
422              my $dtd      = $document->createInternalSubset( "foo", undef, "foo.dtd" );
423
424           will result in the following XML document:
425
426             <?xml version="1.0"?>
427              <!DOCTYPE foo SYSTEM "foo.dtd">
428
429           By setting the public parameter it is possible to set PUBLIC DTDs
430           to a given document. So
431
432             my $document = XML::LibXML::Document->new();
433             my $dtd      = $document->createInternalSubset( "foo", "-//FOO//DTD FOO 0.1//EN", undef );
434
435           will cause the following declaration to be created on the document:
436
437             <?xml version="1.0"?>
438             <!DOCTYPE foo PUBLIC "-//FOO//DTD FOO 0.1//EN">
439
440       createExternalSubset
441             $dtd = $document->createExternalSubset( $rootnode_name, $publicId, $systemId);
442
443           This function is similar to "createInternalSubset()" but this DTD
444           is considered to be external and is therefore not added to the
445           document itself. Nevertheless it can be used for validation
446           purposes.
447
448       importNode
449             $document->importNode( $node );
450
451           If a node is not part of a document, it can be imported to another
452           document. As specified in DOM Level 2 Specification the Node will
453           not be altered or removed from its original document
454           ("$node->cloneNode(1)" will get called implicitly).
455
456           NOTE: Don't try to use importNode() to import sub-trees that
457           contain an entity reference - even if the entity reference is the
458           root node of the sub-tree. This will cause serious problems to your
459           program. This is a limitation of libxml2 and not of XML::LibXML
460           itself.
461
462       adoptNode
463             $document->adoptNode( $node );
464
465           If a node is not part of a document, it can be imported to another
466           document. As specified in DOM Level 3 Specification the Node will
467           not be altered but it will removed from its original document.
468
469           After a document adopted a node, the node, its attributes and all
470           its descendants belong to the new document. Because the node does
471           not belong to the old document, it will be unlinked from its old
472           location first.
473
474           NOTE: Don't try to adoptNode() to import sub-trees that contain
475           entity references - even if the entity reference is the root node
476           of the sub-tree. This will cause serious problems to your program.
477           This is a limitation of libxml2 and not of XML::LibXML itself.
478
479       externalSubset
480             my $dtd = $doc->externalSubset;
481
482           If a document has an external subset defined it will be returned by
483           this function.
484
485           NOTE Dtd nodes are no ordinary nodes in libxml2. The support for
486           these nodes in XML::LibXML is still limited. In particular one may
487           not want use common node function on doctype declaration nodes!
488
489       internalSubset
490             my $dtd = $doc->internalSubset;
491
492           If a document has an internal subset defined it will be returned by
493           this function.
494
495           NOTE Dtd nodes are no ordinary nodes in libxml2. The support for
496           these nodes in XML::LibXML is still limited. In particular one may
497           not want use common node function on doctype declaration nodes!
498
499       setExternalSubset
500             $doc->setExternalSubset($dtd);
501
502           EXPERIMENTAL!
503
504           This method sets a DTD node as an external subset of the given
505           document.
506
507       setInternalSubset
508             $doc->setInternalSubset($dtd);
509
510           EXPERIMENTAL!
511
512           This method sets a DTD node as an internal subset of the given
513           document.
514
515       removeExternalSubset
516             my $dtd = $doc->removeExternalSubset();
517
518           EXPERIMENTAL!
519
520           If a document has an external subset defined it can be removed from
521           the document by using this function. The removed dtd node will be
522           returned.
523
524       removeInternalSubset
525             my $dtd = $doc->removeInternalSubset();
526
527           EXPERIMENTAL!
528
529           If a document has an internal subset defined it can be removed from
530           the document by using this function. The removed dtd node will be
531           returned.
532
533       getElementsByTagName
534             my @nodelist = $doc->getElementsByTagName($tagname);
535
536           Implements the DOM Level 2 function
537
538           In SCALAR context this function returns a XML::LibXML::NodeList
539           object.
540
541       getElementsByTagNameNS
542             my @nodelist = $doc->getElementsByTagNameNS($nsURI,$tagname);
543
544           Implements the DOM Level 2 function
545
546           In SCALAR context this function returns a XML::LibXML::NodeList
547           object.
548
549       getElementsByLocalName
550             my @nodelist = $doc->getElementsByLocalName($localname);
551
552           This allows the fetching of all nodes from a given document with
553           the given Localname.
554
555           In SCALAR context this function returns a XML::LibXML::NodeList
556           object.
557
558       getElementById
559             my $node = $doc->getElementById($id);
560
561           Returns the element that has an ID attribute with the given value.
562           If no such element exists, this returns undef.
563
564           Note: the ID of an element may change while manipulating the
565           document. For documents with a DTD, the information about ID
566           attributes is only available if DTD loading/validation has been
567           requested. For HTML documents parsed with the HTML parser ID
568           detection is done automatically. In XML documents, all "xml:id"
569           attributes are considered to be of type ID. You can test ID-ness of
570           an attribute node with $attr->isId().
571
572           In versions 1.59 and earlier this method was called
573           getElementsById() (plural) by mistake. Starting from 1.60 this name
574           is maintained as an alias only for backward compatibility.
575
576       indexElements
577             $dom->indexElements();
578
579           This function causes libxml2 to stamp all elements in a document
580           with their document position index which considerably speeds up
581           XPath queries for large documents. It should only be used with
582           static documents that won't be further changed by any DOM methods,
583           because once a document is indexed, XPath will always prefer the
584           index to other methods of determining the document order of nodes.
585           XPath could therefore return improperly ordered node-lists when
586           applied on a document that has been changed after being indexed. It
587           is of course possible to use this method to re-index a modified
588           document before using it with XPath again. This function is not a
589           part of the DOM specification.
590
591           This function returns number of elements indexed, -1 if error
592           occurred, or -2 if this feature is not available in the running
593           libxml2.
594

AUTHORS

596       Matt Sergeant, Christian Glahn, Petr Pajas
597

VERSION

599       1.70
600
602       2001-2007, AxKit.com Ltd.
603
604       2002-2006, Christian Glahn.
605
606       2006-2009, Petr Pajas.
607
608
609
610perl v5.10.1                      2009-10-07          XML::LibXML::Document(3)
Impressum