1XML::LibXML::Document(3U)ser Contributed Perl DocumentatiXoMnL::LibXML::Document(3)
2
3
4
6 XML::LibXML::Document - XML::LibXML DOM Document Class
7
9 use XML::LibXML;
10 # Only methods specific to Document nodes are listed here,
11 # see XML::LibXML::Node manpage for other methods
12
13 $dom = XML::LibXML::Document->new( $version, $encoding );
14 $dom = XML::LibXML::Document->createDocument( $version, $encoding );
15 $strURI = $doc->URI();
16 $doc->setURI($strURI);
17 $strEncoding = $doc->encoding();
18 $strEncoding = $doc->actualEncoding();
19 $doc->setEncoding($new_encoding);
20 $strVersion = $doc->version();
21 $doc->standalone
22 $doc->setStandalone($numvalue);
23 my $compression = $doc->compression;
24 $doc->setCompression($ziplevel);
25 $docstring = $dom->toString($format);
26 $c14nstr = $doc->toStringC14N($comment_flag, $xpath [, $xpath_context ]);
27 $ec14nstr = $doc->toStringEC14N($comment_flag, $xpath [, $xpath_context ], $inclusive_prefix_list);
28 $str = $doc->serialize($format);
29 $state = $doc->toFile($filename, $format);
30 $state = $doc->toFH($fh, $format);
31 $str = $document->toStringHTML();
32 $str = $document->serialize_html();
33 $bool = $dom->is_valid();
34 $dom->validate();
35 $root = $dom->documentElement();
36 $dom->setDocumentElement( $root );
37 $element = $dom->createElement( $nodename );
38 $element = $dom->createElementNS( $namespaceURI, $qname );
39 $text = $dom->createTextNode( $content_text );
40 $comment = $dom->createComment( $comment_text );
41 $attrnode = $doc->createAttribute($name [,$value]);
42 $attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] );
43 $fragment = $doc->createDocumentFragment();
44 $cdata = $dom->createCDATASection( $cdata_content );
45 my $pi = $doc->createProcessingInstruction( $target, $data );
46 my $entref = $doc->createEntityReference($refname);
47 $dtd = $document->createInternalSubset( $rootnode, $public, $system);
48 $dtd = $document->createExternalSubset( $rootnode_name, $publicId, $systemId);
49 $document->importNode( $node );
50 $document->adoptNode( $node );
51 my $dtd = $doc->externalSubset;
52 my $dtd = $doc->internalSubset;
53 $doc->setExternalSubset($dtd);
54 $doc->setInternalSubset($dtd);
55 my $dtd = $doc->removeExternalSubset();
56 my $dtd = $doc->removeInternalSubset();
57 my @nodelist = $doc->getElementsByTagName($tagname);
58 my @nodelist = $doc->getElementsByTagNameNS($nsURI,$tagname);
59 my @nodelist = $doc->getElementsByLocalName($localname);
60 my $node = $doc->getElementById($id);
61 $dom->indexElements();
62
64 The Document Class is in most cases the result of a parsing process.
65 But sometimes it is necessary to create a Document from scratch. The
66 DOM Document Class provides functions that conform to the DOM Core
67 naming style.
68
69 It inherits all functions from XML::LibXML::Node as specified in the
70 DOM specification. This enables access to the nodes besides the root
71 element on document level - a "DTD" for example. The support for these
72 nodes is limited at the moment.
73
74 While generally nodes are bound to a document in the DOM concept it is
75 suggested that one should always create a node not bound to any
76 document. There is no need of really including the node to the
77 document, but once the node is bound to a document, it is quite safe
78 that all strings have the correct encoding. If an unbound text node
79 with an ISO encoded string is created (e.g. with $CLASS->new()), the
80 "toString" function may not return the expected result.
81
82 To prevent such problems, it is recommended to pass all data to
83 XML::LibXML methods as character strings (i.e. UTF-8 encoded, with the
84 UTF8 flag on).
85
87 Many functions listed here are extensively documented in the DOM Level
88 3 specification (<http://www.w3.org/TR/DOM-Level-3-Core/>). Please
89 refer to the specification for extensive documentation.
90
91 new
92 $dom = XML::LibXML::Document->new( $version, $encoding );
93
94 alias for createDocument()
95
96 createDocument
97 $dom = XML::LibXML::Document->createDocument( $version, $encoding );
98
99 The constructor for the document class. As Parameter it takes the
100 version string and (optionally) the encoding string. Simply calling
101 createDocument() will create the document:
102
103 <?xml version="your version" encoding="your encoding"?>
104
105 Both parameter are optional. The default value for $version is 1.0,
106 of course. If the $encoding parameter is not set, the encoding will
107 be left unset, which means UTF-8 is implied.
108
109 The call of createDocument() without any parameter will result the
110 following code:
111
112 <?xml version="1.0"?>
113
114 Alternatively one can call this constructor directly from the
115 XML::LibXML class level, to avoid some typing. This will not have
116 any effect on the class instance, which is always
117 XML::LibXML::Document.
118
119 my $document = XML::LibXML->createDocument( "1.0", "UTF-8" );
120
121 is therefore a shortcut for
122
123 my $document = XML::LibXML::Document->createDocument( "1.0", "UTF-8" );
124
125 URI
126 $strURI = $doc->URI();
127
128 Returns the URI (or filename) of the original document. For
129 documents obtained by parsing a string of a FH without using the
130 URI parsing argument of the corresponding "parse_*" function, the
131 result is a generated string unknown-XYZ where XYZ is some number;
132 for documents created with the constructor "new", the URI is
133 undefined.
134
135 The value can be modified by calling "setURI" method on the
136 document node.
137
138 setURI
139 $doc->setURI($strURI);
140
141 Sets the URI of the document reported by the method URI (see also
142 the URI argument to the various "parse_*" functions).
143
144 encoding
145 $strEncoding = $doc->encoding();
146
147 returns the encoding string of the document.
148
149 my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" );
150 print $doc->encoding; # prints ISO-8859-15
151
152 actualEncoding
153 $strEncoding = $doc->actualEncoding();
154
155 returns the encoding in which the XML will be returned by
156 $doc->toString(). This is usually the original encoding of the
157 document as declared in the XML declaration and returned by
158 $doc->encoding. If the original encoding is not known (e.g. if
159 created in memory or parsed from a XML without a declared
160 encoding), 'UTF-8' is returned.
161
162 my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" );
163 print $doc->encoding; # prints ISO-8859-15
164
165 setEncoding
166 $doc->setEncoding($new_encoding);
167
168 This method allows to change the declaration of encoding in the XML
169 declaration of the document. The value also affects the encoding in
170 which the document is serialized to XML by $doc->toString(). Use
171 setEncoding() to remove the encoding declaration.
172
173 version
174 $strVersion = $doc->version();
175
176 returns the version string of the document
177
178 getVersion() is an alternative form of this function.
179
180 standalone
181 $doc->standalone
182
183 This function returns the Numerical value of a documents XML
184 declarations standalone attribute. It returns 1 if standalone="yes"
185 was found, 0 if standalone="no" was found and -1 if standalone was
186 not specified (default on creation).
187
188 setStandalone
189 $doc->setStandalone($numvalue);
190
191 Through this method it is possible to alter the value of a
192 documents standalone attribute. Set it to 1 to set
193 standalone="yes", to 0 to set standalone="no" or set it to -1 to
194 remove the standalone attribute from the XML declaration.
195
196 compression
197 my $compression = $doc->compression;
198
199 libxml2 allows reading of documents directly from gzipped files. In
200 this case the compression variable is set to the compression level
201 of that file (0-8). If XML::LibXML parsed a different source or the
202 file wasn't compressed, the returned value will be -1.
203
204 setCompression
205 $doc->setCompression($ziplevel);
206
207 If one intends to write the document directly to a file, it is
208 possible to set the compression level for a given document. This
209 level can be in the range from 0 to 8. If XML::LibXML should not
210 try to compress use -1 (default).
211
212 Note that this feature will only work if libxml2 is compiled with
213 zlib support and toFile() is used for output.
214
215 toString
216 $docstring = $dom->toString($format);
217
218 toString is a DOM serializing function, so the DOM Tree is
219 serialized into an XML string, ready for output.
220
221 IMPORTANT: unlike toString for other nodes, on document nodes this
222 function returns the XML as a byte string in the original encoding
223 of the document (see the actualEncoding() method)! This means you
224 can simply do:
225
226 open my $out_fh, '>', $file;
227 print {$out_fh} $doc->toString;
228
229 regardless of the actual encoding of the document. See the section
230 on encodings in XML::LibXML for more details.
231
232 The optional $format parameter sets the indenting of the output.
233 This parameter is expected to be an "integer" value, that specifies
234 that indentation should be used. The format parameter can have
235 three different values if it is used:
236
237 If $format is 0, than the document is dumped as it was originally
238 parsed
239
240 If $format is 1, libxml2 will add ignorable white spaces, so the
241 nodes content is easier to read. Existing text nodes will not be
242 altered
243
244 If $format is 2 (or higher), libxml2 will act as $format == 1 but
245 it add a leading and a trailing line break to each text node.
246
247 libxml2 uses a hard-coded indentation of 2 space characters per
248 indentation level. This value can not be altered on run-time.
249
250 toStringC14N
251 $c14nstr = $doc->toStringC14N($comment_flag, $xpath [, $xpath_context ]);
252
253 See the documentation in XML::LibXML::Node.
254
255 toStringEC14N
256 $ec14nstr = $doc->toStringEC14N($comment_flag, $xpath [, $xpath_context ], $inclusive_prefix_list);
257
258 See the documentation in XML::LibXML::Node.
259
260 serialize
261 $str = $doc->serialize($format);
262
263 An alias for toString(). This function was name added to be more
264 consistent with libxml2.
265
266 serialize_c14n
267 An alias for toStringC14N().
268
269 serialize_exc_c14n
270 An alias for toStringEC14N().
271
272 toFile
273 $state = $doc->toFile($filename, $format);
274
275 This function is similar to toString(), but it writes the document
276 directly into a filesystem. This function is very useful, if one
277 needs to store large documents.
278
279 The format parameter has the same behaviour as in toString().
280
281 toFH
282 $state = $doc->toFH($fh, $format);
283
284 This function is similar to toString(), but it writes the document
285 directly to a filehandle or a stream. A byte stream in the document
286 encoding is passed to the file handle. Do NOT apply any
287 ":encoding(...)" or ":utf8" PerlIO layer to the filehandle! See the
288 section on encodings in XML::LibXML for more details.
289
290 The format parameter has the same behaviour as in toString().
291
292 toStringHTML
293 $str = $document->toStringHTML();
294
295 toStringHTML serialize the tree to a byte string in the document
296 encoding as HTML. With this method indenting is automatic and
297 managed by libxml2 internally.
298
299 serialize_html
300 $str = $document->serialize_html();
301
302 An alias for toStringHTML().
303
304 is_valid
305 $bool = $dom->is_valid();
306
307 Returns either TRUE or FALSE depending on whether the DOM Tree is a
308 valid Document or not.
309
310 You may also pass in a XML::LibXML::Dtd object, to validate against
311 an external DTD:
312
313 if (!$dom->is_valid($dtd)) {
314 warn("document is not valid!");
315 }
316
317 validate
318 $dom->validate();
319
320 This is an exception throwing equivalent of is_valid. If the
321 document is not valid it will throw an exception containing the
322 error. This allows you much better error reporting than simply
323 is_valid or not.
324
325 Again, you may pass in a DTD object
326
327 documentElement
328 $root = $dom->documentElement();
329
330 Returns the root element of the Document. A document can have just
331 one root element to contain the documents data.
332
333 Optionally one can use getDocumentElement.
334
335 setDocumentElement
336 $dom->setDocumentElement( $root );
337
338 This function enables you to set the root element for a document.
339 The function supports the import of a node from a different
340 document tree, but does not support a document fragment as $root.
341
342 createElement
343 $element = $dom->createElement( $nodename );
344
345 This function creates a new Element Node bound to the DOM with the
346 name $nodename.
347
348 createElementNS
349 $element = $dom->createElementNS( $namespaceURI, $qname );
350
351 This function creates a new Element Node bound to the DOM with the
352 name $nodename and placed in the given namespace.
353
354 createTextNode
355 $text = $dom->createTextNode( $content_text );
356
357 As an equivalent of createElement, but it creates a Text Node bound
358 to the DOM.
359
360 createComment
361 $comment = $dom->createComment( $comment_text );
362
363 As an equivalent of createElement, but it creates a Comment Node
364 bound to the DOM.
365
366 createAttribute
367 $attrnode = $doc->createAttribute($name [,$value]);
368
369 Creates a new Attribute node.
370
371 createAttributeNS
372 $attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] );
373
374 Creates an Attribute bound to a namespace.
375
376 createDocumentFragment
377 $fragment = $doc->createDocumentFragment();
378
379 This function creates a DocumentFragment.
380
381 createCDATASection
382 $cdata = $dom->createCDATASection( $cdata_content );
383
384 Similar to createTextNode and createComment, this function creates
385 a CDataSection bound to the current DOM.
386
387 createProcessingInstruction
388 my $pi = $doc->createProcessingInstruction( $target, $data );
389
390 create a processing instruction node.
391
392 Since this method is quite long one may use its short form
393 createPI().
394
395 createEntityReference
396 my $entref = $doc->createEntityReference($refname);
397
398 If a document has a DTD specified, one can create entity references
399 by using this function. If one wants to add a entity reference to
400 the document, this reference has to be created by this function.
401
402 An entity reference is unique to a document and cannot be passed to
403 other documents as other nodes can be passed.
404
405 NOTE: A text content containing something that looks like an entity
406 reference, will not be expanded to a real entity reference unless
407 it is a predefined entity
408
409 my $string = "&foo;";
410 $some_element->appendText( $string );
411 print $some_element->textContent; # prints "&foo;"
412
413 createInternalSubset
414 $dtd = $document->createInternalSubset( $rootnode, $public, $system);
415
416 This function creates and adds an internal subset to the given
417 document. Because the function automatically adds the DTD to the
418 document there is no need to add the created node explicitly to the
419 document.
420
421 my $document = XML::LibXML::Document->new();
422 my $dtd = $document->createInternalSubset( "foo", undef, "foo.dtd" );
423
424 will result in the following XML document:
425
426 <?xml version="1.0"?>
427 <!DOCTYPE foo SYSTEM "foo.dtd">
428
429 By setting the public parameter it is possible to set PUBLIC DTDs
430 to a given document. So
431
432 my $document = XML::LibXML::Document->new();
433 my $dtd = $document->createInternalSubset( "foo", "-//FOO//DTD FOO 0.1//EN", undef );
434
435 will cause the following declaration to be created on the document:
436
437 <?xml version="1.0"?>
438 <!DOCTYPE foo PUBLIC "-//FOO//DTD FOO 0.1//EN">
439
440 createExternalSubset
441 $dtd = $document->createExternalSubset( $rootnode_name, $publicId, $systemId);
442
443 This function is similar to "createInternalSubset()" but this DTD
444 is considered to be external and is therefore not added to the
445 document itself. Nevertheless it can be used for validation
446 purposes.
447
448 importNode
449 $document->importNode( $node );
450
451 If a node is not part of a document, it can be imported to another
452 document. As specified in DOM Level 2 Specification the Node will
453 not be altered or removed from its original document
454 ("$node->cloneNode(1)" will get called implicitly).
455
456 NOTE: Don't try to use importNode() to import sub-trees that
457 contain an entity reference - even if the entity reference is the
458 root node of the sub-tree. This will cause serious problems to your
459 program. This is a limitation of libxml2 and not of XML::LibXML
460 itself.
461
462 adoptNode
463 $document->adoptNode( $node );
464
465 If a node is not part of a document, it can be imported to another
466 document. As specified in DOM Level 3 Specification the Node will
467 not be altered but it will removed from its original document.
468
469 After a document adopted a node, the node, its attributes and all
470 its descendants belong to the new document. Because the node does
471 not belong to the old document, it will be unlinked from its old
472 location first.
473
474 NOTE: Don't try to adoptNode() to import sub-trees that contain
475 entity references - even if the entity reference is the root node
476 of the sub-tree. This will cause serious problems to your program.
477 This is a limitation of libxml2 and not of XML::LibXML itself.
478
479 externalSubset
480 my $dtd = $doc->externalSubset;
481
482 If a document has an external subset defined it will be returned by
483 this function.
484
485 NOTE Dtd nodes are no ordinary nodes in libxml2. The support for
486 these nodes in XML::LibXML is still limited. In particular one may
487 not want use common node function on doctype declaration nodes!
488
489 internalSubset
490 my $dtd = $doc->internalSubset;
491
492 If a document has an internal subset defined it will be returned by
493 this function.
494
495 NOTE Dtd nodes are no ordinary nodes in libxml2. The support for
496 these nodes in XML::LibXML is still limited. In particular one may
497 not want use common node function on doctype declaration nodes!
498
499 setExternalSubset
500 $doc->setExternalSubset($dtd);
501
502 EXPERIMENTAL!
503
504 This method sets a DTD node as an external subset of the given
505 document.
506
507 setInternalSubset
508 $doc->setInternalSubset($dtd);
509
510 EXPERIMENTAL!
511
512 This method sets a DTD node as an internal subset of the given
513 document.
514
515 removeExternalSubset
516 my $dtd = $doc->removeExternalSubset();
517
518 EXPERIMENTAL!
519
520 If a document has an external subset defined it can be removed from
521 the document by using this function. The removed dtd node will be
522 returned.
523
524 removeInternalSubset
525 my $dtd = $doc->removeInternalSubset();
526
527 EXPERIMENTAL!
528
529 If a document has an internal subset defined it can be removed from
530 the document by using this function. The removed dtd node will be
531 returned.
532
533 getElementsByTagName
534 my @nodelist = $doc->getElementsByTagName($tagname);
535
536 Implements the DOM Level 2 function
537
538 In SCALAR context this function returns an XML::LibXML::NodeList
539 object.
540
541 getElementsByTagNameNS
542 my @nodelist = $doc->getElementsByTagNameNS($nsURI,$tagname);
543
544 Implements the DOM Level 2 function
545
546 In SCALAR context this function returns an XML::LibXML::NodeList
547 object.
548
549 getElementsByLocalName
550 my @nodelist = $doc->getElementsByLocalName($localname);
551
552 This allows the fetching of all nodes from a given document with
553 the given Localname.
554
555 In SCALAR context this function returns an XML::LibXML::NodeList
556 object.
557
558 getElementById
559 my $node = $doc->getElementById($id);
560
561 Returns the element that has an ID attribute with the given value.
562 If no such element exists, this returns undef.
563
564 Note: the ID of an element may change while manipulating the
565 document. For documents with a DTD, the information about ID
566 attributes is only available if DTD loading/validation has been
567 requested. For HTML documents parsed with the HTML parser ID
568 detection is done automatically. In XML documents, all "xml:id"
569 attributes are considered to be of type ID. You can test ID-ness of
570 an attribute node with $attr->isId().
571
572 In versions 1.59 and earlier this method was called
573 getElementsById() (plural) by mistake. Starting from 1.60 this name
574 is maintained as an alias only for backward compatibility.
575
576 indexElements
577 $dom->indexElements();
578
579 This function causes libxml2 to stamp all elements in a document
580 with their document position index which considerably speeds up
581 XPath queries for large documents. It should only be used with
582 static documents that won't be further changed by any DOM methods,
583 because once a document is indexed, XPath will always prefer the
584 index to other methods of determining the document order of nodes.
585 XPath could therefore return improperly ordered node-lists when
586 applied on a document that has been changed after being indexed. It
587 is of course possible to use this method to re-index a modified
588 document before using it with XPath again. This function is not a
589 part of the DOM specification.
590
591 This function returns number of elements indexed, -1 if error
592 occurred, or -2 if this feature is not available in the running
593 libxml2.
594
596 Matt Sergeant, Christian Glahn, Petr Pajas
597
599 2.0018
600
602 2001-2007, AxKit.com Ltd.
603
604 2002-2006, Christian Glahn.
605
606 2006-2009, Petr Pajas.
607
608
609
610perl v5.16.3 2013-05-13 XML::LibXML::Document(3)