1Mojo::DOM58(3)        User Contributed Perl Documentation       Mojo::DOM58(3)
2
3
4

NAME

6       Mojo::DOM58 - Minimalistic HTML/XML DOM parser with CSS selectors
7

SYNOPSIS

9         use Mojo::DOM58;
10
11         # Parse
12         my $dom = Mojo::DOM58->new('<div><p id="a">Test</p><p id="b">123</p></div>');
13
14         # Find
15         say $dom->at('#b')->text;
16         say $dom->find('p')->map('text')->join("\n");
17         say $dom->find('[id]')->map(attr => 'id')->join("\n");
18
19         # Iterate
20         $dom->find('p[id]')->reverse->each(sub { say $_->{id} });
21
22         # Loop
23         for my $e ($dom->find('p[id]')->each) {
24           say $e->{id}, ':', $e->text;
25         }
26
27         # Modify
28         $dom->find('div p')->last->append('<p id="c">456</p>');
29         $dom->at('#c')->prepend($dom->new_tag('p', id => 'd', '789'));
30         $dom->find(':not(p)')->map('strip');
31
32         # Render
33         say "$dom";
34

DESCRIPTION

36       Mojo::DOM58 is a minimalistic and relaxed pure-perl HTML/XML DOM parser
37       based on Mojo::DOM. It supports the HTML Living Standard
38       <https://html.spec.whatwg.org/> and Extensible Markup Language (XML)
39       1.0 <https://www.w3.org/TR/xml/>, and matching based on CSS3 selectors
40       <https://www.w3.org/TR/selectors/>. It will even try to interpret
41       broken HTML and XML, so you should not use it for validation.
42

FORK INFO

44       Mojo::DOM58 is a fork of Mojo::DOM and tracks features and fixes to
45       stay closely compatible with upstream. It differs only in the
46       standalone format and compatibility with Perl 5.8. Any bugs or patches
47       not related to these changes should be reported directly to the
48       Mojolicious issue tracker.
49
50       This release of Mojo::DOM58 is up to date with version 9.0 of
51       Mojolicious.
52

NODES AND ELEMENTS

54       When we parse an HTML/XML fragment, it gets turned into a tree of
55       nodes.
56
57         <!DOCTYPE html>
58         <html>
59           <head><title>Hello</title></head>
60           <body>World!</body>
61         </html>
62
63       There are currently eight different kinds of nodes, "cdata", "comment",
64       "doctype", "pi", "raw", "root", "tag" and "text". Elements are nodes of
65       the type "tag".
66
67         root
68         |- doctype (html)
69         +- tag (html)
70            |- tag (head)
71            |  +- tag (title)
72            |     +- raw (Hello)
73            +- tag (body)
74               +- text (World!)
75
76       While all node types are represented as Mojo::DOM58 objects, some
77       methods like "attr" and "namespace" only apply to elements.
78

CASE-SENSITIVITY

80       Mojo::DOM58 defaults to HTML semantics, that means all tags and
81       attribute names are lowercased and selectors need to be lowercase as
82       well.
83
84         # HTML semantics
85         my $dom = Mojo::DOM58->new('<P ID="greeting">Hi!</P>');
86         say $dom->at('p[id]')->text;
87
88       If an XML declaration is found, the parser will automatically switch
89       into XML mode and everything becomes case-sensitive.
90
91         # XML semantics
92         my $dom = Mojo::DOM58->new('<?xml version="1.0"?><P ID="greeting">Hi!</P>');
93         say $dom->at('P[ID]')->text;
94
95       HTML or XML semantics can also be forced with the "xml" method.
96
97         # Force HTML semantics
98         my $dom = Mojo::DOM58->new->xml(0)->parse('<P ID="greeting">Hi!</P>');
99         say $dom->at('p[id]')->text;
100
101         # Force XML semantics
102         my $dom = Mojo::DOM58->new->xml(1)->parse('<P ID="greeting">Hi!</P>');
103         say $dom->at('P[ID]')->text;
104

SELECTORS

106       Mojo::DOM58 uses a CSS selector engine based on Mojo::DOM::CSS. All CSS
107       selectors that make sense for a standalone parser are supported.
108
109       *   Any element.
110
111             my $all = $dom->find('*');
112
113       E   An element of type "E".
114
115             my $title = $dom->at('title');
116
117       E[foo]
118           An "E" element with a "foo" attribute.
119
120             my $links = $dom->find('a[href]');
121
122       E[foo="bar"]
123           An "E" element whose "foo" attribute value is exactly equal to
124           "bar".
125
126             my $case_sensitive = $dom->find('input[type="hidden"]');
127             my $case_sensitive = $dom->find('input[type=hidden]');
128
129       E[foo="bar" i]
130           An "E" element whose "foo" attribute value is exactly equal to any
131           (ASCII-range) case-permutation of "bar". Note that this selector is
132           EXPERIMENTAL and might change without warning!
133
134             my $case_insensitive = $dom->find('input[type="hidden" i]');
135             my $case_insensitive = $dom->find('input[type=hidden i]');
136             my $case_insensitive = $dom->find('input[class~="foo" i]');
137
138           This selector is part of Selectors Level 4
139           <https://dev.w3.org/csswg/selectors-4>, which is still a work in
140           progress.
141
142       E[foo="bar" s]
143           An "E" element whose "foo" attribute value is exactly and case-
144           sensitively equal to "bar". Note that this selector is EXPERIMENTAL
145           and might change without warning!
146
147             my $case_sensitive = $dom->find('input[type="hidden" s]');
148
149           This selector is part of Selectors Level 4
150           <https://dev.w3.org/csswg/selectors-4>, which is still a work in
151           progress.
152
153       E[foo~="bar"]
154           An "E" element whose "foo" attribute value is a list of whitespace-
155           separated values, one of which is exactly equal to "bar".
156
157             my $foo = $dom->find('input[class~="foo"]');
158             my $foo = $dom->find('input[class~=foo]');
159
160       E[foo^="bar"]
161           An "E" element whose "foo" attribute value begins exactly with the
162           string "bar".
163
164             my $begins_with = $dom->find('input[name^="f"]');
165             my $begins_with = $dom->find('input[name^=f]');
166
167       E[foo$="bar"]
168           An "E" element whose "foo" attribute value ends exactly with the
169           string "bar".
170
171             my $ends_with = $dom->find('input[name$="o"]');
172             my $ends_with = $dom->find('input[name$=o]');
173
174       E[foo*="bar"]
175           An "E" element whose "foo" attribute value contains the substring
176           "bar".
177
178             my $contains = $dom->find('input[name*="fo"]');
179             my $contains = $dom->find('input[name*=fo]');
180
181       E[foo|="en"]
182           An "E" element whose "foo" attribute has a hyphen-separated list of
183           values beginning (from the left) with "en".
184
185             my $english = $dom->find('link[hreflang|=en]');
186
187       E:root
188           An "E" element, root of the document.
189
190             my $root = $dom->at(':root');
191
192       E:nth-child(n)
193           An "E" element, the "n-th" child of its parent.
194
195             my $third = $dom->find('div:nth-child(3)');
196             my $odd   = $dom->find('div:nth-child(odd)');
197             my $even  = $dom->find('div:nth-child(even)');
198             my $top3  = $dom->find('div:nth-child(-n+3)');
199
200       E:nth-last-child(n)
201           An "E" element, the "n-th" child of its parent, counting from the
202           last one.
203
204             my $third    = $dom->find('div:nth-last-child(3)');
205             my $odd      = $dom->find('div:nth-last-child(odd)');
206             my $even     = $dom->find('div:nth-last-child(even)');
207             my $bottom3  = $dom->find('div:nth-last-child(-n+3)');
208
209       E:nth-of-type(n)
210           An "E" element, the "n-th" sibling of its type.
211
212             my $third = $dom->find('div:nth-of-type(3)');
213             my $odd   = $dom->find('div:nth-of-type(odd)');
214             my $even  = $dom->find('div:nth-of-type(even)');
215             my $top3  = $dom->find('div:nth-of-type(-n+3)');
216
217       E:nth-last-of-type(n)
218           An "E" element, the "n-th" sibling of its type, counting from the
219           last one.
220
221             my $third    = $dom->find('div:nth-last-of-type(3)');
222             my $odd      = $dom->find('div:nth-last-of-type(odd)');
223             my $even     = $dom->find('div:nth-last-of-type(even)');
224             my $bottom3  = $dom->find('div:nth-last-of-type(-n+3)');
225
226       E:first-child
227           An "E" element, first child of its parent.
228
229             my $first = $dom->find('div p:first-child');
230
231       E:last-child
232           An "E" element, last child of its parent.
233
234             my $last = $dom->find('div p:last-child');
235
236       E:first-of-type
237           An "E" element, first sibling of its type.
238
239             my $first = $dom->find('div p:first-of-type');
240
241       E:last-of-type
242           An "E" element, last sibling of its type.
243
244             my $last = $dom->find('div p:last-of-type');
245
246       E:only-child
247           An "E" element, only child of its parent.
248
249             my $lonely = $dom->find('div p:only-child');
250
251       E:only-of-type
252           An "E" element, only sibling of its type.
253
254             my $lonely = $dom->find('div p:only-of-type');
255
256       E:empty
257           An "E" element that has no children (including text nodes).
258
259             my $empty = $dom->find(':empty');
260
261       E:any-link
262           Alias for "E:link". Note that this selector is EXPERIMENTAL and
263           might change without warning! This selector is part of Selectors
264           Level 4 <https://dev.w3.org/csswg/selectors-4>, which is still a
265           work in progress.
266
267       E:link
268           An "E" element being the source anchor of a hyperlink of which the
269           target is not yet visited (":link") or already visited
270           (":visited"). Note that Mojo::DOM58 is not stateful, therefore
271           ":any-link", ":link" and ":visited" yield exactly the same results.
272
273             my $links = $dom->find(':any-link');
274             my $links = $dom->find(':link');
275             my $links = $dom->find(':visited');
276
277       E:visited
278           Alias for "E:link".
279
280       E:scope
281           An "E" element being a designated reference element. Note that this
282           selector is EXPERIMENTAL and might change without warning!
283
284             my $scoped = $dom->find('a:not(:scope > a)');
285             my $scoped = $dom->find('div :scope p');
286             my $scoped = $dom->find('~ p');
287
288           This selector is part of Selectors Level 4
289           <https://dev.w3.org/csswg/selectors-4>, which is still a work in
290           progress.
291
292       E:checked
293           A user interface element "E" which is checked (for instance a
294           radio-button or checkbox).
295
296             my $input = $dom->find(':checked');
297
298       E.warning
299           An "E" element whose class is "warning".
300
301             my $warning = $dom->find('div.warning');
302
303       E#myid
304           An "E" element with "ID" equal to "myid".
305
306             my $foo = $dom->at('div#foo');
307
308       E:not(s1, s2)
309           An "E" element that does not match either compound selector "s1" or
310           compound selector "s2". Note that support for compound selectors is
311           EXPERIMENTAL and might change without warning!
312
313             my $others = $dom->find('div p:not(:first-child, :last-child)');
314
315           Support for compound selectors was added as part of Selectors Level
316           4 <https://dev.w3.org/csswg/selectors-4>, which is still a work in
317           progress.
318
319       E:is(s1, s2)
320           An "E" element that matches compound selector "s1" and/or compound
321           selector "s2". Note that this selector is EXPERIMENTAL and might
322           change without warning!
323
324             my $headers = $dom->find(':is(section, article, aside, nav) h1');
325
326           This selector is part of Selectors Level 4
327           <https://dev.w3.org/csswg/selectors-4>, which is still a work in
328           progress.
329
330       E:has(rs1, rs2)
331           An "E" element, if either of the relative selectors "rs1" or "rs2",
332           when evaluated with "E" as the :scope elements, match an element.
333           Note that this selector is EXPERIMENTAL and might change without
334           warning!
335
336             my $link = $dom->find('a:has(> img)');
337
338           This selector is part of Selectors Level 4
339           <https://dev.w3.org/csswg/selectors-4>, which is still a work in
340           progress.  Also be aware that this feature is currently marked
341           "at-risk", so there is a high chance that it will get removed
342           completely.
343
344       A|E An "E" element that belongs to the namespace alias "A" from CSS
345           Namespaces Module Level 3 <https://www.w3.org/TR/css-
346           namespaces-3/>.  Key/value pairs passed to selector methods are
347           used to declare namespace aliases.
348
349             my $elem = $dom->find('lq|elem', lq => 'http://example.com/q-markup');
350
351           Using an empty alias searches for an element that belongs to no
352           namespace.
353
354             my $div = $dom->find('|div');
355
356       E F An "F" element descendant of an "E" element.
357
358             my $headlines = $dom->find('div h1');
359
360       E > F
361           An "F" element child of an "E" element.
362
363             my $headlines = $dom->find('html > body > div > h1');
364
365       E + F
366           An "F" element immediately preceded by an "E" element.
367
368             my $second = $dom->find('h1 + h2');
369
370       E ~ F
371           An "F" element preceded by an "E" element.
372
373             my $second = $dom->find('h1 ~ h2');
374
375       E, F, G
376           Elements of type "E", "F" and "G".
377
378             my $headlines = $dom->find('h1, h2, h3');
379
380       E[foo=bar][bar=baz]
381           An "E" element whose attributes match all following attribute
382           selectors.
383
384             my $links = $dom->find('a[foo^=b][foo$=ar]');
385

OPERATORS

387       Mojo::DOM58 overloads the following operators.
388
389   array
390         my @nodes = @$dom;
391
392       Alias for "child_nodes".
393
394         # "<!-- Test -->"
395         $dom->parse('<!-- Test --><b>123</b>')->[0];
396
397   bool
398         my $bool = !!$dom;
399
400       Always true.
401
402   hash
403         my %attrs = %$dom;
404
405       Alias for "attr".
406
407         # "test"
408         $dom->parse('<div id="test">Test</div>')->at('div')->{id};
409
410   stringify
411         my $str = "$dom";
412
413       Alias for "to_string".
414

FUNCTIONS

416       Mojo::DOM58 implements the following functions, which can be imported
417       individually.
418
419   tag_to_html
420         my $str = tag_to_html 'div', id => 'foo', 'safe content';
421
422       Generate HTML/XML tag and render it right away. This is a significantly
423       faster alternative to "new_tag" for template systems that have to
424       generate a lot of tags.
425

METHODS

427       Mojo::DOM58 implements the following methods.
428
429   new
430         my $dom = Mojo::DOM58->new;
431         my $dom = Mojo::DOM58->new('<foo bar="baz">I ♥ Mojo::DOM58!</foo>');
432
433       Construct a new scalar-based Mojo::DOM58 object and "parse" HTML/XML
434       fragment if necessary.
435
436   new_tag
437         my $tag = Mojo::DOM58->new_tag('div');
438         my $tag = $dom->new_tag('div');
439         my $tag = $dom->new_tag('div', id => 'foo', hidden => undef);
440         my $tag = $dom->new_tag('div', 'safe content');
441         my $tag = $dom->new_tag('div', id => 'foo', 'safe content');
442         my $tag = $dom->new_tag('div', data => {mojo => 'rocks'}, 'safe content');
443         my $tag = $dom->new_tag('div', id => 'foo', sub { 'unsafe content' });
444
445       Construct a new Mojo::DOM58 object for an HTML/XML tag with or without
446       attributes and content. The "data" attribute may contain a hash
447       reference with key/value pairs to generate attributes from.
448
449         # "<br>"
450         $dom->new_tag('br');
451
452         # "<div></div>"
453         $dom->new_tag('div');
454
455         # "<div id="foo" hidden></div>"
456         $dom->new_tag('div', id => 'foo', hidden => undef);
457
458         # "<div>test &amp; 123</div>"
459         $dom->new_tag('div', 'test & 123');
460
461         # "<div id="foo">test &amp; 123</div>"
462         $dom->new_tag('div', id => 'foo', 'test & 123');
463
464         # "<div data-foo="1" data-bar="test">test &amp; 123</div>""
465         $dom->new_tag('div', data => {foo => 1, Bar => 'test'}, 'test & 123');
466
467         # "<div id="foo">test & 123</div>"
468         $dom->new_tag('div', id => 'foo', sub { 'test & 123' });
469
470         # "<div>Hello<b>Mojo!</b></div>"
471         $dom->parse('<div>Hello</div>')->at('div')
472           ->append_content($dom->new_tag('b', 'Mojo!'))->root;
473
474   all_text
475         my $text = $dom->all_text;
476
477       Extract text content from all descendant nodes of this element. For
478       HTML documents "script" and "style" elements are excluded.
479
480         # "foo\nbarbaz\n"
481         $dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->at('div')->all_text;
482
483   ancestors
484         my $collection = $dom->ancestors;
485         my $collection = $dom->ancestors('div ~ p');
486
487       Find all ancestor elements of this node matching the CSS selector and
488       return a collection containing these elements as Mojo::DOM58 objects.
489       All selectors listed in "SELECTORS" are supported.
490
491         # List tag names of ancestor elements
492         say $dom->ancestors->map('tag')->join("\n");
493
494   append
495         $dom = $dom->append('<p>I ♥ Mojo::DOM58!</p>');
496         $dom = $dom->append(Mojo::DOM58->new);
497
498       Append HTML/XML fragment to this node (for all node types other than
499       "root").
500
501         # "<div><h1>Test</h1><h2>123</h2></div>"
502         $dom->parse('<div><h1>Test</h1></div>')
503           ->at('h1')->append('<h2>123</h2>')->root;
504
505         # "<p>Test 123</p>"
506         $dom->parse('<p>Test</p>')->at('p')
507           ->child_nodes->first->append(' 123')->root;
508
509   append_content
510         $dom = $dom->append_content('<p>I ♥ Mojo::DOM58!</p>');
511         $dom = $dom->append_content(Mojo::DOM58->new);
512
513       Append HTML/XML fragment (for "root" and "tag" nodes) or raw content to
514       this node's content.
515
516         # "<div><h1>Test123</h1></div>"
517         $dom->parse('<div><h1>Test</h1></div>')
518           ->at('h1')->append_content('123')->root;
519
520         # "<!-- Test 123 --><br>"
521         $dom->parse('<!-- Test --><br>')
522           ->child_nodes->first->append_content('123 ')->root;
523
524         # "<p>Test<i>123</i></p>"
525         $dom->parse('<p>Test</p>')->at('p')->append_content('<i>123</i>')->root;
526
527   at
528         my $result = $dom->at('div ~ p');
529         my $result = $dom->at('svg|line', svg => 'http://www.w3.org/2000/svg');
530
531       Find first descendant element of this element matching the CSS selector
532       and return it as a Mojo::DOM58 object, or "undef" if none could be
533       found. All selectors listed in "SELECTORS" are supported.
534
535         # Find first element with "svg" namespace definition
536         my $namespace = $dom->at('[xmlns\:svg]')->{'xmlns:svg'};
537
538       Trailing key/value pairs can be used to declare xml namespace aliases.
539
540         # "<rect />"
541         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
542           ->at('svg|rect', svg => 'http://www.w3.org/2000/svg');
543
544   attr
545         my $hash = $dom->attr;
546         my $foo  = $dom->attr('foo');
547         $dom     = $dom->attr({foo => 'bar'});
548         $dom     = $dom->attr(foo => 'bar');
549
550       This element's attributes.
551
552         # Remove an attribute
553         delete $dom->attr->{id};
554
555         # Attribute without value
556         $dom->attr(selected => undef);
557
558         # List id attributes
559         say $dom->find('*')->map(attr => 'id')->compact->join("\n");
560
561   child_nodes
562         my $collection = $dom->child_nodes;
563
564       Return a collection containing all child nodes of this element as
565       Mojo::DOM58 objects.
566
567         # "<p><b>123</b></p>"
568         $dom->parse('<p>Test<b>123</b></p>')->at('p')->child_nodes->first->remove;
569
570         # "<!DOCTYPE html>"
571         $dom->parse('<!DOCTYPE html><b>123</b>')->child_nodes->first;
572
573         # " Test "
574         $dom->parse('<b>123</b><!-- Test -->')->child_nodes->last->content;
575
576   children
577         my $collection = $dom->children;
578         my $collection = $dom->children('div ~ p');
579
580       Find all child elements of this element matching the CSS selector and
581       return a collection containing these elements as Mojo::DOM58 objects.
582       All selectors listed in "SELECTORS" are supported.
583
584         # Show tag name of random child element
585         say $dom->children->shuffle->first->tag;
586
587   content
588         my $str = $dom->content;
589         $dom    = $dom->content('<p>I ♥ Mojo::DOM58!</p>');
590         $dom    = $dom->content(Mojo::DOM58->new);
591
592       Return this node's content or replace it with HTML/XML fragment (for
593       "root" and "tag" nodes) or raw content.
594
595         # "<b>Test</b>"
596         $dom->parse('<div><b>Test</b></div>')->at('div')->content;
597
598         # "<div><h1>123</h1></div>"
599         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->content('123')->root;
600
601         # "<p><i>123</i></p>"
602         $dom->parse('<p>Test</p>')->at('p')->content('<i>123</i>')->root;
603
604         # "<div><h1></h1></div>"
605         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->content('')->root;
606
607         # " Test "
608         $dom->parse('<!-- Test --><br>')->child_nodes->first->content;
609
610         # "<div><!-- 123 -->456</div>"
611         $dom->parse('<div><!-- Test -->456</div>')
612           ->at('div')->child_nodes->first->content(' 123 ')->root;
613
614   descendant_nodes
615         my $collection = $dom->descendant_nodes;
616
617       Return a collection containing all descendant nodes of this element as
618       Mojo::DOM58 objects.
619
620         # "<p><b>123</b></p>"
621         $dom->parse('<p><!-- Test --><b>123<!-- 456 --></b></p>')
622           ->descendant_nodes->grep(sub { $_->type eq 'comment' })
623           ->map('remove')->first;
624
625         # "<p><b>test</b>test</p>"
626         $dom->parse('<p><b>123</b>456</p>')
627           ->at('p')->descendant_nodes->grep(sub { $_->type eq 'text' })
628           ->map(content => 'test')->first->root;
629
630   find
631         my $collection = $dom->find('div ~ p');
632         my $collection = $dom->find('svg|line', svg => 'http://www.w3.org/2000/svg');
633
634       Find all descendant elements of this element matching the CSS selector
635       and return a collection containing these elements as Mojo::DOM58
636       objects. All selectors listed in "SELECTORS" are supported.
637
638         # Find a specific element and extract information
639         my $id = $dom->find('div')->[23]{id};
640
641         # Extract information from multiple elements
642         my @headers = $dom->find('h1, h2, h3')->map('text')->each;
643
644         # Count all the different tags
645         my $hash = $dom->find('*')->reduce(sub { $a->{$b->tag}++; $a }, {});
646
647         # Find elements with a class that contains dots
648         my @divs = $dom->find('div.foo\.bar')->each;
649
650       Trailing key/value pairs can be used to declare xml namespace aliases.
651
652         # "<rect />"
653         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
654           ->find('svg|rect', svg => 'http://www.w3.org/2000/svg')->first;
655
656   following
657         my $collection = $dom->following;
658         my $collection = $dom->following('div ~ p');
659
660       Find all sibling elements after this node matching the CSS selector and
661       return a collection containing these elements as Mojo::DOM58 objects.
662       All selectors listed in "SELECTORS" are supported.
663
664         # List tags of sibling elements after this node
665         say $dom->following->map('tag')->join("\n");
666
667   following_nodes
668         my $collection = $dom->following_nodes;
669
670       Return a collection containing all sibling nodes after this node as
671       Mojo::DOM58 objects.
672
673         # "C"
674         $dom->parse('<p>A</p><!-- B -->C')->at('p')->following_nodes->last->content;
675
676   matches
677         my $bool = $dom->matches('div ~ p');
678         my $bool = $dom->matches('svg|line', svg => 'http://www.w3.org/2000/svg');
679
680       Check if this element matches the CSS selector. All selectors listed in
681       "SELECTORS" are supported.
682
683         # True
684         $dom->parse('<p class="a">A</p>')->at('p')->matches('.a');
685         $dom->parse('<p class="a">A</p>')->at('p')->matches('p[class]');
686
687         # False
688         $dom->parse('<p class="a">A</p>')->at('p')->matches('.b');
689         $dom->parse('<p class="a">A</p>')->at('p')->matches('p[id]');
690
691       Trailing key/value pairs can be used to declare xml namespace aliases.
692
693         # True
694         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
695           ->matches('svg|rect', svg => 'http://www.w3.org/2000/svg');
696
697   namespace
698         my $namespace = $dom->namespace;
699
700       Find this element's namespace, or return "undef" if none could be
701       found.
702
703         # "http://www.w3.org/2000/svg"
704         Mojo::DOM58->new('<svg xmlns:svg="http://www.w3.org/2000/svg"><svg:circle>3.14</svg:circle></svg>')->at('svg\:circle')->namespace;
705
706         # Find namespace for an element with namespace prefix
707         my $namespace = $dom->at('svg > svg\:circle')->namespace;
708
709         # Find namespace for an element that may or may not have a namespace prefix
710         my $namespace = $dom->at('svg > circle')->namespace;
711
712   next
713         my $sibling = $dom->next;
714
715       Return Mojo::DOM58 object for next sibling element, or "undef" if there
716       are no more siblings.
717
718         # "<h2>123</h2>"
719         $dom->parse('<div><h1>Test</h1><h2>123</h2></div>')->at('h1')->next;
720
721   next_node
722         my $sibling = $dom->next_node;
723
724       Return Mojo::DOM58 object for next sibling node, or "undef" if there
725       are no more siblings.
726
727         # "456"
728         $dom->parse('<p><b>123</b><!-- Test -->456</p>')
729           ->at('b')->next_node->next_node;
730
731         # " Test "
732         $dom->parse('<p><b>123</b><!-- Test -->456</p>')
733           ->at('b')->next_node->content;
734
735   parent
736         my $parent = $dom->parent;
737
738       Return Mojo::DOM58 object for parent of this node, or "undef" if this
739       node has no parent.
740
741         # "<b><i>Test</i></b>"
742         $dom->parse('<p><b><i>Test</i></b></p>')->at('i')->parent;
743
744   parse
745         $dom = $dom->parse('<foo bar="baz">I ♥ Mojo::DOM58!</foo>');
746
747       Parse HTML/XML fragment.
748
749         # Parse XML
750         my $dom = Mojo::DOM58->new->xml(1)->parse('<foo>I ♥ Mojo::DOM58!</foo>');
751
752   preceding
753         my $collection = $dom->preceding;
754         my $collection = $dom->preceding('div ~ p');
755
756       Find all sibling elements before this node matching the CSS selector
757       and return a collection containing these elements as Mojo::DOM58
758       objects. All selectors listed in "SELECTORS" are supported.
759
760         # List tags of sibling elements before this node
761         say $dom->preceding->map('tag')->join("\n");
762
763   preceding_nodes
764         my $collection = $dom->preceding_nodes;
765
766       Return a collection containing all sibling nodes before this node as
767       Mojo::DOM58 objects.
768
769         # "A"
770         $dom->parse('A<!-- B --><p>C</p>')->at('p')->preceding_nodes->first->content;
771
772   prepend
773         $dom = $dom->prepend('<p>I ♥ Mojo::DOM58!</p>');
774         $dom = $dom->prepend(Mojo::DOM58->new);
775
776       Prepend HTML/XML fragment to this node (for all node types other than
777       "root").
778
779         # "<div><h1>Test</h1><h2>123</h2></div>"
780         $dom->parse('<div><h2>123</h2></div>')
781           ->at('h2')->prepend('<h1>Test</h1>')->root;
782
783         # "<p>Test 123</p>"
784         $dom->parse('<p>123</p>')
785           ->at('p')->child_nodes->first->prepend('Test ')->root;
786
787   prepend_content
788         $dom = $dom->prepend_content('<p>I ♥ Mojo::DOM58!</p>');
789         $dom = $dom->prepend_content(Mojo::DOM58->new);
790
791       Prepend HTML/XML fragment (for "root" and "tag" nodes) or raw content
792       to this node's content.
793
794         # "<div><h2>Test123</h2></div>"
795         $dom->parse('<div><h2>123</h2></div>')
796           ->at('h2')->prepend_content('Test')->root;
797
798         # "<!-- Test 123 --><br>"
799         $dom->parse('<!-- 123 --><br>')
800           ->child_nodes->first->prepend_content(' Test')->root;
801
802         # "<p><i>123</i>Test</p>"
803         $dom->parse('<p>Test</p>')->at('p')->prepend_content('<i>123</i>')->root;
804
805   previous
806         my $sibling = $dom->previous;
807
808       Return Mojo::DOM58 object for previous sibling element, or "undef" if
809       there are no more siblings.
810
811         # "<h1>Test</h1>"
812         $dom->parse('<div><h1>Test</h1><h2>123</h2></div>')->at('h2')->previous;
813
814   previous_node
815         my $sibling = $dom->previous_node;
816
817       Return Mojo::DOM58 object for previous sibling node, or "undef" if
818       there are no more siblings.
819
820         # "123"
821         $dom->parse('<p>123<!-- Test --><b>456</b></p>')
822           ->at('b')->previous_node->previous_node;
823
824         # " Test "
825         $dom->parse('<p>123<!-- Test --><b>456</b></p>')
826           ->at('b')->previous_node->content;
827
828   remove
829         my $parent = $dom->remove;
830
831       Remove this node and return "root" (for "root" nodes) or "parent".
832
833         # "<div></div>"
834         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->remove;
835
836         # "<p><b>456</b></p>"
837         $dom->parse('<p>123<b>456</b></p>')
838           ->at('p')->child_nodes->first->remove->root;
839
840   replace
841         my $parent = $dom->replace('<div>I ♥ Mojo::DOM58!</div>');
842         my $parent = $dom->replace(Mojo::DOM58->new);
843
844       Replace this node with HTML/XML fragment and return "root" (for "root"
845       nodes) or "parent".
846
847         # "<div><h2>123</h2></div>"
848         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->replace('<h2>123</h2>');
849
850         # "<p><b>123</b></p>"
851         $dom->parse('<p>Test</p>')
852           ->at('p')->child_nodes->[0]->replace('<b>123</b>')->root;
853
854   root
855         my $root = $dom->root;
856
857       Return Mojo::DOM58 object for "root" node.
858
859   selector
860         my $selector = $dom->selector;
861
862       Get a unique CSS selector for this element.
863
864         # "ul:nth-child(1) > li:nth-child(2)"
865         $dom->parse('<ul><li>Test</li><li>123</li></ul>')->find('li')->last->selector;
866
867         # "p:nth-child(1) > b:nth-child(1) > i:nth-child(1)"
868         $dom->parse('<p><b><i>Test</i></b></p>')->at('i')->selector;
869
870   strip
871         my $parent = $dom->strip;
872
873       Remove this element while preserving its content and return "parent".
874
875         # "<div>Test</div>"
876         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->strip;
877
878   tag
879         my $tag = $dom->tag;
880         $dom    = $dom->tag('div');
881
882       This element's tag name.
883
884         # List tag names of child elements
885         say $dom->children->map('tag')->join("\n");
886
887   tap
888         $dom = $dom->tap(sub {...});
889
890       Equivalent to "tap" in Mojo::Base.
891
892   text
893         my $text = $dom->text;
894
895       Extract text content from this element only (not including child
896       elements).
897
898         # "bar"
899         $dom->parse("<div>foo<p>bar</p>baz</div>")->at('p')->text;
900
901         # "foo\nbaz\n"
902         $dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->at('div')->text;
903
904   to_string
905         my $str = $dom->to_string;
906
907       Render this node and its content to HTML/XML.
908
909         # "<b>Test</b>"
910         $dom->parse('<div><b>Test</b></div>')->at('div b')->to_string;
911
912       To extract text content from all descendant nodes, see "all_text".
913
914   tree
915         my $tree = $dom->tree;
916         $dom     = $dom->tree(['root']);
917
918       Document Object Model. Note that this structure should only be used
919       very carefully since it is very dynamic.
920
921   type
922         my $type = $dom->type;
923
924       This node's type, usually "cdata", "comment", "doctype", "pi", "raw",
925       "root", "tag" or "text".
926
927         # "cdata"
928         $dom->parse('<![CDATA[Test]]>')->child_nodes->first->type;
929
930         # "comment"
931         $dom->parse('<!-- Test -->')->child_nodes->first->type;
932
933         # "doctype"
934         $dom->parse('<!DOCTYPE html>')->child_nodes->first->type;
935
936         # "pi"
937         $dom->parse('<?xml version="1.0"?>')->child_nodes->first->type;
938
939         # "raw"
940         $dom->parse('<title>Test</title>')->at('title')->child_nodes->first->type;
941
942         # "root"
943         $dom->parse('<p>Test</p>')->type;
944
945         # "tag"
946         $dom->parse('<p>Test</p>')->at('p')->type;
947
948         # "text"
949         $dom->parse('<p>Test</p>')->at('p')->child_nodes->first->type;
950
951   val
952         my $value = $dom->val;
953
954       Extract value from form element (such as "button", "input", "option",
955       "select" and "textarea"), or return "undef" if this element has no
956       value. In the case of "select" with "multiple" attribute, find "option"
957       elements with "selected" attribute and return an array reference with
958       all values, or "undef" if none could be found.
959
960         # "a"
961         $dom->parse('<input name=test value=a>')->at('input')->val;
962
963         # "b"
964         $dom->parse('<textarea>b</textarea>')->at('textarea')->val;
965
966         # "c"
967         $dom->parse('<option value="c">Test</option>')->at('option')->val;
968
969         # "d"
970         $dom->parse('<select><option selected>d</option></select>')
971           ->at('select')->val;
972
973         # "e"
974         $dom->parse('<select multiple><option selected>e</option></select>')
975           ->at('select')->val->[0];
976
977         # "on"
978         $dom->parse('<input name=test type=checkbox>')->at('input')->val;
979
980   with_roles
981         my $new_class = Mojo::DOM58->with_roles('Mojo::DOM58::Role::One');
982         my $new_class = Mojo::DOM58->with_roles('+One', '+Two');
983         $dom          = $dom->with_roles('+One', '+Two');
984
985       Equivalent to "with_roles" in Mojo::Base. Note that role support
986       depends on Role::Tiny (2.000001+).
987
988   wrap
989         $dom = $dom->wrap('<div></div>');
990         $dom = $dom->wrap(Mojo::DOM58->new);
991
992       Wrap HTML/XML fragment around this node (for all node types other than
993       "root"), placing it as the last child of the first innermost element.
994
995         # "<p>123<b>Test</b></p>"
996         $dom->parse('<b>Test</b>')->at('b')->wrap('<p>123</p>')->root;
997
998         # "<div><p><b>Test</b></p>123</div>"
999         $dom->parse('<b>Test</b>')->at('b')->wrap('<div><p></p>123</div>')->root;
1000
1001         # "<p><b>Test</b></p><p>123</p>"
1002         $dom->parse('<b>Test</b>')->at('b')->wrap('<p></p><p>123</p>')->root;
1003
1004         # "<p><b>Test</b></p>"
1005         $dom->parse('<p>Test</p>')->at('p')->child_nodes->first->wrap('<b>')->root;
1006
1007   wrap_content
1008         $dom = $dom->wrap_content('<div></div>');
1009         $dom = $dom->wrap_content(Mojo::DOM58->new);
1010
1011       Wrap HTML/XML fragment around this node's content (for "root" and "tag"
1012       nodes), placing it as the last children of the first innermost element.
1013
1014         # "<p><b>123Test</b></p>"
1015         $dom->parse('<p>Test<p>')->at('p')->wrap_content('<b>123</b>')->root;
1016
1017         # "<p><b>Test</b></p><p>123</p>"
1018         $dom->parse('<b>Test</b>')->wrap_content('<p></p><p>123</p>');
1019
1020   xml
1021         my $bool = $dom->xml;
1022         $dom     = $dom->xml($bool);
1023
1024       Disable HTML semantics in parser and activate case-sensitivity,
1025       defaults to auto detection based on XML declarations.
1026

COLLECTION METHODS

1028       Some Mojo::DOM58 methods return an array-based collection object based
1029       on Mojo::Collection, which can either be accessed directly as an array
1030       reference, or with the following methods.
1031
1032         # Chain methods
1033         $collection->map(sub { ucfirst })->shuffle->each(sub {
1034           my ($word, $num) = @_;
1035           say "$num: $word";
1036         });
1037
1038         # Access array directly to manipulate collection
1039         $collection->[23] += 100;
1040         say for @$collection;
1041
1042   compact
1043         my $new = $collection->compact;
1044
1045       Create a new collection with all elements that are defined and not an
1046       empty string.
1047
1048         # $collection contains (0, 1, undef, 2, '', 3)
1049         $collection->compact->join(', '); # "0, 1, 2, 3"
1050
1051   each
1052         my @elements = $collection->each;
1053         $collection  = $collection->each(sub {...});
1054
1055       Evaluate callback for each element in collection or return all elements
1056       as a list if none has been provided. The element will be the first
1057       argument passed to the callback and is also available as $_.
1058
1059         # Make a numbered list
1060         $collection->each(sub {
1061           my ($e, $num) = @_;
1062           say "$num: $e";
1063         });
1064
1065   first
1066         my $first = $collection->first;
1067         my $first = $collection->first(qr/foo/);
1068         my $first = $collection->first(sub {...});
1069         my $first = $collection->first($method);
1070         my $first = $collection->first($method, @args);
1071
1072       Evaluate regular expression/callback for, or call method on, each
1073       element in collection and return the first one that matched the regular
1074       expression, or for which the callback/method returned true. The element
1075       will be the first argument passed to the callback and is also available
1076       as $_.
1077
1078         # Longer version
1079         my $first = $collection->first(sub { $_->$method(@args) });
1080
1081         # Find first value that contains the word "mojo"
1082         my $interesting = $collection->first(qr/mojo/i);
1083
1084         # Find first value that is greater than 5
1085         my $greater = $collection->first(sub { $_ > 5 });
1086
1087   flatten
1088         my $new = $collection->flatten;
1089
1090       Flatten nested collections/arrays recursively and create a new
1091       collection with all elements.
1092
1093         # $collection contains (1, [2, [3, 4], 5, [6]], 7)
1094         $collection->flatten->join(', '); # "1, 2, 3, 4, 5, 6, 7"
1095
1096   grep
1097         my $new = $collection->grep(qr/foo/);
1098         my $new = $collection->grep(sub {...});
1099         my $new = $collection->grep($method);
1100         my $new = $collection->grep($method, @args);
1101
1102       Evaluate regular expression/callback for, or call method on, each
1103       element in collection and create a new collection with all elements
1104       that matched the regular expression, or for which the callback/method
1105       returned true. The element will be the first argument passed to the
1106       callback and is also available as $_.
1107
1108         # Longer version
1109         my $new = $collection->grep(sub { $_->$method(@args) });
1110
1111         # Find all values that contain the word "mojo"
1112         my $interesting = $collection->grep(qr/mojo/i);
1113
1114         # Find all values that are greater than 5
1115         my $greater = $collection->grep(sub { $_ > 5 });
1116
1117   head
1118         my $new = $collection->head(4);
1119         my $new = $collection->head(-2);
1120
1121       Create a new collection with up to the specified number of elements
1122       from the beginning of the collection. A negative number will count from
1123       the end.
1124
1125         # $collection contains ('A', 'B', 'C', 'D', 'E')
1126         $collection->head(3)->join(' '); # "A B C"
1127         $collection->head(-3)->join(' '); # "A B"
1128
1129   join
1130         my $stream = $collection->join;
1131         my $stream = $collection->join("\n");
1132
1133       Turn collection into string.
1134
1135         # Join all values with commas
1136         $collection->join(', ');
1137
1138   last
1139         my $last = $collection->last;
1140
1141       Return the last element in collection.
1142
1143   map
1144         my $new = $collection->map(sub {...});
1145         my $new = $collection->map($method);
1146         my $new = $collection->map($method, @args);
1147
1148       Evaluate callback for, or call method on, each element in collection
1149       and create a new collection from the results. The element will be the
1150       first argument passed to the callback and is also available as $_.
1151
1152         # Longer version
1153         my $new = $collection->map(sub { $_->$method(@args) });
1154
1155         # Append the word "mojo" to all values
1156         my $domified = $collection->map(sub { $_ . 'mojo' });
1157
1158   reduce
1159         my $result = $collection->reduce(sub {...});
1160         my $result = $collection->reduce(sub {...}, $initial);
1161
1162       Reduce elements in collection with callback, the first element will be
1163       used as initial value if none has been provided.
1164
1165         # Calculate the sum of all values
1166         my $sum = $collection->reduce(sub { $a + $b });
1167
1168         # Count how often each value occurs in collection
1169         my $hash = $collection->reduce(sub { $a->{$b}++; $a }, {});
1170
1171   reverse
1172         my $new = $collection->reverse;
1173
1174       Create a new collection with all elements in reverse order.
1175
1176   slice
1177         my $new = $collection->slice(4 .. 7);
1178
1179       Create a new collection with all selected elements.
1180
1181         # $collection contains ('A', 'B', 'C', 'D', 'E')
1182         $collection->slice(1, 2, 4)->join(' '); # "B C E"
1183
1184   shuffle
1185         my $new = $collection->shuffle;
1186
1187       Create a new collection with all elements in random order.
1188
1189   size
1190         my $size = $collection->size;
1191
1192       Number of elements in collection.
1193
1194   sort
1195         my $new = $collection->sort;
1196         my $new = $collection->sort(sub {...});
1197
1198       Sort elements based on return value of callback and create a new
1199       collection from the results.
1200
1201         # Sort values case-insensitive
1202         my $case_insensitive = $collection->sort(sub { uc($a) cmp uc($b) });
1203
1204   tail
1205         my $new = $collection->tail(4);
1206         my $new = $collection->tail(-2);
1207
1208       Create a new collection with up to the specified number of elements
1209       from the end of the collection. A negative number will count from the
1210       beginning.
1211
1212         # $collection contains ('A', 'B', 'C', 'D', 'E')
1213         $collection->tail(3)->join(' '); # "C D E"
1214         $collection->tail(-3)->join(' '); # "D E"
1215
1216   tap
1217         $collection = $collection->tap(sub {...});
1218
1219       Equivalent to "tap" in Mojo::Base.
1220
1221   to_array
1222         my $array = $collection->to_array;
1223
1224       Turn collection into array reference.
1225
1226   uniq
1227         my $new = $collection->uniq;
1228         my $new = $collection->uniq(sub {...});
1229         my $new = $collection->uniq($method);
1230         my $new = $collection->uniq($method, @args);
1231
1232       Create a new collection without duplicate elements, using the string
1233       representation of either the elements or the return value of the
1234       callback/method to decide uniqueness. Note that "undef" and empty
1235       string are treated the same.
1236
1237         # Longer version
1238         my $new = $collection->uniq(sub { $_->$method(@args) });
1239
1240         # $collection contains ('foo', 'bar', 'bar', 'baz')
1241         $collection->uniq->join(' '); # "foo bar baz"
1242
1243         # $collection contains ([1, 2], [2, 1], [3, 2])
1244         $collection->uniq(sub{ $_->[1] })->to_array; # "[[1, 2], [2, 1]]"
1245
1246   with_roles
1247         $collection = $collection->with_roles('Mojo::Collection::Role::One');
1248
1249       Equivalent to "with_roles" in Mojo::Base. Note that role support
1250       depends on Role::Tiny (2.000001+).
1251

DEBUGGING

1253       You can set the "MOJO_DOM58_CSS_DEBUG" environment variable to get some
1254       advanced diagnostics information printed to "STDERR".
1255
1256         MOJO_DOM58_CSS_DEBUG=1
1257

BUGS

1259       Report issues related to the format of this distribution or Perl 5.8
1260       support to the public bugtracker. Any other issues should be reported
1261       directly to the upstream Mojolicious issue tracker.
1262

AUTHOR

1264       Dan Book <dbook@cpan.org>
1265
1266       Code and tests adapted from Mojo::DOM, a lightweight DOM parser by the
1267       Mojolicious team.
1268

CONTRIBUTORS

1270       Matt S Trout (mst)
1271
1273       Copyright (c) 2008-2016 Sebastian Riedel and others.
1274
1275       Copyright (c) 2016 "AUTHOR" and "CONTRIBUTORS" for adaptation to
1276       standalone format.
1277
1278       This is free software, licensed under:
1279
1280         The Artistic License 2.0 (GPL Compatible)
1281

SEE ALSO

1283       Mojo::DOM, HTML::TreeBuilder, XML::LibXML, XML::Twig, XML::Smart
1284
1285
1286
1287perl v5.36.0                      2022-07-22                    Mojo::DOM58(3)
Impressum