1Mojo::DOM58(3)        User Contributed Perl Documentation       Mojo::DOM58(3)
2
3
4

NAME

6       Mojo::DOM58 - Minimalistic HTML/XML DOM parser with CSS selectors
7

SYNOPSIS

9         use Mojo::DOM58;
10
11         # Parse
12         my $dom = Mojo::DOM58->new('<div><p id="a">Test</p><p id="b">123</p></div>');
13
14         # Find
15         say $dom->at('#b')->text;
16         say $dom->find('p')->map('text')->join("\n");
17         say $dom->find('[id]')->map(attr => 'id')->join("\n");
18
19         # Iterate
20         $dom->find('p[id]')->reverse->each(sub { say $_->{id} });
21
22         # Loop
23         for my $e ($dom->find('p[id]')->each) {
24           say $e->{id}, ':', $e->text;
25         }
26
27         # Modify
28         $dom->find('div p')->last->append('<p id="c">456</p>');
29         $dom->at('#c')->prepend($dom->new_tag('p', id => 'd', '789'));
30         $dom->find(':not(p)')->map('strip');
31
32         # Render
33         say "$dom";
34

DESCRIPTION

36       Mojo::DOM58 is a minimalistic and relaxed pure-perl HTML/XML DOM parser
37       based on Mojo::DOM. It supports the HTML Living Standard
38       <https://html.spec.whatwg.org/> and Extensible Markup Language (XML)
39       1.0 <http://www.w3.org/TR/xml/>, and matching based on CSS3 selectors
40       <http://www.w3.org/TR/selectors/>. It will even try to interpret broken
41       HTML and XML, so you should not use it for validation.
42

FORK INFO

44       Mojo::DOM58 is a fork of Mojo::DOM and tracks features and fixes to
45       stay closely compatible with upstream. It differs only in the
46       standalone format and compatibility with Perl 5.8. Any bugs or patches
47       not related to these changes should be reported directly to the
48       Mojolicious issue tracker.
49
50       This release of Mojo::DOM58 is up to date with version 8.09 of
51       Mojolicious.
52

NODES AND ELEMENTS

54       When we parse an HTML/XML fragment, it gets turned into a tree of
55       nodes.
56
57         <!DOCTYPE html>
58         <html>
59           <head><title>Hello</title></head>
60           <body>World!</body>
61         </html>
62
63       There are currently eight different kinds of nodes, "cdata", "comment",
64       "doctype", "pi", "raw", "root", "tag" and "text". Elements are nodes of
65       the type "tag".
66
67         root
68         |- doctype (html)
69         +- tag (html)
70            |- tag (head)
71            |  +- tag (title)
72            |     +- raw (Hello)
73            +- tag (body)
74               +- text (World!)
75
76       While all node types are represented as Mojo::DOM58 objects, some
77       methods like "attr" and "namespace" only apply to elements.
78

CASE-SENSITIVITY

80       Mojo::DOM58 defaults to HTML semantics, that means all tags and
81       attribute names are lowercased and selectors need to be lowercase as
82       well.
83
84         # HTML semantics
85         my $dom = Mojo::DOM58->new('<P ID="greeting">Hi!</P>');
86         say $dom->at('p[id]')->text;
87
88       If an XML declaration is found, the parser will automatically switch
89       into XML mode and everything becomes case-sensitive.
90
91         # XML semantics
92         my $dom = Mojo::DOM58->new('<?xml version="1.0"?><P ID="greeting">Hi!</P>');
93         say $dom->at('P[ID]')->text;
94
95       HTML or XML semantics can also be forced with the "xml" method.
96
97         # Force HTML semantics
98         my $dom = Mojo::DOM58->new->xml(0)->parse('<P ID="greeting">Hi!</P>');
99         say $dom->at('p[id]')->text;
100
101         # Force XML semantics
102         my $dom = Mojo::DOM58->new->xml(1)->parse('<P ID="greeting">Hi!</P>');
103         say $dom->at('P[ID]')->text;
104

SELECTORS

106       Mojo::DOM58 uses a CSS selector engine based on Mojo::DOM::CSS. All CSS
107       selectors that make sense for a standalone parser are supported.
108
109       *   Any element.
110
111             my $all = $dom->find('*');
112
113       E   An element of type "E".
114
115             my $title = $dom->at('title');
116
117       E[foo]
118           An "E" element with a "foo" attribute.
119
120             my $links = $dom->find('a[href]');
121
122       E[foo="bar"]
123           An "E" element whose "foo" attribute value is exactly equal to
124           "bar".
125
126             my $case_sensitive = $dom->find('input[type="hidden"]');
127             my $case_sensitive = $dom->find('input[type=hidden]');
128
129       E[foo="bar" i]
130           An "E" element whose "foo" attribute value is exactly equal to any
131           (ASCII-range) case-permutation of "bar". Note that this selector is
132           EXPERIMENTAL and might change without warning!
133
134             my $case_insensitive = $dom->find('input[type="hidden" i]');
135             my $case_insensitive = $dom->find('input[type=hidden i]');
136             my $case_insensitive = $dom->find('input[class~="foo" i]');
137
138           This selector is part of Selectors Level 4
139           <http://dev.w3.org/csswg/selectors-4>, which is still a work in
140           progress.
141
142       E[foo~="bar"]
143           An "E" element whose "foo" attribute value is a list of whitespace-
144           separated values, one of which is exactly equal to "bar".
145
146             my $foo = $dom->find('input[class~="foo"]');
147             my $foo = $dom->find('input[class~=foo]');
148
149       E[foo^="bar"]
150           An "E" element whose "foo" attribute value begins exactly with the
151           string "bar".
152
153             my $begins_with = $dom->find('input[name^="f"]');
154             my $begins_with = $dom->find('input[name^=f]');
155
156       E[foo$="bar"]
157           An "E" element whose "foo" attribute value ends exactly with the
158           string "bar".
159
160             my $ends_with = $dom->find('input[name$="o"]');
161             my $ends_with = $dom->find('input[name$=o]');
162
163       E[foo*="bar"]
164           An "E" element whose "foo" attribute value contains the substring
165           "bar".
166
167             my $contains = $dom->find('input[name*="fo"]');
168             my $contains = $dom->find('input[name*=fo]');
169
170       E[foo|="en"]
171           An "E" element whose "foo" attribute has a hyphen-separated list of
172           values beginning (from the left) with "en".
173
174             my $english = $dom->find('link[hreflang|=en]');
175
176       E:root
177           An "E" element, root of the document.
178
179             my $root = $dom->at(':root');
180
181       E:nth-child(n)
182           An "E" element, the "n-th" child of its parent.
183
184             my $third = $dom->find('div:nth-child(3)');
185             my $odd   = $dom->find('div:nth-child(odd)');
186             my $even  = $dom->find('div:nth-child(even)');
187             my $top3  = $dom->find('div:nth-child(-n+3)');
188
189       E:nth-last-child(n)
190           An "E" element, the "n-th" child of its parent, counting from the
191           last one.
192
193             my $third    = $dom->find('div:nth-last-child(3)');
194             my $odd      = $dom->find('div:nth-last-child(odd)');
195             my $even     = $dom->find('div:nth-last-child(even)');
196             my $bottom3  = $dom->find('div:nth-last-child(-n+3)');
197
198       E:nth-of-type(n)
199           An "E" element, the "n-th" sibling of its type.
200
201             my $third = $dom->find('div:nth-of-type(3)');
202             my $odd   = $dom->find('div:nth-of-type(odd)');
203             my $even  = $dom->find('div:nth-of-type(even)');
204             my $top3  = $dom->find('div:nth-of-type(-n+3)');
205
206       E:nth-last-of-type(n)
207           An "E" element, the "n-th" sibling of its type, counting from the
208           last one.
209
210             my $third    = $dom->find('div:nth-last-of-type(3)');
211             my $odd      = $dom->find('div:nth-last-of-type(odd)');
212             my $even     = $dom->find('div:nth-last-of-type(even)');
213             my $bottom3  = $dom->find('div:nth-last-of-type(-n+3)');
214
215       E:first-child
216           An "E" element, first child of its parent.
217
218             my $first = $dom->find('div p:first-child');
219
220       E:last-child
221           An "E" element, last child of its parent.
222
223             my $last = $dom->find('div p:last-child');
224
225       E:first-of-type
226           An "E" element, first sibling of its type.
227
228             my $first = $dom->find('div p:first-of-type');
229
230       E:last-of-type
231           An "E" element, last sibling of its type.
232
233             my $last = $dom->find('div p:last-of-type');
234
235       E:only-child
236           An "E" element, only child of its parent.
237
238             my $lonely = $dom->find('div p:only-child');
239
240       E:only-of-type
241           An "E" element, only sibling of its type.
242
243             my $lonely = $dom->find('div p:only-of-type');
244
245       E:empty
246           An "E" element that has no children (including text nodes).
247
248             my $empty = $dom->find(':empty');
249
250       E:link
251           An "E" element being the source anchor of a hyperlink of which the
252           target is not yet visited (":link") or already visited
253           (":visited"). Note that Mojo::DOM58 is not stateful, therefore
254           ":link" and ":visited" yield exactly the same results.
255
256             my $links = $dom->find(':link');
257             my $links = $dom->find(':visited');
258
259       E:visited
260           Alias for "E:link".
261
262       E:checked
263           A user interface element "E" which is checked (for instance a
264           radio-button or checkbox).
265
266             my $input = $dom->find(':checked');
267
268       E.warning
269           An "E" element whose class is "warning".
270
271             my $warning = $dom->find('div.warning');
272
273       E#myid
274           An "E" element with "ID" equal to "myid".
275
276             my $foo = $dom->at('div#foo');
277
278       E:not(s1, s2)
279           An "E" element that does not match either compound selector "s1" or
280           compound selector "s2". Note that support for compound selectors is
281           EXPERIMENTAL and might change without warning!
282
283             my $others = $dom->find('div p:not(:first-child, :last-child)');
284
285           Support for compound selectors was added as part of Selectors Level
286           4 <http://dev.w3.org/csswg/selectors-4>, which is still a work in
287           progress.
288
289       E:matches(s1, s2)
290           An "E" element that matches compound selector "s1" and/or compound
291           selector "s2". Note that this selector is EXPERIMENTAL and might
292           change without warning!
293
294             my $headers = $dom->find(':matches(section, article, aside, nav) h1');
295
296           This selector is part of Selectors Level 4
297           <http://dev.w3.org/csswg/selectors-4>, which is still a work in
298           progress.
299
300       A|E An "E" element that belongs to the namespace alias "A" from CSS
301           Namespaces Module Level 3 <https://www.w3.org/TR/css-
302           namespaces-3/>.  Key/value pairs passed to selector methods are
303           used to declare namespace aliases.
304
305             my $elem = $dom->find('lq|elem', lq => 'http://example.com/q-markup');
306
307           Using an empty alias searches for an element that belongs to no
308           namespace.
309
310             my $div = $dom->find('|div');
311
312       E F An "F" element descendant of an "E" element.
313
314             my $headlines = $dom->find('div h1');
315
316       E > F
317           An "F" element child of an "E" element.
318
319             my $headlines = $dom->find('html > body > div > h1');
320
321       E + F
322           An "F" element immediately preceded by an "E" element.
323
324             my $second = $dom->find('h1 + h2');
325
326       E ~ F
327           An "F" element preceded by an "E" element.
328
329             my $second = $dom->find('h1 ~ h2');
330
331       E, F, G
332           Elements of type "E", "F" and "G".
333
334             my $headlines = $dom->find('h1, h2, h3');
335
336       E[foo=bar][bar=baz]
337           An "E" element whose attributes match all following attribute
338           selectors.
339
340             my $links = $dom->find('a[foo^=b][foo$=ar]');
341

OPERATORS

343       Mojo::DOM58 overloads the following operators.
344
345   array
346         my @nodes = @$dom;
347
348       Alias for "child_nodes".
349
350         # "<!-- Test -->"
351         $dom->parse('<!-- Test --><b>123</b>')->[0];
352
353   bool
354         my $bool = !!$dom;
355
356       Always true.
357
358   hash
359         my %attrs = %$dom;
360
361       Alias for "attr".
362
363         # "test"
364         $dom->parse('<div id="test">Test</div>')->at('div')->{id};
365
366   stringify
367         my $str = "$dom";
368
369       Alias for "to_string".
370

FUNCTIONS

372       Mojo::DOM58 implements the following functions, which can be imported
373       individually.
374
375   tag_to_html
376         my $str = tag_to_html 'div', id => 'foo', 'safe content';
377
378       Generate HTML/XML tag and render it right away. This is a significantly
379       faster alternative to "new_tag" for template systems that have to
380       generate a lot of tags.
381

METHODS

383       Mojo::DOM58 implements the following methods.
384
385   new
386         my $dom = Mojo::DOM58->new;
387         my $dom = Mojo::DOM58->new('<foo bar="baz">I ♥ Mojo::DOM58!</foo>');
388
389       Construct a new scalar-based Mojo::DOM58 object and "parse" HTML/XML
390       fragment if necessary.
391
392   new_tag
393         my $tag = Mojo::DOM58->new_tag('div');
394         my $tag = $dom->new_tag('div');
395         my $tag = $dom->new_tag('div', id => 'foo', hidden => undef);
396         my $tag = $dom->new_tag('div', 'safe content');
397         my $tag = $dom->new_tag('div', id => 'foo', 'safe content');
398         my $tag = $dom->new_tag('div', data => {mojo => 'rocks'}, 'safe content');
399         my $tag = $dom->new_tag('div', id => 'foo', sub { 'unsafe content' });
400
401       Construct a new Mojo::DOM58 object for an HTML/XML tag with or without
402       attributes and content. The "data" attribute may contain a hash
403       reference with key/value pairs to generate attributes from.
404
405         # "<br>"
406         $dom->new_tag('br');
407
408         # "<div></div>"
409         $dom->new_tag('div');
410
411         # "<div id="foo" hidden></div>"
412         $dom->new_tag('div', id => 'foo', hidden => undef);
413
414         # "<div>test &amp; 123</div>"
415         $dom->new_tag('div', 'test & 123');
416
417         # "<div id="foo">test &amp; 123</div>"
418         $dom->new_tag('div', id => 'foo', 'test & 123');
419
420         # "<div data-foo="1" data-bar="test">test &amp; 123</div>""
421         $dom->new_tag('div', data => {foo => 1, Bar => 'test'}, 'test & 123');
422
423         # "<div id="foo">test & 123</div>"
424         $dom->new_tag('div', id => 'foo', sub { 'test & 123' });
425
426         # "<div>Hello<b>Mojo!</b></div>"
427         $dom->parse('<div>Hello</div>')->at('div')
428           ->append_content($dom->new_tag('b', 'Mojo!'))->root;
429
430   all_text
431         my $text = $dom->all_text;
432
433       Extract text content from all descendant nodes of this element.
434
435         # "foo\nbarbaz\n"
436         $dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->at('div')->all_text;
437
438   ancestors
439         my $collection = $dom->ancestors;
440         my $collection = $dom->ancestors('div ~ p');
441
442       Find all ancestor elements of this node matching the CSS selector and
443       return a collection containing these elements as Mojo::DOM58 objects.
444       All selectors listed in "SELECTORS" are supported.
445
446         # List tag names of ancestor elements
447         say $dom->ancestors->map('tag')->join("\n");
448
449   append
450         $dom = $dom->append('<p>I ♥ Mojo::DOM58!</p>');
451         $dom = $dom->append(Mojo::DOM58->new);
452
453       Append HTML/XML fragment to this node (for all node types other than
454       "root").
455
456         # "<div><h1>Test</h1><h2>123</h2></div>"
457         $dom->parse('<div><h1>Test</h1></div>')
458           ->at('h1')->append('<h2>123</h2>')->root;
459
460         # "<p>Test 123</p>"
461         $dom->parse('<p>Test</p>')->at('p')
462           ->child_nodes->first->append(' 123')->root;
463
464   append_content
465         $dom = $dom->append_content('<p>I ♥ Mojo::DOM58!</p>');
466         $dom = $dom->append_content(Mojo::DOM58->new);
467
468       Append HTML/XML fragment (for "root" and "tag" nodes) or raw content to
469       this node's content.
470
471         # "<div><h1>Test123</h1></div>"
472         $dom->parse('<div><h1>Test</h1></div>')
473           ->at('h1')->append_content('123')->root;
474
475         # "<!-- Test 123 --><br>"
476         $dom->parse('<!-- Test --><br>')
477           ->child_nodes->first->append_content('123 ')->root;
478
479         # "<p>Test<i>123</i></p>"
480         $dom->parse('<p>Test</p>')->at('p')->append_content('<i>123</i>')->root;
481
482   at
483         my $result = $dom->at('div ~ p');
484         my $result = $dom->at('svg|line', svg => 'http://www.w3.org/2000/svg');
485
486       Find first descendant element of this element matching the CSS selector
487       and return it as a Mojo::DOM58 object, or "undef" if none could be
488       found. All selectors listed in "SELECTORS" are supported.
489
490         # Find first element with "svg" namespace definition
491         my $namespace = $dom->at('[xmlns\:svg]')->{'xmlns:svg'};
492
493       Trailing key/value pairs can be used to declare xml namespace aliases.
494
495         # "<rect />"
496         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
497           ->at('svg|rect', svg => 'http://www.w3.org/2000/svg');
498
499   attr
500         my $hash = $dom->attr;
501         my $foo  = $dom->attr('foo');
502         $dom     = $dom->attr({foo => 'bar'});
503         $dom     = $dom->attr(foo => 'bar');
504
505       This element's attributes.
506
507         # Remove an attribute
508         delete $dom->attr->{id};
509
510         # Attribute without value
511         $dom->attr(selected => undef);
512
513         # List id attributes
514         say $dom->find('*')->map(attr => 'id')->compact->join("\n");
515
516   child_nodes
517         my $collection = $dom->child_nodes;
518
519       Return a collection containing all child nodes of this element as
520       Mojo::DOM58 objects.
521
522         # "<p><b>123</b></p>"
523         $dom->parse('<p>Test<b>123</b></p>')->at('p')->child_nodes->first->remove;
524
525         # "<!DOCTYPE html>"
526         $dom->parse('<!DOCTYPE html><b>123</b>')->child_nodes->first;
527
528         # " Test "
529         $dom->parse('<b>123</b><!-- Test -->')->child_nodes->last->content;
530
531   children
532         my $collection = $dom->children;
533         my $collection = $dom->children('div ~ p');
534
535       Find all child elements of this element matching the CSS selector and
536       return a collection containing these elements as Mojo::DOM58 objects.
537       All selectors listed in "SELECTORS" are supported.
538
539         # Show tag name of random child element
540         say $dom->children->shuffle->first->tag;
541
542   content
543         my $str = $dom->content;
544         $dom    = $dom->content('<p>I ♥ Mojo::DOM58!</p>');
545         $dom    = $dom->content(Mojo::DOM58->new);
546
547       Return this node's content or replace it with HTML/XML fragment (for
548       "root" and "tag" nodes) or raw content.
549
550         # "<b>Test</b>"
551         $dom->parse('<div><b>Test</b></div>')->at('div')->content;
552
553         # "<div><h1>123</h1></div>"
554         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->content('123')->root;
555
556         # "<p><i>123</i></p>"
557         $dom->parse('<p>Test</p>')->at('p')->content('<i>123</i>')->root;
558
559         # "<div><h1></h1></div>"
560         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->content('')->root;
561
562         # " Test "
563         $dom->parse('<!-- Test --><br>')->child_nodes->first->content;
564
565         # "<div><!-- 123 -->456</div>"
566         $dom->parse('<div><!-- Test -->456</div>')
567           ->at('div')->child_nodes->first->content(' 123 ')->root;
568
569   descendant_nodes
570         my $collection = $dom->descendant_nodes;
571
572       Return a collection containing all descendant nodes of this element as
573       Mojo::DOM58 objects.
574
575         # "<p><b>123</b></p>"
576         $dom->parse('<p><!-- Test --><b>123<!-- 456 --></b></p>')
577           ->descendant_nodes->grep(sub { $_->type eq 'comment' })
578           ->map('remove')->first;
579
580         # "<p><b>test</b>test</p>"
581         $dom->parse('<p><b>123</b>456</p>')
582           ->at('p')->descendant_nodes->grep(sub { $_->type eq 'text' })
583           ->map(content => 'test')->first->root;
584
585   find
586         my $collection = $dom->find('div ~ p');
587         my $collection = $dom->find('svg|line', svg => 'http://www.w3.org/2000/svg');
588
589       Find all descendant elements of this element matching the CSS selector
590       and return a collection containing these elements as Mojo::DOM58
591       objects. All selectors listed in "SELECTORS" are supported.
592
593         # Find a specific element and extract information
594         my $id = $dom->find('div')->[23]{id};
595
596         # Extract information from multiple elements
597         my @headers = $dom->find('h1, h2, h3')->map('text')->each;
598
599         # Count all the different tags
600         my $hash = $dom->find('*')->reduce(sub { $a->{$b->tag}++; $a }, {});
601
602         # Find elements with a class that contains dots
603         my @divs = $dom->find('div.foo\.bar')->each;
604
605       Trailing key/value pairs can be used to declare xml namespace aliases.
606
607         # "<rect />"
608         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
609           ->find('svg|rect', svg => 'http://www.w3.org/2000/svg')->first;
610
611   following
612         my $collection = $dom->following;
613         my $collection = $dom->following('div ~ p');
614
615       Find all sibling elements after this node matching the CSS selector and
616       return a collection containing these elements as Mojo::DOM58 objects.
617       All selectors listed in "SELECTORS" are supported.
618
619         # List tags of sibling elements after this node
620         say $dom->following->map('tag')->join("\n");
621
622   following_nodes
623         my $collection = $dom->following_nodes;
624
625       Return a collection containing all sibling nodes after this node as
626       Mojo::DOM58 objects.
627
628         # "C"
629         $dom->parse('<p>A</p><!-- B -->C')->at('p')->following_nodes->last->content;
630
631   matches
632         my $bool = $dom->matches('div ~ p');
633         my $bool = $dom->matches('svg|line', svg => 'http://www.w3.org/2000/svg');
634
635       Check if this element matches the CSS selector. All selectors listed in
636       "SELECTORS" are supported.
637
638         # True
639         $dom->parse('<p class="a">A</p>')->at('p')->matches('.a');
640         $dom->parse('<p class="a">A</p>')->at('p')->matches('p[class]');
641
642         # False
643         $dom->parse('<p class="a">A</p>')->at('p')->matches('.b');
644         $dom->parse('<p class="a">A</p>')->at('p')->matches('p[id]');
645
646       Trailing key/value pairs can be used to declare xml namespace aliases.
647
648         # True
649         $dom->parse('<svg xmlns="http://www.w3.org/2000/svg"><rect /></svg>')
650           ->matches('svg|rect', svg => 'http://www.w3.org/2000/svg');
651
652   namespace
653         my $namespace = $dom->namespace;
654
655       Find this element's namespace, or return "undef" if none could be
656       found.
657
658         # Find namespace for an element with namespace prefix
659         my $namespace = $dom->at('svg > svg\:circle')->namespace;
660
661         # Find namespace for an element that may or may not have a namespace prefix
662         my $namespace = $dom->at('svg > circle')->namespace;
663
664   next
665         my $sibling = $dom->next;
666
667       Return Mojo::DOM58 object for next sibling element, or "undef" if there
668       are no more siblings.
669
670         # "<h2>123</h2>"
671         $dom->parse('<div><h1>Test</h1><h2>123</h2></div>')->at('h1')->next;
672
673   next_node
674         my $sibling = $dom->next_node;
675
676       Return Mojo::DOM58 object for next sibling node, or "undef" if there
677       are no more siblings.
678
679         # "456"
680         $dom->parse('<p><b>123</b><!-- Test -->456</p>')
681           ->at('b')->next_node->next_node;
682
683         # " Test "
684         $dom->parse('<p><b>123</b><!-- Test -->456</p>')
685           ->at('b')->next_node->content;
686
687   parent
688         my $parent = $dom->parent;
689
690       Return Mojo::DOM58 object for parent of this node, or "undef" if this
691       node has no parent.
692
693         # "<b><i>Test</i></b>"
694         $dom->parse('<p><b><i>Test</i></b></p>')->at('i')->parent;
695
696   parse
697         $dom = $dom->parse('<foo bar="baz">I ♥ Mojo::DOM58!</foo>');
698
699       Parse HTML/XML fragment.
700
701         # Parse XML
702         my $dom = Mojo::DOM58->new->xml(1)->parse('<foo>I ♥ Mojo::DOM58!</foo>');
703
704   preceding
705         my $collection = $dom->preceding;
706         my $collection = $dom->preceding('div ~ p');
707
708       Find all sibling elements before this node matching the CSS selector
709       and return a collection containing these elements as Mojo::DOM58
710       objects. All selectors listed in "SELECTORS" are supported.
711
712         # List tags of sibling elements before this node
713         say $dom->preceding->map('tag')->join("\n");
714
715   preceding_nodes
716         my $collection = $dom->preceding_nodes;
717
718       Return a collection containing all sibling nodes before this node as
719       Mojo::DOM58 objects.
720
721         # "A"
722         $dom->parse('A<!-- B --><p>C</p>')->at('p')->preceding_nodes->first->content;
723
724   prepend
725         $dom = $dom->prepend('<p>I ♥ Mojo::DOM58!</p>');
726         $dom = $dom->prepend(Mojo::DOM58->new);
727
728       Prepend HTML/XML fragment to this node (for all node types other than
729       "root").
730
731         # "<div><h1>Test</h1><h2>123</h2></div>"
732         $dom->parse('<div><h2>123</h2></div>')
733           ->at('h2')->prepend('<h1>Test</h1>')->root;
734
735         # "<p>Test 123</p>"
736         $dom->parse('<p>123</p>')
737           ->at('p')->child_nodes->first->prepend('Test ')->root;
738
739   prepend_content
740         $dom = $dom->prepend_content('<p>I ♥ Mojo::DOM58!</p>');
741         $dom = $dom->prepend_content(Mojo::DOM58->new);
742
743       Prepend HTML/XML fragment (for "root" and "tag" nodes) or raw content
744       to this node's content.
745
746         # "<div><h2>Test123</h2></div>"
747         $dom->parse('<div><h2>123</h2></div>')
748           ->at('h2')->prepend_content('Test')->root;
749
750         # "<!-- Test 123 --><br>"
751         $dom->parse('<!-- 123 --><br>')
752           ->child_nodes->first->prepend_content(' Test')->root;
753
754         # "<p><i>123</i>Test</p>"
755         $dom->parse('<p>Test</p>')->at('p')->prepend_content('<i>123</i>')->root;
756
757   previous
758         my $sibling = $dom->previous;
759
760       Return Mojo::DOM58 object for previous sibling element, or "undef" if
761       there are no more siblings.
762
763         # "<h1>Test</h1>"
764         $dom->parse('<div><h1>Test</h1><h2>123</h2></div>')->at('h2')->previous;
765
766   previous_node
767         my $sibling = $dom->previous_node;
768
769       Return Mojo::DOM58 object for previous sibling node, or "undef" if
770       there are no more siblings.
771
772         # "123"
773         $dom->parse('<p>123<!-- Test --><b>456</b></p>')
774           ->at('b')->previous_node->previous_node;
775
776         # " Test "
777         $dom->parse('<p>123<!-- Test --><b>456</b></p>')
778           ->at('b')->previous_node->content;
779
780   remove
781         my $parent = $dom->remove;
782
783       Remove this node and return "root" (for "root" nodes) or "parent".
784
785         # "<div></div>"
786         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->remove;
787
788         # "<p><b>456</b></p>"
789         $dom->parse('<p>123<b>456</b></p>')
790           ->at('p')->child_nodes->first->remove->root;
791
792   replace
793         my $parent = $dom->replace('<div>I ♥ Mojo::DOM58!</div>');
794         my $parent = $dom->replace(Mojo::DOM58->new);
795
796       Replace this node with HTML/XML fragment and return "root" (for "root"
797       nodes) or "parent".
798
799         # "<div><h2>123</h2></div>"
800         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->replace('<h2>123</h2>');
801
802         # "<p><b>123</b></p>"
803         $dom->parse('<p>Test</p>')
804           ->at('p')->child_nodes->[0]->replace('<b>123</b>')->root;
805
806   root
807         my $root = $dom->root;
808
809       Return Mojo::DOM58 object for "root" node.
810
811   selector
812         my $selector = $dom->selector;
813
814       Get a unique CSS selector for this element.
815
816         # "ul:nth-child(1) > li:nth-child(2)"
817         $dom->parse('<ul><li>Test</li><li>123</li></ul>')->find('li')->last->selector;
818
819         # "p:nth-child(1) > b:nth-child(1) > i:nth-child(1)"
820         $dom->parse('<p><b><i>Test</i></b></p>')->at('i')->selector;
821
822   strip
823         my $parent = $dom->strip;
824
825       Remove this element while preserving its content and return "parent".
826
827         # "<div>Test</div>"
828         $dom->parse('<div><h1>Test</h1></div>')->at('h1')->strip;
829
830   tag
831         my $tag = $dom->tag;
832         $dom    = $dom->tag('div');
833
834       This element's tag name.
835
836         # List tag names of child elements
837         say $dom->children->map('tag')->join("\n");
838
839   tap
840         $dom = $dom->tap(sub {...});
841
842       Equivalent to "tap" in Mojo::Base.
843
844   text
845         my $text = $dom->text;
846
847       Extract text content from this element only (not including child
848       elements).
849
850         # "bar"
851         $dom->parse("<div>foo<p>bar</p>baz</div>")->at('p')->text;
852
853         # "foo\nbaz\n"
854         $dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->at('div')->text;
855
856   to_string
857         my $str = $dom->to_string;
858
859       Render this node and its content to HTML/XML.
860
861         # "<b>Test</b>"
862         $dom->parse('<div><b>Test</b></div>')->at('div b')->to_string;
863
864   tree
865         my $tree = $dom->tree;
866         $dom     = $dom->tree(['root']);
867
868       Document Object Model. Note that this structure should only be used
869       very carefully since it is very dynamic.
870
871   type
872         my $type = $dom->type;
873
874       This node's type, usually "cdata", "comment", "doctype", "pi", "raw",
875       "root", "tag" or "text".
876
877         # "cdata"
878         $dom->parse('<![CDATA[Test]]>')->child_nodes->first->type;
879
880         # "comment"
881         $dom->parse('<!-- Test -->')->child_nodes->first->type;
882
883         # "doctype"
884         $dom->parse('<!DOCTYPE html>')->child_nodes->first->type;
885
886         # "pi"
887         $dom->parse('<?xml version="1.0"?>')->child_nodes->first->type;
888
889         # "raw"
890         $dom->parse('<title>Test</title>')->at('title')->child_nodes->first->type;
891
892         # "root"
893         $dom->parse('<p>Test</p>')->type;
894
895         # "tag"
896         $dom->parse('<p>Test</p>')->at('p')->type;
897
898         # "text"
899         $dom->parse('<p>Test</p>')->at('p')->child_nodes->first->type;
900
901   val
902         my $value = $dom->val;
903
904       Extract value from form element (such as "button", "input", "option",
905       "select" and "textarea"), or return "undef" if this element has no
906       value. In the case of "select" with "multiple" attribute, find "option"
907       elements with "selected" attribute and return an array reference with
908       all values, or "undef" if none could be found.
909
910         # "a"
911         $dom->parse('<input name=test value=a>')->at('input')->val;
912
913         # "b"
914         $dom->parse('<textarea>b</textarea>')->at('textarea')->val;
915
916         # "c"
917         $dom->parse('<option value="c">Test</option>')->at('option')->val;
918
919         # "d"
920         $dom->parse('<select><option selected>d</option></select>')
921           ->at('select')->val;
922
923         # "e"
924         $dom->parse('<select multiple><option selected>e</option></select>')
925           ->at('select')->val->[0];
926
927         # "on"
928         $dom->parse('<input name=test type=checkbox>')->at('input')->val;
929
930   with_roles
931         my $new_class = Mojo::DOM58->with_roles('Mojo::DOM58::Role::One');
932         my $new_class = Mojo::DOM58->with_roles('+One', '+Two');
933         $dom          = $dom->with_roles('+One', '+Two');
934
935       Equivalent to "with_roles" in Mojo::Base. Note that role support
936       depends on Role::Tiny (2.000001+).
937
938   wrap
939         $dom = $dom->wrap('<div></div>');
940         $dom = $dom->wrap(Mojo::DOM58->new);
941
942       Wrap HTML/XML fragment around this node (for all node types other than
943       "root"), placing it as the last child of the first innermost element.
944
945         # "<p>123<b>Test</b></p>"
946         $dom->parse('<b>Test</b>')->at('b')->wrap('<p>123</p>')->root;
947
948         # "<div><p><b>Test</b></p>123</div>"
949         $dom->parse('<b>Test</b>')->at('b')->wrap('<div><p></p>123</div>')->root;
950
951         # "<p><b>Test</b></p><p>123</p>"
952         $dom->parse('<b>Test</b>')->at('b')->wrap('<p></p><p>123</p>')->root;
953
954         # "<p><b>Test</b></p>"
955         $dom->parse('<p>Test</p>')->at('p')->child_nodes->first->wrap('<b>')->root;
956
957   wrap_content
958         $dom = $dom->wrap_content('<div></div>');
959         $dom = $dom->wrap_content(Mojo::DOM58->new);
960
961       Wrap HTML/XML fragment around this node's content (for "root" and "tag"
962       nodes), placing it as the last children of the first innermost element.
963
964         # "<p><b>123Test</b></p>"
965         $dom->parse('<p>Test<p>')->at('p')->wrap_content('<b>123</b>')->root;
966
967         # "<p><b>Test</b></p><p>123</p>"
968         $dom->parse('<b>Test</b>')->wrap_content('<p></p><p>123</p>');
969
970   xml
971         my $bool = $dom->xml;
972         $dom     = $dom->xml($bool);
973
974       Disable HTML semantics in parser and activate case-sensitivity,
975       defaults to auto detection based on XML declarations.
976

COLLECTION METHODS

978       Some Mojo::DOM58 methods return an array-based collection object based
979       on Mojo::Collection, which can either be accessed directly as an array
980       reference, or with the following methods.
981
982         # Chain methods
983         $collection->map(sub { ucfirst })->shuffle->each(sub {
984           my ($word, $num) = @_;
985           say "$num: $word";
986         });
987
988         # Access array directly to manipulate collection
989         $collection->[23] += 100;
990         say for @$collection;
991
992   compact
993         my $new = $collection->compact;
994
995       Create a new collection with all elements that are defined and not an
996       empty string.
997
998         # $collection contains (0, 1, undef, 2, '', 3)
999         $collection->compact->join(', '); # "0, 1, 2, 3"
1000
1001   each
1002         my @elements = $collection->each;
1003         $collection  = $collection->each(sub {...});
1004
1005       Evaluate callback for each element in collection or return all elements
1006       as a list if none has been provided. The element will be the first
1007       argument passed to the callback and is also available as $_.
1008
1009         # Make a numbered list
1010         $collection->each(sub {
1011           my ($e, $num) = @_;
1012           say "$num: $e";
1013         });
1014
1015   first
1016         my $first = $collection->first;
1017         my $first = $collection->first(qr/foo/);
1018         my $first = $collection->first(sub {...});
1019         my $first = $collection->first($method);
1020         my $first = $collection->first($method, @args);
1021
1022       Evaluate regular expression/callback for, or call method on, each
1023       element in collection and return the first one that matched the regular
1024       expression, or for which the callback/method returned true. The element
1025       will be the first argument passed to the callback and is also available
1026       as $_.
1027
1028         # Longer version
1029         my $first = $collection->first(sub { $_->$method(@args) });
1030
1031         # Find first value that contains the word "mojo"
1032         my $interesting = $collection->first(qr/mojo/i);
1033
1034         # Find first value that is greater than 5
1035         my $greater = $collection->first(sub { $_ > 5 });
1036
1037   flatten
1038         my $new = $collection->flatten;
1039
1040       Flatten nested collections/arrays recursively and create a new
1041       collection with all elements.
1042
1043         # $collection contains (1, [2, [3, 4], 5, [6]], 7)
1044         $collection->flatten->join(', '); # "1, 2, 3, 4, 5, 6, 7"
1045
1046   grep
1047         my $new = $collection->grep(qr/foo/);
1048         my $new = $collection->grep(sub {...});
1049         my $new = $collection->grep($method);
1050         my $new = $collection->grep($method, @args);
1051
1052       Evaluate regular expression/callback for, or call method on, each
1053       element in collection and create a new collection with all elements
1054       that matched the regular expression, or for which the callback/method
1055       returned true. The element will be the first argument passed to the
1056       callback and is also available as $_.
1057
1058         # Longer version
1059         my $new = $collection->grep(sub { $_->$method(@args) });
1060
1061         # Find all values that contain the word "mojo"
1062         my $interesting = $collection->grep(qr/mojo/i);
1063
1064         # Find all values that are greater than 5
1065         my $greater = $collection->grep(sub { $_ > 5 });
1066
1067   join
1068         my $stream = $collection->join;
1069         my $stream = $collection->join("\n");
1070
1071       Turn collection into string.
1072
1073         # Join all values with commas
1074         $collection->join(', ');
1075
1076   last
1077         my $last = $collection->last;
1078
1079       Return the last element in collection.
1080
1081   map
1082         my $new = $collection->map(sub {...});
1083         my $new = $collection->map($method);
1084         my $new = $collection->map($method, @args);
1085
1086       Evaluate callback for, or call method on, each element in collection
1087       and create a new collection from the results. The element will be the
1088       first argument passed to the callback and is also available as $_.
1089
1090         # Longer version
1091         my $new = $collection->map(sub { $_->$method(@args) });
1092
1093         # Append the word "mojo" to all values
1094         my $domified = $collection->map(sub { $_ . 'mojo' });
1095
1096   reduce
1097         my $result = $collection->reduce(sub {...});
1098         my $result = $collection->reduce(sub {...}, $initial);
1099
1100       Reduce elements in collection with callback, the first element will be
1101       used as initial value if none has been provided.
1102
1103         # Calculate the sum of all values
1104         my $sum = $collection->reduce(sub { $a + $b });
1105
1106         # Count how often each value occurs in collection
1107         my $hash = $collection->reduce(sub { $a->{$b}++; $a }, {});
1108
1109   reverse
1110         my $new = $collection->reverse;
1111
1112       Create a new collection with all elements in reverse order.
1113
1114   slice
1115         my $new = $collection->slice(4 .. 7);
1116
1117       Create a new collection with all selected elements.
1118
1119         # $collection contains ('A', 'B', 'C', 'D', 'E')
1120         $collection->slice(1, 2, 4)->join(' '); # "B C E"
1121
1122   shuffle
1123         my $new = $collection->shuffle;
1124
1125       Create a new collection with all elements in random order.
1126
1127   size
1128         my $size = $collection->size;
1129
1130       Number of elements in collection.
1131
1132   sort
1133         my $new = $collection->sort;
1134         my $new = $collection->sort(sub {...});
1135
1136       Sort elements based on return value of callback and create a new
1137       collection from the results.
1138
1139         # Sort values case-insensitive
1140         my $case_insensitive = $collection->sort(sub { uc($a) cmp uc($b) });
1141
1142   tap
1143         $collection = $collection->tap(sub {...});
1144
1145       Equivalent to "tap" in Mojo::Base.
1146
1147   to_array
1148         my $array = $collection->to_array;
1149
1150       Turn collection into array reference.
1151
1152   uniq
1153         my $new = $collection->uniq;
1154         my $new = $collection->uniq(sub {...});
1155         my $new = $collection->uniq($method);
1156         my $new = $collection->uniq($method, @args);
1157
1158       Create a new collection without duplicate elements, using the string
1159       representation of either the elements or the return value of the
1160       callback/method to decide uniqueness. Note that "undef" and empty
1161       string are treated the same.
1162
1163         # Longer version
1164         my $new = $collection->uniq(sub { $_->$method(@args) });
1165
1166         # $collection contains ('foo', 'bar', 'bar', 'baz')
1167         $collection->uniq->join(' '); # "foo bar baz"
1168
1169         # $collection contains ([1, 2], [2, 1], [3, 2])
1170         $collection->uniq(sub{ $_->[1] })->to_array; # "[[1, 2], [2, 1]]"
1171
1172   with_roles
1173         $collection = $collection->with_roles('Mojo::Collection::Role::One');
1174
1175       Equivalent to "with_roles" in Mojo::Base. Note that role support
1176       depends on Role::Tiny (2.000001+).
1177

BUGS

1179       Report issues related to the format of this distribution or Perl 5.8
1180       support to the public bugtracker. Any other issues should be reported
1181       directly to the upstream Mojolicious issue tracker.
1182

AUTHOR

1184       Dan Book <dbook@cpan.org>
1185
1186       Code and tests adapted from Mojo::DOM, a lightweight DOM parser by the
1187       Mojolicious team.
1188

CONTRIBUTORS

1190       Matt S Trout (mst)
1191
1193       Copyright (c) 2008-2016 Sebastian Riedel and others.
1194
1195       Copyright (c) 2016 "AUTHOR" and "CONTRIBUTORS" for adaptation to
1196       standalone format.
1197
1198       This is free software, licensed under:
1199
1200         The Artistic License 2.0 (GPL Compatible)
1201

SEE ALSO

1203       Mojo::DOM, HTML::TreeBuilder, XML::LibXML, XML::Twig, XML::Smart
1204
1205
1206
1207perl v5.30.1                      2020-01-30                    Mojo::DOM58(3)
Impressum