1Tree::DAG_Node(3) User Contributed Perl Documentation Tree::DAG_Node(3)
2
3
4
6 Tree::DAG_Node - (super)class for representing nodes in a tree
7
9 Using as a base class:
10
11 package Game::Tree::Node; # or whatever you're doing
12 use Tree::DAG_Node;
13 @ISA = qw(Tree::DAG_Node);
14 ...your own methods overriding/extending
15 the methods in Tree::DAG_Node...
16
17 Using as a class of its own:
18
19 use Tree::DAG_Node;
20 my $root = Tree::DAG_Node->new();
21 $root->name("I'm the tops");
22 my $new_daughter = $root->new_daughter;
23 $new_daughter->name("More");
24 ...
25
27 This class encapsulates/makes/manipulates objects that represent nodes
28 in a tree structure. The tree structure is not an object itself, but is
29 emergent from the linkages you create between nodes. This class
30 provides the methods for making linkages that can be used to build up a
31 tree, while preventing you from ever making any kinds of linkages which
32 are not allowed in a tree (such as having a node be its own mother or
33 ancestor, or having a node have two mothers).
34
35 This is what I mean by a "tree structure", a bit redundantly stated:
36
37 * A tree is a special case of an acyclic directed graph.
38
39 * A tree is a network of nodes where there's exactly one root node
40 (i.e., 'the top'), and the only primary relationship between nodes is
41 the mother-daugher relationship.
42
43 * No node can be its own mother, or its mother's mother, etc.
44
45 * Each node in the tree has exactly one "parent" (node in the "up"
46 direction) -- except the root, which is parentless.
47
48 * Each node can have any number (0 to any finite number) of daughter
49 nodes. A given node's daughter nodes constitute an ordered list.
50 (However, you are free to consider this ordering irrelevant. Some
51 applications do need daughters to be ordered, so I chose to consider
52 this the general case.)
53
54 * A node can appear in only one tree, and only once in that tree.
55 Notably (notable because it doesn't follow from the two above points),
56 a node cannot appear twice in its mother's daughter list.
57
58 * In other words, there's an idea of up (toward the root) versus down
59 (away from the root), and left (i.e., toward the start (index 0) of a
60 given node's daughter list) versus right (toward the end of a given
61 node's daughter list).
62
63 Trees as described above have various applications, among them:
64 representing syntactic constituency, in formal linguistics;
65 representing contingencies in a game tree; representing abstract syntax
66 in the parsing of any computer language -- whether in expression trees
67 for programming languages, or constituency in the parse of a markup
68 language document. (Some of these might not use the fact that
69 daughters are ordered.)
70
71 (Note: B-Trees are a very special case of the above kinds of trees, and
72 are best treated with their own class. Check CPAN for modules
73 encapsulating B-Trees; or if you actually want a database, and for some
74 reason ended up looking here, go look at AnyDBM_File.)
75
76 Many base classes are not usable except as such -- but Tree::DAG_Node
77 can be used as a normal class. You can go ahead and say:
78
79 use Tree::DAG_Node;
80 my $root = Tree::DAG_Node->new();
81 $root->name("I'm the tops");
82 $new_daughter = Tree::DAG_Node->new();
83 $new_daughter->name("More");
84 $root->add_daughter($new_daughter);
85
86 and so on, constructing and linking objects from Tree::DAG_Node and
87 making useful tree structures out of them.
88
90 This class is big and provides lots of methods. If your problem is
91 simple (say, just representing a simple parse tree), this class might
92 seem like using an atomic sledgehammer to swat a fly. But the
93 complexity of this module's bells and whistles shouldn't detract from
94 the efficiency of using this class for a simple purpose. In fact, I'd
95 be very surprised if any one user ever had use for more that even a
96 third of the methods in this class. And remember: an atomic
97 sledgehammer will kill that fly.
98
100 Implementationally, each node in a tree is an object, in the sense of
101 being an arbitrarily complex data structure that belongs to a class
102 (presumably Tree::DAG_Node, or ones derived from it) that provides
103 methods.
104
105 The attributes of a node-object are:
106
107 mother -- this node's mother. undef if this is a root.
108 daughters -- the (possibly empty) list of daughters of this node.
109 name -- the name for this node.
110 Need not be unique, or even printable. This is printed in some of
111 the various dumper methods, but it's up to you if you don't put
112 anything meaningful or printable here.
113
114 attributes -- whatever the user wants to use it for.
115 Presumably a hashref to whatever other attributes the user wants to
116 store without risk of colliding with the object's real attributes.
117 (Example usage: attributes to an SGML tag -- you definitely
118 wouldn't want the existence of a "mother=foo" pair in such a tag to
119 collide with a node object's 'mother' attribute.)
120
121 Aside from (by default) initializing it to {}, and having the
122 access method called "attributes" (described a ways below), I don't
123 do anything with the "attributes" in this module. I basically
124 intended this so that users who don't want/need to bother deriving
125 a class from Tree::DAG_Node, could still attach whatever data they
126 wanted in a node.
127
128 "mother" and "daughters" are attributes that relate to linkage -- they
129 are never written to directly, but are changed as appropriate by the
130 "linkage methods", discussed below.
131
132 The other two (and whatever others you may add in derived classes) are
133 simply accessed thru the same-named methods, discussed further below.
134
135 ABOUT THE DOCUMENTED INTERFACE
136 Stick to the documented interface (and comments in the source --
137 especially ones saying "undocumented!" and/or "disfavored!" -- do not
138 count as documentation!), and don't rely on any behavior that's not in
139 the documented interface.
140
141 Specifically, unless the documentation for a particular method says
142 "this method returns thus-and-such a value", then you should not rely
143 on it returning anything meaningful.
144
145 A passing acquintance with at least the broader details of the source
146 code for this class is assumed for anyone using this class as a base
147 class -- especially if you're overriding existing methods, and
148 definitely if you're overriding linkage methods.
149
151 the constructor CLASS->new() or CLASS->new({...options...})
152 This creates a new node object, calls
153 $object->_init({...options...}) to provide it sane defaults (like:
154 undef name, undef mother, no daughters, 'attributes' setting of a
155 new empty hashref), and returns the object created. (If you just
156 said "CLASS->new()" or "CLASS->new", then it pretends you called
157 "CLASS->new({})".)
158
159 Currently no options for putting in {...options...} are part of the
160 documented interface, but the options is here in case you want to
161 add such behavior in a derived class.
162
163 Read on if you plan on using Tree::DAG_New as a base class.
164 (Otherwise feel free to skip to the description of _init.)
165
166 There are, in my mind, two ways to do object construction:
167
168 Way 1: create an object, knowing that it'll have certain
169 uninteresting sane default values, and then call methods to change
170 those values to what you want. Example:
171
172 $node = Tree::DAG_Node->new;
173 $node->name('Supahnode!');
174 $root->add_daughter($node);
175 $node->add_daughters(@some_others)
176
177 Way 2: be able to specify some/most/all the object's attributes in
178 the call to the constructor. Something like:
179
180 $node = Tree::DAG_Node->new({
181 name => 'Supahnode!',
182 mother => $root,
183 daughters => \@some_others
184 });
185
186 After some deliberation, I've decided that the second way is a Bad
187 Thing. First off, it is not markedly more concise than the first
188 way. Second off, it often requires subtly different syntax (e.g.,
189 \@some_others vs @some_others). It just complicates things for the
190 programmer and the user, without making either appreciably happier.
191
192 (This is not to say that options in general for a constructor are
193 bad -- "random_network", discussed far below, necessarily takes
194 options. But note that those are not options for the default
195 values of attributes.)
196
197 Anyway, if you use Tree::DAG_Node as a superclass, and you add
198 attributes that need to be initialized, what you need to do is
199 provide an _init method that calls $this->SUPER::_init($options) to
200 use its superclass's _init method, and then initializes the new
201 attributes:
202
203 sub _init {
204 my($this, $options) = @_[0,1];
205 $this->SUPER::_init($options); # call my superclass's _init to
206 # init all the attributes I'm inheriting
207
208 # Now init /my/ new attributes:
209 $this->{'amigos'} = []; # for example
210 }
211
212 ...or, as I prefer when I'm being a neat freak:
213
214 sub _init {
215 my($this, $options) = @_[0,1];
216 $this->SUPER::_init($options);
217
218 $this->_init_amigos($options);
219 }
220
221 sub _init_amigos {
222 my $this = $_[0];
223 # Or my($this,$options) = @_[0,1]; if I'm using $options
224 $this->{'amigos'} = [];
225 }
226
227 In other words, I like to have each attribute initialized thru a
228 method named _init_[attribute], which should expect the object as
229 $_[0] and the the options hashref (or {} if none was given) as
230 $_[1]. If you insist on having your _init recognize options for
231 setting attributes, you might as well have them dealt with by the
232 appropriate _init_[attribute] method, like this:
233
234 sub _init {
235 my($this, $options) = @_[0,1];
236 $this->SUPER::_init($options);
237
238 $this->_init_amigos($options);
239 }
240
241 sub _init_amigos {
242 my($this,$options) = @_[0,1]; # I need options this time
243 $this->{'amigos'} = [];
244 $this->amigos(@{$options->{'amigos'}}) if $options->{'amigos'};
245 }
246
247 All this bookkeeping looks silly with just one new attribute in a
248 class derived straight from Tree::DAG_Node, but if there's lots of
249 new attributes running around, and if you're deriving from a class
250 derived from a class derived from Tree::DAG_Node, then tidy
251 stratification/modularization like this can keep you sane.
252
253 the constructor $obj->new() or $obj->new({...options...})
254 Just another way to get at the "new" method. This does not copy
255 $obj, but merely constructs a new object of the same class as it.
256 Saves you the bother of going $class = ref $obj; $obj2 =
257 $class->new;
258
259 the method $node->_init({...options...})
260 Initialize the object's attribute values. See the discussion
261 above. Presumably this should be called only by the guts of the
262 "new" constructor -- never by the end user.
263
264 Currently there are no documented options for putting in
265 {...options...}, but (in case you want to disregard the above rant)
266 the option exists for you to use {...options...} for something
267 useful in a derived class.
268
269 Please see the source for more information.
270
271 see also (below) the constructors "new_daughter" and
272 "new_daughter_left"
273
275 $node->daughters
276 This returns the (possibly empty) list of daughters for $node.
277
278 $node->mother
279 This returns what node is $node's mother. This is undef if $node
280 has no mother -- i.e., if it is a root.
281
282 $mother->add_daughters( LIST )
283 This method adds the node objects in LIST to the (right) end of
284 $mother's "daughter" list. Making a node N1 the daughter of
285 another node N2 also means that N1's "mother" attribute is
286 "automatically" set to N2; it also means that N1 stops being
287 anything else's daughter as it becomes N2's daughter.
288
289 If you try to make a node its own mother, a fatal error results.
290 If you try to take one of a a node N1's ancestors and make it also
291 a daughter of N1, a fatal error results. A fatal error results if
292 anything in LIST isn't a node object.
293
294 If you try to make N1 a daughter of N2, but it's already a daughter
295 of N2, then this is a no-operation -- it won't move such nodes to
296 the end of the list or anything; it just skips doing anything with
297 them.
298
299 $node->add_daughter( LIST )
300 An exact synonym for $node->add_daughters(LIST)
301
302 $mother->add_daughters_left( LIST )
303 This method is just like "add_daughters", except that it adds the
304 node objects in LIST to the (left) beginning of $mother's daughter
305 list, instead of the (right) end of it.
306
307 $node->add_daughter_left( LIST )
308 An exact synonym for $node->add_daughters_left( LIST )
309
310 Note:
311 The above link-making methods perform basically an "unshift" or
312 "push" on the mother node's daughter list. To get the full range
313 of list-handling functionality, copy the daughter list, and change
314 it, and then call "set_daughters" on the result:
315
316 @them = $mother->daughters;
317 @removed = splice(@them, 0,2, @new_nodes);
318 $mother->set_daughters(@them);
319
320 Or consider a structure like:
321
322 $mother->set_daughters(
323 grep($_->name =~ /NP/ ,
324 $mother->daughters
325 )
326 );
327
328 the constructor $daughter = $mother->new_daughter, or
329 the constructor $daughter = $mother->new_daughter({...options...})
330 This constructs a new node (of the same class as $mother), and adds
331 it to the (right) end of the daughter list of $mother. This is
332 essentially the same as going
333
334 $daughter = $mother->new;
335 $mother->add_daughter($daughter);
336
337 but is rather more efficient because (since $daughter is guaranteed
338 new and isn't linked to/from anything), it doesn't have to check
339 that $daughter isn't an ancestor of $mother, isn't already daughter
340 to a mother it needs to be unlinked from, isn't already in
341 $mother's daughter list, etc.
342
343 As you'd expect for a constructor, it returns the node-object
344 created.
345
346 the constructor $mother->new_daughter_left, or
347 $mother->new_daughter_left({...options...})
348 This is just like $mother->new_daughter, but adds the new daughter
349 to the left (start) of $mother's daughter list.
350
351 $mother->remove_daughters( LIST )
352 This removes the nodes listed in LIST from $mother's daughter list.
353 This is a no-operation if LIST is empty. If there are things in
354 LIST that aren't a current daughter of $mother, they are ignored.
355
356 Not to be confused with $mother->clear_daughters.
357
358 $node->remove_daughter( LIST )
359 An exact synonym for $node->remove_daughters( LIST )
360
361 $node->unlink_from_mother
362 This removes node from the daughter list of its mother. If it has
363 no mother, this is a no-operation.
364
365 Returns the mother unlinked from (if any).
366
367 $mother->clear_daughters
368 This unlinks all $mother's daughters. Returns the the list of what
369 used to be $mother's daughters.
370
371 Not to be confused with $mother->remove_daughters( LIST ).
372
373 $mother->set_daughters( LIST )
374 This unlinks all $mother's daughters, and replaces them with the
375 daughters in LIST.
376
377 Currently implemented as just $mother->clear_daughters followed by
378 $mother->add_daughters( LIST ).
379
380 $node->replace_with( LIST )
381 This replaces $node in its mother's daughter list, by unlinking
382 $node and replacing it with the items in LIST. This returns a list
383 consisting of $node followed by LIST, i.e., the nodes that replaced
384 it.
385
386 LIST can include $node itself (presumably at most once). LIST can
387 also be empty-list. However, if any items in LIST are sisters to
388 $node, they are ignored, and are not in the copy of LIST passed as
389 the return value.
390
391 As you might expect for any linking operation, the items in LIST
392 cannot be $node's mother, or any ancestor to it; and items in LIST
393 are, of course, unlinked from their mothers (if they have any) as
394 they're linked to $node's mother.
395
396 (In the special (and bizarre) case where $node is root, this simply
397 calls $this->unlink_from_mother on all the items in LIST, making
398 them roots of their own trees.)
399
400 Note that the daughter-list of $node is not necessarily affected;
401 nor are the daughter-lists of the items in LIST. I mention this in
402 case you think replace_with switches one node for another, with
403 respect to its mother list and its daughter list, leaving the rest
404 of the tree unchanged. If that's what you want, replacing $Old with
405 $New, then you want:
406
407 $New->set_daughters($Old->clear_daughters);
408 $Old->replace_with($New);
409
410 (I can't say $node's and LIST-items' daughter lists are never
411 affected my replace_with -- they can be affected in this case:
412
413 $N1 = ($node->daughters)[0]; # first daughter of $node
414 $N2 = ($N1->daughters)[0]; # first daughter of $N1;
415 $N3 = Tree::DAG_Node->random_network; # or whatever
416 $node->replace_with($N1, $N2, $N3);
417
418 As a side affect of attaching $N1 and $N2 to $node's mother,
419 they're unlinked from their parents ($node, and $N1, replectively).
420 But N3's daughter list is unaffected.
421
422 In other words, this method does what it has to, as you'd expect it
423 to.
424
425 $node->replace_with_daughters
426 This replaces $node in its mother's daughter list, by unlinking
427 $node and replacing it with its daughters. In other words, $node
428 becomes motherless and daughterless as its daughters move up and
429 take its place. This returns a list consisting of $node followed
430 by the nodes that were its daughters.
431
432 In the special (and bizarre) case where $node is root, this simply
433 unlinks its daughters from it, making them roots of their own
434 trees.
435
436 Effectively the same as $node->replace_with($node->daughters), but
437 more efficient, since less checking has to be done. (And I also
438 think $node->replace_with_daughters is a more common operation in
439 tree-wrangling than $node->replace_with(LIST), so deserves a named
440 method of its own, but that's just me.)
441
442 $node->add_left_sisters( LIST )
443 This adds the elements in LIST (in that order) as immediate left
444 sisters of $node. In other words, given that B's mother's
445 daughter-list is (A,B,C,D), calling B->add_left_sisters(X,Y) makes
446 B's mother's daughter-list (A,X,Y,B,C,D).
447
448 If LIST is empty, this is a no-op, and returns empty-list.
449
450 This is basically implemented as a call to
451 $node->replace_with(LIST, $node), and so all replace_with's
452 limitations and caveats apply.
453
454 The return value of $node->add_left_sisters( LIST ) is the elements
455 of LIST that got added, as returned by replace_with -- minus the
456 copies of $node you'd get from a straight call to
457 $node->replace_with(LIST, $node).
458
459 $node->add_left_sister( LIST )
460 An exact synonym for $node->add_left_sisters(LIST)
461
462 $node->add_right_sisters( LIST )
463 Just like add_left_sisters (which see), except that the the
464 elements in LIST (in that order) as immediate right sisters of
465 $node;
466
467 In other words, given that B's mother's daughter-list is (A,B,C,D),
468 calling B->add_right_sisters(X,Y) makes B's mother's daughter-list
469 (A,B,X,Y,C,D).
470
471 $node->add_right_sister( LIST )
472 An exact synonym for $node->add_right_sisters(LIST)
473
475 $node->name or $node->name(SCALAR)
476 In the first form, returns the value of the node object's "name"
477 attribute. In the second form, sets it to the value of SCALAR.
478
479 $node->attributes or $node->attributes(SCALAR)
480 In the first form, returns the value of the node object's
481 "attributes" attribute. In the second form, sets it to the value
482 of SCALAR. I intend this to be used to store a reference to a
483 (presumably anonymous) hash the user can use to store whatever
484 attributes he doesn't want to have to store as object attributes.
485 In this case, you needn't ever set the value of this. (_init has
486 already initialized it to {}.) Instead you can just do...
487
488 $node->attributes->{'foo'} = 'bar';
489
490 ...to write foo => bar.
491
492 $node->attribute or $node->attribute(SCALAR)
493 An exact synonym for $node->attributes or $node->attributes(SCALAR)
494
496 $node->is_node
497 This always returns true. More pertinently,
498 $object->can('is_node') is true (regardless of what "is_node" would
499 do if called) for objects belonging to this class or for any class
500 derived from it.
501
502 $node->ancestors
503 Returns the list of this node's ancestors, starting with its
504 mother, then grandmother, and ending at the root. It does this by
505 simply following the 'mother' attributes up as far as it can. So
506 if $item IS the root, this returns an empty list.
507
508 Consider that scalar($node->ancestors) returns the ply of this node
509 within the tree -- 2 for a granddaughter of the root, etc., and 0
510 for root itself.
511
512 $node->root
513 Returns the root of whatever tree $node is a member of. If $node
514 is the root, then the result is $node itself.
515
516 $node->is_daughter_of($node2)
517 Returns true iff $node is a daughter of $node2. Currently
518 implemented as just a test of ($it->mother eq $node2).
519
520 $node->self_and_descendants
521 Returns a list consisting of itself (as element 0) and all the
522 descendants of $node. Returns just itself if $node is a
523 terminal_node.
524
525 (Note that it's spelled "descendants", not "descendents".)
526
527 $node->descendants
528 Returns a list consisting of all the descendants of $node. Returns
529 empty-list if $node is a terminal_node.
530
531 (Note that it's spelled "descendants", not "descendents".)
532
533 $node->leaves_under
534 Returns a list (going left-to-right) of all the leaf nodes under
535 $node. ("Leaf nodes" are also called "terminal nodes" -- i.e.,
536 nodes that have no daughters.) Returns $node in the degenerate
537 case of $node being a leaf itself.
538
539 $node->depth_under
540 Returns an integer representing the number of branches between this
541 $node and the most distant leaf under it. (In other words, this
542 returns the ply of subtree starting of $node. Consider
543 scalar($it->ancestors) if you want the ply of a node within the
544 whole tree.)
545
546 $node->generation
547 Returns a list of all nodes (going left-to-right) that are in
548 $node's generation -- i.e., that are the some number of nodes down
549 from the root. $root->generation is just $root.
550
551 Of course, $node is always in its own generation.
552
553 $node->generation_under(NODE2)
554 Like $node->generation, but returns only the nodes in $node's
555 generation that are also descendants of NODE2 -- in other words,
556
557 @us = $node->generation_under( $node->mother->mother );
558
559 is all $node's first cousins (to borrow yet more kinship
560 terminology) -- assuming $node does indeed have a grandmother.
561 Actually "cousins" isn't quite an apt word, because @us ends up
562 including $node's siblings and $node.
563
564 Actually, "generation_under" is just an alias to "generation", but
565 I figure that this:
566
567 @us = $node->generation_under($way_upline);
568
569 is a bit more readable than this:
570
571 @us = $node->generation($way_upline);
572
573 But it's up to you.
574
575 $node->generation_under($node) returns just $node.
576
577 If you call $node->generation_under($node) but NODE2 is not $node
578 or an ancestor of $node, it behaves as if you called just
579 $node->generation().
580
581 $node->self_and_sisters
582 Returns a list of all nodes (going left-to-right) that have the
583 same mother as $node -- including $node itself. This is just like
584 $node->mother->daughters, except that that fails where $node is
585 root, whereas $root->self_and_siblings, as a special case, returns
586 $root.
587
588 (Contrary to how you may interpret how this method is named, "self"
589 is not (necessarily) the first element of what's returned.)
590
591 $node->sisters
592 Returns a list of all nodes (going left-to-right) that have the
593 same mother as $node -- not including $node itself. If $node is
594 root, this returns empty-list.
595
596 $node->left_sister
597 Returns the node that's the immediate left sister of $node. If
598 $node is the leftmost (or only) daughter of its mother (or has no
599 mother), then this returns undef.
600
601 (See also $node->add_left_sisters(LIST).)
602
603 $node->left_sisters
604 Returns a list of nodes that're sisters to the left of $node. If
605 $node is the leftmost (or only) daughter of its mother (or has no
606 mother), then this returns an empty list.
607
608 (See also $node->add_left_sisters(LIST).)
609
610 $node->right_sister
611 Returns the node that's the immediate right sister of $node. If
612 $node is the rightmost (or only) daughter of its mother (or has no
613 mother), then this returns undef.
614
615 (See also $node->add_right_sisters(LIST).)
616
617 $node->right_sisters
618 Returns a list of nodes that're sisters to the right of $node. If
619 $node is the rightmost (or only) daughter of its mother (or has no
620 mother), then this returns an empty list.
621
622 (See also $node->add_right_sisters(LIST).)
623
624 $node->my_daughter_index
625 Returns what index this daughter is, in its mother's "daughter"
626 list. In other words, if $node is ($node->mother->daughters)[3],
627 then $node->my_daughter_index returns 3.
628
629 As a special case, returns 0 if $node has no mother.
630
631 $node->address or $anynode->address(ADDRESS)
632 With the first syntax, returns the address of $node within its
633 tree, based on its position within the tree. An address is formed
634 by noting the path between the root and $node, and concatenating
635 the daughter-indices of the nodes this passes thru (starting with 0
636 for the root, and ending with $node).
637
638 For example, if to get from node ROOT to node $node, you pass thru
639 ROOT, A, B, and $node, then the address is determined as:
640
641 * ROOT's my_daughter_index is 0.
642
643 * A's my_daughter_index is, suppose, 2. (A is index 2 in ROOT's
644 daughter list.)
645
646 * B's my_daughter_index is, suppose, 0. (B is index 0 in A's
647 daughter list.)
648
649 * $node's my_daughter_index is, suppose, 4. ($node is index 4 in
650 B's daughter list.)
651
652 The address of the above-described $node is, therefore, "0:2:0:4".
653
654 (As a somewhat special case, the address of the root is always "0";
655 and since addresses start from the root, all addresses start with a
656 "0".)
657
658 The second syntax, where you provide an address, starts from the
659 root of the tree $anynode belongs to, and returns the node
660 corresponding to that address. Returns undef if no node
661 corresponds to that address. Note that this routine may be
662 somewhat liberal in its interpretation of what can constitute an
663 address; i.e., it accepts "0.2.0.4", besides "0:2:0:4".
664
665 Also note that the address of a node in a tree is meaningful only
666 in that tree as currently structured.
667
668 (Consider how ($address1 cmp $address2) may be magically meaningful
669 to you, if you mant to figure out what nodes are to the right of
670 what other nodes.)
671
672 $node->common(LIST)
673 Returns the lowest node in the tree that is ancestor-or-self to the
674 nodes $node and LIST.
675
676 If the nodes are far enough apart in the tree, the answer is just
677 the root.
678
679 If the nodes aren't all in the same tree, the answer is undef.
680
681 As a degenerate case, if LIST is empty, returns $node.
682
683 $node->common_ancestor(LIST)
684 Returns the lowest node that is ancestor to all the nodes given (in
685 nodes $node and LIST). In other words, it answers the question:
686 "What node in the tree, as low as possible, is ancestor to the
687 nodes given ($node and LIST)?"
688
689 If the nodes are far enough apart, the answer is just the root --
690 except if any of the nodes are the root itself, in which case the
691 answer is undef (since the root has no ancestor).
692
693 If the nodes aren't all in the same tree, the answer is undef.
694
695 As a degenerate case, if LIST is empty, returns $node's mother;
696 that'll be undef if $node is root.
697
699 $node->walk_down({ callback => \&foo, callbackback => \&foo, ... })
700 Performs a depth-first traversal of the structure at and under
701 $node. What it does at each node depends on the value of the
702 options hashref, which you must provide. There are three options,
703 "callback" and "callbackback" (at least one of which must be
704 defined, as a sub reference), and "_depth". This is what
705 "walk_down" does, in pseudocode form:
706
707 * Start at the $node given.
708
709 * If there's a "callback", call it with $node as the first
710 argument, and the options hashref as the second argument (which
711 contains the potentially useful "_depth", remember). This function
712 must return true or false -- if false, it will block the next step:
713
714 * If $node has any daughter nodes, increment "_depth", and call
715 $daughter->walk_down(options_hashref) for each daughter (in order,
716 of course), where options_hashref is the same hashref it was called
717 with. When this returns, decrements "_depth".
718
719 * If there's a "callbackback", call just it as with "callback" (but
720 tossing out the return value). Note that "callback" returning
721 false blocks traversal below $node, but doesn't block calling
722 callbackback for $node. (Incidentally, in the unlikely case that
723 $node has stopped being a node object, "callbackback" won't get
724 called.)
725
726 * Return.
727
728 $node->walk_down is the way to recursively do things to a tree (if
729 you start at the root) or part of a tree; if what you're doing is
730 best done via pre-pre order traversal, use "callback"; if what
731 you're doing is best done with post-order traversal, use
732 "callbackback". "walk_down" is even the basis for plenty of the
733 methods in this class. See the source code for examples both
734 simple and horrific.
735
736 Note that if you don't specify "_depth", it effectively defaults to
737 0. You should set it to scalar($node->ancestors) if you want
738 "_depth" to reflect the true depth-in-the-tree for the nodes
739 called, instead of just the depth below $node. (If $node is the
740 root, there's difference, of course.)
741
742 And by the way, it's a bad idea to modify the tree from the
743 callback. Unpredictable things may happen. I instead suggest
744 having your callback add to a stack of things that need changing,
745 and then, once "walk_down" is all finished, changing those nodes
746 from that stack.
747
748 Note that the existence of "walk_down" doesn't mean you can't write
749 you own special-use traversers.
750
751 @lines = $node->dump_names({ ...options... });
752 Dumps, as an indented list, the names of the nodes starting at
753 $node, and continuing under it. Options are:
754
755 * _depth -- A nonnegative number. Indicating the depth to consider
756 $node as being at (and so the generation under that is that plus
757 one, etc.). Defaults to 0. You may choose to use set _depth =>
758 scalar($node->ancestors).
759
760 * tick -- a string to preface each entry with, between the
761 indenting-spacing and the node's name. Defaults to empty-string.
762 You may prefer "*" or "-> " or someting.
763
764 * indent -- the string used to indent with. Defaults to " " (two
765 spaces). Another sane value might be ". " (period, space).
766 Setting it to empty-string suppresses indenting.
767
768 The dump is not printed, but is returned as a list, where each item
769 is a line, with a "\n" at the end.
770
771 the constructor CLASS->random_network({...options...})
772 the method $node->random_network({...options...})
773 In the first case, constructs a randomly arranged network under a
774 new node, and returns the root node of that tree. In the latter
775 case, constructs the network under $node.
776
777 Currently, this is implemented a bit half-heartedly, and half-
778 wittedly. I basically needed to make up random-looking networks to
779 stress-test the various tree-dumper methods, and so wrote this. If
780 you actually want to rely on this for any application more serious
781 than that, I suggest examining the source code and seeing if this
782 does really what you need (say, in reliability of randomness); and
783 feel totally free to suggest changes to me (especially in the form
784 of "I rewrote "random_network", here's the code...")
785
786 It takes four options:
787
788 * max_node_count -- maximum number of nodes this tree will be
789 allowed to have (counting the root). Defaults to 25.
790
791 * min_depth -- minimum depth for the tree. Defaults to 2. Leaves
792 can be generated only after this depth is reached, so the tree will
793 be at least this deep -- unless max_node_count is hit first.
794
795 * max_depth -- maximum depth for the tree. Defaults to 3 plus
796 min_depth. The tree will not be deeper than this.
797
798 * max_children -- maximum number of children any mother in the tree
799 can have. Defaults to 4.
800
801 the constructor CLASS->lol_to_tree($lol);
802 Converts something like bracket-notation for "Chomsky trees" (or
803 rather, the closest you can come with Perl
804 list-of-lists(-of-lists(-of-lists))) into a tree structure.
805 Returns the root of the tree converted.
806
807 The conversion rules are that: 1) if the last (possibly the only)
808 item in a given list is a scalar, then that is used as the "name"
809 attribute for the node based on this list. 2) All other items in
810 the list represent daughter nodes of the current node --
811 recursively so, if they are list references; otherwise, (non-
812 terminal) scalars are considered to denote nodes with that name.
813 So ['Foo', 'Bar', 'N'] is an alternate way to represent [['Foo'],
814 ['Bar'], 'N'].
815
816 An example will illustrate:
817
818 use Tree::DAG_Node;
819 $lol =
820 [
821 [
822 [ [ 'Det:The' ],
823 [ [ 'dog' ], 'N'], 'NP'],
824 [ '/with rabies\\', 'PP'],
825 'NP'
826 ],
827 [ 'died', 'VP'],
828 'S'
829 ];
830 $tree = Tree::DAG_Node->lol_to_tree($lol);
831 $diagram = $tree->draw_ascii_tree;
832 print map "$_\n", @$diagram;
833
834 ...returns this tree:
835
836 |
837 <S>
838 |
839 /------------------\
840 | |
841 <NP> <VP>
842 | |
843 /---------------\ <died>
844 | |
845 <NP> <PP>
846 | |
847 /-------\ </with rabies\>
848 | |
849 <Det:The> <N>
850 |
851 <dog>
852
853 By the way (and this rather follows from the above rules), when
854 denoting a LoL tree consisting of just one node, this:
855
856 $tree = Tree::DAG_Node->lol_to_tree( 'Lonely' );
857
858 is okay, although it'd probably occur to you to denote it only as:
859
860 $tree = Tree::DAG_Node->lol_to_tree( ['Lonely'] );
861
862 which is of course fine, too.
863
864 $node->tree_to_lol_notation({...options...})
865 Dumps a tree (starting at $node) as the sort of LoL-like bracket
866 notation you see in the above example code. Returns just one big
867 block of text. The only option is "multiline" -- if true, it dumps
868 the text as the sort of indented structure as seen above; if false
869 (and it defaults to false), dumps it all on one line (with no
870 indenting, of course).
871
872 For example, starting with the tree from the above example, this:
873
874 print $tree->tree_to_lol_notation, "\n";
875
876 prints the following (which I've broken over two lines for sake of
877 printablitity of documentation):
878
879 [[[['Det:The'], [['dog'], 'N'], 'NP'], [["/with rabies\x5c"],
880 'PP'], 'NP'], [['died'], 'VP'], 'S'],
881
882 Doing this:
883
884 print $tree->tree_to_lol_notation({ multiline => 1 });
885
886 prints the same content, just spread over many lines, and prettily
887 indented.
888
889 $node->tree_to_lol
890 Returns that tree (starting at $node) represented as a LoL, like
891 what $lol, above, holds. (This is as opposed to
892 "tree_to_lol_notation", which returns the viewable code like what
893 gets evaluated and stored in $lol, above.)
894
895 Lord only knows what you use this for -- maybe for feeding to
896 Data::Dumper, in case "tree_to_lol_notation" doesn't do just what
897 you want?
898
899 the constructor CLASS->simple_lol_to_tree($simple_lol);
900 This is like lol_to_tree, except that rule 1 doesn't apply -- i.e.,
901 all scalars (or really, anything not a listref) in the LoL-
902 structure end up as named terminal nodes, and only terminal nodes
903 get names (and, of course, that name comes from that scalar value).
904 This method is useful for making things like expression trees, or
905 at least starting them off. Consider that this:
906
907 $tree = Tree::DAG_Node->simple_lol_to_tree(
908 [ 'foo', ['bar', ['baz'], 'quux'], 'zaz', 'pati' ]
909 );
910
911 converts from something like a Lispish or Iconish tree, if you
912 pretend the brackets are parentheses.
913
914 Note that there is a (possibly surprising) degenerate case of what
915 I'm calling a "simple-LoL", and it's like this:
916
917 $tree = Tree::DAG_Node->simple_lol_to_tree('Lonely');
918
919 This is the (only) way you can specify a tree consisting of only a
920 single node, which here gets the name 'Lonely'.
921
922 $node->tree_to_simple_lol
923 Returns that tree (starting at $node) represented as a simple-LoL
924 -- i.e., one where non-terminal nodes are represented as listrefs,
925 and terminal nodes are gotten from the contents of those nodes'
926 "name' attributes.
927
928 Note that in the case of $node being terminal, what you get back is
929 the same as $node->name.
930
931 Compare to tree_to_simple_lol_notation.
932
933 $node->tree_to_simple_lol_notation({...options...})
934 A simple-LoL version of tree_to_lol_notation (which see); takes the
935 same options.
936
937 $list_r = $node->draw_ascii_tree({ ... options ... })
938 Draws a nice ASCII-art representation of the tree structure at-and-
939 under $node, with $node at the top. Returns a reference to the
940 list of lines (with no "\n"s or anything at the end of them) that
941 make up the picture.
942
943 Example usage:
944
945 print map("$_\n", @{$tree->draw_ascii_tree});
946
947 draw_ascii_tree takes parameters you set in the options hashref:
948
949 * "no_name" -- if true, "draw_ascii_tree" doesn't print the name of
950 the node; simply prints a "*". Defaults to 0 (i.e., print the node
951 name.)
952
953 * "h_spacing" -- number 0 or greater. Sets the number of spaces
954 inserted horizontally between nodes (and groups of nodes) in a
955 tree. Defaults to 1.
956
957 * "h_compact" -- number 0 or 1. Sets the extent to which
958 "draw_ascii_tree" tries to save horizontal space. Defaults to 1.
959 If I think of a better scrunching algorithm, there'll be a "2"
960 setting for this.
961
962 * "v_compact" -- number 0, 1, or 2. Sets the degree to which
963 "draw_ascii_tree" tries to save vertical space. Defaults to 1.
964
965 This occasionally returns trees that are a bit cock-eyed in parts;
966 if anyone can suggest a better drawing algorithm, I'd be
967 appreciative.
968
969 $node->copy_tree or $node->copy_tree({...options...})
970 This returns the root of a copy of the tree that $node is a member
971 of. If you pass no options, copy_tree pretends you've passed {}.
972
973 This method is currently implemented as just a call to
974 $this->root->copy_at_and_under({...options...}), but magic may be
975 added in the future.
976
977 Options you specify are passed down to calls to $node->copy.
978
979 $node->copy_at_and_under or $node->copy_at_and_under({...options...})
980 This returns a copy of the subtree consisting of $node and
981 everything under it.
982
983 If you pass no options, copy_at_and_under pretends you've passed
984 {}.
985
986 This works by recursively building up the new tree from the leaves,
987 duplicating nodes using $orig_node->copy($options_ref) and then
988 linking them up into a new tree of the same shape.
989
990 Options you specify are passed down to calls to $node->copy.
991
992 the constructor $node->copy or $node->copy({...options...})
993 Returns a copy of $node, minus its daughter or mother attributes
994 (which are set back to default values).
995
996 If you pass no options, "copy" pretends you've passed {}.
997
998 Magic happens with the 'attributes' attribute: if it's a hashref
999 (and it usually is), the new node doesn't end up with the same
1000 hashref, but with ref to a hash with the content duplicated from
1001 the original's hashref. If 'attributes' is not a hashref, but
1002 instead an object that belongs to a class that provides a method
1003 called "copy", then that method is called, and the result saved in
1004 the clone's 'attribute' attribute. Both of these kinds of magic
1005 are disabled if the options you pass to "copy" (maybe via
1006 "copy_tree", or "copy_at_and_under") includes ("no_attribute_copy"
1007 => 1).
1008
1009 The options hashref you pass to "copy" (derictly or indirectly)
1010 gets changed slightly after you call "copy" -- it gets an entry
1011 called "from_to" added to it. Chances are you would never know nor
1012 care, but this is reserved for possible future use. See the source
1013 if you are wildly curious.
1014
1015 Note that if you are using $node->copy (whether directly or via
1016 $node->copy_tree or $node->copy_at_or_under), and it's not properly
1017 copying object attributes containing references, you probably
1018 shouldn't fight it or try to fix it -- simply override copy_tree
1019 with:
1020
1021 sub copy_tree {
1022 use Storable qw(dclone);
1023 my $this = $_[0];
1024 return dclone($this->root);
1025 # d for "deep"
1026 }
1027
1028 or
1029
1030 sub copy_tree {
1031 use Data::Dumper;
1032 my $this = $_[0];
1033 $Data::Dumper::Purity = 1;
1034 return eval(Dumper($this->root));
1035 }
1036
1037 Both of these avoid you having to reinvent the wheel.
1038
1039 How to override copy_at_or_under with something that uses Storable
1040 or Data::Dumper is left as an exercise to the reader.
1041
1042 Consider that if in a derived class, you add attributes with really
1043 bizarre contents (like a unique-for-all-time-ID), you may need to
1044 override "copy". Consider:
1045
1046 sub copy {
1047 my($it, @etc) = @_;
1048 $it->SUPER::copy(@etc);
1049 $it->{'UID'} = &get_new_UID;
1050 }
1051
1052 ...or the like. See the source of Tree::DAG_Node::copy for
1053 inspiration.
1054
1055 $node->delete_tree
1056 Destroys the entire tree that $node is a member of (starting at the
1057 root), by nulling out each node-object's attributes (including,
1058 most importantly, its linkage attributes -- hopefully this is more
1059 than sufficient to eliminate all circularity in the data
1060 structure), and then moving it into the class DEADNODE.
1061
1062 Use this when you're finished with the tree in question, and want
1063 to free up its memory. (If you don't do this, it'll get freed up
1064 anyway when your program ends.)
1065
1066 If you try calling any methods on any of the node objects in the
1067 tree you've destroyed, you'll get an error like:
1068
1069 Can't locate object method "leaves_under"
1070 via package "DEADNODE".
1071
1072 So if you see that, that's what you've done wrong. (Actually, the
1073 class DEADNODE does provide one method: a no-op method
1074 "delete_tree". So if you want to delete a tree, but think you may
1075 have deleted it already, it's safe to call $node->delete_tree on it
1076 (again).)
1077
1078 The "delete_tree" method is needed because Perl's garbage collector
1079 would never (as currently implemented) see that it was time to de-
1080 allocate the memory the tree uses -- until either you call
1081 $node->delete_tree, or until the program stops (at "global
1082 destruction" time, when everything is unallocated).
1083
1084 Incidentally, there are better ways to do garbage-collecting on a
1085 tree, ways which don't require the user to explicitly call a method
1086 like "delete_tree" -- they involve dummy classes, as explained at
1087 "http://mox.perl.com/misc/circle-destroy.pod"
1088
1089 However, introducing a dummy class concept into Tree::DAG_Node
1090 would be rather a distraction. If you want to do this with your
1091 derived classes, via a DESTROY in a dummy class (or in a tree-
1092 metainformation class, maybe), then feel free to.
1093
1094 The only case where I can imagine "delete_tree" failing to totally
1095 void the tree, is if you use the hashref in the "attributes"
1096 attribute to store (presumably among other things) references to
1097 other nodes' "attributes" hashrefs -- which 1) is maybe a bit odd,
1098 and 2) is your problem, because it's your hash structure that's
1099 circular, not the tree's. Anyway, consider:
1100
1101 # null out all my "attributes" hashes
1102 $anywhere->root->walk_down({
1103 'callback' => sub {
1104 $hr = $_[0]->attributes; %$hr = (); return 1;
1105 }
1106 });
1107 # And then:
1108 $anywhere->delete_tree;
1109
1110 (I suppose "delete_tree" is a "destructor", or as close as you can
1111 meaningfully come for a circularity-rich data structure in Perl.)
1112
1113 When and How to Destroy
1114 It should be clear to you that if you've built a big parse tree or
1115 something, and then you're finished with it, you should call
1116 $some_node->delete_tree on it if you want the memory back.
1117
1118 But consider this case: you've got this tree:
1119
1120 A
1121 / | \
1122 B C D
1123 | | \
1124 E X Y
1125
1126 Let's say you decide you don't want D or any of its descendants in the
1127 tree, so you call D->unlink_from_mother. This does NOT automagically
1128 destroy the tree D-X-Y. Instead it merely splits the tree into two:
1129
1130 A D
1131 / \ / \
1132 B C X Y
1133 |
1134 E
1135
1136 To destroy D and its little tree, you have to explicitly call
1137 delete_tree on it.
1138
1139 Note, however, that if you call C->unlink_from_mother, and if you don't
1140 have a link to C anywhere, then it does magically go away. This is
1141 because nothing links to C -- whereas with the D-X-Y tree, D links to X
1142 and Y, and X and Y each link back to D. Note that calling
1143 C->delete_tree is harmless -- after all, a tree of only one node is
1144 still a tree.
1145
1146 So, this is a surefire way of getting rid of all $node's children and
1147 freeing up the memory associated with them and their descendants:
1148
1149 foreach my $it ($node->clear_daughters) { $it->delete_tree }
1150
1151 Just be sure not to do this:
1152
1153 foreach my $it ($node->daughters) { $it->delete_tree }
1154 $node->clear_daughters;
1155
1156 That's bad; the first call to $_->delete_tree will climb to the root of
1157 $node's tree, and nuke the whole tree, not just the bits under $node.
1158 You might as well have just called $node->delete_tree. (Moreavor, once
1159 $node is dead, you can't call clear_daughters on it, so you'll get an
1160 error there.)
1161
1163 If you find a bug in this library, report it to me as soon as possible,
1164 at the address listed in the MAINTAINER section, below. Please try to
1165 be as specific as possible about how you got the bug to occur.
1166
1168 If you develop a given routine for dealing with trees in some way, and
1169 use it a lot, then if you think it'd be of use to anyone else, do email
1170 me about it; it might be helpful to others to include that routine, or
1171 something based on it, in a later version of this module.
1172
1173 It's occurred to me that you might like to (and might yourself develop
1174 routines to) draw trees in something other than ASCII art. If you do
1175 so -- say, for PostScript output, or for output interpretable by some
1176 external plotting program -- I'd be most interested in the results.
1177
1179 This module uses "strict", but I never wrote it with -w warnings in
1180 mind -- so if you use -w, do not be surprised if you see complaints
1181 from the guts of DAG_Node. As long as there is no way to turn off -w
1182 for a given module (instead of having to do it in every single
1183 subroutine with a "local $^W"), I'm not going to change this. However,
1184 I do, at points, get bursts of ambition, and I try to fix code in
1185 DAG_Node that generates warnings, as I come across them -- which is
1186 only occasionally. Feel free to email me any patches for any such
1187 fixes you come up with, tho.
1188
1189 Currently I don't assume (or enforce) anything about the class
1190 membership of nodes being manipulated, other than by testing whether
1191 each one provides a method "is_node", a la:
1192
1193 die "Not a node!!!" unless UNIVERSAL::can($node, "is_node");
1194
1195 So, as far as I'm concerned, a given tree's nodes are free to belong to
1196 different classes, just so long as they provide/inherit "is_node", the
1197 few methods that this class relies on to navigate the tree, and have
1198 the same internal object structure, or a superset of it. Presumably
1199 this would be the case for any object belonging to a class derived from
1200 "Tree::DAG_Node", or belonging to "Tree::DAG_Node" itself.
1201
1202 When routines in this class access a node's "mother" attribute, or its
1203 "daughters" attribute, they (generally) do so directly (via
1204 $node->{'mother'}, etc.), for sake of efficiency. But classes derived
1205 from this class should probably do this instead thru a method (via
1206 $node->mother, etc.), for sake of portability, abstraction, and general
1207 goodness.
1208
1209 However, no routines in this class (aside from, necessarily, "_init",
1210 "_init_name", and "name") access the "name" attribute directly;
1211 routines (like the various tree draw/dump methods) get the "name" value
1212 thru a call to $obj->name(). So if you want the object's name to not
1213 be a real attribute, but instead have it derived dynamically from some
1214 feature of the object (say, based on some of its other attributes, or
1215 based on its address), you can to override the "name" method, without
1216 causing problems. (Be sure to consider the case of $obj->name as a
1217 write method, as it's used in "lol_to_tree" and "random_network".)
1218
1220 HTML::Element
1221
1222 Wirth, Niklaus. 1976. Algorithms + Data Structures = Programs
1223 Prentice-Hall, Englewood Cliffs, NJ.
1224
1225 Knuth, Donald Ervin. 1997. Art of Computer Programming, Volume 1,
1226 Third Edition: Fundamental Algorithms. Addison-Wesley, Reading, MA.
1227
1228 Wirth's classic, currently and lamentably out of print, has a good
1229 section on trees. I find it clearer than Knuth's (if not quite as
1230 encyclopedic), probably because Wirth's example code is in a block-
1231 structured high-level language (basically Pascal), instead of in
1232 assembler (MIX).
1233
1234 Until some kind publisher brings out a new printing of Wirth's book,
1235 try poking around used bookstores (or "www.abebooks.com") for a copy.
1236 I think it was also republished in the 1980s under the title Algorithms
1237 and Data Structures, and in a German edition called Algorithmen und
1238 Datenstrukturen. (That is, I'm sure books by Knuth were published
1239 under those titles, but I'm assuming that they're just later
1240 printings/editions of Algorithms + Data Structures = Programs.)
1241
1243 David Hand, "<cogent@cpan.org>"
1244
1246 Sean M. Burke, "<sburke@cpan.org>"
1247
1249 Copyright 1998-2001, 2004, 2007 by Sean M. Burke and David Hand.
1250
1251 This program is free software; you can redistribute it and/or modify it
1252 under the same terms as Perl itself.
1253
1254 This program is distributed in the hope that it will be useful, but
1255 without any warranty; without even the implied warranty of
1256 merchantability or fitness for a particular purpose.
1257
1258
1259
1260perl v5.10.1 2007-12-09 Tree::DAG_Node(3)