1Data::Stag(3) User Contributed Perl Documentation Data::Stag(3)
2
3
4
6 Data::Stag - Structured Tags datastructures
7
9 # PROCEDURAL USAGE
10 use Data::Stag qw(:all);
11 $doc = stag_parse($file);
12 @persons = stag_find($doc, "person");
13 foreach $p (@persons) {
14 printf "%s, %s phone: %s\n",
15 stag_sget($p, "family_name"),
16 stag_sget($p, "given_name"),
17 stag_sget($p, "phone_no"),
18 ;
19 }
20
21 # OBJECT-ORIENTED USAGE
22 use Data::Stag;
23 $doc = Data::Stag->parse($file);
24 @persons = $doc->find("person");
25 foreach $p (@person) {
26 printf "%s, %s phone:%s\n",
27 $p->sget("family_name"),
28 $p->sget("given_name"),
29 $p->sget("phone_no"),
30 ;
31 }
32
34 This module is for manipulating data as hierarchical tag/value pairs
35 (Structured TAGs or Simple Tree AGgreggates). These datastructures can
36 be represented as nested arrays, which have the advantage of being
37 native to perl. A simple example is shown below:
38
39 [ person=> [ [ family_name => $family_name ],
40 [ given_name => $given_name ],
41 [ phone_no => $phone_no ] ] ],
42
43 Data::Stag uses a subset of XML for import and export. This means the
44 module can also be used as a general XML parser/writer (with certain
45 caveats).
46
47 The above set of structured tags can be represented in XML as
48
49 <person>
50 <family_name>...</family_name>
51 <given_name>...</given_name>
52 <phone_no>...</phone_no>
53 </person>
54
55 This datastructure can be examined, manipulated and exported using Stag
56 functions or methods:
57
58 $document = Data::Stag->parse($file);
59 @persons = $document->find('person');
60 foreach my $person (@person) {
61 $person->set('full_name',
62 $person->sget('given_name') . ' ' .
63 $person->sget('family_name'));
64 }
65
66 Advanced querying is performed by passing functions, for example:
67
68 # get all people in dataset with name starting 'A'
69 @persons =
70 $document->where('person',
71 sub {shift->sget('family_name') =~ /^A/});
72
73 One of the things that marks this module out against other XML modules
74 is this emphasis on a functional approach as an obect-oriented or
75 procedural approach.
76
77 For full information on the stag project, see
78 <http://stag.sourceforge.net>
79
80 PROCEDURAL VS OBJECT-ORIENTED USAGE
81 Depending on your preference, this module can be used a set of
82 procedural subroutine calls, or as method calls upon Data::Stag
83 objects, or both.
84
85 In procedural mode, all the subroutine calls are prefixed "stag_" to
86 avoid namespace clashes. The following three calls are equivalent:
87
88 $person = stag_find($doc, "person");
89 $person = $doc->find("person");
90 $person = $doc->find_person;
91
92 In object mode, you can treat any tree element as if it is an object
93 with automatically defined methods for getting/setting the tag values.
94
95 USE OF XML
96 Nested arrays can be imported and exported as XML, as well as other
97 formats. XML can be slurped into memory all at once (using less memory
98 than an equivalent DOM tree), or a simplified SAX style event handling
99 model can be used. Similarly, data can be exported all at once, or as a
100 series of events.
101
102 Although this module can be used as a general XML tool, it is intended
103 primarily as a tool for manipulating hierarchical data using nested
104 tag/value pairs.
105
106 This module is more suited to dealing with data-oriented documents than
107 text-oriented documents.
108
109 By using a simpler subset of XML equivalent to a basic data tree
110 structure, we can write simpler, cleaner code.
111
112 This module is ideally suited to element-only XML (that is, XML without
113 attributes or mixed elements).
114
115 If you are using attributes or mixed elements, it is useful to know
116 what is going on under the hood.
117
118 All attributes are turned into elements; they are nested inside an
119 element with name '@'.
120
121 For example, the following piece of XML
122
123 <foo id="x">
124 <bar>ugh</bar>
125 </foo>
126
127 Gets represented internally as
128
129 <foo>
130 <@>
131 <id>x</id>
132 </@>
133 <bar>ugh</bar>
134 </foo>
135
136 Of course, this is not valid XML. However, it is just an internal
137 representation - when exporting back to XML it will look like normal
138 XML with attributes again.
139
140 Mixed content cannot be represented in a simple tree format, so this is
141 also expanded.
142
143 The following piece of XML
144
145 <paragraph id="1" color="green">
146 example of <bold>mixed</bold>content
147 </paragraph>
148
149 gets parsed as if it were actually:
150
151 <paragraph>
152 <@>
153 <id>1</id>
154 <color>green</color>
155 </@>
156 <.>example of</.>
157 <bold>mixed</bold>
158 <.>content</.>
159 </paragraph>
160
161 When using stag with attribute or mixed attribute xml, you can treat
162 '@' and '.' as normal elements
163
164 SAX
165
166 This module can also be used as part of a SAX-style event generation /
167 handling framework - see Data::Stag::BaseHandler
168
169 PERL REPRESENTATION
170
171 Because nested arrays are native to perl, we can specify an XML
172 datastructure directly in perl without going through multiple object
173 calls.
174
175 For example, instead of using XML::Writer for the lengthy
176
177 $obj->startTag("record");
178 $obj->startTag("field1");
179 $obj->characters("foo");
180 $obj->endTag("field1");
181 $obj->startTag("field2");
182 $obj->characters("bar");
183 $obj->endTag("field2");
184 $obj->end("record");
185
186 We can instead write
187
188 $struct = [ record => [
189 [ field1 => 'foo'],
190 [ field2 => 'bar']]];
191
192 PARSING
193
194 The following example is for parsing out subsections of a tree and
195 changing sub-elements
196
197 use Data::Stag qw(:all);
198 my $tree = stag_parse($xmlfile);
199 my ($subtree) = stag_findnode($tree, $element);
200 stag_set($element, $sub_element, $new_val);
201 print stag_xml($subtree);
202
203 OBJECT ORIENTED
204
205 The same can be done in a more OO fashion
206
207 use Data::Stag qw(:all);
208 my $tree = Data::Stag->parse($xmlfile);
209 my ($subtree) = $tree->findnode($element);
210 $element->set($sub_element, $new_val);
211 print $subtree->xml;
212
213 IN A STREAM
214
215 Rather than parsing in a whole file into memory all at once (which may
216 not be suitable for very large files), you can take an event handling
217 approach. The easiest way to do this to register which nodes in the
218 file you are interested in using the makehandler method. The parser
219 will sweep through the file, building objects as it goes, and handing
220 the object to a subroutine that you specify.
221
222 For example:
223
224 use Data::Stag;
225 # catch the end of 'person' elements
226 my $h = Data::Stag->makehandler( person=> sub {
227 my ($self, $person) = @_;
228 printf "name:%s phone:%s\n",
229 $person->get_name,
230 $person->get_phone;
231 return; # clear node
232 });
233 Data::Stag->parse(-handler=>$h,
234 -file=>$f);
235
236 see Data::Stag::BaseHandler for writing handlers
237
238 See the Stag website at <http://stag.sourceforge.net> for more
239 examples.
240
241 STRUCTURED TAGS TREE DATA STRUCTURE
242 A tree of structured tags is represented as a recursively nested array,
243 the elements of the array represent nodes in the tree.
244
245 A node is a name/data pair, that can represent tags and values. A node
246 is represented using a reference to an array, where the first element
247 of the array is the tagname, or element, and the second element is the
248 data
249
250 This can be visualised as a box:
251
252 +-----------+
253 |Name | Data|
254 +-----------+
255
256 In perl, we represent this pair as a reference to an array
257
258 [ Name => $Data ]
259
260 The Data can either be a list of child nodes (subtrees), or a data
261 value.
262
263 The terminal nodes (leafs of the tree) contain data values; this is
264 represented in perl using primitive scalars.
265
266 For example:
267
268 [ Name => 'Fred' ]
269
270 For non-terminal nodes, the Data is a reference to an array, where each
271 element of the the array is a new node.
272
273 +-----------+
274 |Name | Data|
275 +-----------+
276 ||| +-----------+
277 ||+-->|Name | Data|
278 || +-----------+
279 ||
280 || +-----------+
281 |+--->|Name | Data|
282 | +-----------+
283 |
284 | +-----------+
285 +---->|Name | Data|
286 +-----------+
287
288 In perl this would be:
289
290 [ Name => [
291 [Name1 => $Data1],
292 [Name2 => $Data2],
293 [Name3 => $Data3],
294 ]
295 ];
296
297 The extra level of nesting is required to be able to store any node in
298 the tree using a single variable. This representation has lots of
299 advantages over others, eg hashes and mixed hash/array structures.
300
301 MANIPULATION AND QUERYING
302 The following example is taken from biology; we have a list of species
303 (mouse, human, fly) and a list of genes found in that species. These
304 are cross-referenced by an identifier called tax_id. We can do a
305 relational-style inner join on this identifier, as follows -
306
307 use Data::Stag qw(:all);
308 my $tree =
309 Data::Stag->new(
310 'db' => [
311 [ 'species_set' => [
312 [ 'species' => [
313 [ 'common_name' => 'house mouse' ],
314 [ 'binomial' => 'Mus musculus' ],
315 [ 'tax_id' => '10090' ]]],
316 [ 'species' => [
317 [ 'common_name' => 'fruit fly' ],
318 [ 'binomial' => 'Drosophila melanogaster' ],
319 [ 'tax_id' => '7227' ]]],
320 [ 'species' => [
321 [ 'common_name' => 'human' ],
322 [ 'binomial' => 'Homo sapiens' ],
323 [ 'tax_id' => '9606' ]]]]],
324 [ 'gene_set' => [
325 [ 'gene' => [
326 [ 'symbol' => 'HGNC' ],
327 [ 'tax_id' => '9606' ],
328 [ 'phenotype' => 'Hemochromatosis' ],
329 [ 'phenotype' => 'Porphyria variegata' ],
330 [ 'GO_term' => 'iron homeostasis' ],
331 [ 'map' => '6p21.3' ]]],
332 [ 'gene' => [
333 [ 'symbol' => 'Hfe' ],
334 [ 'synonym' => 'MR2' ],
335 [ 'tax_id' => '10090' ],
336 [ 'GO_term' => 'integral membrane protein' ],
337 [ 'map' => '13 A2-A4' ]]]]]]
338 );
339
340 # inner join of species and gene parts of tree,
341 # based on 'tax_id' element
342 my $gene_set = $tree->find("gene_set"); # get <gene_set> element
343 my $species_set = $tree->find("species_set"); # get <species_set> element
344 $gene_set->ijoin("gene", "tax_id", $species_set); # INNER JOIN
345
346 print "Reorganised data:\n";
347 print $gene_set->xml;
348
349 # find all genes starting with letter 'H' in where species/common_name=human
350 my @genes =
351 $gene_set->where('gene',
352 sub { my $g = shift;
353 $g->get_symbol =~ /^H/ &&
354 $g->findval("common_name") eq ('human')});
355
356 print "Human genes beginning 'H'\n";
357 print $_->xml foreach @genes;
358
359 S-Expression (Lisp) representation
360 The data represented using this module can be represented as Lisp-style
361 S-Expressions.
362
363 See Data::Stag::SxprParser and Data::Stag::SxprWriter
364
365 If we execute this code on the XML from the example above
366
367 $stag = Data::Stag->parse($xmlfile);
368 print $stag->sxpr;
369
370 The following S-Expression will be printed:
371
372 '(db
373 (species_set
374 (species
375 (common_name "house mouse")
376 (binomial "Mus musculus")
377 (tax_id "10090"))
378 (species
379 (common_name "fruit fly")
380 (binomial "Drosophila melanogaster")
381 (tax_id "7227"))
382 (species
383 (common_name "human")
384 (binomial "Homo sapiens")
385 (tax_id "9606")))
386 (gene_set
387 (gene
388 (symbol "HGNC")
389 (tax_id "9606")
390 (phenotype "Hemochromatosis")
391 (phenotype "Porphyria variegata")
392 (GO_term "iron homeostasis")
393 (map
394 (cytological
395 (chromosome "6")
396 (band "p21.3"))))
397 (gene
398 (symbol "Hfe")
399 (synonym "MR2")
400 (tax_id "10090")
401 (GO_term "integral membrane protein")))
402 (similarity_set
403 (pair
404 (symbol "HGNC")
405 (symbol "Hfe"))
406 (pair
407 (symbol "WNT3A")
408 (symbol "Wnt3a"))))
409
410 TIPS FOR EMACS USERS AND LISP PROGRAMMERS
411
412 If you use emacs, you can save this as a file with the ".el" suffix and
413 get syntax highlighting for editing this file. Quotes around the
414 terminal node data items are optional.
415
416 If you know emacs lisp or any other lisp, this also turns out to be a
417 very nice language for manipulating these datastructures. Try copying
418 and pasting the above s-expression to the emacs scratch buffer and
419 playing with it in lisp.
420
421 INDENTED TEXT REPRESENTATION
422 Data::Stag has its own text format for writing data trees. Again, this
423 is only possible because we are working with a subset of XML (no
424 attributes, no mixed elements). The data structure above can be written
425 as follows -
426
427 db:
428 species_set:
429 species:
430 common_name: house mouse
431 binomial: Mus musculus
432 tax_id: 10090
433 species:
434 common_name: fruit fly
435 binomial: Drosophila melanogaster
436 tax_id: 7227
437 species:
438 common_name: human
439 binomial: Homo sapiens
440 tax_id: 9606
441 gene_set:
442 gene:
443 symbol: HGNC
444 tax_id: 9606
445 phenotype: Hemochromatosis
446 phenotype: Porphyria variegata
447 GO_term: iron homeostasis
448 map: 6p21.3
449 gene:
450 symbol: Hfe
451 synonym: MR2
452 tax_id: 10090
453 GO_term: integral membrane protein
454 map: 13 A2-A4
455 similarity_set:
456 pair:
457 symbol: HGNC
458 symbol: Hfe
459 pair:
460 symbol: WNT3A
461 symbol: Wnt3a
462
463 See Data::Stag::ITextParser and Data::Stag::ITextWriter
464
465 NESTED ARRAY SPECIFICATION II
466 To avoid excessive square bracket usage, you can specify a structure
467 like this:
468
469 use Data::Stag qw(:all);
470
471 *N = \&stag_new;
472 my $tree =
473 N(top=>[
474 N('personset'=>[
475 N('person'=>[
476 N('name'=>'davey'),
477 N('address'=>'here'),
478 N('description'=>[
479 N('hair'=>'green'),
480 N('eyes'=>'two'),
481 N('teeth'=>5),
482 ]
483 ),
484 N('pets'=>[
485 N('petname'=>'igor'),
486 N('petname'=>'ginger'),
487 ]
488 ),
489
490 ],
491 ),
492 N('person'=>[
493 N('name'=>'shuggy'),
494 N('address'=>'there'),
495 N('description'=>[
496 N('hair'=>'red'),
497 N('eyes'=>'three'),
498 N('teeth'=>1),
499 ]
500 ),
501 N('pets'=>[
502 N('petname'=>'thud'),
503 N('petname'=>'spud'),
504 ]
505 ),
506 ]
507 ),
508 ]
509 ),
510 N('animalset'=>[
511 N('animal'=>[
512 N('name'=>'igor'),
513 N('class'=>'rat'),
514 N('description'=>[
515 N('fur'=>'white'),
516 N('eyes'=>'red'),
517 N('teeth'=>50),
518 ],
519 ),
520 ],
521 ),
522 ]
523 ),
524
525 ]
526 );
527
528 # find all people
529 my @persons = stag_find($tree, 'person');
530
531 # write xml for all red haired people
532 foreach my $p (@persons) {
533 print stag_xml($p)
534 if stag_tmatch($p, "hair", "red");
535 } ;
536
537 # find all people that have name == shuggy
538 my @p =
539 stag_qmatch($tree,
540 "person",
541 "name",
542 "shuggy");
543
545 As well as the methods listed below, a node can be treated as if it is
546 a data object of a class determined by the element.
547
548 For example, the following are equivalent.
549
550 $node->get_name;
551 $node->get('name');
552
553 $node->set_name('fred');
554 $node->set('name', 'fred');
555
556 This is really just syntactic sugar. The autoloaded methods are not
557 checked against any schema, although this may be added in future.
558
560 A stag tree can be indexed as a hash for direct retrieval; see
561 Data::Stag::HashDB
562
563 This index can be made persistent as a DB file; see Data::Stag::StagDB
564
565 If you wish to use Stag in conjunction with a relational database, you
566 should install DBIx::DBStag
567
569 All method calls are also available as procedural subroutine calls;
570 unless otherwise noted, the subroutine call is the same as the method
571 call, but with the string stag_ prefixed to the method name. The first
572 argument should be a Data::Stag datastructure.
573
574 To import all subroutines into the current namespace, use this idiom:
575
576 use Data::Stag qw(:all);
577 $doc = stag_parse($file);
578 @persons = stag_find($doc, 'person');
579
580 If you wish to use this module procedurally, and you are too lazy to
581 prefix all calls with stag_, use this idiom:
582
583 use Data::Stag qw(:lazy);
584 $doc = parse($file);
585 @persons = find($doc, 'person');
586
587 But beware of clashes!
588
589 Most method calls also have a handy short mnemonic. Use of these is
590 optional. Software engineering types prefer longer names, in the belief
591 that this leads to clearer code. Hacker types prefer shorter names, as
592 this requires less keystrokes, and leads to a more compact
593 representation of the code. It is expected that if you do use this
594 module, then its usage will be fairly ubiquitous within your code, and
595 the mnemonics will become familiar, much like the qw and s/ operators
596 in perl. As always with perl, the decision is yours.
597
598 Some methods take a single parameter or list of parameters; some have
599 large lists of parameters that can be passed in any order. If the
600 documentation states:
601
602 Args: [x str], [y int], [z ANY]
603
604 Then the method can be called like this:
605
606 $stag->foo("this is x", 55, $ref);
607
608 or like this:
609
610 $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
611
612 INITIALIZATION METHODS
613 new
614
615 Title: new
616
617 Args: element str, data STAG-DATA
618 Returns: Data::Stag node
619 Example: $node = stag_new();
620 Example: $node = Data::Stag->new;
621 Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
622
623 creates a new instance of a Data::Stag node
624
625 stagify (nodify)
626
627 Title: stagify
628 Synonym: nodify
629 Args: data ARRAY-REF
630 Returns: Data::Stag node
631 Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
632
633 turns a perl array reference into a Data::Stag node.
634
635 similar to new
636
637 parse
638
639 Title: parse
640
641 Args: [file str], [format str], [handler obj], [fh FileHandle]
642 Returns: Data::Stag node
643 Example: $node = stag_parse($fn);
644 Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
645 Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
646
647 slurps a file or string into a Data::Stag node structure. Will guess
648 the format (xml, sxpr, itext, indent) from the suffix if it is not
649 given.
650
651 The format can also be the name of a parsing module, or an actual
652 parser object;
653
654 The handler is any object that can take nested Stag events
655 (start_event, end_event, evbody) which are generated from the parse. If
656 the handler is omitted, all events will be cached and the resulting
657 tree will be returned.
658
659 See Data::Stag::BaseHandler for writing your own handlers
660
661 See Data::Stag::BaseGenerator for details on parser classes, and error
662 handling
663
664 parsestr
665
666 Title: parsestr
667
668 Args: [str str], [format str], [handler obj]
669 Returns: Data::Stag node
670 Example: $node = stag_parsestr('(a (b (c "1")))');
671 Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
672
673 Similar to parse(), except the first argument is a string
674
675 from
676
677 Title: from
678
679 Args: format str, source str
680 Returns: Data::Stag node
681 Example: $node = stag_from('xml', $fn);
682 Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
683 Example: $node = Data::Stag->from($parser, $fn);
684
685 Similar to parse
686
687 slurps a file or string into a Data::Stag node structure.
688
689 The format can also be the name of a parsing module, or an actual
690 parser object
691
692 unflatten
693
694 Title: unflatten
695
696 Args: data array
697 Returns: Data::Stag node
698 Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
699
700 Creates a node structure from a semi-flattened representation, in which
701 children of a node are represented as a flat list of data rather than a
702 list of array references.
703
704 This means a structure can be specified as:
705
706 person=>[name=>$n,
707 phone=>$p,
708 address=>[street=>$s,
709 city=>$c]]
710
711 Instead of:
712
713 [person=>[ [name=>$n],
714 [phone=>$p],
715 [address=>[ [street=>$s],
716 [city=>$c] ] ]
717 ]
718 ]
719
720 The former gets converted into the latter for the internal
721 representation
722
723 makehandler
724
725 Title: makehandler
726
727 Args: hash of CODEREFs keyed by element name
728 OR a string containing the name of a module
729 Returns: L<Data::Stag::BaseHandler>
730 Example: $h = Data::Stag->makehandler(%subs);
731 Example: $h = Data::Stag->makehandler("My::FooHandler");
732 Example: $h = Data::Stag->makehandler('xml');
733
734 This creates a Stag event handler. The argument is a hash of
735 subroutines keyed by element/node name. After each node is fired by the
736 parser/generator, the subroutine is called, passing the handler object
737 and the stag node as arguments. whatever the subroutine returns is
738 placed back into the tree
739
740 For example, for a a parser/generator that fires events with the
741 following tree form
742
743 <person>
744 <name>foo</name>
745 ...
746 </person>
747
748 we can create a handler that writes person/name like this:
749
750 $h = Data::Stag->makehandler(
751 person => sub { my ($self,$stag) = @_;
752 print $stag->name;
753 return $stag; # dont change tree
754 });
755 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
756
757 See Data::Stag::BaseHandler for details on handlers
758
759 getformathandler
760
761 Title: getformathandler
762
763 Args: format str OR L<Data::Stag::BaseHandler>
764 Returns: L<Data::Stag::BaseHandler>
765 Example: $h = Data::Stag->getformathandler('xml');
766 $h->file("my.xml");
767 Data::Stag->parse(-fn=>$fn, -handler=>$h);
768
769 Creates a Stag event handler - this handler can be passed to an event
770 generator / parser. Built in handlers include:
771
772 xml Generates xml tags from events
773
774 sxpr
775 Generates S-Expressions from events
776
777 itext
778 Generates itext format from events
779
780 indent
781 Generates indent format from events
782
783 All the above are kinds of Data::Stag::Writer
784
785 chainhandler
786
787 Title: chainhandler
788
789 Args: blocked events - str or str[]
790 initial handler - handler object
791 final handler - handler object
792 Returns:
793 Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
794
795 chains handlers together - for example, you may want to make transforms
796 on an event stream, and then pass the event stream to another handler -
797 for example, and xml handler
798
799 $processor = Data::Stag->makehandler(
800 a => sub { my ($self,$stag) = @_;
801 $stag->set_foo("bar");
802 return $stag
803 },
804 b => sub { my ($self,$stag) = @_;
805 $stag->set_blah("eek");
806 return $stag
807 },
808 );
809 $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
810 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
811
812 If the inner handler has a method CONSUMES(), this method will
813 determine the blocked events if none are specified.
814
815 see also the script stag-handle.pl
816
817 RECURSIVE SEARCHING
818 find (f)
819
820 Title: find
821 Synonym: f
822
823 Args: element str
824 Returns: node[] or ANY
825 Example: @persons = stag_find($struct, 'person');
826 Example: @persons = $struct->find('person');
827
828 recursively searches tree for all elements of the given type, and
829 returns all nodes or data elements found.
830
831 if the element found is a non-terminal node, will return the node if
832 the element found is a terminal (leaf) node, will return the data value
833
834 the element argument can be a path
835
836 @names = $struct->find('department/person/name');
837
838 will find name in the nested structure below:
839
840 (department
841 (person
842 (name "foo")))
843
844 findnode (fn)
845
846 Title: findnode
847 Synonym: fn
848
849 Args: element str
850 Returns: node[]
851 Example: @persons = stag_findnode($struct, 'person');
852 Example: @persons = $struct->findnode('person');
853
854 recursively searches tree for all elements of the given type, and
855 returns all nodes found.
856
857 paths can also be used (see find)
858
859 findval (fv)
860
861 Title: findval
862 Synonym: fv
863
864 Args: element str
865 Returns: ANY[] or ANY
866 Example: @names = stag_findval($struct, 'name');
867 Example: @names = $struct->findval('name');
868 Example: $firstname = $struct->findval('name');
869
870 recursively searches tree for all elements of the given type, and
871 returns all data values found. the data values could be primitive
872 scalars or nodes.
873
874 paths can also be used (see find)
875
876 sfindval (sfv)
877
878 Title: sfindval
879 Synonym: sfv
880
881 Args: element str
882 Returns: ANY
883 Example: $name = stag_sfindval($struct, 'name');
884 Example: $name = $struct->sfindval('name');
885
886 as findval, but returns the first value found
887
888 paths can also be used (see find)
889
890 findvallist (fvl)
891
892 Title: findvallist
893 Synonym: fvl
894
895 Args: element str[]
896 Returns: ANY[]
897 Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
898 Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
899
900 recursively searches tree for all elements in the list
901
902 DEPRECATED
903
904 DATA ACCESSOR METHODS
905 these allow getting and setting of elements directly underneath the
906 current one
907
908 get (g)
909
910 Title: get
911 Synonym: g
912
913 Args: element str
914 Return: node[] or ANY
915 Example: $name = $person->get('name');
916 Example: @phone_nos = $person->get('phone_no');
917
918 gets the value of the named sub-element
919
920 if the sub-element is a non-terminal, will return a node(s) if the sub-
921 element is a terminal (leaf) it will return the data value(s)
922
923 the examples above would work on a data structure like this:
924
925 [person => [ [name => 'fred'],
926 [phone_no => '1-800-111-2222'],
927 [phone_no => '1-415-555-5555']]]
928
929 will return an array or single value depending on the context
930
931 [equivalent to findval(), except that only direct children (as opposed
932 to all descendents) are checked]
933
934 paths can also be used, like this:
935
936 @phones_nos = $struct->get('person/phone_no')
937
938 sget (sg)
939
940 Title: sget
941 Synonym: sg
942
943 Args: element str
944 Return: ANY
945 Example: $name = $person->sget('name');
946 Example: $phone = $person->sget('phone_no');
947 Example: $phone = $person->sget('department/person/name');
948
949 as get but always returns a single value
950
951 [equivalent to sfindval(), except that only direct children (as opposed
952 to all descendents) are checked]
953
954 getl (gl getlist)
955
956 Title: gl
957 Synonym: getl
958 Synonym: getlist
959
960 Args: element str[]
961 Return: node[] or ANY[]
962 Example: ($name, @phone) = $person->getl('name', 'phone_no');
963
964 returns the data values for a list of sub-elements of a node
965
966 [equivalent to findvallist(), except that only direct children (as
967 opposed to all descendents) are checked]
968
969 getn (gn getnode)
970
971 Title: getn
972 Synonym: gn
973 Synonym: getnode
974
975 Args: element str
976 Return: node[]
977 Example: $namestruct = $person->getn('name');
978 Example: @pstructs = $person->getn('phone_no');
979
980 as get but returns the whole node rather than just the data value
981
982 [equivalent to findnode(), except that only direct children (as opposed
983 to all descendents) are checked]
984
985 sgetmap (sgm)
986
987 Title: sgetmap
988 Synonym: sgm
989
990 Args: hash
991 Return: hash
992 Example: %h = $person->sgetmap('social-security-no'=>'id',
993 'name' =>'label',
994 'job' =>0,
995 'address' =>'location');
996
997 returns a hash of key/val pairs based on the values of the data values
998 of the subnodes in the current element; keys are mapped according to
999 the hash passed (a value of '' or 0 will map an identical key/val).
1000
1001 no multivalued data elements are allowed
1002
1003 set (s)
1004
1005 Title: set
1006 Synonym: s
1007
1008 Args: element str, datavalue ANY (list)
1009 Return: ANY
1010 Example: $person->set('name', 'fred'); # single val
1011 Example: $person->set('phone_no', $cellphone, $homephone);
1012
1013 sets the data value of an element for any node. if the element is
1014 multivalued, all the old values will be replaced with the new ones
1015 specified.
1016
1017 ordering will be preserved, unless the element specified does not
1018 exist, in which case, the new tag/value pair will be placed at the end.
1019
1020 for example, if we have a stag node $person
1021
1022 person:
1023 name: shuggy
1024 job: bus driver
1025
1026 if we do this
1027
1028 $person->set('name', ());
1029
1030 we will end up with
1031
1032 person:
1033 job: bus driver
1034
1035 then if we do this
1036
1037 $person->set('name', 'shuggy');
1038
1039 the 'name' node will be placed as the last attribute
1040
1041 person:
1042 job: bus driver
1043 name: shuggy
1044
1045 You can also use magic methods, for example
1046
1047 $person->set_name('shuggy');
1048 $person->set_job('bus driver', 'poet');
1049 print $person->itext;
1050
1051 will print
1052
1053 person:
1054 name: shuggy
1055 job: bus driver
1056 job: poet
1057
1058 note that if the datavalue is a non-terminal node as opposed to a
1059 primitive value, then you have to do it like this:
1060
1061 $people = Data::Stag->new(people=>[
1062 [person=>[[name=>'Sherlock Holmes']]],
1063 [person=>[[name=>'Moriarty']]],
1064 ]);
1065 $address = Data::Stag->new(address=>[
1066 [address_line=>"221B Baker Street"],
1067 [city=>"London"],
1068 [country=>"Great Britain"]]);
1069 ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1070 $person->set("address", $address->data);
1071
1072 If you are using XML data, you can set attributes like this:
1073
1074 $person->set('@'=>[[id=>$id],[foo=>$foo]]);
1075
1076 unset (u)
1077
1078 Title: unset
1079 Synonym: u
1080
1081 Args: element str, datavalue ANY
1082 Return: ANY
1083 Example: $person->unset('name');
1084 Example: $person->unset('phone_no');
1085
1086 prunes all nodes of the specified element from the current node
1087
1088 You can use magic methods, like this
1089
1090 $person->unset_name;
1091 $person->unset_phone_no;
1092
1093 free
1094
1095 Title: free
1096 Synonym: u
1097
1098 Args:
1099 Return:
1100 Example: $person->free;
1101
1102 removes all data from a node. If that node is a subnode of another
1103 node, it is removed altogether
1104
1105 for instance, if we had the data below:
1106
1107 <person>
1108 <name>fred</name>
1109 <address>
1110 ..
1111 </address>
1112 </person>
1113
1114 and called
1115
1116 $person->get_address->free
1117
1118 then the person node would look like this:
1119
1120 <person>
1121 <name>fred</name>
1122 </person>
1123
1124 add (a)
1125
1126 Title: add
1127 Synonym: a
1128
1129 Args: element str, datavalues ANY[]
1130 OR
1131 Data::Stag
1132 Return: ANY
1133 Example: $person->add('phone_no', $cellphone, $homephone);
1134 Example: $person->add_phone_no('1-555-555-5555');
1135 Example: $dataset->add($person)
1136
1137 adds a datavalue or list of datavalues. appends if already existing,
1138 creates new element value pairs if not already existing.
1139
1140 if the argument is a stag node, it will add this node under the current
1141 one.
1142
1143 For example, if we have the following node in $dataset
1144
1145 <dataset>
1146 <person>
1147 <name>jim</name>
1148 </person>
1149 </dataset>
1150
1151 And then we add data to it:
1152
1153 ($person) = $dataset->qmatch('person', name=>'jim');
1154 $person->add('phone_no', '555-1111', '555-2222');
1155
1156 We will be left with:
1157
1158 <dataset>
1159 <person>
1160 <name>jim</name>
1161 <phone_no>555-1111</phone_no>
1162 <phone_no>555-2222</phone_no>
1163 </person>
1164 </dataset>
1165
1166 The above call is equivalent to:
1167
1168 $person->add_phone_no('555-1111', '555-2222');
1169
1170 As well as adding data values, we can add whole nodes:
1171
1172 $dataset->add(person=>[[name=>"fred"],
1173 [phone_no=>"555-3333"]]);
1174
1175 Which is equivalent to
1176
1177 $dataset->add_person([[name=>"fred"],
1178 [phone_no=>"555-3333"]]);
1179
1180 Remember, the value has to be specified as an array reference of nodes.
1181 In general, you should use the addkid() method to add nodes and used
1182 add() to add values
1183
1184 element (e name)
1185
1186 Title: element
1187 Synonym: e
1188 Synonym: name
1189
1190 Args:
1191 Return: element str
1192 Example: $element = $struct->element
1193
1194 returns the element name of the current node.
1195
1196 This is illustrated in the different representation formats below
1197
1198 sxpr
1199 (element "data")
1200
1201 or
1202
1203 (element
1204 (sub_element "..."))
1205
1206 xml
1207 <element>data</element>
1208
1209 or
1210
1211 <element>
1212 <sub_element>...</sub_element>
1213 </element>
1214
1215 perl
1216 [element => $data ]
1217
1218 or
1219
1220 [element => [
1221 [sub_element => "..." ]]]
1222
1223 itext
1224 element: data
1225
1226 or
1227
1228 element:
1229 sub_element: ...
1230
1231 indent
1232 element "data"
1233
1234 or
1235
1236 element
1237 sub_element "..."
1238
1239 kids (k children)
1240
1241 Title: kids
1242 Synonym: k
1243 Synonym: children
1244
1245 Args:
1246 Return: ANY or ANY[]
1247 Example: @nodes = $person->kids
1248 Example: $name = $namestruct->kids
1249
1250 returns the data value(s) of the current node; if it is a terminal
1251 node, returns a single value which is the data. if it is non-terminal,
1252 returns an array of nodes
1253
1254 addkid (ak addchild)
1255
1256 Title: addkid
1257 Synonym: ak
1258 Synonym: addchild
1259
1260 Args: kid node
1261 Return: ANY
1262 Example: $person->addkid($job);
1263
1264 adds a new child node to a non-terminal node, after all the existing
1265 child nodes
1266
1267 You can use this method/procedure to add XML attribute data to a node:
1268
1269 $person->addkid(['@'=>[[id=>$id]]]);
1270
1271 subnodes
1272
1273 Title: subnodes
1274
1275 Args:
1276 Return: ANY[]
1277 Example: @nodes = $person->subnodes
1278
1279 returns the child nodes; returns empty list if this is a terminal node
1280
1281 ntnodes
1282
1283 Title: ntnodes
1284
1285 Args:
1286 Return: ANY[]
1287 Example: @nodes = $person->ntnodes
1288
1289 returns all non-terminal children of current node
1290
1291 tnodes
1292
1293 Title: tnodes
1294
1295 Args:
1296 Return: ANY[]
1297 Example: @nodes = $person->tnodes
1298
1299 returns all terminal children of current node
1300
1301 QUERYING AND ADVANCED DATA MANIPULATION
1302 ijoin (j)
1303
1304 Title: ijoin
1305 Synonym: j
1306 Synonym: ij
1307
1308 Args: element str, key str, data Node
1309 Return: undef
1310
1311 does a relational style inner join - see previous example in this doc
1312
1313 key can either be a single node name that must be shared (analagous to
1314 SQL INNER JOIN .. USING), or a key1=key2 equivalence relation
1315 (analagous to SQL INNER JOIN ... ON)
1316
1317 qmatch (qm)
1318
1319 Title: qmatch
1320 Synonym: qm
1321
1322 Args: return-element str, match-element str, match-value str
1323 Return: node[]
1324 Example: @persons = $s->qmatch('person', 'name', 'fred');
1325 Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1326
1327 queries the node tree for all elements that satisfy the specified
1328 key=val match - see previous example in this doc
1329
1330 for those inclined to thinking relationally, this can be thought of as
1331 a query that returns a stag object:
1332
1333 SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1334
1335 this always returns an array; this means that calling in a scalar
1336 context will return the number of elements; for example
1337
1338 $n = $s->qmatch('person', (name=>'fred'));
1339
1340 the value of $n will be equal to the number of persons called fred
1341
1342 tmatch (tm)
1343
1344 Title: tmatch
1345 Synonym: tm
1346
1347 Args: element str, value str
1348 Return: bool
1349 Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1350
1351 returns true if the the value of the specified element matches - see
1352 previous example in this doc
1353
1354 tmatchhash (tmh)
1355
1356 Title: tmatchhash
1357 Synonym: tmh
1358
1359 Args: match hashref
1360 Return: bool
1361 Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1362
1363 returns true if the node matches a set of constraints, specified as
1364 hash.
1365
1366 tmatchnode (tmn)
1367
1368 Title: tmatchnode
1369 Synonym: tmn
1370
1371 Args: match node
1372 Return: bool
1373 Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1374
1375 returns true if the node matches a set of constraints, specified as
1376 node
1377
1378 cmatch (cm)
1379
1380 Title: cmatch
1381 Synonym: cm
1382
1383 Args: element str, value str
1384 Return: bool
1385 Example: $n_freds = $personset->cmatch('name', 'fred');
1386
1387 counts the number of matches
1388
1389 where (w)
1390
1391 Title: where
1392 Synonym: w
1393
1394 Args: element str, test CODE
1395 Return: Node[]
1396 Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1397
1398 the tree is queried for all elements of the specified type that satisfy
1399 the coderef (must return a boolean)
1400
1401 my @rich_dog_or_cat_owners =
1402 $data->where('person',
1403 sub {my $p = shift;
1404 $p->get_salary > 100000 &&
1405 $p->where('pet',
1406 sub {shift->get_type =~ /(dog|cat)/})});
1407
1408 iterate (i)
1409
1410 Title: iterate
1411 Synonym: i
1412
1413 Args: CODE
1414 Return: Node[]
1415 Example: $data->iterate(sub {
1416 my $stag = shift;
1417 my $parent = shift;
1418 if ($stag->element eq 'pet') {
1419 $parent->set_pet_name($stag->get_name);
1420 }
1421 });
1422
1423 iterates through whole tree calling the specified subroutine.
1424
1425 the first arg passed to the subroutine is the stag node representing
1426 the tree at that point; the second arg is for the parent.
1427
1428 for instance, the example code above would turn this
1429
1430 (person
1431 (name "jim")
1432 (pet
1433 (name "fluffy")))
1434
1435 into this
1436
1437 (person
1438 (name "jim")
1439 (pet_name "fluffy")
1440 (pet
1441 (name "fluffy")))
1442
1443 maptree
1444
1445 Title: maptree
1446
1447 Args: CODE
1448 Return: Node[]
1449 Example: $data->maptree(sub {
1450 my $stag = shift;
1451 my $parent = shift;
1452 if ($stag->element eq 'pet') {
1453 [pet=>$stag->sget_foo]
1454 }
1455 else {
1456 $stag
1457 }
1458 });
1459
1460 MISCELLANEOUS METHODS
1461 duplicate (d)
1462
1463 Title: duplicate
1464 Synonym: d
1465
1466 Args:
1467 Return: Node
1468 Example: $node2 = $node->duplicate;
1469
1470 does a deep copy of a stag structure
1471
1472 isanode
1473
1474 Title: isanode
1475
1476 Args:
1477 Return: bool
1478 Example: if (stag_isanode($node)) { ... }
1479
1480 hash
1481
1482 Title: hash
1483
1484 Args:
1485 Return: hash
1486 Example: $h = $node->hash;
1487
1488 turns a tree into a hash. all data values will be arrayrefs
1489
1490 pairs
1491
1492 Title: pairs
1493
1494 turns a tree into a hash. all data values will be scalar (IMPORTANT:
1495 this means duplicate values will be lost)
1496
1497 write
1498
1499 Title: write
1500
1501 Args: filename str, format str[optional]
1502 Return:
1503 Example: $node->write("myfile.xml");
1504 Example: $node->write("myfile", "itext");
1505
1506 will try and guess the format from the extension if not specified
1507
1508 xml
1509
1510 Title: xml
1511
1512 Args: filename str, format str[optional]
1513 Return:
1514 Example: $node->write("myfile.xml");
1515 Example: $node->write("myfile", "itext");
1516
1517
1518 Args:
1519 Return: xml str
1520 Example: print $node->xml;
1521
1522 XML METHODS
1523 xslt
1524
1525 Title: xslt
1526
1527 Args: xslt_file str
1528 Return: Node
1529 Example: $new_stag = $stag->xslt('mytransform.xsl');
1530
1531 transforms a stag tree using XSLT
1532
1533 xsltstr
1534
1535 Title: xsltstr
1536
1537 Args: xslt_file str
1538 Return: str
1539 Example: print $stag->xsltstr('mytransform.xsl');
1540
1541 As above, but returns the string of the resulting transform, rather
1542 than a stag tree
1543
1544 sax
1545
1546 Title: sax
1547
1548 Args: saxhandler SAX-CLASS
1549 Return:
1550 Example: $node->sax($mysaxhandler);
1551
1552 turns a tree into a series of SAX events
1553
1554 xpath (xp tree2xpath)
1555
1556 Title: xpath
1557 Synonym: xp
1558 Synonym: tree2xpath
1559
1560 Args:
1561 Return: xpath object
1562 Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1563
1564 xpquery (xpq xpathquery)
1565
1566 Title: xpquery
1567 Synonym: xpq
1568 Synonym: xpathquery
1569
1570 Args: xpathquery str
1571 Return: Node[]
1572 Example: @nodes = $node->xqp($xpathquerystr);
1573
1575 The following scripts come with the stag module
1576
1577 stag-autoschema.pl
1578 writes the implicit stag-schema for a stag file
1579
1580 stag-db.pl
1581 persistent storage and retrieval for stag data (xml, sxpr, itext)
1582
1583 stag-diff.pl
1584 finds the difference between two stag files
1585
1586 stag-drawtree.pl
1587 draws a stag file (xml, itext, sxpr) as a PNG diagram
1588
1589 stag-filter.pl
1590 filters a stag file (xml, itext, sxpr) for nodes of interest
1591
1592 stag-findsubtree.pl
1593 finds nodes in a stag file
1594
1595 stag-flatten.pl
1596 turns stag data into a flat table
1597
1598 stag-grep.pl
1599 filters a stag file (xml, itext, sxpr) for nodes of interest
1600
1601 stag-handle.pl
1602 streams a stag file through a handler into a writer
1603
1604 stag-join.pl
1605 joins two stag files together based around common key
1606
1607 stag-mogrify.pl
1608 mangle stag files
1609
1610 stag-parse.pl
1611 parses a file and fires events (e.g. sxpr to xml)
1612
1613 stag-query.pl
1614 aggregare queries
1615
1616 stag-split.pl
1617 splits a stag file (xml, itext, sxpr) into multiple files
1618
1619 stag-splitter.pl
1620 splits a stag file into multiple files
1621
1622 stag-view.pl
1623 draws an expandable Tk tree diagram showing stag data
1624
1625 To get more documentation, type
1626
1627 stag_<script> -h
1628
1630 none known so far, possibly quite a few undocumented features!
1631
1632 Not a bug, but the underlying default datastructure of nested arrays is
1633 more heavyweight than it needs to be. More lightweight implementations
1634 are possible. Some time I will write a C implementation.
1635
1637 <http://stag.sourceforge.net>
1638
1640 Chris Mungall <cjm AT fruitfly DOT org>
1641
1643 Copyright (c) 2004 Chris Mungall
1644
1645 This module is free software. You may distribute this module under the
1646 same terms as perl itself
1647
1648
1649
1650perl v5.34.0 2021-07-22 Data::Stag(3)