1Data::Stag(3)         User Contributed Perl Documentation        Data::Stag(3)
2
3
4

NAME

6         Data::Stag - Structured Tags datastructures
7

SYNOPSIS

9         # PROCEDURAL USAGE
10         use Data::Stag qw(:all);
11         $doc = stag_parse($file);
12         @persons = stag_find($doc, "person");
13         foreach $p (@persons) {
14           printf "%s, %s phone: %s\n",
15             stag_sget($p, "family_name"),
16             stag_sget($p, "given_name"),
17             stag_sget($p, "phone_no"),
18           ;
19         }
20
21         # OBJECT-ORIENTED USAGE
22         use Data::Stag;
23         $doc = Data::Stag->parse($file);
24         @persons = $doc->find("person");
25         foreach $p (@person) {
26           printf "%s, %s phone:%s\n",
27             $p->sget("family_name"),
28             $p->sget("given_name"),
29             $p->sget("phone_no"),
30           ;
31         }
32

DESCRIPTION

34       This module is for manipulating data as hierarchical tag/value pairs
35       (Structured TAGs or Simple Tree AGgreggates). These datastructures can
36       be represented as nested arrays, which have the advantage of being
37       native to perl. A simple example is shown below:
38
39         [ person=> [  [ family_name => $family_name ],
40                       [ given_name  => $given_name  ],
41                       [ phone_no    => $phone_no    ] ] ],
42
43       Data::Stag uses a subset of XML for import and export. This means the
44       module can also be used as a general XML parser/writer (with certain
45       caveats).
46
47       The above set of structured tags can be represented in XML as
48
49         <person>
50           <family_name>...</family_name>
51           <given_name>...</given_name>
52           <phone_no>...</phone_no>
53         </person>
54
55       This datastructure can be examined, manipulated and exported using Stag
56       functions or methods:
57
58         $document = Data::Stag->parse($file);
59         @persons = $document->find('person');
60         foreach my $person (@person) {
61           $person->set('full_name',
62                        $person->sget('given_name') . ' ' .
63                        $person->sget('family_name'));
64         }
65
66       Advanced querying is performed by passing functions, for example:
67
68         # get all people in dataset with name starting 'A'
69         @persons =
70           $document->where('person',
71                            sub {shift->sget('family_name') =~ /^A/});
72
73       One of the things that marks this module out against other XML modules
74       is this emphasis on a functional approach as an obect-oriented or
75       procedural approach.
76
77       For full information on the stag project, see
78       <http://stag.sourceforge.net>
79
80   PROCEDURAL VS OBJECT-ORIENTED USAGE
81       Depending on your preference, this module can be used a set of
82       procedural subroutine calls, or as method calls upon Data::Stag
83       objects, or both.
84
85       In procedural mode, all the subroutine calls are prefixed "stag_" to
86       avoid namespace clashes. The following three calls are equivalent:
87
88         $person = stag_find($doc, "person");
89         $person = $doc->find("person");
90         $person = $doc->find_person;
91
92       In object mode, you can treat any tree element as if it is an object
93       with automatically defined methods for getting/setting the tag values.
94
95   USE OF XML
96       Nested arrays can be imported and exported as XML, as well as other
97       formats. XML can be slurped into memory all at once (using less memory
98       than an equivalent DOM tree), or a simplified SAX style event handling
99       model can be used. Similarly, data can be exported all at once, or as a
100       series of events.
101
102       Although this module can be used as a general XML tool, it is intended
103       primarily as a tool for manipulating hierarchical data using nested
104       tag/value pairs.
105
106       This module is more suited to dealing with data-oriented documents than
107       text-oriented documents.
108
109       By using a simpler subset of XML equivalent to a basic data tree
110       structure, we can write simpler, cleaner code.
111
112       This module is ideally suited to element-only XML (that is, XML without
113       attributes or mixed elements).
114
115       If you are using attributes or mixed elements, it is useful to know
116       what is going on under the hood.
117
118       All attributes are turned into elements; they are nested inside an
119       element with name '@'.
120
121       For example, the following piece of XML
122
123         <foo id="x">
124           <bar>ugh</bar>
125         </foo>
126
127       Gets represented internally as
128
129         <foo>
130           <@>
131             <id>x</id>
132           </@>
133           <bar>ugh</bar>
134         </foo>
135
136       Of course, this is not valid XML. However, it is just an internal
137       representation - when exporting back to XML it will look like normal
138       XML with attributes again.
139
140       Mixed content cannot be represented in a simple tree format, so this is
141       also expanded.
142
143       The following piece of XML
144
145         <paragraph id="1" color="green">
146           example of <bold>mixed</bold>content
147         </paragraph>
148
149       gets parsed as if it were actually:
150
151         <paragraph>
152           <@>
153             <id>1</id>
154             <color>green</color>
155           </@>
156           <.>example of</.>
157           <bold>mixed</bold>
158           <.>content</.>
159         </paragraph>
160
161       When using stag with attribute or mixed attribute xml, you can treat
162       '@' and '.' as normal elements
163
164       SAX
165
166       This module can also be used as part of a SAX-style event generation /
167       handling framework - see Data::Stag::BaseHandler
168
169       PERL REPRESENTATION
170
171       Because nested arrays are native to perl, we can specify an XML
172       datastructure directly in perl without going through multiple object
173       calls.
174
175       For example, instead of using XML::Writer for the lengthy
176
177         $obj->startTag("record");
178         $obj->startTag("field1");
179         $obj->characters("foo");
180         $obj->endTag("field1");
181         $obj->startTag("field2");
182         $obj->characters("bar");
183         $obj->endTag("field2");
184         $obj->end("record");
185
186       We can instead write
187
188         $struct = [ record => [
189                     [ field1 => 'foo'],
190                     [ field2 => 'bar']]];
191
192       PARSING
193
194       The following example is for parsing out subsections of a tree and
195       changing sub-elements
196
197         use Data::Stag qw(:all);
198         my $tree = stag_parse($xmlfile);
199         my ($subtree) = stag_findnode($tree, $element);
200         stag_set($element, $sub_element, $new_val);
201         print stag_xml($subtree);
202
203       OBJECT ORIENTED
204
205       The same can be done in a more OO fashion
206
207         use Data::Stag qw(:all);
208         my $tree = Data::Stag->parse($xmlfile);
209         my ($subtree) = $tree->findnode($element);
210         $element->set($sub_element, $new_val);
211         print $subtree->xml;
212
213       IN A STREAM
214
215       Rather than parsing in a whole file into memory all at once (which may
216       not be suitable for very large files), you can take an event handling
217       approach. The easiest way to do this to register which nodes in the
218       file you are interested in using the makehandler method. The parser
219       will sweep through the file, building objects as it goes, and handing
220       the object to a subroutine that you specify.
221
222       For example:
223
224         use Data::Stag;
225         # catch the end of 'person' elements
226         my $h = Data::Stag->makehandler( person=> sub {
227                                                      my ($self, $person) = @_;
228                                                      printf "name:%s phone:%s\n",
229                                                        $person->get_name,
230                                                        $person->get_phone;
231                                                      return;   # clear node
232                                                       });
233         Data::Stag->parse(-handler=>$h,
234                           -file=>$f);
235
236       see Data::Stag::BaseHandler for writing handlers
237
238       See the Stag website at <http://stag.sourceforge.net> for more
239       examples.
240
241   STRUCTURED TAGS TREE DATA STRUCTURE
242       A tree of structured tags is represented as a recursively nested array,
243       the elements of the array represent nodes in the tree.
244
245       A node is a name/data pair, that can represent tags and values.  A node
246       is represented using a reference to an array, where the first element
247       of the array is the tagname, or element, and the second element is the
248       data
249
250       This can be visualised as a box:
251
252         +-----------+
253         |Name | Data|
254         +-----------+
255
256       In perl, we represent this pair as a reference to an array
257
258         [ Name => $Data ]
259
260       The Data can either be a list of child nodes (subtrees), or a data
261       value.
262
263       The terminal nodes (leafs of the tree) contain data values; this is
264       represented in perl using primitive scalars.
265
266       For example:
267
268         [ Name => 'Fred' ]
269
270       For non-terminal nodes, the Data is a reference to an array, where each
271       element of the the array is a new node.
272
273         +-----------+
274         |Name | Data|
275         +-----------+
276                 |||   +-----------+
277                 ||+-->|Name | Data|
278                 ||    +-----------+
279                 ||
280                 ||    +-----------+
281                 |+--->|Name | Data|
282                 |     +-----------+
283                 |
284                 |     +-----------+
285                 +---->|Name | Data|
286                       +-----------+
287
288       In perl this would be:
289
290         [ Name => [
291                     [Name1 => $Data1],
292                     [Name2 => $Data2],
293                     [Name3 => $Data3],
294                   ]
295         ];
296
297       The extra level of nesting is required to be able to store any node in
298       the tree using a single variable. This representation has lots of
299       advantages over others, eg hashes and mixed hash/array structures.
300
301   MANIPULATION AND QUERYING
302       The following example is taken from biology; we have a list of species
303       (mouse, human, fly) and a list of genes found in that species. These
304       are cross-referenced by an identifier called tax_id. We can do a
305       relational-style inner join on this identifier, as follows -
306
307         use Data::Stag qw(:all);
308         my $tree =
309         Data::Stag->new(
310           'db' => [
311           [ 'species_set' => [
312             [ 'species' => [
313               [ 'common_name' => 'house mouse' ],
314               [ 'binomial' => 'Mus musculus' ],
315               [ 'tax_id' => '10090' ]]],
316             [ 'species' => [
317               [ 'common_name' => 'fruit fly' ],
318               [ 'binomial' => 'Drosophila melanogaster' ],
319               [ 'tax_id' => '7227' ]]],
320             [ 'species' => [
321               [ 'common_name' => 'human' ],
322               [ 'binomial' => 'Homo sapiens' ],
323               [ 'tax_id' => '9606' ]]]]],
324           [ 'gene_set' => [
325             [ 'gene' => [
326               [ 'symbol' => 'HGNC' ],
327               [ 'tax_id' => '9606' ],
328               [ 'phenotype' => 'Hemochromatosis' ],
329               [ 'phenotype' => 'Porphyria variegata' ],
330               [ 'GO_term' => 'iron homeostasis' ],
331               [ 'map' => '6p21.3' ]]],
332             [ 'gene' => [
333               [ 'symbol' => 'Hfe' ],
334               [ 'synonym' => 'MR2' ],
335               [ 'tax_id' => '10090' ],
336               [ 'GO_term' => 'integral membrane protein' ],
337               [ 'map' => '13 A2-A4' ]]]]]]
338          );
339
340         # inner join of species and gene parts of tree,
341         # based on 'tax_id' element
342         my $gene_set = $tree->find("gene_set");       # get <gene_set> element
343         my $species_set = $tree->find("species_set"); # get <species_set> element
344         $gene_set->ijoin("gene", "tax_id", $species_set);   # INNER JOIN
345
346         print "Reorganised data:\n";
347         print $gene_set->xml;
348
349         # find all genes starting with letter 'H' in where species/common_name=human
350         my @genes =
351           $gene_set->where('gene',
352                            sub { my $g = shift;
353                                  $g->get_symbol =~ /^H/ &&
354                                  $g->findval("common_name") eq ('human')});
355
356         print "Human genes beginning 'H'\n";
357         print $_->xml foreach @genes;
358
359   S-Expression (Lisp) representation
360       The data represented using this module can be represented as Lisp-style
361       S-Expressions.
362
363       See Data::Stag::SxprParser and  Data::Stag::SxprWriter
364
365       If we execute this code on the XML from the example above
366
367         $stag = Data::Stag->parse($xmlfile);
368         print $stag->sxpr;
369
370       The following S-Expression will be printed:
371
372         '(db
373           (species_set
374             (species
375               (common_name "house mouse")
376               (binomial "Mus musculus")
377               (tax_id "10090"))
378             (species
379               (common_name "fruit fly")
380               (binomial "Drosophila melanogaster")
381               (tax_id "7227"))
382             (species
383               (common_name "human")
384               (binomial "Homo sapiens")
385               (tax_id "9606")))
386           (gene_set
387             (gene
388               (symbol "HGNC")
389               (tax_id "9606")
390               (phenotype "Hemochromatosis")
391               (phenotype "Porphyria variegata")
392               (GO_term "iron homeostasis")
393               (map
394                 (cytological
395                   (chromosome "6")
396                   (band "p21.3"))))
397             (gene
398               (symbol "Hfe")
399               (synonym "MR2")
400               (tax_id "10090")
401               (GO_term "integral membrane protein")))
402           (similarity_set
403             (pair
404               (symbol "HGNC")
405               (symbol "Hfe"))
406             (pair
407               (symbol "WNT3A")
408               (symbol "Wnt3a"))))
409
410       TIPS FOR EMACS USERS AND LISP PROGRAMMERS
411
412       If you use emacs, you can save this as a file with the ".el" suffix and
413       get syntax highlighting for editing this file. Quotes around the
414       terminal node data items are optional.
415
416       If you know emacs lisp or any other lisp, this also turns out to be a
417       very nice language for manipulating these datastructures. Try copying
418       and pasting the above s-expression to the emacs scratch buffer and
419       playing with it in lisp.
420
421   INDENTED TEXT REPRESENTATION
422       Data::Stag has its own text format for writing data trees. Again, this
423       is only possible because we are working with a subset of XML (no
424       attributes, no mixed elements). The data structure above can be written
425       as follows -
426
427         db:
428           species_set:
429             species:
430               common_name: house mouse
431               binomial: Mus musculus
432               tax_id: 10090
433             species:
434               common_name: fruit fly
435               binomial: Drosophila melanogaster
436               tax_id: 7227
437             species:
438               common_name: human
439               binomial: Homo sapiens
440               tax_id: 9606
441           gene_set:
442             gene:
443               symbol: HGNC
444               tax_id: 9606
445               phenotype: Hemochromatosis
446               phenotype: Porphyria variegata
447               GO_term: iron homeostasis
448               map: 6p21.3
449             gene:
450               symbol: Hfe
451               synonym: MR2
452               tax_id: 10090
453               GO_term: integral membrane protein
454               map: 13 A2-A4
455           similarity_set:
456             pair:
457               symbol: HGNC
458               symbol: Hfe
459             pair:
460               symbol: WNT3A
461               symbol: Wnt3a
462
463       See Data::Stag::ITextParser and  Data::Stag::ITextWriter
464
465   NESTED ARRAY SPECIFICATION II
466       To avoid excessive square bracket usage, you can specify a structure
467       like this:
468
469         use Data::Stag qw(:all);
470
471         *N = \&stag_new;
472         my $tree =
473           N(top=>[
474                   N('personset'=>[
475                                   N('person'=>[
476                                                N('name'=>'davey'),
477                                                N('address'=>'here'),
478                                                N('description'=>[
479                                                                  N('hair'=>'green'),
480                                                                  N('eyes'=>'two'),
481                                                                  N('teeth'=>5),
482                                                                 ]
483                                                 ),
484                                                N('pets'=>[
485                                                           N('petname'=>'igor'),
486                                                           N('petname'=>'ginger'),
487                                                          ]
488                                                 ),
489
490                                               ],
491                                    ),
492                                   N('person'=>[
493                                                N('name'=>'shuggy'),
494                                                N('address'=>'there'),
495                                                N('description'=>[
496                                                                  N('hair'=>'red'),
497                                                                  N('eyes'=>'three'),
498                                                                  N('teeth'=>1),
499                                                                 ]
500                                                 ),
501                                                N('pets'=>[
502                                                           N('petname'=>'thud'),
503                                                           N('petname'=>'spud'),
504                                                          ]
505                                                 ),
506                                               ]
507                                    ),
508                                  ]
509                    ),
510                   N('animalset'=>[
511                                   N('animal'=>[
512                                                N('name'=>'igor'),
513                                                N('class'=>'rat'),
514                                                N('description'=>[
515                                                                  N('fur'=>'white'),
516                                                                  N('eyes'=>'red'),
517                                                                  N('teeth'=>50),
518                                                                 ],
519                                                 ),
520                                               ],
521                                    ),
522                                  ]
523                    ),
524
525                  ]
526            );
527
528         # find all people
529         my @persons = stag_find($tree, 'person');
530
531         # write xml for all red haired people
532         foreach my $p (@persons) {
533           print stag_xml($p)
534             if stag_tmatch($p, "hair", "red");
535         } ;
536
537         # find all people that have name == shuggy
538         my @p =
539           stag_qmatch($tree,
540                       "person",
541                       "name",
542                       "shuggy");
543

NODES AS DATA OBJECTS

545       As well as the methods listed below, a node can be treated as if it is
546       a data object of a class determined by the element.
547
548       For example, the following are equivalent.
549
550         $node->get_name;
551         $node->get('name');
552
553         $node->set_name('fred');
554         $node->set('name', 'fred');
555
556       This is really just syntactic sugar. The autoloaded methods are not
557       checked against any schema, although this may be added in future.
558

INDEXING STAG TREES

560       A stag tree can be indexed as a hash for direct retrieval; see
561       Data::Stag::HashDB
562
563       This index can be made persistent as a DB file; see Data::Stag::StagDB
564
565       If you wish to use Stag in conjunction with a relational database, you
566       should install DBIx::DBStag
567

STAG METHODS

569       All method calls are also available as procedural subroutine calls;
570       unless otherwise noted, the subroutine call is the same as the method
571       call, but with the string stag_ prefixed to the method name. The first
572       argument should be a Data::Stag datastructure.
573
574       To import all subroutines into the current namespace, use this idiom:
575
576         use Data::Stag qw(:all);
577         $doc = stag_parse($file);
578         @persons = stag_find($doc, 'person');
579
580       If you wish to use this module procedurally, and you are too lazy to
581       prefix all calls with stag_, use this idiom:
582
583         use Data::Stag qw(:lazy);
584         $doc = parse($file);
585         @persons = find($doc, 'person');
586
587       But beware of clashes!
588
589       Most method calls also have a handy short mnemonic. Use of these is
590       optional. Software engineering types prefer longer names, in the belief
591       that this leads to clearer code. Hacker types prefer shorter names, as
592       this requires less keystrokes, and leads to a more compact
593       representation of the code. It is expected that if you do use this
594       module, then its usage will be fairly ubiquitous within your code, and
595       the mnemonics will become familiar, much like the qw and s/ operators
596       in perl. As always with perl, the decision is yours.
597
598       Some methods take a single parameter or list of parameters; some have
599       large lists of parameters that can be passed in any order. If the
600       documentation states:
601
602         Args: [x str], [y int], [z ANY]
603
604       Then the method can be called like this:
605
606         $stag->foo("this is x", 55, $ref);
607
608       or like this:
609
610         $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
611
612   INITIALIZATION METHODS
613       new
614
615              Title: new
616
617               Args: element str, data STAG-DATA
618            Returns: Data::Stag node
619            Example: $node = stag_new();
620            Example: $node = Data::Stag->new;
621            Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
622
623       creates a new instance of a Data::Stag node
624
625       stagify (nodify)
626
627              Title: stagify
628            Synonym: nodify
629               Args: data ARRAY-REF
630            Returns: Data::Stag node
631            Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
632
633       turns a perl array reference into a Data::Stag node.
634
635       similar to new
636
637       parse
638
639              Title: parse
640
641               Args: [file str], [format str], [handler obj], [fh FileHandle]
642            Returns: Data::Stag node
643            Example: $node = stag_parse($fn);
644            Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
645            Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
646
647       slurps a file or string into a Data::Stag node structure. Will guess
648       the format (xml, sxpr, itext, indent) from the suffix if it is not
649       given.
650
651       The format can also be the name of a parsing module, or an actual
652       parser object;
653
654       The handler is any object that can take nested Stag events
655       (start_event, end_event, evbody) which are generated from the parse. If
656       the handler is omitted, all events will be cached and the resulting
657       tree will be returned.
658
659       See Data::Stag::BaseHandler for writing your own handlers
660
661       See Data::Stag::BaseGenerator for details on parser classes, and error
662       handling
663
664       parsestr
665
666              Title: parsestr
667
668               Args: [str str], [format str], [handler obj]
669            Returns: Data::Stag node
670            Example: $node = stag_parsestr('(a (b (c "1")))');
671            Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
672
673       Similar to parse(), except the first argument is a string
674
675       from
676
677              Title: from
678
679               Args: format str, source str
680            Returns: Data::Stag node
681            Example: $node = stag_from('xml', $fn);
682            Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
683            Example: $node = Data::Stag->from($parser, $fn);
684
685       Similar to parse
686
687       slurps a file or string into a Data::Stag node structure.
688
689       The format can also be the name of a parsing module, or an actual
690       parser object
691
692       unflatten
693
694              Title: unflatten
695
696               Args: data array
697            Returns: Data::Stag node
698            Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
699
700       Creates a node structure from a semi-flattened representation, in which
701       children of a node are represented as a flat list of data rather than a
702       list of array references.
703
704       This means a structure can be specified as:
705
706         person=>[name=>$n,
707                  phone=>$p,
708                  address=>[street=>$s,
709                            city=>$c]]
710
711       Instead of:
712
713         [person=>[ [name=>$n],
714                    [phone=>$p],
715                    [address=>[ [street=>$s],
716                                [city=>$c] ] ]
717                  ]
718         ]
719
720       The former gets converted into the latter for the internal
721       representation
722
723       makehandler
724
725              Title: makehandler
726
727               Args: hash of CODEREFs keyed by element name
728                     OR a string containing the name of a module
729            Returns: L<Data::Stag::BaseHandler>
730            Example: $h = Data::Stag->makehandler(%subs);
731            Example: $h = Data::Stag->makehandler("My::FooHandler");
732            Example: $h = Data::Stag->makehandler('xml');
733
734       This creates a Stag event handler. The argument is a hash of
735       subroutines keyed by element/node name. After each node is fired by the
736       parser/generator, the subroutine is called, passing the handler object
737       and the stag node as arguments. whatever the subroutine returns is
738       placed back into the tree
739
740       For example, for a a parser/generator that fires events with the
741       following tree form
742
743         <person>
744           <name>foo</name>
745           ...
746         </person>
747
748       we can create a handler that writes person/name like this:
749
750         $h = Data::Stag->makehandler(
751                                      person => sub { my ($self,$stag) = @_;
752                                                      print $stag->name;
753                                                      return $stag; # dont change tree
754                                                    });
755         $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
756
757       See Data::Stag::BaseHandler for details on handlers
758
759       getformathandler
760
761              Title: getformathandler
762
763               Args: format str OR L<Data::Stag::BaseHandler>
764            Returns: L<Data::Stag::BaseHandler>
765            Example: $h = Data::Stag->getformathandler('xml');
766                     $h->file("my.xml");
767                     Data::Stag->parse(-fn=>$fn, -handler=>$h);
768
769       Creates a Stag event handler - this handler can be passed to an event
770       generator / parser. Built in handlers include:
771
772       xml Generates xml tags from events
773
774       sxpr
775           Generates S-Expressions from events
776
777       itext
778           Generates itext format from events
779
780       indent
781           Generates indent format from events
782
783       All the above are kinds of Data::Stag::Writer
784
785       chainhandler
786
787              Title: chainhandler
788
789               Args: blocked events - str or str[]
790                     initial handler - handler object
791                     final handler - handler object
792            Returns:
793            Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
794
795       chains handlers together - for example, you may want to make transforms
796       on an event stream, and then pass the event stream to another handler -
797       for example, and xml handler
798
799         $processor = Data::Stag->makehandler(
800                                              a => sub { my ($self,$stag) = @_;
801                                                         $stag->set_foo("bar");
802                                                         return $stag
803                                                       },
804                                              b => sub { my ($self,$stag) = @_;
805                                                         $stag->set_blah("eek");
806                                                         return $stag
807                                                       },
808                                              );
809         $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
810         $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
811
812       If the inner handler has a method CONSUMES(), this method will
813       determine the blocked events if none are specified.
814
815       see also the script stag-handle.pl
816
817   RECURSIVE SEARCHING
818       find (f)
819
820              Title: find
821            Synonym: f
822
823               Args: element str
824            Returns: node[] or ANY
825            Example: @persons = stag_find($struct, 'person');
826            Example: @persons = $struct->find('person');
827
828       recursively searches tree for all elements of the given type, and
829       returns all nodes or data elements found.
830
831       if the element found is a non-terminal node, will return the node if
832       the element found is a terminal (leaf) node, will return the data value
833
834       the element argument can be a path
835
836         @names = $struct->find('department/person/name');
837
838       will find name in the nested structure below:
839
840         (department
841          (person
842           (name "foo")))
843
844       findnode (fn)
845
846              Title: findnode
847            Synonym: fn
848
849               Args: element str
850            Returns: node[]
851            Example: @persons = stag_findnode($struct, 'person');
852            Example: @persons = $struct->findnode('person');
853
854       recursively searches tree for all elements of the given type, and
855       returns all nodes found.
856
857       paths can also be used (see find)
858
859       findval (fv)
860
861              Title: findval
862            Synonym: fv
863
864               Args: element str
865            Returns: ANY[] or ANY
866            Example: @names = stag_findval($struct, 'name');
867            Example: @names = $struct->findval('name');
868            Example: $firstname = $struct->findval('name');
869
870       recursively searches tree for all elements of the given type, and
871       returns all data values found. the data values could be primitive
872       scalars or nodes.
873
874       paths can also be used (see find)
875
876       sfindval (sfv)
877
878              Title: sfindval
879            Synonym: sfv
880
881               Args: element str
882            Returns: ANY
883            Example: $name = stag_sfindval($struct, 'name');
884            Example: $name = $struct->sfindval('name');
885
886       as findval, but returns the first value found
887
888       paths can also be used (see find)
889
890       findvallist (fvl)
891
892              Title: findvallist
893            Synonym: fvl
894
895               Args: element str[]
896            Returns: ANY[]
897            Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
898            Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
899
900       recursively searches tree for all elements in the list
901
902       DEPRECATED
903
904   DATA ACCESSOR METHODS
905       these allow getting and setting of elements directly underneath the
906       current one
907
908       get (g)
909
910              Title: get
911            Synonym: g
912
913               Args: element str
914             Return: node[] or ANY
915            Example: $name = $person->get('name');
916            Example: @phone_nos = $person->get('phone_no');
917
918       gets the value of the named sub-element
919
920       if the sub-element is a non-terminal, will return a node(s) if the sub-
921       element is a terminal (leaf) it will return the data value(s)
922
923       the examples above would work on a data structure like this:
924
925         [person => [ [name => 'fred'],
926                      [phone_no => '1-800-111-2222'],
927                      [phone_no => '1-415-555-5555']]]
928
929       will return an array or single value depending on the context
930
931       [equivalent to findval(), except that only direct children (as opposed
932       to all descendents) are checked]
933
934       paths can also be used, like this:
935
936        @phones_nos = $struct->get('person/phone_no')
937
938       sget (sg)
939
940              Title: sget
941            Synonym: sg
942
943               Args: element str
944             Return: ANY
945            Example: $name = $person->sget('name');
946            Example: $phone = $person->sget('phone_no');
947            Example: $phone = $person->sget('department/person/name');
948
949       as get but always returns a single value
950
951       [equivalent to sfindval(), except that only direct children (as opposed
952       to all descendents) are checked]
953
954       getl (gl getlist)
955
956              Title: gl
957            Synonym: getl
958            Synonym: getlist
959
960               Args: element str[]
961             Return: node[] or ANY[]
962            Example: ($name, @phone) = $person->getl('name', 'phone_no');
963
964       returns the data values for a list of sub-elements of a node
965
966       [equivalent to findvallist(), except that only direct children (as
967       opposed to all descendents) are checked]
968
969       getn (gn getnode)
970
971              Title: getn
972            Synonym: gn
973            Synonym: getnode
974
975               Args: element str
976             Return: node[]
977            Example: $namestruct = $person->getn('name');
978            Example: @pstructs = $person->getn('phone_no');
979
980       as get but returns the whole node rather than just the data value
981
982       [equivalent to findnode(), except that only direct children (as opposed
983       to all descendents) are checked]
984
985       sgetmap (sgm)
986
987              Title: sgetmap
988            Synonym: sgm
989
990               Args: hash
991             Return: hash
992            Example: %h = $person->sgetmap('social-security-no'=>'id',
993                                           'name'              =>'label',
994                                           'job'               =>0,
995                                           'address'           =>'location');
996
997       returns a hash of key/val pairs based on the values of the data values
998       of the subnodes in the current element; keys are mapped according to
999       the hash passed (a value of '' or 0 will map an identical key/val).
1000
1001       no multivalued data elements are allowed
1002
1003       set (s)
1004
1005              Title: set
1006            Synonym: s
1007
1008               Args: element str, datavalue ANY (list)
1009             Return: ANY
1010            Example: $person->set('name', 'fred');    # single val
1011            Example: $person->set('phone_no', $cellphone, $homephone);
1012
1013       sets the data value of an element for any node. if the element is
1014       multivalued, all the old values will be replaced with the new ones
1015       specified.
1016
1017       ordering will be preserved, unless the element specified does not
1018       exist, in which case, the new tag/value pair will be placed at the end.
1019
1020       for example, if we have a stag node $person
1021
1022         person:
1023           name: shuggy
1024           job:  bus driver
1025
1026       if we do this
1027
1028         $person->set('name', ());
1029
1030       we will end up with
1031
1032         person:
1033           job:  bus driver
1034
1035       then if we do this
1036
1037         $person->set('name', 'shuggy');
1038
1039       the 'name' node will be placed as the last attribute
1040
1041         person:
1042           job:  bus driver
1043           name: shuggy
1044
1045       You can also use magic methods, for example
1046
1047         $person->set_name('shuggy');
1048         $person->set_job('bus driver', 'poet');
1049         print $person->itext;
1050
1051       will print
1052
1053         person:
1054           name: shuggy
1055           job:  bus driver
1056           job:  poet
1057
1058       note that if the datavalue is a non-terminal node as opposed to a
1059       primitive value, then you have to do it like this:
1060
1061         $people  = Data::Stag->new(people=>[
1062                                             [person=>[[name=>'Sherlock Holmes']]],
1063                                             [person=>[[name=>'Moriarty']]],
1064                                            ]);
1065         $address = Data::Stag->new(address=>[
1066                                              [address_line=>"221B Baker Street"],
1067                                              [city=>"London"],
1068                                              [country=>"Great Britain"]]);
1069         ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1070         $person->set("address", $address->data);
1071
1072       If you are using XML data, you can set attributes like this:
1073
1074         $person->set('@'=>[[id=>$id],[foo=>$foo]]);
1075
1076       unset (u)
1077
1078              Title: unset
1079            Synonym: u
1080
1081               Args: element str, datavalue ANY
1082             Return: ANY
1083            Example: $person->unset('name');
1084            Example: $person->unset('phone_no');
1085
1086       prunes all nodes of the specified element from the current node
1087
1088       You can use magic methods, like this
1089
1090         $person->unset_name;
1091         $person->unset_phone_no;
1092
1093       free
1094
1095              Title: free
1096            Synonym: u
1097
1098               Args:
1099             Return:
1100            Example: $person->free;
1101
1102       removes all data from a node. If that node is a subnode of another
1103       node, it is removed altogether
1104
1105       for instance, if we had the data below:
1106
1107         <person>
1108           <name>fred</name>
1109           <address>
1110           ..
1111           </address>
1112         </person>
1113
1114       and called
1115
1116         $person->get_address->free
1117
1118       then the person node would look like this:
1119
1120         <person>
1121           <name>fred</name>
1122         </person>
1123
1124       add (a)
1125
1126              Title: add
1127            Synonym: a
1128
1129               Args: element str, datavalues ANY[]
1130                     OR
1131                     Data::Stag
1132             Return: ANY
1133            Example: $person->add('phone_no', $cellphone, $homephone);
1134            Example: $person->add_phone_no('1-555-555-5555');
1135            Example: $dataset->add($person)
1136
1137       adds a datavalue or list of datavalues. appends if already existing,
1138       creates new element value pairs if not already existing.
1139
1140       if the argument is a stag node, it will add this node under the current
1141       one.
1142
1143       For example, if we have the following node in $dataset
1144
1145        <dataset>
1146          <person>
1147            <name>jim</name>
1148          </person>
1149        </dataset>
1150
1151       And then we add data to it:
1152
1153         ($person) = $dataset->qmatch('person', name=>'jim');
1154         $person->add('phone_no', '555-1111', '555-2222');
1155
1156       We will be left with:
1157
1158        <dataset>
1159          <person>
1160            <name>jim</name>
1161            <phone_no>555-1111</phone_no>
1162            <phone_no>555-2222</phone_no>
1163          </person>
1164        </dataset>
1165
1166       The above call is equivalent to:
1167
1168         $person->add_phone_no('555-1111', '555-2222');
1169
1170       As well as adding data values, we can add whole nodes:
1171
1172         $dataset->add(person=>[[name=>"fred"],
1173                                [phone_no=>"555-3333"]]);
1174
1175       Which is equivalent to
1176
1177         $dataset->add_person([[name=>"fred"],
1178                               [phone_no=>"555-3333"]]);
1179
1180       Remember, the value has to be specified as an array reference of nodes.
1181       In general, you should use the addkid() method to add nodes and used
1182       add() to add values
1183
1184       element (e name)
1185
1186              Title: element
1187            Synonym: e
1188            Synonym: name
1189
1190               Args:
1191             Return: element str
1192            Example: $element = $struct->element
1193
1194       returns the element name of the current node.
1195
1196       This is illustrated in the different representation formats below
1197
1198       sxpr
1199             (element "data")
1200
1201           or
1202
1203             (element
1204              (sub_element "..."))
1205
1206       xml
1207             <element>data</element>
1208
1209           or
1210
1211             <element>
1212               <sub_element>...</sub_element>
1213             </element>
1214
1215       perl
1216             [element => $data ]
1217
1218           or
1219
1220             [element => [
1221                           [sub_element => "..." ]]]
1222
1223       itext
1224             element: data
1225
1226           or
1227
1228             element:
1229               sub_element: ...
1230
1231       indent
1232             element "data"
1233
1234           or
1235
1236             element
1237               sub_element "..."
1238
1239       kids (k children)
1240
1241              Title: kids
1242            Synonym: k
1243            Synonym: children
1244
1245               Args:
1246             Return: ANY or ANY[]
1247            Example: @nodes = $person->kids
1248            Example: $name = $namestruct->kids
1249
1250       returns the data value(s) of the current node; if it is a terminal
1251       node, returns a single value which is the data. if it is non-terminal,
1252       returns an array of nodes
1253
1254       addkid (ak addchild)
1255
1256              Title: addkid
1257            Synonym: ak
1258            Synonym: addchild
1259
1260               Args: kid node
1261             Return: ANY
1262            Example: $person->addkid($job);
1263
1264       adds a new child node to a non-terminal node, after all the existing
1265       child nodes
1266
1267       You can use this method/procedure to add XML attribute data to a node:
1268
1269         $person->addkid(['@'=>[[id=>$id]]]);
1270
1271       subnodes
1272
1273              Title: subnodes
1274
1275               Args:
1276             Return: ANY[]
1277            Example: @nodes = $person->subnodes
1278
1279       returns the child nodes; returns empty list if this is a terminal node
1280
1281       ntnodes
1282
1283              Title: ntnodes
1284
1285               Args:
1286             Return: ANY[]
1287            Example: @nodes = $person->ntnodes
1288
1289       returns all non-terminal children of current node
1290
1291       tnodes
1292
1293              Title: tnodes
1294
1295               Args:
1296             Return: ANY[]
1297            Example: @nodes = $person->tnodes
1298
1299       returns all terminal children of current node
1300
1301   QUERYING AND ADVANCED DATA MANIPULATION
1302       ijoin (j)
1303
1304              Title: ijoin
1305            Synonym: j
1306            Synonym: ij
1307
1308               Args: element str, key str, data Node
1309             Return: undef
1310
1311       does a relational style inner join - see previous example in this doc
1312
1313       key can either be a single node name that must be shared (analagous to
1314       SQL INNER JOIN .. USING), or a key1=key2 equivalence relation
1315       (analagous to SQL INNER JOIN ... ON)
1316
1317       qmatch (qm)
1318
1319              Title: qmatch
1320            Synonym: qm
1321
1322               Args: return-element str, match-element str, match-value str
1323             Return: node[]
1324            Example: @persons = $s->qmatch('person', 'name', 'fred');
1325            Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1326
1327       queries the node tree for all elements that satisfy the specified
1328       key=val match - see previous example in this doc
1329
1330       for those inclined to thinking relationally, this can be thought of as
1331       a query that returns a stag object:
1332
1333         SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1334
1335       this always returns an array; this means that calling in a scalar
1336       context will return the number of elements; for example
1337
1338         $n = $s->qmatch('person', (name=>'fred'));
1339
1340       the value of $n will be equal to the number of persons called fred
1341
1342       tmatch (tm)
1343
1344              Title: tmatch
1345            Synonym: tm
1346
1347               Args: element str, value str
1348             Return: bool
1349            Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1350
1351       returns true if the the value of the specified element matches - see
1352       previous example in this doc
1353
1354       tmatchhash (tmh)
1355
1356              Title: tmatchhash
1357            Synonym: tmh
1358
1359               Args: match hashref
1360             Return: bool
1361            Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1362
1363       returns true if the node matches a set of constraints, specified as
1364       hash.
1365
1366       tmatchnode (tmn)
1367
1368              Title: tmatchnode
1369            Synonym: tmn
1370
1371               Args: match node
1372             Return: bool
1373            Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1374
1375       returns true if the node matches a set of constraints, specified as
1376       node
1377
1378       cmatch (cm)
1379
1380              Title: cmatch
1381            Synonym: cm
1382
1383               Args: element str, value str
1384             Return: bool
1385            Example: $n_freds = $personset->cmatch('name', 'fred');
1386
1387       counts the number of matches
1388
1389       where (w)
1390
1391              Title: where
1392            Synonym: w
1393
1394               Args: element str, test CODE
1395             Return: Node[]
1396            Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1397
1398       the tree is queried for all elements of the specified type that satisfy
1399       the coderef (must return a boolean)
1400
1401         my @rich_dog_or_cat_owners =
1402           $data->where('person',
1403                        sub {my $p = shift;
1404                             $p->get_salary > 100000 &&
1405                             $p->where('pet',
1406                                       sub {shift->get_type =~ /(dog|cat)/})});
1407
1408       iterate (i)
1409
1410              Title: iterate
1411            Synonym: i
1412
1413               Args: CODE
1414             Return: Node[]
1415            Example: $data->iterate(sub {
1416                                        my $stag = shift;
1417                                        my $parent = shift;
1418                                        if ($stag->element eq 'pet') {
1419                                            $parent->set_pet_name($stag->get_name);
1420                                        }
1421                                    });
1422
1423       iterates through whole tree calling the specified subroutine.
1424
1425       the first arg passed to the subroutine is the stag node representing
1426       the tree at that point; the second arg is for the parent.
1427
1428       for instance, the example code above would turn this
1429
1430         (person
1431          (name "jim")
1432          (pet
1433           (name "fluffy")))
1434
1435       into this
1436
1437         (person
1438          (name "jim")
1439          (pet_name "fluffy")
1440          (pet
1441           (name "fluffy")))
1442
1443       maptree
1444
1445              Title: maptree
1446
1447               Args: CODE
1448             Return: Node[]
1449            Example: $data->maptree(sub {
1450                                        my $stag = shift;
1451                                        my $parent = shift;
1452                                        if ($stag->element eq 'pet') {
1453                                            [pet=>$stag->sget_foo]
1454                                        }
1455                                        else {
1456                                            $stag
1457                                        }
1458                                    });
1459
1460   MISCELLANEOUS METHODS
1461       duplicate (d)
1462
1463              Title: duplicate
1464            Synonym: d
1465
1466               Args:
1467             Return: Node
1468            Example: $node2 = $node->duplicate;
1469
1470       does a deep copy of a stag structure
1471
1472       isanode
1473
1474              Title: isanode
1475
1476               Args:
1477             Return: bool
1478            Example: if (stag_isanode($node)) { ... }
1479
1480       hash
1481
1482              Title: hash
1483
1484               Args:
1485             Return: hash
1486            Example: $h = $node->hash;
1487
1488       turns a tree into a hash. all data values will be arrayrefs
1489
1490       pairs
1491
1492              Title: pairs
1493
1494       turns a tree into a hash. all data values will be scalar (IMPORTANT:
1495       this means duplicate values will be lost)
1496
1497       write
1498
1499              Title: write
1500
1501               Args: filename str, format str[optional]
1502             Return:
1503            Example: $node->write("myfile.xml");
1504            Example: $node->write("myfile", "itext");
1505
1506       will try and guess the format from the extension if not specified
1507
1508       xml
1509
1510              Title: xml
1511
1512               Args: filename str, format str[optional]
1513             Return:
1514            Example: $node->write("myfile.xml");
1515            Example: $node->write("myfile", "itext");
1516
1517
1518               Args:
1519             Return: xml str
1520            Example: print $node->xml;
1521
1522   XML METHODS
1523       xslt
1524
1525              Title: xslt
1526
1527               Args: xslt_file str
1528             Return: Node
1529            Example: $new_stag = $stag->xslt('mytransform.xsl');
1530
1531       transforms a stag tree using XSLT
1532
1533       xsltstr
1534
1535              Title: xsltstr
1536
1537               Args: xslt_file str
1538             Return: str
1539            Example: print $stag->xsltstr('mytransform.xsl');
1540
1541       As above, but returns the string of the resulting transform, rather
1542       than a stag tree
1543
1544       sax
1545
1546              Title: sax
1547
1548               Args: saxhandler SAX-CLASS
1549             Return:
1550            Example: $node->sax($mysaxhandler);
1551
1552       turns a tree into a series of SAX events
1553
1554       xpath (xp tree2xpath)
1555
1556              Title: xpath
1557            Synonym: xp
1558            Synonym: tree2xpath
1559
1560               Args:
1561             Return: xpath object
1562            Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1563
1564       xpquery (xpq xpathquery)
1565
1566              Title: xpquery
1567            Synonym: xpq
1568            Synonym: xpathquery
1569
1570               Args: xpathquery str
1571             Return: Node[]
1572            Example: @nodes = $node->xqp($xpathquerystr);
1573

STAG SCRIPTS

1575       The following scripts come with the stag module
1576
1577       stag-autoschema.pl
1578           writes the implicit stag-schema for a stag file
1579
1580       stag-db.pl
1581           persistent storage and retrieval for stag data (xml, sxpr, itext)
1582
1583       stag-diff.pl
1584           finds the difference between two stag files
1585
1586       stag-drawtree.pl
1587           draws a stag file (xml, itext, sxpr) as a PNG diagram
1588
1589       stag-filter.pl
1590           filters a stag file (xml, itext, sxpr) for nodes of interest
1591
1592       stag-findsubtree.pl
1593           finds nodes in a stag file
1594
1595       stag-flatten.pl
1596           turns stag data into a flat table
1597
1598       stag-grep.pl
1599           filters a stag file (xml, itext, sxpr) for nodes of interest
1600
1601       stag-handle.pl
1602           streams a stag file through a handler into a writer
1603
1604       stag-join.pl
1605           joins two stag files together based around common key
1606
1607       stag-mogrify.pl
1608           mangle stag files
1609
1610       stag-parse.pl
1611           parses a file and fires events (e.g. sxpr to xml)
1612
1613       stag-query.pl
1614           aggregare queries
1615
1616       stag-split.pl
1617           splits a stag file (xml, itext, sxpr) into multiple files
1618
1619       stag-splitter.pl
1620           splits a stag file into multiple files
1621
1622       stag-view.pl
1623           draws an expandable Tk tree diagram showing stag data
1624
1625       To get more documentation, type
1626
1627         stag_<script> -h
1628

BUGS

1630       none known so far, possibly quite a few undocumented features!
1631
1632       Not a bug, but the underlying default datastructure of nested arrays is
1633       more heavyweight than it needs to be. More lightweight implementations
1634       are possible. Some time I will write a C implementation.
1635

WEBSITE

1637       <http://stag.sourceforge.net>
1638

AUTHOR

1640       Chris Mungall <cjm AT fruitfly DOT org>
1641
1643       Copyright (c) 2004 Chris Mungall
1644
1645       This module is free software.  You may distribute this module under the
1646       same terms as perl itself
1647
1648
1649
1650perl v5.36.0                      2022-07-22                     Data::Stag(3)
Impressum