1Data::Stag(3)         User Contributed Perl Documentation        Data::Stag(3)
2
3
4

NAME

6         Data::Stag - Structured Tags datastructures
7

SYNOPSIS

9         # PROCEDURAL USAGE
10         use Data::Stag qw(:all);
11         $doc = stag_parse($file);
12         @persons = stag_find($doc, "person");
13         foreach $p (@persons) {
14           printf "%s, %s phone: %s\n",
15             stag_sget($p, "family_name"),
16             stag_sget($p, "given_name"),
17             stag_sget($p, "phone_no"),
18           ;
19         }
20
21         # OBJECT-ORIENTED USAGE
22         use Data::Stag;
23         $doc = Data::Stag->parse($file);
24         @persons = $doc->find("person");
25         foreach $p (@person) {
26           printf "%s, %s phone:%s\n",
27             $p->sget("family_name"),
28             $p->sget("given_name"),
29             $p->sget("phone_no"),
30           ;
31         }
32

DESCRIPTION

34       This module is for manipulating data as hierarchical tag/value pairs
35       (Structured TAGs or Simple Tree AGgreggates). These datastructures can
36       be represented as nested arrays, which have the advantage of being
37       native to perl. A simple example is shown below:
38
39         [ person=> [  [ family_name => $family_name ],
40                       [ given_name  => $given_name  ],
41                       [ phone_no    => $phone_no    ] ] ],
42
43       Data::Stag uses a subset of XML for import and export. This means the
44       module can also be used as a general XML parser/writer (with certain
45       caveats).
46
47       The above set of structured tags can be represented in XML as
48
49         <person>
50           <family_name>...</family_name>
51           <given_name>...</given_name>
52           <phone_no>...</phone_no>
53         </person>
54
55       This datastructure can be examined, manipulated and exported using Stag
56       functions or methods:
57
58         $document = Data::Stag->parse($file);
59         @persons = $document->find('person');
60         foreach my $person (@person) {
61           $person->set('full_name',
62                        $person->sget('given_name') . ' ' .
63                        $person->sget('family_name'));
64         }
65
66       Advanced querying is performed by passing functions, for example:
67
68         # get all people in dataset with name starting 'A'
69         @persons =
70           $document->where('person',
71                            sub {shift->sget('family_name') =~ /^A/});
72
73       One of the things that marks this module out against other XML modules
74       is this emphasis on a functional approach as an obect-oriented or pro‐
75       cedural approach.
76
77       For full information on the stag project, see <http://stag.source
78       forge.net>
79
80       PROCEDURAL VS OBJECT-ORIENTED USAGE
81
82       Depending on your preference, this module can be used a set of proce‐
83       dural subroutine calls, or as method calls upon Data::Stag objects, or
84       both.
85
86       In procedural mode, all the subroutine calls are prefixed "stag_" to
87       avoid namespace clashes. The following three calls are equivalent:
88
89         $person = stag_find($doc, "person");
90         $person = $doc->find("person");
91         $person = $doc->find_person;
92
93       In object mode, you can treat any tree element as if it is an object
94       with automatically defined methods for getting/setting the tag values.
95
96       USE OF XML
97
98       Nested arrays can be imported and exported as XML, as well as other
99       formats. XML can be slurped into memory all at once (using less memory
100       than an equivalent DOM tree), or a simplified SAX style event handling
101       model can be used. Similarly, data can be exported all at once, or as a
102       series of events.
103
104       Although this module can be used as a general XML tool, it is intended
105       primarily as a tool for manipulating hierarchical data using nested
106       tag/value pairs.
107
108       This module is more suited to dealing with data-oriented documents than
109       text-oriented documents.
110
111       By using a simpler subset of XML equivalent to a basic data tree struc‐
112       ture, we can write simpler, cleaner code.
113
114       This module is ideally suited to element-only XML (that is, XML without
115       attributes or mixed elements).
116
117       If you are using attributes or mixed elements, it is useful to know
118       what is going on under the hood.
119
120       All attributes are turned into elements; they are nested inside an ele‐
121       ment with name '@'.
122
123       For example, the following piece of XML
124
125         <foo id="x">
126           <bar>ugh</bar>
127         </foo>
128
129       Gets represented internally as
130
131         <foo>
132           <@>
133             <id>x</id>
134           </@>
135           <bar>ugh</bar>
136         </foo>
137
138       Of course, this is not valid XML. However, it is just an internal rep‐
139       resentation - when exporting back to XML it will look like normal XML
140       with attributes again.
141
142       Mixed content cannot be represented in a simple tree format, so this is
143       also expanded.
144
145       The following piece of XML
146
147         <paragraph id="1" color="green">
148           example of <bold>mixed</bold>content
149         </paragraph>
150
151       gets parsed as if it were actually:
152
153         <paragraph>
154           <@>
155             <id>1</id>
156             <color>green</color>
157           </@>
158           <.>example of</.>
159           <bold>mixed</bold>
160           <.>content</.>
161         </paragraph>
162
163       When using stag with attribute or mixed attribute xml, you can treat
164       '@' and '.' as normal elements
165
166       SAX
167
168       This module can also be used as part of a SAX-style event generation /
169       handling framework - see Data::Stag::BaseHandler
170
171       PERL REPRESENTATION
172
173       Because nested arrays are native to perl, we can specify an XML datas‐
174       tructure directly in perl without going through multiple object calls.
175
176       For example, instead of using XML::Writer for the lengthy
177
178         $obj->startTag("record");
179         $obj->startTag("field1");
180         $obj->characters("foo");
181         $obj->endTag("field1");
182         $obj->startTag("field2");
183         $obj->characters("bar");
184         $obj->endTag("field2");
185         $obj->end("record");
186
187       We can instead write
188
189         $struct = [ record => [
190                     [ field1 => 'foo'],
191                     [ field2 => 'bar']]];
192
193       PARSING
194
195       The following example is for parsing out subsections of a tree and
196       changing sub-elements
197
198         use Data::Stag qw(:all);
199         my $tree = stag_parse($xmlfile);
200         my ($subtree) = stag_findnode($tree, $element);
201         stag_set($element, $sub_element, $new_val);
202         print stag_xml($subtree);
203
204       OBJECT ORIENTED
205
206       The same can be done in a more OO fashion
207
208         use Data::Stag qw(:all);
209         my $tree = Data::Stag->parse($xmlfile);
210         my ($subtree) = $tree->findnode($element);
211         $element->set($sub_element, $new_val);
212         print $subtree->xml;
213
214       IN A STREAM
215
216       Rather than parsing in a whole file into memory all at once (which may
217       not be suitable for very large files), you can take an event handling
218       approach. The easiest way to do this to register which nodes in the
219       file you are interested in using the makehandler method. The parser
220       will sweep through the file, building objects as it goes, and handing
221       the object to a subroutine that you specify.
222
223       For example:
224
225         use Data::Stag;
226         # catch the end of 'person' elements
227         my $h = Data::Stag->makehandler( person=> sub {
228                                                      my ($self, $person) = @_;
229                                                      printf "name:%s phone:%s\n",
230                                                        $person->get_name,
231                                                        $person->get_phone;
232                                                      return;   # clear node
233                                                       });
234         Data::Stag->parse(-handler=>$h,
235                           -file=>$f);
236
237       see Data::Stag::BaseHandler for writing handlers
238
239       See the Stag website at <http://stag.sourceforge.net> for more exam‐
240       ples.
241
242       STRUCTURED TAGS TREE DATA STRUCTURE
243
244       A tree of structured tags is represented as a recursively nested array,
245       the elements of the array represent nodes in the tree.
246
247       A node is a name/data pair, that can represent tags and values.  A node
248       is represented using a reference to an array, where the first element
249       of the array is the tagname, or element, and the second element is the
250       data
251
252       This can be visualised as a box:
253
254         +-----------+
255         ⎪Name ⎪ Data⎪
256         +-----------+
257
258       In perl, we represent this pair as a reference to an array
259
260         [ Name => $Data ]
261
262       The Data can either be a list of child nodes (subtrees), or a data
263       value.
264
265       The terminal nodes (leafs of the tree) contain data values; this is
266       represented in perl using primitive scalars.
267
268       For example:
269
270         [ Name => 'Fred' ]
271
272       For non-terminal nodes, the Data is a reference to an array, where each
273       element of the the array is a new node.
274
275         +-----------+
276         ⎪Name ⎪ Data⎪
277         +-----------+
278                 ⎪⎪⎪   +-----------+
279                 ⎪⎪+-->⎪Name ⎪ Data⎪
280                 ⎪⎪    +-----------+
281                 ⎪⎪
282                 ⎪⎪    +-----------+
283                 ⎪+--->⎪Name ⎪ Data⎪
284                 ⎪     +-----------+
285
286                 ⎪     +-----------+
287                 +---->⎪Name ⎪ Data⎪
288                       +-----------+
289
290       In perl this would be:
291
292         [ Name => [
293                     [Name1 => $Data1],
294                     [Name2 => $Data2],
295                     [Name3 => $Data3],
296                   ]
297         ];
298
299       The extra level of nesting is required to be able to store any node in
300       the tree using a single variable. This representation has lots of
301       advantages over others, eg hashes and mixed hash/array structures.
302
303       MANIPULATION AND QUERYING
304
305       The following example is taken from biology; we have a list of species
306       (mouse, human, fly) and a list of genes found in that species. These
307       are cross-referenced by an identifier called tax_id. We can do a rela‐
308       tional-style inner join on this identifier, as follows -
309
310         use Data::Stag qw(:all);
311         my $tree =
312         Data::Stag->new(
313           'db' => [
314           [ 'species_set' => [
315             [ 'species' => [
316               [ 'common_name' => 'house mouse' ],
317               [ 'binomial' => 'Mus musculus' ],
318               [ 'tax_id' => '10090' ]]],
319             [ 'species' => [
320               [ 'common_name' => 'fruit fly' ],
321               [ 'binomial' => 'Drosophila melanogaster' ],
322               [ 'tax_id' => '7227' ]]],
323             [ 'species' => [
324               [ 'common_name' => 'human' ],
325               [ 'binomial' => 'Homo sapiens' ],
326               [ 'tax_id' => '9606' ]]]]],
327           [ 'gene_set' => [
328             [ 'gene' => [
329               [ 'symbol' => 'HGNC' ],
330               [ 'tax_id' => '9606' ],
331               [ 'phenotype' => 'Hemochromatosis' ],
332               [ 'phenotype' => 'Porphyria variegata' ],
333               [ 'GO_term' => 'iron homeostasis' ],
334               [ 'map' => '6p21.3' ]]],
335             [ 'gene' => [
336               [ 'symbol' => 'Hfe' ],
337               [ 'synonym' => 'MR2' ],
338               [ 'tax_id' => '10090' ],
339               [ 'GO_term' => 'integral membrane protein' ],
340               [ 'map' => '13 A2-A4' ]]]]]]
341          );
342
343         # inner join of species and gene parts of tree,
344         # based on 'tax_id' element
345         my $gene_set = $tree->find("gene_set");       # get <gene_set> element
346         my $species_set = $tree->find("species_set"); # get <species_set> element
347         $gene_set->ijoin("gene", "tax_id", $species_set);   # INNER JOIN
348
349         print "Reorganised data:\n";
350         print $gene_set->xml;
351
352         # find all genes starting with letter 'H' in where species/common_name=human
353         my @genes =
354           $gene_set->where('gene',
355                            sub { my $g = shift;
356                                  $g->get_symbol =~ /^H/ &&
357                                  $g->findval("common_name") eq ('human')});
358
359         print "Human genes beginning 'H'\n";
360         print $_->xml foreach @genes;
361
362       S-Expression (Lisp) representation
363
364       The data represented using this module can be represented as Lisp-style
365       S-Expressions.
366
367       See Data::Stag::SxprParser and  Data::Stag::SxprWriter
368
369       If we execute this code on the XML from the example above
370
371         $stag = Data::Stag->parse($xmlfile);
372         print $stag->sxpr;
373
374       The following S-Expression will be printed:
375
376         '(db
377           (species_set
378             (species
379               (common_name "house mouse")
380               (binomial "Mus musculus")
381               (tax_id "10090"))
382             (species
383               (common_name "fruit fly")
384               (binomial "Drosophila melanogaster")
385               (tax_id "7227"))
386             (species
387               (common_name "human")
388               (binomial "Homo sapiens")
389               (tax_id "9606")))
390           (gene_set
391             (gene
392               (symbol "HGNC")
393               (tax_id "9606")
394               (phenotype "Hemochromatosis")
395               (phenotype "Porphyria variegata")
396               (GO_term "iron homeostasis")
397               (map
398                 (cytological
399                   (chromosome "6")
400                   (band "p21.3"))))
401             (gene
402               (symbol "Hfe")
403               (synonym "MR2")
404               (tax_id "10090")
405               (GO_term "integral membrane protein")))
406           (similarity_set
407             (pair
408               (symbol "HGNC")
409               (symbol "Hfe"))
410             (pair
411               (symbol "WNT3A")
412               (symbol "Wnt3a"))))
413
414       TIPS FOR EMACS USERS AND LISP PROGRAMMERS
415
416       If you use emacs, you can save this as a file with the ".el" suffix and
417       get syntax highlighting for editing this file. Quotes around the termi‐
418       nal node data items are optional.
419
420       If you know emacs lisp or any other lisp, this also turns out to be a
421       very nice language for manipulating these datastructures. Try copying
422       and pasting the above s-expression to the emacs scratch buffer and
423       playing with it in lisp.
424
425       INDENTED TEXT REPRESENTATION
426
427       Data::Stag has its own text format for writing data trees. Again, this
428       is only possible because we are working with a subset of XML (no
429       attributes, no mixed elements). The data structure above can be written
430       as follows -
431
432         db:
433           species_set:
434             species:
435               common_name: house mouse
436               binomial: Mus musculus
437               tax_id: 10090
438             species:
439               common_name: fruit fly
440               binomial: Drosophila melanogaster
441               tax_id: 7227
442             species:
443               common_name: human
444               binomial: Homo sapiens
445               tax_id: 9606
446           gene_set:
447             gene:
448               symbol: HGNC
449               tax_id: 9606
450               phenotype: Hemochromatosis
451               phenotype: Porphyria variegata
452               GO_term: iron homeostasis
453               map: 6p21.3
454             gene:
455               symbol: Hfe
456               synonym: MR2
457               tax_id: 10090
458               GO_term: integral membrane protein
459               map: 13 A2-A4
460           similarity_set:
461             pair:
462               symbol: HGNC
463               symbol: Hfe
464             pair:
465               symbol: WNT3A
466               symbol: Wnt3a
467
468       See Data::Stag::ITextParser and  Data::Stag::ITextWriter
469
470       NESTED ARRAY SPECIFICATION II
471
472       To avoid excessive square bracket usage, you can specify a structure
473       like this:
474
475         use Data::Stag qw(:all);
476
477         *N = \&stag_new;
478         my $tree =
479           N(top=>[
480                   N('personset'=>[
481                                   N('person'=>[
482                                                N('name'=>'davey'),
483                                                N('address'=>'here'),
484                                                N('description'=>[
485                                                                  N('hair'=>'green'),
486                                                                  N('eyes'=>'two'),
487                                                                  N('teeth'=>5),
488                                                                 ]
489                                                 ),
490                                                N('pets'=>[
491                                                           N('petname'=>'igor'),
492                                                           N('petname'=>'ginger'),
493                                                          ]
494                                                 ),
495
496                                               ],
497                                    ),
498                                   N('person'=>[
499                                                N('name'=>'shuggy'),
500                                                N('address'=>'there'),
501                                                N('description'=>[
502                                                                  N('hair'=>'red'),
503                                                                  N('eyes'=>'three'),
504                                                                  N('teeth'=>1),
505                                                                 ]
506                                                 ),
507                                                N('pets'=>[
508                                                           N('petname'=>'thud'),
509                                                           N('petname'=>'spud'),
510                                                          ]
511                                                 ),
512                                               ]
513                                    ),
514                                  ]
515                    ),
516                   N('animalset'=>[
517                                   N('animal'=>[
518                                                N('name'=>'igor'),
519                                                N('class'=>'rat'),
520                                                N('description'=>[
521                                                                  N('fur'=>'white'),
522                                                                  N('eyes'=>'red'),
523                                                                  N('teeth'=>50),
524                                                                 ],
525                                                 ),
526                                               ],
527                                    ),
528                                  ]
529                    ),
530
531                  ]
532            );
533
534         # find all people
535         my @persons = stag_find($tree, 'person');
536
537         # write xml for all red haired people
538         foreach my $p (@persons) {
539           print stag_xml($p)
540             if stag_tmatch($p, "hair", "red");
541         } ;
542
543         # find all people that have name == shuggy
544         my @p =
545           stag_qmatch($tree,
546                       "person",
547                       "name",
548                       "shuggy");
549

NODES AS DATA OBJECTS

551       As well as the methods listed below, a node can be treated as if it is
552       a data object of a class determined by the element.
553
554       For example, the following are equivalent.
555
556         $node->get_name;
557         $node->get('name');
558
559         $node->set_name('fred');
560         $node->set('name', 'fred');
561
562       This is really just syntactic sugar. The autoloaded methods are not
563       checked against any schema, although this may be added in future.
564

INDEXING STAG TREES

566       A stag tree can be indexed as a hash for direct retrieval; see
567       Data::Stag::HashDB
568
569       This index can be made persistent as a DB file; see Data::Stag::StagDB
570
571       If you wish to use Stag in conjunction with a relational database, you
572       should install DBIx::DBStag
573

STAG METHODS

575       All method calls are also available as procedural subroutine calls;
576       unless otherwise noted, the subroutine call is the same as the method
577       call, but with the string stag_ prefixed to the method name. The first
578       argument should be a Data::Stag datastructure.
579
580       To import all subroutines into the current namespace, use this idiom:
581
582         use Data::Stag qw(:all);
583         $doc = stag_parse($file);
584         @persons = stag_find($doc, 'person');
585
586       If you wish to use this module procedurally, and you are too lazy to
587       prefix all calls with stag_, use this idiom:
588
589         use Data::Stag qw(:lazy);
590         $doc = parse($file);
591         @persons = find($doc, 'person');
592
593       But beware of clashes!
594
595       Most method calls also have a handy short mnemonic. Use of these is
596       optional. Software engineering types prefer longer names, in the belief
597       that this leads to clearer code. Hacker types prefer shorter names, as
598       this requires less keystrokes, and leads to a more compact representa‐
599       tion of the code. It is expected that if you do use this module, then
600       its usage will be fairly ubiquitous within your code, and the mnemonics
601       will become familiar, much like the qw and s/ operators in perl. As
602       always with perl, the decision is yours.
603
604       Some methods take a single parameter or list of parameters; some have
605       large lists of parameters that can be passed in any order. If the docu‐
606       mentation states:
607
608         Args: [x str], [y int], [z ANY]
609
610       Then the method can be called like this:
611
612         $stag->foo("this is x", 55, $ref);
613
614       or like this:
615
616         $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
617
618       INITIALIZATION METHODS
619
620       new
621
622              Title: new
623
624               Args: element str, data STAG-DATA
625            Returns: Data::Stag node
626            Example: $node = stag_new();
627            Example: $node = Data::Stag->new;
628            Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
629
630       creates a new instance of a Data::Stag node
631
632       stagify (nodify)
633
634              Title: stagify
635            Synonym: nodify
636               Args: data ARRAY-REF
637            Returns: Data::Stag node
638            Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
639
640       turns a perl array reference into a Data::Stag node.
641
642       similar to new
643
644       parse
645
646              Title: parse
647
648               Args: [file str], [format str], [handler obj], [fh FileHandle]
649            Returns: Data::Stag node
650            Example: $node = stag_parse($fn);
651            Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
652            Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
653
654       slurps a file or string into a Data::Stag node structure. Will guess
655       the format (xml, sxpr, itext, indent) from the suffix if it is not
656       given.
657
658       The format can also be the name of a parsing module, or an actual
659       parser object;
660
661       The handler is any object that can take nested Stag events
662       (start_event, end_event, evbody) which are generated from the parse. If
663       the handler is omitted, all events will be cached and the resulting
664       tree will be returned.
665
666       See Data::Stag::BaseHandler for writing your own handlers
667
668       See Data::Stag::BaseGenerator for details on parser classes, and error
669       handling
670
671       parsestr
672
673              Title: parsestr
674
675               Args: [str str], [format str], [handler obj]
676            Returns: Data::Stag node
677            Example: $node = stag_parsestr('(a (b (c "1")))');
678            Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
679
680       Similar to parse(), except the first argument is a string
681
682       from
683
684              Title: from
685
686               Args: format str, source str
687            Returns: Data::Stag node
688            Example: $node = stag_from('xml', $fn);
689            Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
690            Example: $node = Data::Stag->from($parser, $fn);
691
692       Similar to parse
693
694       slurps a file or string into a Data::Stag node structure.
695
696       The format can also be the name of a parsing module, or an actual
697       parser object
698
699       unflatten
700
701              Title: unflatten
702
703               Args: data array
704            Returns: Data::Stag node
705            Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
706
707       Creates a node structure from a semi-flattened representation, in which
708       children of a node are represented as a flat list of data rather than a
709       list of array references.
710
711       This means a structure can be specified as:
712
713         person=>[name=>$n,
714                  phone=>$p,
715                  address=>[street=>$s,
716                            city=>$c]]
717
718       Instead of:
719
720         [person=>[ [name=>$n],
721                    [phone=>$p],
722                    [address=>[ [street=>$s],
723                                [city=>$c] ] ]
724                  ]
725         ]
726
727       The former gets converted into the latter for the internal representa‐
728       tion
729
730       makehandler
731
732              Title: makehandler
733
734               Args: hash of CODEREFs keyed by element name
735                     OR a string containing the name of a module
736            Returns: L<Data::Stag::BaseHandler>
737            Example: $h = Data::Stag->makehandler(%subs);
738            Example: $h = Data::Stag->makehandler("My::FooHandler");
739            Example: $h = Data::Stag->makehandler('xml');
740
741       This creates a Stag event handler. The argument is a hash of subrou‐
742       tines keyed by element/node name. After each node is fired by the
743       parser/generator, the subroutine is called, passing the handler object
744       and the stag node as arguments. whatever the subroutine returns is
745       placed back into the tree
746
747       For example, for a a parser/generator that fires events with the fol‐
748       lowing tree form
749
750         <person>
751           <name>foo</name>
752           ...
753         </person>
754
755       we can create a handler that writes person/name like this:
756
757         $h = Data::Stag->makehandler(
758                                      person => sub { my ($self,$stag) = @_;
759                                                      print $stag->name;
760                                                      return $stag; # dont change tree
761                                                    });
762         $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
763
764       See Data::Stag::BaseHandler for details on handlers
765
766       getformathandler
767
768              Title: getformathandler
769
770               Args: format str OR L<Data::Stag::BaseHandler>
771            Returns: L<Data::Stag::BaseHandler>
772            Example: $h = Data::Stag->getformathandler('xml');
773                     $h->file("my.xml");
774                     Data::Stag->parse(-fn=>$fn, -handler=>$h);
775
776       Creates a Stag event handler - this handler can be passed to an event
777       generator / parser. Built in handlers include:
778
779       xml Generates xml tags from events
780
781       sxpr
782           Generates S-Expressions from events
783
784       itext
785           Generates itext format from events
786
787       indent
788           Generates indent format from events
789
790       All the above are kinds of Data::Stag::Writer
791
792       chainhandler
793
794              Title: chainhandler
795
796               Args: blocked events - str or str[]
797                     initial handler - handler object
798                     final handler - handler object
799            Returns:
800            Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
801
802       chains handlers together - for example, you may want to make transforms
803       on an event stream, and then pass the event stream to another handler -
804       for example, and xml handler
805
806         $processor = Data::Stag->makehandler(
807                                              a => sub { my ($self,$stag) = @_;
808                                                         $stag->set_foo("bar");
809                                                         return $stag
810                                                       },
811                                              b => sub { my ($self,$stag) = @_;
812                                                         $stag->set_blah("eek");
813                                                         return $stag
814                                                       },
815                                              );
816         $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
817         $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
818
819       If the inner handler has a method CONSUMES(), this method will deter‐
820       mine the blocked events if none are specified.
821
822       see also the script stag-handle.pl
823
824       RECURSIVE SEARCHING
825
826       find (f)
827
828              Title: find
829            Synonym: f
830
831               Args: element str
832            Returns: node[] or ANY
833            Example: @persons = stag_find($struct, 'person');
834            Example: @persons = $struct->find('person');
835
836       recursively searches tree for all elements of the given type, and
837       returns all nodes or data elements found.
838
839       if the element found is a non-terminal node, will return the node if
840       the element found is a terminal (leaf) node, will return the data value
841
842       the element argument can be a path
843
844         @names = $struct->find('department/person/name');
845
846       will find name in the nested structure below:
847
848         (department
849          (person
850           (name "foo")))
851
852       findnode (fn)
853
854              Title: findnode
855            Synonym: fn
856
857               Args: element str
858            Returns: node[]
859            Example: @persons = stag_findnode($struct, 'person');
860            Example: @persons = $struct->findnode('person');
861
862       recursively searches tree for all elements of the given type, and
863       returns all nodes found.
864
865       paths can also be used (see find)
866
867       findval (fv)
868
869              Title: findval
870            Synonym: fv
871
872               Args: element str
873            Returns: ANY[] or ANY
874            Example: @names = stag_findval($struct, 'name');
875            Example: @names = $struct->findval('name');
876            Example: $firstname = $struct->findval('name');
877
878       recursively searches tree for all elements of the given type, and
879       returns all data values found. the data values could be primitive
880       scalars or nodes.
881
882       paths can also be used (see find)
883
884       sfindval (sfv)
885
886              Title: sfindval
887            Synonym: sfv
888
889               Args: element str
890            Returns: ANY
891            Example: $name = stag_sfindval($struct, 'name');
892            Example: $name = $struct->sfindval('name');
893
894       as findval, but returns the first value found
895
896       paths can also be used (see find)
897
898       findvallist (fvl)
899
900              Title: findvallist
901            Synonym: fvl
902
903               Args: element str[]
904            Returns: ANY[]
905            Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
906            Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
907
908       recursively searches tree for all elements in the list
909
910       DEPRECATED
911
912       DATA ACCESSOR METHODS
913
914       these allow getting and setting of elements directly underneath the
915       current one
916
917       get (g)
918
919              Title: get
920            Synonym: g
921
922               Args: element str
923             Return: node[] or ANY
924            Example: $name = $person->get('name');
925            Example: @phone_nos = $person->get('phone_no');
926
927       gets the value of the named sub-element
928
929       if the sub-element is a non-terminal, will return a node(s) if the sub-
930       element is a terminal (leaf) it will return the data value(s)
931
932       the examples above would work on a data structure like this:
933
934         [person => [ [name => 'fred'],
935                      [phone_no => '1-800-111-2222'],
936                      [phone_no => '1-415-555-5555']]]
937
938       will return an array or single value depending on the context
939
940       [equivalent to findval(), except that only direct children (as opposed
941       to all descendents) are checked]
942
943       paths can also be used, like this:
944
945        @phones_nos = $struct->get('person/phone_no')
946
947       sget (sg)
948
949              Title: sget
950            Synonym: sg
951
952               Args: element str
953             Return: ANY
954            Example: $name = $person->sget('name');
955            Example: $phone = $person->sget('phone_no');
956            Example: $phone = $person->sget('department/person/name');
957
958       as get but always returns a single value
959
960       [equivalent to sfindval(), except that only direct children (as opposed
961       to all descendents) are checked]
962
963       getl (gl getlist)
964
965              Title: gl
966            Synonym: getl
967            Synonym: getlist
968
969               Args: element str[]
970             Return: node[] or ANY[]
971            Example: ($name, @phone) = $person->getl('name', 'phone_no');
972
973       returns the data values for a list of sub-elements of a node
974
975       [equivalent to findvallist(), except that only direct children (as
976       opposed to all descendents) are checked]
977
978       getn (gn getnode)
979
980              Title: getn
981            Synonym: gn
982            Synonym: getnode
983
984               Args: element str
985             Return: node[]
986            Example: $namestruct = $person->getn('name');
987            Example: @pstructs = $person->getn('phone_no');
988
989       as get but returns the whole node rather than just the data value
990
991       [equivalent to findnode(), except that only direct children (as opposed
992       to all descendents) are checked]
993
994       sgetmap (sgm)
995
996              Title: sgetmap
997            Synonym: sgm
998
999               Args: hash
1000             Return: hash
1001            Example: %h = $person->sgetmap('social-security-no'=>'id',
1002                                           'name'              =>'label',
1003                                           'job'               =>0,
1004                                           'address'           =>'location');
1005
1006       returns a hash of key/val pairs based on the values of the data values
1007       of the subnodes in the current element; keys are mapped according to
1008       the hash passed (a value of '' or 0 will map an identical key/val).
1009
1010       no multivalued data elements are allowed
1011
1012       set (s)
1013
1014              Title: set
1015            Synonym: s
1016
1017               Args: element str, datavalue ANY (list)
1018             Return: ANY
1019            Example: $person->set('name', 'fred');    # single val
1020            Example: $person->set('phone_no', $cellphone, $homephone);
1021
1022       sets the data value of an element for any node. if the element is mul‐
1023       tivalued, all the old values will be replaced with the new ones speci‐
1024       fied.
1025
1026       ordering will be preserved, unless the element specified does not
1027       exist, in which case, the new tag/value pair will be placed at the end.
1028
1029       for example, if we have a stag node $person
1030
1031         person:
1032           name: shuggy
1033           job:  bus driver
1034
1035       if we do this
1036
1037         $person->set('name', ());
1038
1039       we will end up with
1040
1041         person:
1042           job:  bus driver
1043
1044       then if we do this
1045
1046         $person->set('name', 'shuggy');
1047
1048       the 'name' node will be placed as the last attribute
1049
1050         person:
1051           job:  bus driver
1052           name: shuggy
1053
1054       You can also use magic methods, for example
1055
1056         $person->set_name('shuggy');
1057         $person->set_job('bus driver', 'poet');
1058         print $person->itext;
1059
1060       will print
1061
1062         person:
1063           name: shuggy
1064           job:  bus driver
1065           job:  poet
1066
1067       note that if the datavalue is a non-terminal node as opposed to a prim‐
1068       itive value, then you have to do it like this:
1069
1070         $people  = Data::Stag->new(people=>[
1071                                             [person=>[[name=>'Sherlock Holmes']]],
1072                                             [person=>[[name=>'Moriarty']]],
1073                                            ]);
1074         $address = Data::Stag->new(address=>[
1075                                              [address_line=>"221B Baker Street"],
1076                                              [city=>"London"],
1077                                              [country=>"Great Britain"]]);
1078         ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1079         $person->set("address", $address->data);
1080
1081       If you are using XML data, you can set attributes like this:
1082
1083         $person->set('@'=>[[id=>$id],[foo=>$foo]]);
1084
1085       unset (u)
1086
1087              Title: unset
1088            Synonym: u
1089
1090               Args: element str, datavalue ANY
1091             Return: ANY
1092            Example: $person->unset('name');
1093            Example: $person->unset('phone_no');
1094
1095       prunes all nodes of the specified element from the current node
1096
1097       You can use magic methods, like this
1098
1099         $person->unset_name;
1100         $person->unset_phone_no;
1101
1102       free
1103
1104              Title: free
1105            Synonym: u
1106
1107               Args:
1108             Return:
1109            Example: $person->free;
1110
1111       removes all data from a node. If that node is a subnode of another
1112       node, it is removed altogether
1113
1114       for instance, if we had the data below:
1115
1116         <person>
1117           <name>fred</name>
1118           <address>
1119           ..
1120           </address>
1121         </person>
1122
1123       and called
1124
1125         $person->get_address->free
1126
1127       then the person node would look like this:
1128
1129         <person>
1130           <name>fred</name>
1131         </person>
1132
1133       add (a)
1134
1135              Title: add
1136            Synonym: a
1137
1138               Args: element str, datavalues ANY[]
1139                     OR
1140                     Data::Stag
1141             Return: ANY
1142            Example: $person->add('phone_no', $cellphone, $homephone);
1143            Example: $person->add_phone_no('1-555-555-5555');
1144            Example: $dataset->add($person)
1145
1146       adds a datavalue or list of datavalues. appends if already existing,
1147       creates new element value pairs if not already existing.
1148
1149       if the argument is a stag node, it will add this node under the current
1150       one.
1151
1152       For example, if we have the following node in $dataset
1153
1154        <dataset>
1155          <person>
1156            <name>jim</name>
1157          </person>
1158        </dataset>
1159
1160       And then we add data to it:
1161
1162         ($person) = $dataset->qmatch('person', name=>'jim');
1163         $person->add('phone_no', '555-1111', '555-2222');
1164
1165       We will be left with:
1166
1167        <dataset>
1168          <person>
1169            <name>jim</name>
1170            <phone_no>555-1111</phone_no>
1171            <phone_no>555-2222</phone_no>
1172          </person>
1173        </dataset>
1174
1175       The above call is equivalent to:
1176
1177         $person->add_phone_no('555-1111', '555-2222');
1178
1179       As well as adding data values, we can add whole nodes:
1180
1181         $dataset->add(person=>[[name=>"fred"],
1182                                [phone_no=>"555-3333"]]);
1183
1184       Which is equivalent to
1185
1186         $dataset->add_person([[name=>"fred"],
1187                               [phone_no=>"555-3333"]]);
1188
1189       Remember, the value has to be specified as an array reference of nodes.
1190       In general, you should use the addkid() method to add nodes and used
1191       add() to add values
1192
1193       element (e name)
1194
1195              Title: element
1196            Synonym: e
1197            Synonym: name
1198
1199               Args:
1200             Return: element str
1201            Example: $element = $struct->element
1202
1203       returns the element name of the current node.
1204
1205       This is illustrated in the different representation formats below
1206
1207       sxpr
1208             (element "data")
1209
1210           or
1211
1212             (element
1213              (sub_element "..."))
1214
1215       xml
1216             <element>data</element>
1217
1218           or
1219
1220             <element>
1221               <sub_element>...</sub_element>
1222             </element>
1223
1224       perl
1225             [element => $data ]
1226
1227           or
1228
1229             [element => [
1230                           [sub_element => "..." ]]]
1231
1232       itext
1233             element: data
1234
1235           or
1236
1237             element:
1238               sub_element: ...
1239
1240       indent
1241             element "data"
1242
1243           or
1244
1245             element
1246               sub_element "..."
1247
1248       kids (k children)
1249
1250              Title: kids
1251            Synonym: k
1252            Synonym: children
1253
1254               Args:
1255             Return: ANY or ANY[]
1256            Example: @nodes = $person->kids
1257            Example: $name = $namestruct->kids
1258
1259       returns the data value(s) of the current node; if it is a terminal
1260       node, returns a single value which is the data. if it is non-terminal,
1261       returns an array of nodes
1262
1263       addkid (ak addchild)
1264
1265              Title: addkid
1266            Synonym: ak
1267            Synonym: addchild
1268
1269               Args: kid node
1270             Return: ANY
1271            Example: $person->addkid($job);
1272
1273       adds a new child node to a non-terminal node, after all the existing
1274       child nodes
1275
1276       You can use this method/procedure to add XML attribute data to a node:
1277
1278         $person->addkid(['@'=>[[id=>$id]]]);
1279
1280       subnodes
1281
1282              Title: subnodes
1283
1284               Args:
1285             Return: ANY[]
1286            Example: @nodes = $person->subnodes
1287
1288       returns the child nodes; returns empty list if this is a terminal node
1289
1290       ntnodes
1291
1292              Title: ntnodes
1293
1294               Args:
1295             Return: ANY[]
1296            Example: @nodes = $person->ntnodes
1297
1298       returns all non-terminal children of current node
1299
1300       tnodes
1301
1302              Title: tnodes
1303
1304               Args:
1305             Return: ANY[]
1306            Example: @nodes = $person->tnodes
1307
1308       returns all terminal children of current node
1309
1310       QUERYING AND ADVANCED DATA MANIPULATION
1311
1312       ijoin (j)
1313
1314              Title: ijoin
1315            Synonym: j
1316            Synonym: ij
1317
1318               Args: element str, key str, data Node
1319             Return: undef
1320
1321       does a relational style inner join - see previous example in this doc
1322
1323       key can either be a single node name that must be shared (analagous to
1324       SQL INNER JOIN .. USING), or a key1=key2 equivalence relation
1325       (analagous to SQL INNER JOIN ... ON)
1326
1327       qmatch (qm)
1328
1329              Title: qmatch
1330            Synonym: qm
1331
1332               Args: return-element str, match-element str, match-value str
1333             Return: node[]
1334            Example: @persons = $s->qmatch('person', 'name', 'fred');
1335            Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1336
1337       queries the node tree for all elements that satisfy the specified
1338       key=val match - see previous example in this doc
1339
1340       for those inclined to thinking relationally, this can be thought of as
1341       a query that returns a stag object:
1342
1343         SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1344
1345       this always returns an array; this means that calling in a scalar con‐
1346       text will return the number of elements; for example
1347
1348         $n = $s->qmatch('person', (name=>'fred'));
1349
1350       the value of $n will be equal to the number of persons called fred
1351
1352       tmatch (tm)
1353
1354              Title: tmatch
1355            Synonym: tm
1356
1357               Args: element str, value str
1358             Return: bool
1359            Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1360
1361       returns true if the the value of the specified element matches - see
1362       previous example in this doc
1363
1364       tmatchhash (tmh)
1365
1366              Title: tmatchhash
1367            Synonym: tmh
1368
1369               Args: match hashref
1370             Return: bool
1371            Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1372
1373       returns true if the node matches a set of constraints, specified as
1374       hash.
1375
1376       tmatchnode (tmn)
1377
1378              Title: tmatchnode
1379            Synonym: tmn
1380
1381               Args: match node
1382             Return: bool
1383            Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1384
1385       returns true if the node matches a set of constraints, specified as
1386       node
1387
1388       cmatch (cm)
1389
1390              Title: cmatch
1391            Synonym: cm
1392
1393               Args: element str, value str
1394             Return: bool
1395            Example: $n_freds = $personset->cmatch('name', 'fred');
1396
1397       counts the number of matches
1398
1399       where (w)
1400
1401              Title: where
1402            Synonym: w
1403
1404               Args: element str, test CODE
1405             Return: Node[]
1406            Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1407
1408       the tree is queried for all elements of the specified type that satisfy
1409       the coderef (must return a boolean)
1410
1411         my @rich_dog_or_cat_owners =
1412           $data->where('person',
1413                        sub {my $p = shift;
1414                             $p->get_salary > 100000 &&
1415                             $p->where('pet',
1416                                       sub {shift->get_type =~ /(dog⎪cat)/})});
1417
1418       iterate (i)
1419
1420              Title: iterate
1421            Synonym: i
1422
1423               Args: CODE
1424             Return: Node[]
1425            Example: $data->iterate(sub {
1426                                        my $stag = shift;
1427                                        my $parent = shift;
1428                                        if ($stag->element eq 'pet') {
1429                                            $parent->set_pet_name($stag->get_name);
1430                                        }
1431                                    });
1432
1433       iterates through whole tree calling the specified subroutine.
1434
1435       the first arg passed to the subroutine is the stag node representing
1436       the tree at that point; the second arg is for the parent.
1437
1438       for instance, the example code above would turn this
1439
1440         (person
1441          (name "jim")
1442          (pet
1443           (name "fluffy")))
1444
1445       into this
1446
1447         (person
1448          (name "jim")
1449          (pet_name "fluffy")
1450          (pet
1451           (name "fluffy")))
1452
1453       MISCELLANEOUS METHODS
1454
1455       duplicate (d)
1456
1457              Title: duplicate
1458            Synonym: d
1459
1460               Args:
1461             Return: Node
1462            Example: $node2 = $node->duplicate;
1463
1464       does a deep copy of a stag structure
1465
1466       isanode
1467
1468              Title: isanode
1469
1470               Args:
1471             Return: bool
1472            Example: if (stag_isanode($node)) { ... }
1473
1474       hash
1475
1476              Title: hash
1477
1478               Args:
1479             Return: hash
1480            Example: $h = $node->hash;
1481
1482       turns a tree into a hash. all data values will be arrayrefs
1483
1484       pairs
1485
1486              Title: pairs
1487
1488       turns a tree into a hash. all data values will be scalar (IMPORTANT:
1489       this means duplicate values will be lost)
1490
1491       write
1492
1493              Title: write
1494
1495               Args: filename str, format str[optional]
1496             Return:
1497            Example: $node->write("myfile.xml");
1498            Example: $node->write("myfile", "itext");
1499
1500       will try and guess the format from the extension if not specified
1501
1502       xml
1503
1504              Title: xml
1505
1506               Args: filename str, format str[optional]
1507             Return:
1508            Example: $node->write("myfile.xml");
1509            Example: $node->write("myfile", "itext");
1510
1511               Args:
1512             Return: xml str
1513            Example: print $node->xml;
1514
1515       XML METHODS
1516
1517       xslt
1518
1519              Title: xslt
1520
1521               Args: xslt_file str
1522             Return: Node
1523            Example: $new_stag = $stag->xslt('mytransform.xsl');
1524
1525       transforms a stag tree using XSLT
1526
1527       xsltstr
1528
1529              Title: xsltstr
1530
1531               Args: xslt_file str
1532             Return: str
1533            Example: print $stag->xsltstr('mytransform.xsl');
1534
1535       As above, but returns the string of the resulting transform, rather
1536       than a stag tree
1537
1538       sax
1539
1540              Title: sax
1541
1542               Args: saxhandler SAX-CLASS
1543             Return:
1544            Example: $node->sax($mysaxhandler);
1545
1546       turns a tree into a series of SAX events
1547
1548       xpath (xp tree2xpath)
1549
1550              Title: xpath
1551            Synonym: xp
1552            Synonym: tree2xpath
1553
1554               Args:
1555             Return: xpath object
1556            Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1557
1558       xpquery (xpq xpathquery)
1559
1560              Title: xpquery
1561            Synonym: xpq
1562            Synonym: xpathquery
1563
1564               Args: xpathquery str
1565             Return: Node[]
1566            Example: @nodes = $node->xqp($xpathquerystr);
1567

STAG SCRIPTS

1569       The following scripts come with the stag module
1570
1571       stag-autoschema.pl
1572           writes the implicit stag-schema for a stag file
1573
1574       stag-db.pl
1575           persistent storage and retrieval for stag data (xml, sxpr, itext)
1576
1577       stag-diff.pl
1578           finds the difference between two stag files
1579
1580       stag-drawtree.pl
1581           draws a stag file (xml, itext, sxpr) as a PNG diagram
1582
1583       stag-filter.pl
1584           filters a stag file (xml, itext, sxpr) for nodes of interest
1585
1586       stag-findsubtree.pl
1587           finds nodes in a stag file
1588
1589       stag-flatten.pl
1590           turns stag data into a flat table
1591
1592       stag-grep.pl
1593           filters a stag file (xml, itext, sxpr) for nodes of interest
1594
1595       stag-handle.pl
1596           streams a stag file through a handler into a writer
1597
1598       stag-join.pl
1599           joins two stag files together based around common key
1600
1601       stag-mogrify.pl
1602           mangle stag files
1603
1604       stag-parse.pl
1605           parses a file and fires events (e.g. sxpr to xml)
1606
1607       stag-query.pl
1608           aggregare queries
1609
1610       stag-split.pl
1611           splits a stag file (xml, itext, sxpr) into multiple files
1612
1613       stag-splitter.pl
1614           splits a stag file into multiple files
1615
1616       stag-view.pl
1617           draws an expandable Tk tree diagram showing stag data
1618
1619       To get more documentation, type
1620
1621         stag_<script> -h
1622

BUGS

1624       none known so far, possibly quite a few undocumented features!
1625
1626       Not a bug, but the underlying default datastructure of nested arrays is
1627       more heavyweight than it needs to be. More lightweight implementations
1628       are possible. Some time I will write a C implementation.
1629

WEBSITE

1631       <http://stag.sourceforge.net>
1632

AUTHOR

1634       Chris Mungall <cjm AT fruitfly DOT org>
1635
1637       Copyright (c) 2004 Chris Mungall
1638
1639       This module is free software.  You may distribute this module under the
1640       same terms as perl itself
1641
1642
1643
1644perl v5.8.8                       2005-12-16                     Data::Stag(3)
Impressum