1Data::Stag(3) User Contributed Perl Documentation Data::Stag(3)
2
3
4
6 Data::Stag - Structured Tags datastructures
7
9 # PROCEDURAL USAGE
10 use Data::Stag qw(:all);
11 $doc = stag_parse($file);
12 @persons = stag_find($doc, "person");
13 foreach $p (@persons) {
14 printf "%s, %s phone: %s\n",
15 stag_sget($p, "family_name"),
16 stag_sget($p, "given_name"),
17 stag_sget($p, "phone_no"),
18 ;
19 }
20
21 # OBJECT-ORIENTED USAGE
22 use Data::Stag;
23 $doc = Data::Stag->parse($file);
24 @persons = $doc->find("person");
25 foreach $p (@person) {
26 printf "%s, %s phone:%s\n",
27 $p->sget("family_name"),
28 $p->sget("given_name"),
29 $p->sget("phone_no"),
30 ;
31 }
32
34 This module is for manipulating data as hierarchical tag/value pairs
35 (Structured TAGs or Simple Tree AGgreggates). These datastructures can
36 be represented as nested arrays, which have the advantage of being
37 native to perl. A simple example is shown below:
38
39 [ person=> [ [ family_name => $family_name ],
40 [ given_name => $given_name ],
41 [ phone_no => $phone_no ] ] ],
42
43 Data::Stag uses a subset of XML for import and export. This means the
44 module can also be used as a general XML parser/writer (with certain
45 caveats).
46
47 The above set of structured tags can be represented in XML as
48
49 <person>
50 <family_name>...</family_name>
51 <given_name>...</given_name>
52 <phone_no>...</phone_no>
53 </person>
54
55 This datastructure can be examined, manipulated and exported using Stag
56 functions or methods:
57
58 $document = Data::Stag->parse($file);
59 @persons = $document->find('person');
60 foreach my $person (@person) {
61 $person->set('full_name',
62 $person->sget('given_name') . ' ' .
63 $person->sget('family_name'));
64 }
65
66 Advanced querying is performed by passing functions, for example:
67
68 # get all people in dataset with name starting 'A'
69 @persons =
70 $document->where('person',
71 sub {shift->sget('family_name') =~ /^A/});
72
73 One of the things that marks this module out against other XML modules
74 is this emphasis on a functional approach as an obect-oriented or pro‐
75 cedural approach.
76
77 For full information on the stag project, see <http://stag.source‐
78 forge.net>
79
80 PROCEDURAL VS OBJECT-ORIENTED USAGE
81
82 Depending on your preference, this module can be used a set of proce‐
83 dural subroutine calls, or as method calls upon Data::Stag objects, or
84 both.
85
86 In procedural mode, all the subroutine calls are prefixed "stag_" to
87 avoid namespace clashes. The following three calls are equivalent:
88
89 $person = stag_find($doc, "person");
90 $person = $doc->find("person");
91 $person = $doc->find_person;
92
93 In object mode, you can treat any tree element as if it is an object
94 with automatically defined methods for getting/setting the tag values.
95
96 USE OF XML
97
98 Nested arrays can be imported and exported as XML, as well as other
99 formats. XML can be slurped into memory all at once (using less memory
100 than an equivalent DOM tree), or a simplified SAX style event handling
101 model can be used. Similarly, data can be exported all at once, or as a
102 series of events.
103
104 Although this module can be used as a general XML tool, it is intended
105 primarily as a tool for manipulating hierarchical data using nested
106 tag/value pairs.
107
108 This module is more suited to dealing with data-oriented documents than
109 text-oriented documents.
110
111 By using a simpler subset of XML equivalent to a basic data tree struc‐
112 ture, we can write simpler, cleaner code.
113
114 This module is ideally suited to element-only XML (that is, XML without
115 attributes or mixed elements).
116
117 If you are using attributes or mixed elements, it is useful to know
118 what is going on under the hood.
119
120 All attributes are turned into elements; they are nested inside an ele‐
121 ment with name '@'.
122
123 For example, the following piece of XML
124
125 <foo id="x">
126 <bar>ugh</bar>
127 </foo>
128
129 Gets represented internally as
130
131 <foo>
132 <@>
133 <id>x</id>
134 </@>
135 <bar>ugh</bar>
136 </foo>
137
138 Of course, this is not valid XML. However, it is just an internal rep‐
139 resentation - when exporting back to XML it will look like normal XML
140 with attributes again.
141
142 Mixed content cannot be represented in a simple tree format, so this is
143 also expanded.
144
145 The following piece of XML
146
147 <paragraph id="1" color="green">
148 example of <bold>mixed</bold>content
149 </paragraph>
150
151 gets parsed as if it were actually:
152
153 <paragraph>
154 <@>
155 <id>1</id>
156 <color>green</color>
157 </@>
158 <.>example of</.>
159 <bold>mixed</bold>
160 <.>content</.>
161 </paragraph>
162
163 When using stag with attribute or mixed attribute xml, you can treat
164 '@' and '.' as normal elements
165
166 SAX
167
168 This module can also be used as part of a SAX-style event generation /
169 handling framework - see Data::Stag::BaseHandler
170
171 PERL REPRESENTATION
172
173 Because nested arrays are native to perl, we can specify an XML datas‐
174 tructure directly in perl without going through multiple object calls.
175
176 For example, instead of using XML::Writer for the lengthy
177
178 $obj->startTag("record");
179 $obj->startTag("field1");
180 $obj->characters("foo");
181 $obj->endTag("field1");
182 $obj->startTag("field2");
183 $obj->characters("bar");
184 $obj->endTag("field2");
185 $obj->end("record");
186
187 We can instead write
188
189 $struct = [ record => [
190 [ field1 => 'foo'],
191 [ field2 => 'bar']]];
192
193 PARSING
194
195 The following example is for parsing out subsections of a tree and
196 changing sub-elements
197
198 use Data::Stag qw(:all);
199 my $tree = stag_parse($xmlfile);
200 my ($subtree) = stag_findnode($tree, $element);
201 stag_set($element, $sub_element, $new_val);
202 print stag_xml($subtree);
203
204 OBJECT ORIENTED
205
206 The same can be done in a more OO fashion
207
208 use Data::Stag qw(:all);
209 my $tree = Data::Stag->parse($xmlfile);
210 my ($subtree) = $tree->findnode($element);
211 $element->set($sub_element, $new_val);
212 print $subtree->xml;
213
214 IN A STREAM
215
216 Rather than parsing in a whole file into memory all at once (which may
217 not be suitable for very large files), you can take an event handling
218 approach. The easiest way to do this to register which nodes in the
219 file you are interested in using the makehandler method. The parser
220 will sweep through the file, building objects as it goes, and handing
221 the object to a subroutine that you specify.
222
223 For example:
224
225 use Data::Stag;
226 # catch the end of 'person' elements
227 my $h = Data::Stag->makehandler( person=> sub {
228 my ($self, $person) = @_;
229 printf "name:%s phone:%s\n",
230 $person->get_name,
231 $person->get_phone;
232 return; # clear node
233 });
234 Data::Stag->parse(-handler=>$h,
235 -file=>$f);
236
237 see Data::Stag::BaseHandler for writing handlers
238
239 See the Stag website at <http://stag.sourceforge.net> for more exam‐
240 ples.
241
242 STRUCTURED TAGS TREE DATA STRUCTURE
243
244 A tree of structured tags is represented as a recursively nested array,
245 the elements of the array represent nodes in the tree.
246
247 A node is a name/data pair, that can represent tags and values. A node
248 is represented using a reference to an array, where the first element
249 of the array is the tagname, or element, and the second element is the
250 data
251
252 This can be visualised as a box:
253
254 +-----------+
255 ⎪Name ⎪ Data⎪
256 +-----------+
257
258 In perl, we represent this pair as a reference to an array
259
260 [ Name => $Data ]
261
262 The Data can either be a list of child nodes (subtrees), or a data
263 value.
264
265 The terminal nodes (leafs of the tree) contain data values; this is
266 represented in perl using primitive scalars.
267
268 For example:
269
270 [ Name => 'Fred' ]
271
272 For non-terminal nodes, the Data is a reference to an array, where each
273 element of the the array is a new node.
274
275 +-----------+
276 ⎪Name ⎪ Data⎪
277 +-----------+
278 ⎪⎪⎪ +-----------+
279 ⎪⎪+-->⎪Name ⎪ Data⎪
280 ⎪⎪ +-----------+
281 ⎪⎪
282 ⎪⎪ +-----------+
283 ⎪+--->⎪Name ⎪ Data⎪
284 ⎪ +-----------+
285 ⎪
286 ⎪ +-----------+
287 +---->⎪Name ⎪ Data⎪
288 +-----------+
289
290 In perl this would be:
291
292 [ Name => [
293 [Name1 => $Data1],
294 [Name2 => $Data2],
295 [Name3 => $Data3],
296 ]
297 ];
298
299 The extra level of nesting is required to be able to store any node in
300 the tree using a single variable. This representation has lots of
301 advantages over others, eg hashes and mixed hash/array structures.
302
303 MANIPULATION AND QUERYING
304
305 The following example is taken from biology; we have a list of species
306 (mouse, human, fly) and a list of genes found in that species. These
307 are cross-referenced by an identifier called tax_id. We can do a rela‐
308 tional-style inner join on this identifier, as follows -
309
310 use Data::Stag qw(:all);
311 my $tree =
312 Data::Stag->new(
313 'db' => [
314 [ 'species_set' => [
315 [ 'species' => [
316 [ 'common_name' => 'house mouse' ],
317 [ 'binomial' => 'Mus musculus' ],
318 [ 'tax_id' => '10090' ]]],
319 [ 'species' => [
320 [ 'common_name' => 'fruit fly' ],
321 [ 'binomial' => 'Drosophila melanogaster' ],
322 [ 'tax_id' => '7227' ]]],
323 [ 'species' => [
324 [ 'common_name' => 'human' ],
325 [ 'binomial' => 'Homo sapiens' ],
326 [ 'tax_id' => '9606' ]]]]],
327 [ 'gene_set' => [
328 [ 'gene' => [
329 [ 'symbol' => 'HGNC' ],
330 [ 'tax_id' => '9606' ],
331 [ 'phenotype' => 'Hemochromatosis' ],
332 [ 'phenotype' => 'Porphyria variegata' ],
333 [ 'GO_term' => 'iron homeostasis' ],
334 [ 'map' => '6p21.3' ]]],
335 [ 'gene' => [
336 [ 'symbol' => 'Hfe' ],
337 [ 'synonym' => 'MR2' ],
338 [ 'tax_id' => '10090' ],
339 [ 'GO_term' => 'integral membrane protein' ],
340 [ 'map' => '13 A2-A4' ]]]]]]
341 );
342
343 # inner join of species and gene parts of tree,
344 # based on 'tax_id' element
345 my $gene_set = $tree->find("gene_set"); # get <gene_set> element
346 my $species_set = $tree->find("species_set"); # get <species_set> element
347 $gene_set->ijoin("gene", "tax_id", $species_set); # INNER JOIN
348
349 print "Reorganised data:\n";
350 print $gene_set->xml;
351
352 # find all genes starting with letter 'H' in where species/common_name=human
353 my @genes =
354 $gene_set->where('gene',
355 sub { my $g = shift;
356 $g->get_symbol =~ /^H/ &&
357 $g->findval("common_name") eq ('human')});
358
359 print "Human genes beginning 'H'\n";
360 print $_->xml foreach @genes;
361
362 S-Expression (Lisp) representation
363
364 The data represented using this module can be represented as Lisp-style
365 S-Expressions.
366
367 See Data::Stag::SxprParser and Data::Stag::SxprWriter
368
369 If we execute this code on the XML from the example above
370
371 $stag = Data::Stag->parse($xmlfile);
372 print $stag->sxpr;
373
374 The following S-Expression will be printed:
375
376 '(db
377 (species_set
378 (species
379 (common_name "house mouse")
380 (binomial "Mus musculus")
381 (tax_id "10090"))
382 (species
383 (common_name "fruit fly")
384 (binomial "Drosophila melanogaster")
385 (tax_id "7227"))
386 (species
387 (common_name "human")
388 (binomial "Homo sapiens")
389 (tax_id "9606")))
390 (gene_set
391 (gene
392 (symbol "HGNC")
393 (tax_id "9606")
394 (phenotype "Hemochromatosis")
395 (phenotype "Porphyria variegata")
396 (GO_term "iron homeostasis")
397 (map
398 (cytological
399 (chromosome "6")
400 (band "p21.3"))))
401 (gene
402 (symbol "Hfe")
403 (synonym "MR2")
404 (tax_id "10090")
405 (GO_term "integral membrane protein")))
406 (similarity_set
407 (pair
408 (symbol "HGNC")
409 (symbol "Hfe"))
410 (pair
411 (symbol "WNT3A")
412 (symbol "Wnt3a"))))
413
414 TIPS FOR EMACS USERS AND LISP PROGRAMMERS
415
416 If you use emacs, you can save this as a file with the ".el" suffix and
417 get syntax highlighting for editing this file. Quotes around the termi‐
418 nal node data items are optional.
419
420 If you know emacs lisp or any other lisp, this also turns out to be a
421 very nice language for manipulating these datastructures. Try copying
422 and pasting the above s-expression to the emacs scratch buffer and
423 playing with it in lisp.
424
425 INDENTED TEXT REPRESENTATION
426
427 Data::Stag has its own text format for writing data trees. Again, this
428 is only possible because we are working with a subset of XML (no
429 attributes, no mixed elements). The data structure above can be written
430 as follows -
431
432 db:
433 species_set:
434 species:
435 common_name: house mouse
436 binomial: Mus musculus
437 tax_id: 10090
438 species:
439 common_name: fruit fly
440 binomial: Drosophila melanogaster
441 tax_id: 7227
442 species:
443 common_name: human
444 binomial: Homo sapiens
445 tax_id: 9606
446 gene_set:
447 gene:
448 symbol: HGNC
449 tax_id: 9606
450 phenotype: Hemochromatosis
451 phenotype: Porphyria variegata
452 GO_term: iron homeostasis
453 map: 6p21.3
454 gene:
455 symbol: Hfe
456 synonym: MR2
457 tax_id: 10090
458 GO_term: integral membrane protein
459 map: 13 A2-A4
460 similarity_set:
461 pair:
462 symbol: HGNC
463 symbol: Hfe
464 pair:
465 symbol: WNT3A
466 symbol: Wnt3a
467
468 See Data::Stag::ITextParser and Data::Stag::ITextWriter
469
470 NESTED ARRAY SPECIFICATION II
471
472 To avoid excessive square bracket usage, you can specify a structure
473 like this:
474
475 use Data::Stag qw(:all);
476
477 *N = \&stag_new;
478 my $tree =
479 N(top=>[
480 N('personset'=>[
481 N('person'=>[
482 N('name'=>'davey'),
483 N('address'=>'here'),
484 N('description'=>[
485 N('hair'=>'green'),
486 N('eyes'=>'two'),
487 N('teeth'=>5),
488 ]
489 ),
490 N('pets'=>[
491 N('petname'=>'igor'),
492 N('petname'=>'ginger'),
493 ]
494 ),
495
496 ],
497 ),
498 N('person'=>[
499 N('name'=>'shuggy'),
500 N('address'=>'there'),
501 N('description'=>[
502 N('hair'=>'red'),
503 N('eyes'=>'three'),
504 N('teeth'=>1),
505 ]
506 ),
507 N('pets'=>[
508 N('petname'=>'thud'),
509 N('petname'=>'spud'),
510 ]
511 ),
512 ]
513 ),
514 ]
515 ),
516 N('animalset'=>[
517 N('animal'=>[
518 N('name'=>'igor'),
519 N('class'=>'rat'),
520 N('description'=>[
521 N('fur'=>'white'),
522 N('eyes'=>'red'),
523 N('teeth'=>50),
524 ],
525 ),
526 ],
527 ),
528 ]
529 ),
530
531 ]
532 );
533
534 # find all people
535 my @persons = stag_find($tree, 'person');
536
537 # write xml for all red haired people
538 foreach my $p (@persons) {
539 print stag_xml($p)
540 if stag_tmatch($p, "hair", "red");
541 } ;
542
543 # find all people that have name == shuggy
544 my @p =
545 stag_qmatch($tree,
546 "person",
547 "name",
548 "shuggy");
549
551 As well as the methods listed below, a node can be treated as if it is
552 a data object of a class determined by the element.
553
554 For example, the following are equivalent.
555
556 $node->get_name;
557 $node->get('name');
558
559 $node->set_name('fred');
560 $node->set('name', 'fred');
561
562 This is really just syntactic sugar. The autoloaded methods are not
563 checked against any schema, although this may be added in future.
564
566 A stag tree can be indexed as a hash for direct retrieval; see
567 Data::Stag::HashDB
568
569 This index can be made persistent as a DB file; see Data::Stag::StagDB
570
571 If you wish to use Stag in conjunction with a relational database, you
572 should install DBIx::DBStag
573
575 All method calls are also available as procedural subroutine calls;
576 unless otherwise noted, the subroutine call is the same as the method
577 call, but with the string stag_ prefixed to the method name. The first
578 argument should be a Data::Stag datastructure.
579
580 To import all subroutines into the current namespace, use this idiom:
581
582 use Data::Stag qw(:all);
583 $doc = stag_parse($file);
584 @persons = stag_find($doc, 'person');
585
586 If you wish to use this module procedurally, and you are too lazy to
587 prefix all calls with stag_, use this idiom:
588
589 use Data::Stag qw(:lazy);
590 $doc = parse($file);
591 @persons = find($doc, 'person');
592
593 But beware of clashes!
594
595 Most method calls also have a handy short mnemonic. Use of these is
596 optional. Software engineering types prefer longer names, in the belief
597 that this leads to clearer code. Hacker types prefer shorter names, as
598 this requires less keystrokes, and leads to a more compact representa‐
599 tion of the code. It is expected that if you do use this module, then
600 its usage will be fairly ubiquitous within your code, and the mnemonics
601 will become familiar, much like the qw and s/ operators in perl. As
602 always with perl, the decision is yours.
603
604 Some methods take a single parameter or list of parameters; some have
605 large lists of parameters that can be passed in any order. If the docu‐
606 mentation states:
607
608 Args: [x str], [y int], [z ANY]
609
610 Then the method can be called like this:
611
612 $stag->foo("this is x", 55, $ref);
613
614 or like this:
615
616 $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
617
618 INITIALIZATION METHODS
619
620 new
621
622 Title: new
623
624 Args: element str, data STAG-DATA
625 Returns: Data::Stag node
626 Example: $node = stag_new();
627 Example: $node = Data::Stag->new;
628 Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
629
630 creates a new instance of a Data::Stag node
631
632 stagify (nodify)
633
634 Title: stagify
635 Synonym: nodify
636 Args: data ARRAY-REF
637 Returns: Data::Stag node
638 Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
639
640 turns a perl array reference into a Data::Stag node.
641
642 similar to new
643
644 parse
645
646 Title: parse
647
648 Args: [file str], [format str], [handler obj], [fh FileHandle]
649 Returns: Data::Stag node
650 Example: $node = stag_parse($fn);
651 Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
652 Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
653
654 slurps a file or string into a Data::Stag node structure. Will guess
655 the format (xml, sxpr, itext, indent) from the suffix if it is not
656 given.
657
658 The format can also be the name of a parsing module, or an actual
659 parser object;
660
661 The handler is any object that can take nested Stag events
662 (start_event, end_event, evbody) which are generated from the parse. If
663 the handler is omitted, all events will be cached and the resulting
664 tree will be returned.
665
666 See Data::Stag::BaseHandler for writing your own handlers
667
668 See Data::Stag::BaseGenerator for details on parser classes, and error
669 handling
670
671 parsestr
672
673 Title: parsestr
674
675 Args: [str str], [format str], [handler obj]
676 Returns: Data::Stag node
677 Example: $node = stag_parsestr('(a (b (c "1")))');
678 Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
679
680 Similar to parse(), except the first argument is a string
681
682 from
683
684 Title: from
685
686 Args: format str, source str
687 Returns: Data::Stag node
688 Example: $node = stag_from('xml', $fn);
689 Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
690 Example: $node = Data::Stag->from($parser, $fn);
691
692 Similar to parse
693
694 slurps a file or string into a Data::Stag node structure.
695
696 The format can also be the name of a parsing module, or an actual
697 parser object
698
699 unflatten
700
701 Title: unflatten
702
703 Args: data array
704 Returns: Data::Stag node
705 Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
706
707 Creates a node structure from a semi-flattened representation, in which
708 children of a node are represented as a flat list of data rather than a
709 list of array references.
710
711 This means a structure can be specified as:
712
713 person=>[name=>$n,
714 phone=>$p,
715 address=>[street=>$s,
716 city=>$c]]
717
718 Instead of:
719
720 [person=>[ [name=>$n],
721 [phone=>$p],
722 [address=>[ [street=>$s],
723 [city=>$c] ] ]
724 ]
725 ]
726
727 The former gets converted into the latter for the internal representa‐
728 tion
729
730 makehandler
731
732 Title: makehandler
733
734 Args: hash of CODEREFs keyed by element name
735 OR a string containing the name of a module
736 Returns: L<Data::Stag::BaseHandler>
737 Example: $h = Data::Stag->makehandler(%subs);
738 Example: $h = Data::Stag->makehandler("My::FooHandler");
739 Example: $h = Data::Stag->makehandler('xml');
740
741 This creates a Stag event handler. The argument is a hash of subrou‐
742 tines keyed by element/node name. After each node is fired by the
743 parser/generator, the subroutine is called, passing the handler object
744 and the stag node as arguments. whatever the subroutine returns is
745 placed back into the tree
746
747 For example, for a a parser/generator that fires events with the fol‐
748 lowing tree form
749
750 <person>
751 <name>foo</name>
752 ...
753 </person>
754
755 we can create a handler that writes person/name like this:
756
757 $h = Data::Stag->makehandler(
758 person => sub { my ($self,$stag) = @_;
759 print $stag->name;
760 return $stag; # dont change tree
761 });
762 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
763
764 See Data::Stag::BaseHandler for details on handlers
765
766 getformathandler
767
768 Title: getformathandler
769
770 Args: format str OR L<Data::Stag::BaseHandler>
771 Returns: L<Data::Stag::BaseHandler>
772 Example: $h = Data::Stag->getformathandler('xml');
773 $h->file("my.xml");
774 Data::Stag->parse(-fn=>$fn, -handler=>$h);
775
776 Creates a Stag event handler - this handler can be passed to an event
777 generator / parser. Built in handlers include:
778
779 xml Generates xml tags from events
780
781 sxpr
782 Generates S-Expressions from events
783
784 itext
785 Generates itext format from events
786
787 indent
788 Generates indent format from events
789
790 All the above are kinds of Data::Stag::Writer
791
792 chainhandler
793
794 Title: chainhandler
795
796 Args: blocked events - str or str[]
797 initial handler - handler object
798 final handler - handler object
799 Returns:
800 Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
801
802 chains handlers together - for example, you may want to make transforms
803 on an event stream, and then pass the event stream to another handler -
804 for example, and xml handler
805
806 $processor = Data::Stag->makehandler(
807 a => sub { my ($self,$stag) = @_;
808 $stag->set_foo("bar");
809 return $stag
810 },
811 b => sub { my ($self,$stag) = @_;
812 $stag->set_blah("eek");
813 return $stag
814 },
815 );
816 $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
817 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
818
819 If the inner handler has a method CONSUMES(), this method will deter‐
820 mine the blocked events if none are specified.
821
822 see also the script stag-handle.pl
823
824 RECURSIVE SEARCHING
825
826 find (f)
827
828 Title: find
829 Synonym: f
830
831 Args: element str
832 Returns: node[] or ANY
833 Example: @persons = stag_find($struct, 'person');
834 Example: @persons = $struct->find('person');
835
836 recursively searches tree for all elements of the given type, and
837 returns all nodes or data elements found.
838
839 if the element found is a non-terminal node, will return the node if
840 the element found is a terminal (leaf) node, will return the data value
841
842 the element argument can be a path
843
844 @names = $struct->find('department/person/name');
845
846 will find name in the nested structure below:
847
848 (department
849 (person
850 (name "foo")))
851
852 findnode (fn)
853
854 Title: findnode
855 Synonym: fn
856
857 Args: element str
858 Returns: node[]
859 Example: @persons = stag_findnode($struct, 'person');
860 Example: @persons = $struct->findnode('person');
861
862 recursively searches tree for all elements of the given type, and
863 returns all nodes found.
864
865 paths can also be used (see find)
866
867 findval (fv)
868
869 Title: findval
870 Synonym: fv
871
872 Args: element str
873 Returns: ANY[] or ANY
874 Example: @names = stag_findval($struct, 'name');
875 Example: @names = $struct->findval('name');
876 Example: $firstname = $struct->findval('name');
877
878 recursively searches tree for all elements of the given type, and
879 returns all data values found. the data values could be primitive
880 scalars or nodes.
881
882 paths can also be used (see find)
883
884 sfindval (sfv)
885
886 Title: sfindval
887 Synonym: sfv
888
889 Args: element str
890 Returns: ANY
891 Example: $name = stag_sfindval($struct, 'name');
892 Example: $name = $struct->sfindval('name');
893
894 as findval, but returns the first value found
895
896 paths can also be used (see find)
897
898 findvallist (fvl)
899
900 Title: findvallist
901 Synonym: fvl
902
903 Args: element str[]
904 Returns: ANY[]
905 Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
906 Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
907
908 recursively searches tree for all elements in the list
909
910 DEPRECATED
911
912 DATA ACCESSOR METHODS
913
914 these allow getting and setting of elements directly underneath the
915 current one
916
917 get (g)
918
919 Title: get
920 Synonym: g
921
922 Args: element str
923 Return: node[] or ANY
924 Example: $name = $person->get('name');
925 Example: @phone_nos = $person->get('phone_no');
926
927 gets the value of the named sub-element
928
929 if the sub-element is a non-terminal, will return a node(s) if the sub-
930 element is a terminal (leaf) it will return the data value(s)
931
932 the examples above would work on a data structure like this:
933
934 [person => [ [name => 'fred'],
935 [phone_no => '1-800-111-2222'],
936 [phone_no => '1-415-555-5555']]]
937
938 will return an array or single value depending on the context
939
940 [equivalent to findval(), except that only direct children (as opposed
941 to all descendents) are checked]
942
943 paths can also be used, like this:
944
945 @phones_nos = $struct->get('person/phone_no')
946
947 sget (sg)
948
949 Title: sget
950 Synonym: sg
951
952 Args: element str
953 Return: ANY
954 Example: $name = $person->sget('name');
955 Example: $phone = $person->sget('phone_no');
956 Example: $phone = $person->sget('department/person/name');
957
958 as get but always returns a single value
959
960 [equivalent to sfindval(), except that only direct children (as opposed
961 to all descendents) are checked]
962
963 getl (gl getlist)
964
965 Title: gl
966 Synonym: getl
967 Synonym: getlist
968
969 Args: element str[]
970 Return: node[] or ANY[]
971 Example: ($name, @phone) = $person->getl('name', 'phone_no');
972
973 returns the data values for a list of sub-elements of a node
974
975 [equivalent to findvallist(), except that only direct children (as
976 opposed to all descendents) are checked]
977
978 getn (gn getnode)
979
980 Title: getn
981 Synonym: gn
982 Synonym: getnode
983
984 Args: element str
985 Return: node[]
986 Example: $namestruct = $person->getn('name');
987 Example: @pstructs = $person->getn('phone_no');
988
989 as get but returns the whole node rather than just the data value
990
991 [equivalent to findnode(), except that only direct children (as opposed
992 to all descendents) are checked]
993
994 sgetmap (sgm)
995
996 Title: sgetmap
997 Synonym: sgm
998
999 Args: hash
1000 Return: hash
1001 Example: %h = $person->sgetmap('social-security-no'=>'id',
1002 'name' =>'label',
1003 'job' =>0,
1004 'address' =>'location');
1005
1006 returns a hash of key/val pairs based on the values of the data values
1007 of the subnodes in the current element; keys are mapped according to
1008 the hash passed (a value of '' or 0 will map an identical key/val).
1009
1010 no multivalued data elements are allowed
1011
1012 set (s)
1013
1014 Title: set
1015 Synonym: s
1016
1017 Args: element str, datavalue ANY (list)
1018 Return: ANY
1019 Example: $person->set('name', 'fred'); # single val
1020 Example: $person->set('phone_no', $cellphone, $homephone);
1021
1022 sets the data value of an element for any node. if the element is mul‐
1023 tivalued, all the old values will be replaced with the new ones speci‐
1024 fied.
1025
1026 ordering will be preserved, unless the element specified does not
1027 exist, in which case, the new tag/value pair will be placed at the end.
1028
1029 for example, if we have a stag node $person
1030
1031 person:
1032 name: shuggy
1033 job: bus driver
1034
1035 if we do this
1036
1037 $person->set('name', ());
1038
1039 we will end up with
1040
1041 person:
1042 job: bus driver
1043
1044 then if we do this
1045
1046 $person->set('name', 'shuggy');
1047
1048 the 'name' node will be placed as the last attribute
1049
1050 person:
1051 job: bus driver
1052 name: shuggy
1053
1054 You can also use magic methods, for example
1055
1056 $person->set_name('shuggy');
1057 $person->set_job('bus driver', 'poet');
1058 print $person->itext;
1059
1060 will print
1061
1062 person:
1063 name: shuggy
1064 job: bus driver
1065 job: poet
1066
1067 note that if the datavalue is a non-terminal node as opposed to a prim‐
1068 itive value, then you have to do it like this:
1069
1070 $people = Data::Stag->new(people=>[
1071 [person=>[[name=>'Sherlock Holmes']]],
1072 [person=>[[name=>'Moriarty']]],
1073 ]);
1074 $address = Data::Stag->new(address=>[
1075 [address_line=>"221B Baker Street"],
1076 [city=>"London"],
1077 [country=>"Great Britain"]]);
1078 ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1079 $person->set("address", $address->data);
1080
1081 If you are using XML data, you can set attributes like this:
1082
1083 $person->set('@'=>[[id=>$id],[foo=>$foo]]);
1084
1085 unset (u)
1086
1087 Title: unset
1088 Synonym: u
1089
1090 Args: element str, datavalue ANY
1091 Return: ANY
1092 Example: $person->unset('name');
1093 Example: $person->unset('phone_no');
1094
1095 prunes all nodes of the specified element from the current node
1096
1097 You can use magic methods, like this
1098
1099 $person->unset_name;
1100 $person->unset_phone_no;
1101
1102 free
1103
1104 Title: free
1105 Synonym: u
1106
1107 Args:
1108 Return:
1109 Example: $person->free;
1110
1111 removes all data from a node. If that node is a subnode of another
1112 node, it is removed altogether
1113
1114 for instance, if we had the data below:
1115
1116 <person>
1117 <name>fred</name>
1118 <address>
1119 ..
1120 </address>
1121 </person>
1122
1123 and called
1124
1125 $person->get_address->free
1126
1127 then the person node would look like this:
1128
1129 <person>
1130 <name>fred</name>
1131 </person>
1132
1133 add (a)
1134
1135 Title: add
1136 Synonym: a
1137
1138 Args: element str, datavalues ANY[]
1139 OR
1140 Data::Stag
1141 Return: ANY
1142 Example: $person->add('phone_no', $cellphone, $homephone);
1143 Example: $person->add_phone_no('1-555-555-5555');
1144 Example: $dataset->add($person)
1145
1146 adds a datavalue or list of datavalues. appends if already existing,
1147 creates new element value pairs if not already existing.
1148
1149 if the argument is a stag node, it will add this node under the current
1150 one.
1151
1152 For example, if we have the following node in $dataset
1153
1154 <dataset>
1155 <person>
1156 <name>jim</name>
1157 </person>
1158 </dataset>
1159
1160 And then we add data to it:
1161
1162 ($person) = $dataset->qmatch('person', name=>'jim');
1163 $person->add('phone_no', '555-1111', '555-2222');
1164
1165 We will be left with:
1166
1167 <dataset>
1168 <person>
1169 <name>jim</name>
1170 <phone_no>555-1111</phone_no>
1171 <phone_no>555-2222</phone_no>
1172 </person>
1173 </dataset>
1174
1175 The above call is equivalent to:
1176
1177 $person->add_phone_no('555-1111', '555-2222');
1178
1179 As well as adding data values, we can add whole nodes:
1180
1181 $dataset->add(person=>[[name=>"fred"],
1182 [phone_no=>"555-3333"]]);
1183
1184 Which is equivalent to
1185
1186 $dataset->add_person([[name=>"fred"],
1187 [phone_no=>"555-3333"]]);
1188
1189 Remember, the value has to be specified as an array reference of nodes.
1190 In general, you should use the addkid() method to add nodes and used
1191 add() to add values
1192
1193 element (e name)
1194
1195 Title: element
1196 Synonym: e
1197 Synonym: name
1198
1199 Args:
1200 Return: element str
1201 Example: $element = $struct->element
1202
1203 returns the element name of the current node.
1204
1205 This is illustrated in the different representation formats below
1206
1207 sxpr
1208 (element "data")
1209
1210 or
1211
1212 (element
1213 (sub_element "..."))
1214
1215 xml
1216 <element>data</element>
1217
1218 or
1219
1220 <element>
1221 <sub_element>...</sub_element>
1222 </element>
1223
1224 perl
1225 [element => $data ]
1226
1227 or
1228
1229 [element => [
1230 [sub_element => "..." ]]]
1231
1232 itext
1233 element: data
1234
1235 or
1236
1237 element:
1238 sub_element: ...
1239
1240 indent
1241 element "data"
1242
1243 or
1244
1245 element
1246 sub_element "..."
1247
1248 kids (k children)
1249
1250 Title: kids
1251 Synonym: k
1252 Synonym: children
1253
1254 Args:
1255 Return: ANY or ANY[]
1256 Example: @nodes = $person->kids
1257 Example: $name = $namestruct->kids
1258
1259 returns the data value(s) of the current node; if it is a terminal
1260 node, returns a single value which is the data. if it is non-terminal,
1261 returns an array of nodes
1262
1263 addkid (ak addchild)
1264
1265 Title: addkid
1266 Synonym: ak
1267 Synonym: addchild
1268
1269 Args: kid node
1270 Return: ANY
1271 Example: $person->addkid($job);
1272
1273 adds a new child node to a non-terminal node, after all the existing
1274 child nodes
1275
1276 You can use this method/procedure to add XML attribute data to a node:
1277
1278 $person->addkid(['@'=>[[id=>$id]]]);
1279
1280 subnodes
1281
1282 Title: subnodes
1283
1284 Args:
1285 Return: ANY[]
1286 Example: @nodes = $person->subnodes
1287
1288 returns the child nodes; returns empty list if this is a terminal node
1289
1290 ntnodes
1291
1292 Title: ntnodes
1293
1294 Args:
1295 Return: ANY[]
1296 Example: @nodes = $person->ntnodes
1297
1298 returns all non-terminal children of current node
1299
1300 tnodes
1301
1302 Title: tnodes
1303
1304 Args:
1305 Return: ANY[]
1306 Example: @nodes = $person->tnodes
1307
1308 returns all terminal children of current node
1309
1310 QUERYING AND ADVANCED DATA MANIPULATION
1311
1312 ijoin (j)
1313
1314 Title: ijoin
1315 Synonym: j
1316 Synonym: ij
1317
1318 Args: element str, key str, data Node
1319 Return: undef
1320
1321 does a relational style inner join - see previous example in this doc
1322
1323 key can either be a single node name that must be shared (analagous to
1324 SQL INNER JOIN .. USING), or a key1=key2 equivalence relation
1325 (analagous to SQL INNER JOIN ... ON)
1326
1327 qmatch (qm)
1328
1329 Title: qmatch
1330 Synonym: qm
1331
1332 Args: return-element str, match-element str, match-value str
1333 Return: node[]
1334 Example: @persons = $s->qmatch('person', 'name', 'fred');
1335 Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1336
1337 queries the node tree for all elements that satisfy the specified
1338 key=val match - see previous example in this doc
1339
1340 for those inclined to thinking relationally, this can be thought of as
1341 a query that returns a stag object:
1342
1343 SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1344
1345 this always returns an array; this means that calling in a scalar con‐
1346 text will return the number of elements; for example
1347
1348 $n = $s->qmatch('person', (name=>'fred'));
1349
1350 the value of $n will be equal to the number of persons called fred
1351
1352 tmatch (tm)
1353
1354 Title: tmatch
1355 Synonym: tm
1356
1357 Args: element str, value str
1358 Return: bool
1359 Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1360
1361 returns true if the the value of the specified element matches - see
1362 previous example in this doc
1363
1364 tmatchhash (tmh)
1365
1366 Title: tmatchhash
1367 Synonym: tmh
1368
1369 Args: match hashref
1370 Return: bool
1371 Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1372
1373 returns true if the node matches a set of constraints, specified as
1374 hash.
1375
1376 tmatchnode (tmn)
1377
1378 Title: tmatchnode
1379 Synonym: tmn
1380
1381 Args: match node
1382 Return: bool
1383 Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1384
1385 returns true if the node matches a set of constraints, specified as
1386 node
1387
1388 cmatch (cm)
1389
1390 Title: cmatch
1391 Synonym: cm
1392
1393 Args: element str, value str
1394 Return: bool
1395 Example: $n_freds = $personset->cmatch('name', 'fred');
1396
1397 counts the number of matches
1398
1399 where (w)
1400
1401 Title: where
1402 Synonym: w
1403
1404 Args: element str, test CODE
1405 Return: Node[]
1406 Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1407
1408 the tree is queried for all elements of the specified type that satisfy
1409 the coderef (must return a boolean)
1410
1411 my @rich_dog_or_cat_owners =
1412 $data->where('person',
1413 sub {my $p = shift;
1414 $p->get_salary > 100000 &&
1415 $p->where('pet',
1416 sub {shift->get_type =~ /(dog⎪cat)/})});
1417
1418 iterate (i)
1419
1420 Title: iterate
1421 Synonym: i
1422
1423 Args: CODE
1424 Return: Node[]
1425 Example: $data->iterate(sub {
1426 my $stag = shift;
1427 my $parent = shift;
1428 if ($stag->element eq 'pet') {
1429 $parent->set_pet_name($stag->get_name);
1430 }
1431 });
1432
1433 iterates through whole tree calling the specified subroutine.
1434
1435 the first arg passed to the subroutine is the stag node representing
1436 the tree at that point; the second arg is for the parent.
1437
1438 for instance, the example code above would turn this
1439
1440 (person
1441 (name "jim")
1442 (pet
1443 (name "fluffy")))
1444
1445 into this
1446
1447 (person
1448 (name "jim")
1449 (pet_name "fluffy")
1450 (pet
1451 (name "fluffy")))
1452
1453 MISCELLANEOUS METHODS
1454
1455 duplicate (d)
1456
1457 Title: duplicate
1458 Synonym: d
1459
1460 Args:
1461 Return: Node
1462 Example: $node2 = $node->duplicate;
1463
1464 does a deep copy of a stag structure
1465
1466 isanode
1467
1468 Title: isanode
1469
1470 Args:
1471 Return: bool
1472 Example: if (stag_isanode($node)) { ... }
1473
1474 hash
1475
1476 Title: hash
1477
1478 Args:
1479 Return: hash
1480 Example: $h = $node->hash;
1481
1482 turns a tree into a hash. all data values will be arrayrefs
1483
1484 pairs
1485
1486 Title: pairs
1487
1488 turns a tree into a hash. all data values will be scalar (IMPORTANT:
1489 this means duplicate values will be lost)
1490
1491 write
1492
1493 Title: write
1494
1495 Args: filename str, format str[optional]
1496 Return:
1497 Example: $node->write("myfile.xml");
1498 Example: $node->write("myfile", "itext");
1499
1500 will try and guess the format from the extension if not specified
1501
1502 xml
1503
1504 Title: xml
1505
1506 Args: filename str, format str[optional]
1507 Return:
1508 Example: $node->write("myfile.xml");
1509 Example: $node->write("myfile", "itext");
1510
1511 Args:
1512 Return: xml str
1513 Example: print $node->xml;
1514
1515 XML METHODS
1516
1517 xslt
1518
1519 Title: xslt
1520
1521 Args: xslt_file str
1522 Return: Node
1523 Example: $new_stag = $stag->xslt('mytransform.xsl');
1524
1525 transforms a stag tree using XSLT
1526
1527 xsltstr
1528
1529 Title: xsltstr
1530
1531 Args: xslt_file str
1532 Return: str
1533 Example: print $stag->xsltstr('mytransform.xsl');
1534
1535 As above, but returns the string of the resulting transform, rather
1536 than a stag tree
1537
1538 sax
1539
1540 Title: sax
1541
1542 Args: saxhandler SAX-CLASS
1543 Return:
1544 Example: $node->sax($mysaxhandler);
1545
1546 turns a tree into a series of SAX events
1547
1548 xpath (xp tree2xpath)
1549
1550 Title: xpath
1551 Synonym: xp
1552 Synonym: tree2xpath
1553
1554 Args:
1555 Return: xpath object
1556 Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1557
1558 xpquery (xpq xpathquery)
1559
1560 Title: xpquery
1561 Synonym: xpq
1562 Synonym: xpathquery
1563
1564 Args: xpathquery str
1565 Return: Node[]
1566 Example: @nodes = $node->xqp($xpathquerystr);
1567
1569 The following scripts come with the stag module
1570
1571 stag-autoschema.pl
1572 writes the implicit stag-schema for a stag file
1573
1574 stag-db.pl
1575 persistent storage and retrieval for stag data (xml, sxpr, itext)
1576
1577 stag-diff.pl
1578 finds the difference between two stag files
1579
1580 stag-drawtree.pl
1581 draws a stag file (xml, itext, sxpr) as a PNG diagram
1582
1583 stag-filter.pl
1584 filters a stag file (xml, itext, sxpr) for nodes of interest
1585
1586 stag-findsubtree.pl
1587 finds nodes in a stag file
1588
1589 stag-flatten.pl
1590 turns stag data into a flat table
1591
1592 stag-grep.pl
1593 filters a stag file (xml, itext, sxpr) for nodes of interest
1594
1595 stag-handle.pl
1596 streams a stag file through a handler into a writer
1597
1598 stag-join.pl
1599 joins two stag files together based around common key
1600
1601 stag-mogrify.pl
1602 mangle stag files
1603
1604 stag-parse.pl
1605 parses a file and fires events (e.g. sxpr to xml)
1606
1607 stag-query.pl
1608 aggregare queries
1609
1610 stag-split.pl
1611 splits a stag file (xml, itext, sxpr) into multiple files
1612
1613 stag-splitter.pl
1614 splits a stag file into multiple files
1615
1616 stag-view.pl
1617 draws an expandable Tk tree diagram showing stag data
1618
1619 To get more documentation, type
1620
1621 stag_<script> -h
1622
1624 none known so far, possibly quite a few undocumented features!
1625
1626 Not a bug, but the underlying default datastructure of nested arrays is
1627 more heavyweight than it needs to be. More lightweight implementations
1628 are possible. Some time I will write a C implementation.
1629
1631 <http://stag.sourceforge.net>
1632
1634 Chris Mungall <cjm AT fruitfly DOT org>
1635
1637 Copyright (c) 2004 Chris Mungall
1638
1639 This module is free software. You may distribute this module under the
1640 same terms as perl itself
1641
1642
1643
1644perl v5.8.8 2005-12-16 Data::Stag(3)