1treeql(n) Tree Query Language treeql(n)
2
3
4
5______________________________________________________________________________
6
8 treeql - Query tree objects
9
11 package require Tcl 8.2
12
13 package require snit
14
15 package require struct::list
16
17 package require struct::set
18
19 package require treeql ?1.3.1?
20
21 treeql objectname -tree tree ?-query query? ?-nodes nodes? ?args...?
22
23 qo query args...
24
25 qo result
26
27 qo discard
28
29______________________________________________________________________________
30
32 This package provides objects which can be used to query and transform
33 tree objects following the API of tree objects created by the package
34 struct::tree.
35
36 The tree query and manipulation language used here, TreeQL, is inspired
37 by Cost (See section References for more information).
38
39 treeql, the package, is a fairly thin query facility over tree-struc‐
40 tured data types. It implements an ordered set of nodes (really a
41 list) which are generated and filtered through the application of
42 TreeQL operators to each node in turn.
43
45 TREEQL CLASS API
46 The command treeql is a snit::type which implements the Treeql Query
47 Language. This means that it follows the API for class commands as
48 specified by the package snit. Its general syntax is
49
50 treeql objectname -tree tree ?-query query? ?-nodes nodes? ?args...?
51 The command creates a new tree query object and returns the
52 fully qualified name of the object command as its result. The
53 API the returned command is following is described in the sec‐
54 tion TreeQL OBJECT API
55
56 Each query object is associated with a single tree object. This
57 is the object all queries will be run against.
58
59 If the option -nodes was specified then its argument is treated
60 as a list of nodes. This list is used to initialize the node
61 set. It defaults to the empty list.
62
63 If the option -query was specified then its argument will be in‐
64 terpreted as an object, the parent query of this query. It de‐
65 faults to the object itself. All queries will be interpreted in
66 the environment of this object.
67
68 Any arguments coming after the options are treated as a query
69 and run immediately, after the node set has been initialized.
70 This uses the same syntax for the query as the method query.
71
72 The operations of the TreeQL available for this are explained in
73 the section about The Tree Query Language. This section also ex‐
74 plains the term node set used above.
75
76 TREEQL OBJECT API
77 As treeql has been implemented in snit all the standard methods of
78 snit-based classes are available to the user and therefore not listed
79 here. Please read the documentation for snit for what they are and what
80 functionality they provide
81
82 The methods provided by the package treeql itself are listed and ex‐
83 plained below.
84
85 qo query args...
86 This method interprets its arguments as a series of TreeQL oper‐
87 ators and interpretes them from the left to right (i.e. first to
88 last). Note that the first operator uses the node set currently
89 known to the object to perform its actions. In other words, the
90 node set is not cleared, or modified in other ways, before the
91 query is run. This allows the user to run several queries one
92 after the other and have each use the results of the last. Any
93 initialization has to be done by any query itself, using TreeQL
94 operators. The result of the method is the node set after the
95 last operator of the query has been executed.
96
97 Note that uncaught errors will leave the node set of the object
98 in an intermediate state, per the TreeQL operators which were
99 executed successfully before the error occurred.
100
101 The above means in detail that:
102
103 [1] The first argument is interpreted as the name of a query
104 operator, the number of arguments required by that opera‐
105 tor is then determined, and taken from the immediately
106 following arguments.
107
108 Because of this operators cannot have optional arguments,
109 all arguments have to be present as defined. Failure to
110 do this will, at least, confuse the query interpreter,
111 but more likely cause errors.
112
113 [2] The operator is applied to the current node set, yielding
114 a new node set, and/or manipulating the tree object the
115 query object is connected to.
116
117 [3] The arguments used (i.e. operator name and arguments) are
118 removed from the list of method arguments, and then the
119 whole process is repeated from step [1], until the list
120 of arguments is empty or an error occurred.
121
122
123
124 # q is the query object.
125
126 q query root children get data
127
128 # The above query
129 # - Resets the node set to the root node - root
130 # - Adds the children of root to the set - children
131 # - Replaces the node set with the - get data
132 # values for the attribute 'data',
133 # for all nodes in the set which
134 # have such an attribute.
135 # - And returns this information.
136
137 # Below we can see the same query, but rewritten
138 # to show the structure as it is seen by the query
139 # interpreter.
140
141 q query \
142 root \
143 children \
144 get data
145
146
147 The operators of the TreeQL language available for this are explained
148 in the section about The Tree Query Language. This section also ex‐
149 plains the term node set used above.
150
151 qo result
152 This method returns a list containing the current node set.
153
154 qo discard
155 This method returns the current node set (like method result),
156 but also destroys the query object (qo). This is useful when
157 constructing and using sub-queries (%AUTO% objects immediately
158 destroyed after use).
159
161 This and the following sections specify the Tree Query Language used by
162 the query objects of this package in detail.
163
164 First we explain the general concepts underneath the language which are
165 required to comprehend it. This is followed by the specifications for
166 all the available query operators. They fall into eight categories, and
167 each category has its own section.
168
169 [1] TreeQL Concepts
170
171 [2] Structural generators
172
173 [3] Attribute Filters
174
175 [4] Attribute Mutators
176
177 [5] Attribute String Accessors
178
179 [6] Sub-queries
180
181 [7] Node Set Operators
182
183 [8] Node Set Iterators
184
185 [9] Typed node support
186
187 TREEQL CONCEPTS
188 The main concept which has to be understood is that of the node set.
189 Each query object maintains exactly one such node set, and essentially
190 all operators use it and input argument and for their result. This
191 structure simply contains the handles of all nodes which are currently
192 of interest to the query object. To name it a set is a bit of a mis‐
193 nomer, because
194
195 [1] A node (handle) can occur in the structure more than once, and
196
197 [2] the order of nodes in the structure is important as well. When‐
198 ever an operator processes all nodes in the node set it will do
199 so in the order they occur in the structure.
200
201 Regarding the possible multiple occurrence of a node, consider a node
202 set containing two nodes A and B, both having node P as their immediate
203 parent. Application of the TreeQL operator "parent" will then add P to
204 the new node set twice, once per node it was parent of. I.e. the new
205 node set will then be {P P}.
206
207 STRUCTURAL GENERATORS
208 All tree-structural operators locate nodes in the tree based on a
209 structural relation ship to the nodes currently in the set and then re‐
210 place the current node set with the set of nodes found Nodes which ful‐
211 fill such a relationship multiple times are added to the result as of‐
212 ten as they fulfill the relationship.
213
214 It is important to note that the found nodes are collected in a sepa‐
215 rate storage area while processing the node set, and are added to (or
216 replacing) the current node set only after the current node set has
217 been processed completely. In other words, the new nodes are not pro‐
218 cessed by the operator as well and do not affect the iteration.
219
220 When describing an operator the variable N will be used to refer to any
221 node in the node set.
222
223 ancestors
224 Replaces the current node set with the ancestors for all nodes N
225 in the node set, should N have a parent. In other words, nodes
226 without a parent do not contribute to the new node set. In other
227 words, uses all nodes on the path from node N to root, in this
228 order (root last), for all nodes N in the node set. This in‐
229 cludes the root, but not the node itself.
230
231 rootpath
232 Replaces the current node set with the ancestors for all nodes N
233 in the node set, should N have a parent. In other words, nodes
234 without a parent do not contribute to the new node set. In con‐
235 trast to the operator ancestors the nodes are added in reverse
236 order however, i.e. the root node first.
237
238 parent Replaces the current node set with the parent of node N, for all
239 nodes N in the node set, should N have a parent. In other words,
240 nodes without a parent do not contribute to the new node set.
241
242 children
243 Replaces the current node set with the immediate children of
244 node N, for all nodes N in the node set, should N have children.
245 In other words, nodes without children do not contribute to the
246 new node set.
247
248 left Replaces the current node set with the previous/left sibling for
249 all nodes N in the node set, should N have siblings to the left.
250 In other words, nodes without left siblings do not contribute to
251 the new node set.
252
253 right Replaces the current node set with the next/right sibling for
254 all nodes N in the node set, should N have siblings to the
255 right. In other words, nodes without right siblings do not con‐
256 tribute to the new node set.
257
258 prev Replaces the current node set with all previous/left siblings of
259 node N, for all nodes N in the node set, should N have siblings
260 to the left. In other words, nodes without left siblings are ig‐
261 nored. The left sibling adjacent to the node is added first, and
262 the leftmost sibling last (reverse tree order).
263
264 esib Replaces the current node set with all previous/left siblings of
265 node N, for all nodes N in the node set, should N have siblings
266 to the left. In other words, nodes without left siblings are ig‐
267 nored. The leftmost sibling is added first, and the left sibling
268 adjacent to the node last (tree order).
269
270 The method name is a shorthand for Earlier SIBling.
271
272 next Replaces the current node set with all next/right siblings of
273 node N, for all nodes N in the node set, should N have siblings
274 to the right. In other words, nodes without right siblings do
275 not contribute to the new node set. The right sibling adjacent
276 to the node is added first, and the rightmost sibling last (tree
277 order).
278
279 root Replaces the current node set with a node set containing a sin‐
280 gle node, the root of the tree.
281
282 tree Replaces the current node set with a node set containing all
283 nodes found in the tree. The nodes are added in pre-order (par‐
284 ent first, then children, the latter from left to right, first
285 to last).
286
287 descendants
288 Replaces the current node set with the nodes in all subtrees
289 rooted at node N, for all nodes N in the node set, should N have
290 children. In other words, nodes without children do not contrib‐
291 ute to the new node set.
292
293 This is like the operator children, but covers the children of
294 children as well, i.e. all the proper descendants. "Rooted at N"
295 means that N itself is not added to the new set, which is also
296 implied by proper descendants.
297
298 subtree
299 Like operator descendants, but includes the node N. In other
300 words:
301
302 Replaces the current node set with the nodes of the subtree of
303 node N, for all nodes N in the node set, should N have children.
304 In other words, nodes without children do not contribute to the
305 new node set. I.e this is like the operator children, but covers
306 the children of children, etc. as well. "Of N" means that N it‐
307 self is added to the new set.
308
309 forward
310 Replaces the current node set with the nodes in the subtrees
311 rooted at the right siblings of node N, for all nodes N in the
312 node set, should N have right siblings, and they children. In
313 other words, nodes without right siblings, and them without
314 children are ignored.
315
316 This is equivalent to the operator sequence
317
318 next descendants
319
320 later This is an alias for the operator forward.
321
322 backward
323 Replaces the current node set with the nodes in the flattened
324 previous subtrees, in reverse tree order.
325
326 This is nearly equivalent to the operator sequence
327
328 prev descendants
329
330 The only difference is that this uses the nodes in reverse or‐
331 der.
332
333 earlier
334 Replaces the current node set with the nodes in the flattened
335 previous subtrees, in tree order.
336
337 This is equivalent to the operator sequence
338
339 prev subtree
340
341 ATTRIBUTE FILTERS
342 These operators filter the node set by reference to attributes of nodes
343 and their properties. Filter means that all nodes not fulfilling the
344 criteria are removed from the node set. In other words, the node set is
345 replaced by the set of nodes fulfilling the filter criteria.
346
347 hasatt attr
348 Reduces the node set to nodes which have an attribute named
349 attr.
350
351 withatt attr value
352 Reduces the node set to nodes which have an attribute named
353 attr, and where the value of that attribute is equal to value
354 (The "==" operator is string equal -nocase).
355
356 withatt! attr val
357 This is the same as withatt, but all nodes in the node set have
358 to have the attribute, and the "==" operator is string equal,
359 i.e. no -nocase. The operator will fail with an error if they
360 don't have the attribute.
361
362 attof attr vals
363 Reduces the node set to nodes which which have an attribute
364 named attr and where the value of that attribute is contained in
365 the list vals of legal values. The contained-in operator used
366 here does glob matching (using the attribute value as pattern)
367 and ignores the case of the attribute value, but not for the el‐
368 ements of vals.
369
370 attmatch attr match
371 Same as withatt, but string match is used as the "==" operator,
372 and match is the pattern checked for.
373
374 Note that match is a interpreted as a partial argument list for
375 string match. This means that it is interpreted as a list con‐
376 taining the pattern, and the pattern element can be preceded by
377 options understand by string match, like -nocase. This is espe‐
378 cially important should the pattern contain spaces. It has to be
379 wrapped into a list for correct interpretation by this operator
380
381 ATTRIBUTE MUTATORS
382 These operators change node attributes within the underlying tree. In
383 other words, all these operators have side effects.
384
385 set attr val
386 Sets the attribute attr to the value val, for all nodes N in the
387 node set. The operator will fail if a node does not have an at‐
388 tribute named attr. The tree will be left in a partially modi‐
389 fied state.
390
391 unset attr
392 Unsets the attribute attr, for all nodes N in the node set. The
393 operator will fail if a node does not have an attribute named
394 attr. The tree will be left in a partially modified state.
395
396 ATTRIBUTE STRING ACCESSORS
397 These operators retrieve the values of node attributes from the under‐
398 lying tree. The collected results are stored in the node set, but are
399 not actually nodes.
400
401 In other words, they redefine the semantics of the node set stored by
402 the query object to contain non-node data after their completion.
403
404 The query interpreter will terminate after it has finished processing
405 one of these operators, silently discarding any later query elements.
406 It also means that our talk about maintenance of a node set is not
407 quite true. It is a node set while the interpreter is processing com‐
408 mands, but can be left as an attribute value set at the end of query
409 processing.
410
411 string op attr
412 Applies the string operator op to the attribute named attr, for
413 all nodes N in the node set, collects the results of that appli‐
414 cation and places them into the node set.
415
416 The operator will fail if a node does not have an attribute
417 named attr.
418
419 The argument op is interpreted as partial argument list for the
420 builtin command string. Its first word has to be any of the
421 sub-commands understood by string. This has to be followed by
422 all arguments required for the subcommand, except the last.
423 that last argument is supplied by the attribute value.
424
425 get pattern
426 For all nodes N in the node set it determines all their at‐
427 tributes with names matching the glob pattern, then the values
428 of these attributes, at last it replaces the node set with the
429 list of these attribute values.
430
431 attlist
432 This is a convenience definition for the operator getvals *. In
433 other words, it replaces the node set with a list of the attri‐
434 bute values for all attributes for all nodes N in the node set.
435
436 attrs glob
437 Replaces the current node set with a list of attribute lists,
438 one attribute list per for all nodes N in the node set.
439
440 attval attname
441 Reduces the current node set with the operator hasatt, and then
442 replaces it with a list containing the values of the attribute
443 named attname for all nodes N in the node set.
444
445 SUB-QUERIES
446 Sub-queries yield node sets which are then used to augment, reduce or
447 replace the current node set.
448
449 andq query
450 Replaces the node set with the set-intersection of the node set
451 generated by the sub-query query and itself.
452
453 The execution of the sub-query uses the current node set as its
454 own initial node set.
455
456 orq query
457 Replaces the node set with the set-union of the node set gener‐
458 ated by the sub-query query and itself. Duplicate nodes are re‐
459 moved.
460
461 The execution of the sub-query uses the current node set as its
462 own initial node set.
463
464 notq query
465 Replaces the node set with the set of nodes generated by the
466 sub-query query which are also not in the current node set. In
467 other word the set difference of itself and the node set gener‐
468 ated by the sub-query.
469
470 The execution of the sub-query uses the current node set as its
471 own initial node set.
472
473 NODE SET OPERATORS
474 These operators change the node set directly, without referring to the
475 tree.
476
477 unique Removes duplicate nodes from the node set, preserving order. In
478 other words, the earliest occurrence of a node handle is pre‐
479 served, every other occurrence is removed.
480
481 select Replaces the current node set with a node set containing only
482 the first node from the current node set
483
484 transform query var body
485 First it interprets the sub-query query, using the current node
486 set as its initial node set. Then it iterates over the result
487 of that query, binding the handle of each node to the variable
488 named in var, and executing the script body. The collected re‐
489 sults of these executions is made the new node set, replacing
490 the current one.
491
492 The script body is executed in the context of the caller.
493
494 map var body
495 Iterates over the current node set, binding the handle of each
496 node to the variable named in var, and executing the script
497 body. The collected results of these executions is made the new
498 node set, replacing the current one.
499
500 The script body is executed in the context of the caller.
501
502 quote val
503 Appends the literal value val to the current node set.
504
505 replace val
506 Replaces the current node set with the literal list value val.
507
508 NODE SET ITERATORS
509 foreach query var body
510 Interprets the sub-query query, then performs the equivalent of
511 operator over on the nodes in the node set created by that
512 query. The current node set is not changed, except through side
513 effects from the script body.
514
515 The script body is executed in the context of the caller.
516
517 with query body
518 Interprets the query, then runs the script body on the node set
519 generated by the query. At last it restores the current node set
520 as it was before the execution of the query.
521
522 The script body is executed in the context of the caller.
523
524 over var body
525 Executes the script body for each node in the node set, with the
526 variable named by var bound to the name of the current node.
527 The script body is executed in the context of the caller.
528
529 This is like the builtin foreach, with the node set as the
530 source of the list to iterate over.
531
532 The results of executing the body are ignored.
533
534 delete Deletes all the nodes contained in the current node set from the
535 tree.
536
537 TYPED NODE SUPPORT
538 These filters and accessors assume the existence of an attribute called
539 @type, and are short-hand forms useful for cost-like tree query, html
540 tree editing, and so on.
541
542 nodetype
543 Returns the node type of nodes. Attribute string accessor. This
544 is equivalent to
545
546 get @type
547
548 oftype t
549 Reduces the node set to nodes whose type is equal to t, with
550 letter case ignored.
551
552 nottype t
553 Reduces the node set to nodes whose type is not equal to t, with
554 letter case ignored.
555
556 oftypes attrs
557 Reduces set to nodes whose @type is an element in the list attrs
558 of types. The value of @type is used as a glob pattern, and let‐
559 ter case is relevant.
560
562 ... TODO ...
563
565 [1] COST [http://wiki.tcl.tk/COST] on the Tcler's Wiki.
566
567 [2] TreeQL [http://wiki.tcl.tk/treeql] on the Tcler's Wiki. Discuss
568 this package there.
569
571 This document, and the package it describes, will undoubtedly contain
572 bugs and other problems. Please report such in the category treeql of
573 the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please
574 also report any ideas for enhancements you may have for either package
575 and/or documentation.
576
577 When proposing code changes, please provide unified diffs, i.e the out‐
578 put of diff -u.
579
580 Note further that attachments are strongly preferred over inlined
581 patches. Attachments can be made by going to the Edit form of the
582 ticket immediately after its creation, and then using the left-most
583 button in the secondary navigation bar.
584
586 Cost, DOM, TreeQL, XPath, XSLT, structured queries, tree, tree query
587 language
588
590 Data structures
591
593 Copyright (c) 2004 Colin McCormack <coldstore@users.sourceforge.net>
594 Copyright (c) 2004 Andreas Kupries <andreas_kupries@users.sourceforge.net>
595
596
597
598
599tcllib 1.3.1 treeql(n)