1XML::XQL(3) User Contributed Perl Documentation XML::XQL(3)
2
3
4
6 XML::XQL - A perl module for querying XML tree structures with XQL
7
9 use XML::XQL;
10 use XML::XQL::DOM;
11
12 $parser = new XML::DOM::Parser;
13 $doc = $parser->parsefile ("file.xml");
14
15 # Return all elements with tagName='title' under the root element 'book'
16 $query = new XML::XQL::Query (Expr => "book/title");
17 @result = $query->solve ($doc);
18 $query->dispose; # Avoid memory leaks - Remove circular references
19
20 # Or (to save some typing)
21 @result = XML::XQL::solve ("book/title", $doc);
22
23 # Or (to save even more typing)
24 @result = $doc->xql ("book/title");
25
27 The XML::XQL module implements the XQL (XML Query Language) proposal
28 submitted to the XSL Working Group in September 1998. The spec can be
29 found at: <http://www.w3.org/TandS/QL/QL98/pp/xql.html> Most of the
30 contents related to the XQL syntax can also be found in the
31 XML::XQL::Tutorial that comes with this distribution. Note that XQL is
32 not the same as XML-QL!
33
34 The current implementation only works with the XML::DOM module, but
35 once the design is stable and the major bugs are flushed out, other
36 extensions might follow, e.g. for XML::Grove.
37
38 XQL was designed to be extensible and this implementation tries to
39 stick to that. Users can add their own functions, methods, comparison
40 operators and data types. Plugging in a new XML tree structure (like
41 XML::Grove) should be a piece of cake.
42
43 To use the XQL module, either
44
45 use XML::XQL;
46
47 or
48
49 use XML::XQL::Strict;
50
51 The Strict module only provides the core XQL functionality as found in
52 the XQL spec. By default (i.e. by using XML::XQL) you get 'XQL+', which
53 has some additional features.
54
55 See the section "Additional Features in XQL+" for the differences.
56
57 This module is still in development. See the To-do list in XQL.pm for
58 what still needs to be done. Any suggestions are welcome, the sooner
59 these implementation issues are resolved, the faster we can all use
60 this module.
61
62 If you find a bug, you would do me great favor by sending it to me in
63 the form of a test case. See the file t/xql_template.t that comes with
64 this distribution.
65
66 If you have written a cool comparison operator, function, method or XQL
67 data type that you would like to share, send it to
68 tjmather@tjmather.com and I will add it to this module.
69
71 solve (QUERY_STRING, INPUT_LIST...)
72 @result = XML::XQL::solve ("doc//book", $doc);
73
74 This is provided as a shortcut for:
75
76 $query = new XML::XQL::Query (Expr => "doc//book");
77 @result = $query->solve ($doc);
78 $query->dispose;
79
80 Note that with XML::XQL::DOM, you can also write (see
81 XML::DOM::Node for details):
82
83 @result = $doc->xql ("doc//book");
84
85 setDocParser (PARSER)
86 Sets the XML::DOM::Parser that is used by the new XQL+ document()
87 method. By default it uses an XML::DOM::Parser that was created
88 without any arguments, i.e.
89
90 $PARSER = new XML::DOM::Parser;
91
92 defineFunction (NAME, FUNCREF, ARGCOUNT [, ALLOWED_OUTSIDE [, CONST,
93 [QUERY_ARG]]])
94 Defines the XQL function (at the global level, i.e. for all newly
95 created queries) with the specified NAME. The ARGCOUNT parameter
96 can either be a single number or a reference to a list with
97 numbers. A single number expands to [ARGCOUNT, ARGCOUNT]. The list
98 contains pairs of numbers, indicating the number of arguments that
99 the function allows. The value -1 means infinity. E.g. [2, 5, 7, 9,
100 12, -1] means that the function can have 2, 3, 4, 5, 7, 8, 9, 12 or
101 more arguments. The number of arguments is checked when parsing
102 the XQL query string.
103
104 The second parameter must be a reference to a Perl function or an
105 anonymous sub. E.g. '\&my_func' or 'sub { ... code ... }'
106
107 If ALLOWED_OUTSIDE (default is 0) is set to 1, the function or
108 method may also be used outside subqueries in node queries. (See
109 NodeQuery parameter in Query constructor)
110
111 If CONST (default is 0) is set to 1, the function is considered to
112 be "constant". See "Constant Function Invocations" for details.
113
114 If QUERY_ARG (default is 0) is not -1, the argument with that index
115 is considered to be a 'query parameter'. If the query parameter is
116 a subquery, that returns multiple values, the result list of the
117 function invocation will contain one result value for each value of
118 the subquery. E.g. 'length(book/author)' will return a list of
119 Numbers, denoting the string lengths of all the author elements
120 returned by 'book/author'.
121
122 Note that only methods (not functions) may appear after a Bang "!"
123 operator. This is checked when parsing the XQL query string.
124
125 See also: defineMethod
126
127 generateFunction (NAME, FUNCNAME, RETURN_TYPE [, ARGCOUNT [,
128 ALLOWED_OUTSIDE [, CONST [, QUERY_ARG]]]])
129 Generates and defines an XQL function wrapper for the Perl function
130 with the name FUNCNAME. The function name will be NAME in XQL query
131 expressions. The return type should be one of the builtin XQL Data
132 Types or a class derived from XML::XQL::PrimitiveType (see "Adding
133 Data Types".) See defineFunction for the meaning of ARGCOUNT,
134 ALLOWED_OUTSIDE, CONST and QUERY_ARG.
135
136 Function values are always converted to Perl strings with
137 xql_toString before they are passed to the Perl function
138 implementation. The function return value is cast to an object of
139 type RETURN_TYPE, or to the empty list [] if the result is undef.
140 It uses expandType to expand XQL primitive type names. If
141 RETURN_TYPE is "*", it returns the function result as is, unless
142 the function result is undef, in which case it returns [].
143
144 defineMethod (NAME, FUNCREF, ARGCOUNT [, ALLOWED_OUTSIDE])
145 Defines the XQL method (at the global level, i.e. for all newly
146 created queries) with the specified NAME. The ARGCOUNT parameter
147 can either be a single number or a reference to a list with
148 numbers. A single number expands to [ARGCOUNT, ARGCOUNT]. The list
149 contains pairs of numbers, indicating the number of arguments that
150 the method allows. The value -1 means infinity. E.g. [2, 5, 7, 9,
151 12, -1] means that the method can have 2, 3, 4, 5, 7, 8, 9, 12 or
152 more arguments. The number of arguments is checked when parsing
153 the XQL query string.
154
155 The second parameter must be a reference to a Perl function or an
156 anonymous sub. E.g. '\&my_func' or 'sub { ... code ... }'
157
158 If ALLOWED_OUTSIDE (default is 0) is set to 1, the function or
159 method may also be used outside subqueries in node queries. (See
160 NodeQuery parameter in Query constructor)
161
162 Note that only methods (not functions) may appear after a Bang "!"
163 operator. This is checked when parsing the XQL query string.
164
165 See also: defineFunction
166
167 defineComparisonOperators (NAME => FUNCREF [, NAME => FUNCREF]*)
168 Defines XQL comparison operators at the global level. The FUNCREF
169 parameters must be a references to a Perl function or an anonymous
170 sub. E.g. '\&my_func' or 'sub { ... code ... }'
171
172 E.g. define the operators $my_op$ and $my_op2$:
173
174 defineComparisonOperators ('my_op' => \&my_op,
175 'my_op2' => sub { ... insert code here ... });
176
177 defineElementValueConvertor (TAG_NAME, FUNCREF)
178 Defines that the result of the value() call for Elements with the
179 specified TAG_NAME uses the specified function. The function will
180 receive two parameters. The second one is the TAG_NAME of the
181 Element node and the first parameter is the Element node itself.
182 FUNCREF should be a reference to a Perl function, e.g. \&my_sub, or
183 an anonymous sub.
184
185 E.g. to define that all Elements with tag name 'date-of-birth'
186 should return XML::XQL::Date objects:
187
188 defineElementValueConvertor ('date-of-birth', sub {
189 my $elem = shift;
190 # Always pass in the node as the second parameter. This is
191 # the reference node for the object, which is used when
192 # sorting values in document order.
193 new XML::XQL::Date ($elem->xql_text, $elem);
194 });
195
196 These convertors can only be specified at a global level, not on a
197 per query basis. To undefine a convertor, simply pass a FUNCREF of
198 undef.
199
200 defineAttrValueConvertor (ELEM_TAG_NAME, ATTR_NAME, FUNCREF)
201 Defines that the result of the value() call for Attributes with the
202 specified ATTR_NAME and a parent Element with the specified
203 ELEM_TAG_NAME uses the specified function. An ELEM_TAG_NAME of "*"
204 will match regardless of the tag name of the parent Element. The
205 function will receive 3 parameters. The third one is the tag name
206 of the parent Element (even if ELEM_TAG_NAME was "*"), the second
207 is the ATTR_NAME and the first is the Attribute node itself.
208 FUNCREF should be a reference to a Perl function, e.g. \&my_sub, or
209 an anonymous sub.
210
211 These convertors can only be specified at a global level, not on a
212 per query basis. To undefine a convertor, simply pass a FUNCREF of
213 undef.
214
215 defineTokenQ (Q)
216 Defines the token for the q// string delimiters at a global level.
217 The default value for XQL+ is 'q', for XML::XQL::Strict it is
218 undef. A value of undef will deactivate this feature.
219
220 defineTokenQQ (QQ)
221 Defines the token for the qq// string delimiters at a global level.
222 The default value for XQL+ is 'qq', for XML::XQL::Strict it is
223 undef. A value of undef will deactivate this feature.
224
225 expandType (TYPE)
226 Used internally to expand type names of XQL primitive types. E.g.
227 it expands "Number" to "XML::XQL::Number" and is not case-
228 sensitive, so "number" and "NuMbEr" will both expand correctly.
229
230 defineExpandedTypes (ALIAS, FULL_NAME [, ...])
231 For each pair of arguments it allows the class name FULL_NAME to be
232 abbreviated with ALIAS. The definitions are used by expandType().
233 (ALIAS is always converted to lowercase internally, because
234 expandType is case-insensitive.)
235
236 Overriding the ALIAS for "date", also affects the object type
237 returned by the date() function.
238
239 setErrorContextDelimiters (START, END, BOLD_ON, BOLD_OFF)
240 Sets the delimiters used when printing error messages during query
241 evaluation. The default delimiters on Unix are `tput smul`
242 (underline on) and `tput rmal` (underline off). On other systems
243 (that don't have tput), the delimiters are ">>" and "<<" resp.
244
245 When printing the error message, the subexpression that caused the
246 error will be enclosed by the delimiters, i.e. underlined on Unix.
247
248 For certain subexpressions the significant keyword, e.g. "$and$" is
249 enclosed in the bold delimiters BOLD_ON (default: `tput bold` on
250 Unix, "" elsewhere) and BOLD_OFF (default: (`tput rmul` . `tput
251 smul`) on Unix, "" elsewhere, see $BoldOff in XML::XQL::XQL.pm for
252 details.)
253
254 isEmptyList (VAR)
255 Returns 1 if VAR is [], else 0. Can be used in user defined
256 functions.
257
259 Parent operator '..'
260 The '..' operator returns the parent of the current node, where '.'
261 would return the current node. This is not part of any XQL
262 standard, because you would normally use return operators, which
263 are not implemented here.
264
265 Sequence operators ';' and ';;'
266 The sequence operators ';' (precedes) and ';;' (immediately
267 precedes) are not in the XQL spec, but are described in 'The Design
268 of XQL' by Jonathan Robie who is one of the designers of XQL. It
269 can be found at <http://www.texcel.no/whitepapers/xql-design.html>
270 See also the XQL Tutorial for a description of what they mean.
271
272 q// and qq// String Tokens
273 String tokens a la q// and qq// are allowed. q// evaluates like
274 Perl's single quotes and qq// like Perl's double quotes. Note that
275 the default XQL strings do not allow escaping etc., so it's not
276 possible to define a string with both single and double quotes. If
277 'q' and 'qq' are not to your liking, you may redefine them to
278 something else or undefine them altogether, by assigning undef to
279 them. E.g:
280
281 # at a global level - shared by all queries (that don't (re)define 'q')
282 XML::XQL::defineTokenQ ('k');
283 XML::XQL::defineTokenQQ (undef);
284
285 # at a query level - only defined for this query
286 $query = new XML::XQL::Query (Expr => "book/title", q => 'k', qq => undef);
287
288 From now on k// works like q// did and qq// doesn't work at all
289 anymore.
290
291 Query strings can have embedded Comments
292 For example:
293
294 $queryExpr = "book/title # this comment is inside the query string
295 [. = 'Moby Dick']"; # this comment is outside
296
297 Optional dollar delimiters and case-insensitive XQL keywords
298 The following XQL keywords are case-insensitive and the dollar sign
299 delimiters may be omitted: $and$, $or$, $not$, $union$,
300 $intersect$, $to$, $any$, $all$, $eq$, $ne$, $lt$, $gt$, $ge$,
301 $le$, $ieq$, $ine$, $ilt$, $igt$, $ige$, $ile$.
302
303 E.g. $AND$, $And$, $aNd$, and, And, aNd are all valid replacements
304 for $and$.
305
306 Note that XQL+ comparison operators ($match$, $no_match$, $isa$,
307 $can$) still require dollar delimiters and are case-sensitive.
308
309 Comparison operator: $match$ or '=~'
310 E.g. "book/title =~ '/(Moby|Dick)/']" will return all book titles
311 containing Moby or Dick. Note that the match expression needs to be
312 quoted and should contain the // or m// delimiters for Perl.
313
314 When casting the values to be matched, both are converted to Text.
315
316 Comparison operator: $no_match$ or '!~'
317 E.g. "book/title !~ '/(Moby|Dick)/']" will return all book titles
318 that don't contain Moby or Dick. Note that the match expression
319 needs to be quoted and should contain the // or m// delimiters for
320 Perl.
321
322 When casting the values to be matched, both are converted to Text.
323
324 Comparison operator: $isa$
325 E.g. '//. $isa$ "XML::XQL::Date"' returns all elements for which
326 the value() function returns an XML::XQL::Date object. (Note that
327 the value() function can be overridden to return a specific object
328 type for certain elements and attributes.) It uses expandType to
329 expand XQL primitive type names.
330
331 Comparison operator: $can$
332 E.g. '//. $can$ "swim"' returns all elements for which the value()
333 function returns an object that implements the (Perl) swim()
334 method. (Note that the value() function can be overridden to
335 return a specific object type for certain elements and attributes.)
336
337 Function: once (QUERY)
338 E.g. 'once(id("foo"))' will evaluate the QUERY expression only once
339 per query. Certain query results (like the above example) will
340 always return the same value within a query. Using once() will
341 cache the QUERY result for the rest of the query.
342
343 Note that "constant" function invocations are always cached. See
344 also "Constant Function Invocations"
345
346 Function: subst (QUERY, EXPR, EXPR [,MODIFIERS, [MODE]])
347 E.g. 'subst(book/title, "[M|m]oby", "Dick", "g")' will replace Moby
348 or moby with Dick globally ("g") in all book title elements.
349 Underneath it uses Perl's substitute operator s///. Don't worry
350 about which delimiters are used underneath. The function returns
351 all the book/titles for which a substitution occurred. The default
352 MODIFIERS string is "" (empty.) The function name may be
353 abbreviated to "s".
354
355 For most Node types, it converts the value() to a string (with
356 xql_toString) to match the string and xql_setValue to set the new
357 value in case it matched. For XQL primitives (Boolean, Number,
358 Text) and other data types (e.g. Date) it uses xql_toString to
359 match the String and xql_setValue to set the result. Beware that
360 performing a substitution on a primitive that was found in the
361 original XQL query expression, changes the value of that constant.
362
363 If MODE is 0 (default), it treats Element nodes differently by
364 matching and replacing text blocks occurring in the Element node. A
365 text block is defined as the concatenation of the raw text of
366 subsequent Text, CDATASection and EntityReference nodes. In this
367 mode it skips embedded Element nodes. If a text block matches, it
368 is replaced by a single Text node, regardless of the original node
369 type(s).
370
371 If MODE is 1, it treats Element nodes like the other nodes, i.e. it
372 converts the value() to a string etc. Note that the default
373 implementation of value() calls text(), which normalizes whitespace
374 and includes embedded Element descendants (recursively.) This is
375 probably not what you want to use in most cases, but since I'm not
376 a professional psychic... :-)
377
378 Function: map (QUERY, CODE)
379 E.g. 'map(book/title, "s/[M|m]oby/Dick/g; $_")' will replace Moby
380 or moby with Dick globally ("g") in all book title elements.
381 Underneath it uses Perl's map operator. The function returns all
382 the book/titles for which a change occurred.
383
384 ??? add more specifics
385
386 Function: eval (EXPR [,TYPE])
387 Evaluates the Perl expression EXPR and returns an object of the
388 specified TYPE. It uses expandType to expand XQL primitive type
389 names. If the result of the eval was undef, the empty list [] is
390 returned.
391
392 E.g. 'eval("2 + 5", "Number")' returns a Number object with the
393 value 7, and
394 'eval("%ENV{USER}")' returns a Text object with the user name.
395
396 Consider using once() to cache the return value, when the
397 invocation will return the same result for each invocation within a
398 query.
399
400 ??? add more specifics
401
402 Function: new (TYPE [, QUERY [, PAR] *])
403 Creates a new object of the specified object TYPE. The constructor
404 may have any number of arguments. The first argument of the
405 constructor (the 2nd argument of the new() function) is considered
406 to be a 'query parameter'. See defineFunction for a definition of
407 query parameter. It uses expandType to expand XQL primitive type
408 names.
409
410 Function: document (QUERY) or doc (QUERY)
411 The document() function creates a new XML::XML::Document for each
412 result of QUERY (QUERY may be a simple string expression, like
413 "/usr/enno/file.xml". See t/xql_document.t or below for an example
414 with a more complex QUERY.)
415
416 document() may be abbreviated to doc().
417
418 document() uses an XML::DOM::Parser underneath, which can be set
419 with XML::XQL::setDocParser(). By default it uses a parser that was
420 created without any arguments, i.e.
421
422 $PARSER = new XML::DOM::Parser;
423
424 Let's try a more complex example, assuming $doc contains:
425
426 <doc>
427 <file name="file1.xml"/>
428 <file name="file2.xml"/>
429 </doc>
430
431 Then the following query will return two XML::XML::Documents, one
432 for file1.xml and one for file2.xml:
433
434 @result = XML::XQL::solve ("document(doc/file/@name)", $doc);
435
436 The resulting documents can be used as input for following queries,
437 e.g.
438
439 @result = XML::XQL::solve ("document(doc/file/@name)/root/bla", $doc);
440
441 will return all /root/bla elements from the documents returned by
442 document().
443
444 Method: DOM_nodeType ()
445 Returns the DOM node type. Note that these are mostly the same as
446 nodeType(), except for CDATASection and EntityReference nodes.
447 DOM_nodeType() returns 4 and 5 respectively, whereas nodeType()
448 returns 3, because they are considered text nodes.
449
450 Function wrappers for Perl builtin functions
451 XQL function wrappers have been provided for most Perl builtin
452 functions. When using a Perl builtin function like "substr" in an
453 XQL+ querry, an XQL function wrapper will be generated on the fly.
454 The arguments to these functions may be regular XQL+ subqueries
455 (that return one or more values) for a query parameter (see
456 generateFunction for a definition.) Most wrappers of Perl builtin
457 functions have argument 0 for a query parameter, except for: chmod
458 (parameter 1 is the query parameter), chown (2) and utime (2). The
459 following functions have no query parameter, which means that all
460 parameters should be a single value: atan2, rand, srand, sprintf,
461 rename, unlink, system.
462
463 The function result is casted to the appropriate XQL primitive type
464 (Number, Text or Boolean), or to an empty list if the result was
465 undef.
466
467 XPath functions and methods
468 The following functions were found in the XPath specification:
469
470 Function: concat (STRING, STRING, STRING*)
471 The concat function returns the concatenation of its arguments.
472
473 Function: starts-with (STRING, STRING)
474 The starts-with function returns true if the first argument string
475 starts with the second argument string, and otherwise returns
476 false.
477
478 Function: contains (STRING, STRING)
479 The contains function returns true if the first argument string
480 contains the second argument string, and otherwise returns false.
481
482 Function: substring-before (STRING, STRING)
483 The substring-before function returns the substring of the first
484 argument string that precedes the first occurrence of the second
485 argument string in the first argument string, or the empty string
486 if the first argument string does not contain the second argument
487 string. For example,
488
489 substring-before("1999/04/01","/") returns 1999.
490
491 Function: substring-after (STRING, STRING)
492 The substring-after function returns the substring of the first
493 argument string that follows the first occurrence of the second
494 argument string in the first argument string, or the empty string
495 if the first argument string does not contain the second argument
496 string. For example,
497
498 substring-after("1999/04/01","/") returns 04/01,
499
500 and
501
502 substring-after("1999/04/01","19") returns 99/04/01.
503
504 Function: substring (STRING, NUMBER [, NUMBER] )
505 The substring function returns the substring of the first argument
506 starting at the position specified in the second argument with
507 length specified in the third argument. For example,
508
509 substring("12345",2,3) returns "234".
510
511 If the third argument is not specified, it returns the substring
512 starting at the position specified in the second argument and
513 continuing to the end of the string. For example,
514
515 substring("12345",2) returns "2345".
516
517 More precisely, each character in the string is considered to have
518 a numeric position: the position of the first character is 1, the
519 position of the second character is 2 and so on.
520
521 NOTE: This differs from the substr method , in which the method
522 treats the position of the first character as 0.
523
524 The XPath spec says this about rounding, but that is not true in
525 this implementation: The returned substring contains those
526 characters for which the position of the character is greater than
527 or equal to the rounded value of the second argument and, if the
528 third argument is specified, less than the sum of the rounded value
529 of the second argument and the rounded value of the third argument;
530 the comparisons and addition used for the above follow the standard
531 IEEE 754 rules; rounding is done as if by a call to the round
532 function.
533
534 Method: string-length ( [ QUERY ] )
535 The string-length returns the number of characters in the string.
536 If the argument is omitted, it defaults to the context node
537 converted to a string, in other words the string-value of the
538 context node.
539
540 Note that the generated XQL wrapper for the Perl built-in substr
541 does not allow the argument to be omitted.
542
543 Method: normalize-space ( [ QUERY ] )
544 The normalize-space function returns the argument string with
545 whitespace normalized by stripping leading and trailing whitespace
546 and replacing sequences of whitespace characters by a single space.
547 Whitespace characters are the same as those allowed by the S
548 production in XML. If the argument is omitted, it defaults to the
549 context node converted to a string, in other words the string-value
550 of the context node.
551
552 Function: translate (STRING, STRING, STRING)
553 The translate function returns the first argument string with
554 occurrences of characters in the second argument string replaced by
555 the character at the corresponding position in the third argument
556 string. For example,
557
558 translate("bar","abc","ABC") returns the string BAr.
559
560 If there is a character in the second argument string with no
561 character at a corresponding position in the third argument string
562 (because the second argument string is longer than the third
563 argument string), then occurrences of that character in the first
564 argument string are removed. For example,
565
566 translate("--aaa--","abc-","ABC") returns "AAA".
567
568 If a character occurs more than once in the second argument string,
569 then the first occurrence determines the replacement character. If
570 the third argument string is longer than the second argument
571 string, then excess characters are ignored.
572
573 NOTE: The translate function is not a sufficient solution for case
574 conversion in all languages. A future version may provide
575 additional functions for case conversion.
576
577 This function was implemented using tr///d.
578
579 Function: sum ( QUERY )
580 The sum function returns the sum of the QUERY results, by
581 converting the string values of each result to a number.
582
583 Function: floor (NUMBER)
584 The floor function returns the largest (closest to positive
585 infinity) number that is not greater than the argument and that is
586 an integer.
587
588 Function: ceiling (NUMBER)
589 The ceiling function returns the smallest (closest to negative
590 infinity) number that is not less than the argument and that is an
591 integer.
592
593 Function: round (NUMBER)
594 The round function returns the number that is closest to the
595 argument and that is an integer. If there are two such numbers,
596 then the one that is closest to positive infinity is returned.
597
599 XQL Builtin Data Types
600 The XQL engine uses the following object classes internally. Only
601 Number, Boolean and Text are considered primitive XQL types:
602
603 • XML::XQL::Number
604
605 For integers and floating point numbers.
606
607 • XML::XQL::Boolean
608
609 For booleans, e.g returned by true() and false().
610
611 • XML::XQL::Text
612
613 For string values.
614
615 • XML::XQL::Date
616
617 For date, time and date/time values. E.g. returned by the
618 date() function.
619
620 • XML::XQL::Node
621
622 Superclass of all XML node types. E.g. all subclasses of
623 XML::DOM::Node subclass from this.
624
625 • Perl list reference
626
627 Lists of values are passed by reference (i.e. using []
628 delimiters). The empty list [] has a double meaning. It also
629 means 'undef' in certain situations, e.g. when a function
630 invocation or comparison failed.
631
632 Type casting in comparisons
633 When two values are compared in an XML comparison (e.g. $eq$) the
634 values are first casted to the same data type. Node values are
635 first replaced by their value() (i.e. the XQL value() function is
636 used, which returns a Text value by default, but may return any
637 data type if the user so chooses.) The resulting values are then
638 casted to the type of the object with the highest xql_primType()
639 value. They are as follows: Node (0), Text (1), Number (2), Boolean
640 (3), Date (4), other data types (4 by default, but this may be
641 overriden by the user.)
642
643 E.g. if one value is a Text value and the other is a Number, the
644 Text value is cast to a Number and the resulting low-level (Perl)
645 comparison is (for $eq$):
646
647 $number->xql_toString == $text->xql_toString
648
649 If both were Text values, it would have been
650
651 $text1->xql_toString eq $text2->xql_toString
652
653 Note that the XQL spec is vague and even conflicting where it
654 concerns type casting. This implementation resulted after talking
655 to Joe Lapp, one of the spec writers.
656
657 Adding Data Types
658 If you want to add your own data type, make sure it derives from
659 XML::XQL::PrimitiveType and implements the necessary methods.
660
661 I will add more stuff here to explain it all, but for now, look at
662 the code for the primitive XQL types or the Date class
663 (XML::XQL::Date in Date.pm.)
664
665 Document Order
666 The XQL spec states that query results always return their values
667 in document order, which means the order in which they appeared in
668 the original XML document. Values extracted from Nodes (e.g. with
669 value(), text(), rawText(), nodeName(), etc.) always have a pointer
670 to the reference node (i.e. the Node from which the value was
671 extracted.) These pointers are acknowledged when (intermediate)
672 result lists are sorted. Currently, the only place where a result
673 list is sorted is in a $union$ expression, which is the only place
674 where the result list can be unordered. (If you find that this is
675 not true, let me know.)
676
677 Non-node values that have no associated reference node, always end
678 up at the end of the result list in the order that they were added.
679 The XQL spec states that the reference node for an XML Attribute is
680 the Element to which it belongs, and that the order of values with
681 the same reference node is undefined. This means that the order of
682 an Element and its attributes would be undefined. But since the
683 XML::DOM module keeps track of the order of the attributes, the XQL
684 engine does the same, and therefore, the attributes of an Element
685 are sorted and appear after their parent Element in a sorted result
686 list.
687
688 Constant Function Invocations
689 If a function always returns the same value when given "constant"
690 arguments, the function is considered to be "constant". A
691 "constant" argument can be either an XQL primitive (Number,
692 Boolean, Text) or a "constant" function invocation. E.g.
693
694 date("12-03-1998")
695 true()
696 sin(0.3)
697 length("abc")
698 date(substr("12-03-1998 is the date", 0, 10))
699
700 are constant, but not:
701
702 length(book[2])
703
704 Results of constant function invocations are cached and calculated
705 only once for each query. See also the CONST parameter in
706 defineFunction. It is not necessary to wrap constant function
707 invocations in a once() call.
708
709 Constant XQL functions are: date, true, false and a lot of the XQL+
710 wrappers for Perl builtin functions. Function wrappers for certain
711 builtins are not made constant on purpose to force the invocation
712 to be evaluated every time, e.g. 'mkdir("/user/enno/my_dir",
713 "0644")' (although constant in appearance) may return different
714 results for multiple invocations. See %PerlFunc in Plus.pm for
715 details.
716
717 Function: count ([QUERY])
718 The count() function has no parameters in the XQL spec. In this
719 implementation it will return the number of QUERY results when
720 passed a QUERY parameter.
721
722 Method: text ([RECURSE])
723 When expanding an Element node, the text() method adds the expanded
724 text() value of sub-Elements. When RECURSE is set to 0 (default is
725 1), it will not include sub-elements. This is useful e.g. when
726 using the $match$ operator in a recursive context (using the //
727 operator), so it won't return parent Elements when one of the
728 children matches.
729
730 Method: rawText ([RECURSE])
731 See text().
732
734 XML::XQL::Query, XML::XQL::DOM, XML::XQL::Date
735
736 The Japanese version of this document can be found on-line at
737 <http://member.nifty.ne.jp/hippo2000/perltips/xml/xql.htm>
738
739 The XML::XQL::Tutorial manual page. The Japanese version can be found
740 at <http://member.nifty.ne.jp/hippo2000/perltips/xml/xql/tutorial.htm>
741
742 The XQL spec at <http://www.w3.org/TandS/QL/QL98/pp/xql.html>
743
744 The Design of XQL at <http://www.texcel.no/whitepapers/xql-design.html>
745
746 The DOM Level 1 specification at <http://www.w3.org/TR/REC-DOM-Level-1>
747
748 The XML spec (Extensible Markup Language 1.0) at
749 <http://www.w3.org/TR/REC-xml>
750
751 The XML::Parser and XML::Parser::Expat manual pages.
752
754 Enno Derksen is the original author.
755
756 Please send bugs, comments and suggestions to T.J. Mather
757 <tjmather@tjmather.com>
758
759
760
761perl v5.34.0 2021-07-23 XML::XQL(3)