1xmlsimple(3am) GNU Awk Extension Modules xmlsimple(3am)
2
3
4
6 xmlsimple - Provides facilities for writing simple one-line scripts
7 with the gawk-xml extension. Also provides higher-level functions that
8 simplify writing more complex scripts.
9
11 @include "xmlsimple"
12
13 parentpath = XmlParent(path)
14 test = XmlMatch(path)
15 scopepath = XmlMatchScope(path)
16 ancestorpath = XmlMatchAttr(path, name, value, mode)
17
18 XmlGrep()
19
21 The xmlsimple library facilitates writing scripts based on the gawk-xml
22 extension. It is an alternative to the xmllib library. A key difference
23 is that $0 is not changed, so xmlsimple is compatible with awk code
24 that relies on the gawk-xml core interface.
25
26 Short token variable names
27 To shorten simple scripts, xmlsimple provides two-letter named vari‐
28 ables that duplicate predefined token-related core variables:
29
30 XD Equivalent to XMLDECLARATION.
31
32 SD Equivalent to XMLSTARTDOCT.
33
34 ED Equivalent to XMLENDDOCT.
35
36 PI Equivalent to XMLPROCINST.
37
38 SE Equivalent to XMLSTARTELEM.
39
40 EE Equivalent to XMLENDELEM.
41
42 TX Equivalent to XMLCHARDATA.
43
44 SC Equivalent to XMLSTARTCDATA.
45
46 EC Equivalent to XMLENDCDATA.
47
48 CM Equivalent to XMLCOMMENT.
49
50 UP Equivalent to XMLUNPARSED.
51
52 EOI Equivalent to XMLENDDOCUMENT.
53
54 Collecting character data
55 Character data items between element tags are automatically collected
56 in a single CHARDATA variable. This feature simplifies processing text
57 data interspersed with comments, processing instructions or CDATA
58 markup.
59
60 CHARDATA
61 Available at every XMLSTARTELEMENT or XMLENDELEMENT token. Con‐
62 tains all the character data since the previous start- or end-
63 element tag.
64
65 Whitespace handling
66 The XMLTRIM mode variable controls whether whitespace in the CHARDATA
67 variable is automatically trimmed or not. Possible values are:
68
69 XMLTRIM = 0
70 Keep all whitespace
71
72 XMLTRIM = 1 (default)
73 Discard leading and trailing whitespace, and collapse contiguous
74 whitespace characters into a single space char.
75
76 XMLTRIM = -1
77 Just collapse contiguous whitespace characters into a single
78 space char. Keeps the collapsed leading or trailing whitespace.
79
80 Record ancestors information
81 The ATTR array variable automatically keeps the attributes of every
82 ancestor of the current element, and of the element itself.
83
84 ATTR[path@attribute]
85 Contains the value of the specified attribute of the ancestor
86 element at the given path.
87
88 Example
89
90 While processing a /books/book/title element, ATTR["/books/book@on-
91 loan"] contains the name of the book loaner.
92
93 Path related functions
94 A fixed path is a slash delimited list of direct child elements
95 (/name/name/...). A path expression accepts also an asterisk (*) to
96 match any name, and a double slash (//) to represent a descendant at
97 any level. An absolute path starts with a slash (path from the root
98 element). A relative path without a leading slash can start at any
99 level (path from some ancestor).
100
101 XmlParent(path)
102 Returns the path of the parent element. I.e., the path argument
103 without the last /name part. The path argument is optional. If
104 not given the XMLPATH is used.
105
106 XmlMatch(path)
107 Tests whether the current XMLPATH matches the path expression
108 argument, anchored at the end.
109
110 XmlMatchScope(path)
111 Returns the XMLPATH prefix not matched by the matching path
112 expression argument. Returns a null value if there is no match.
113
114 XmlMatchAttr(path, name, value, mode)
115 Returns the path of the innermost ancestor that matches the path
116 argument and also has a name attribute with the given value. The
117 mode argument is optional. If non-null then the value is handled
118 as a regular expression instead of a fixed value.
119
120 Grep-like facilities
121 XmlGrep()
122 If invoked at the XMLSTARTELEM event, causes the whole element
123 subtree to be copied to the output.
124
126 The xmlsimple library includes both the xmlbase and xmlcopy libraries.
127 Their functionality is implicitly available.
128
130 The path related functions only operate on elements. Comments, process‐
131 ing instructions or CDATA sections are not taken into account.
132
133 XmlGrep() cannot be used to copy tokens outside the root element (XML
134 prologue or epilogue).
135
137 XML Processing With gawk, xmlbase(3am), xmlcopy(3am), xmltree(3am),
138 xmlwrite(3am).
139
141 Manuel Collado, m-collado@users.sourceforge.net.
142
144 Copyright (C) 2017, Free Software Foundation, Inc.
145
146 Permission is granted to make and distribute verbatim copies of this
147 manual page provided the copyright notice and this permission notice
148 are preserved on all copies.
149
150 Permission is granted to copy and distribute modified versions of this
151 manual page under the conditions for verbatim copying, provided that
152 the entire resulting derived work is distributed under the terms of a
153 permission notice identical to this one.
154
155 Permission is granted to copy and distribute translations of this man‐
156 ual page into another language, under the above conditions for modified
157 versions, except that this permission notice may be stated in a trans‐
158 lation approved by the Foundation.
159
160
161
162GAWK Extension Library (gawkextlibJ)anuary 2017 xmlsimple(3am)