1doctools::idx::parse(n) Documentation tools doctools::idx::parse(n)
2
3
4
5______________________________________________________________________________
6
8 doctools::idx::parse - Parsing text in docidx format
9
11 package require doctools::idx::parse ?0.1?
12
13 package require Tcl 8.4
14
15 package require doctools::idx::structure
16
17 package require doctools::msgcat
18
19 package require doctools::tcl::parse
20
21 package require fileutil
22
23 package require logger
24
25 package require snit
26
27 package require struct::list
28
29 package require struct::stack
30
31 ::doctools::idx::parse text text
32
33 ::doctools::idx::parse file path
34
35 ::doctools::idx::parse includes
36
37 ::doctools::idx::parse include add path
38
39 ::doctools::idx::parse include remove path
40
41 ::doctools::idx::parse include clear
42
43 ::doctools::idx::parse vars
44
45 ::doctools::idx::parse var set name value
46
47 ::doctools::idx::parse var unset name
48
49 ::doctools::idx::parse var clear ?pattern?
50
51______________________________________________________________________________
52
54 This package provides commands to parse text written in the docidx
55 markup language and convert it into the canonical serialization of the
56 keyword index encoded in the text. See the section Keyword index seri‐
57 alization format for specification of their format.
58
59 This is an internal package of doctools, for use by the higher level
60 packages handling docidx documents.
61
63 ::doctools::idx::parse text text
64 The command takes the string contained in text and parses it un‐
65 der the assumption that it contains a document written using the
66 docidx markup language. An error is thrown if this assumption is
67 found to be false. The format of these errors is described in
68 section Parse errors.
69
70 When successful the command returns the canonical serialization
71 of the keyword index which was encoded in the text. See the
72 section Keyword index serialization format for specification of
73 that format.
74
75 ::doctools::idx::parse file path
76 The same as text, except that the text to parse is read from the
77 file specified by path.
78
79 ::doctools::idx::parse includes
80 This method returns the current list of search paths used when
81 looking for include files.
82
83 ::doctools::idx::parse include add path
84 This method adds the path to the list of paths searched when
85 looking for an include file. The call is ignored if the path is
86 already in the list of paths. The method returns the empty
87 string as its result.
88
89 ::doctools::idx::parse include remove path
90 This method removes the path from the list of paths searched
91 when looking for an include file. The call is ignored if the
92 path is not contained in the list of paths. The method returns
93 the empty string as its result.
94
95 ::doctools::idx::parse include clear
96 This method clears the list of search paths for include files.
97
98 ::doctools::idx::parse vars
99 This method returns a dictionary containing the current set of
100 predefined variables known to the vset markup command during
101 processing.
102
103 ::doctools::idx::parse var set name value
104 This method adds the variable name to the set of predefined
105 variables known to the vset markup command during processing,
106 and gives it the specified value. The method returns the empty
107 string as its result.
108
109 ::doctools::idx::parse var unset name
110 This method removes the variable name from the set of predefined
111 variables known to the vset markup command during processing.
112 The method returns the empty string as its result.
113
114 ::doctools::idx::parse var clear ?pattern?
115 This method removes all variables matching the pattern from the
116 set of predefined variables known to the vset markup command
117 during processing. The method returns the empty string as its
118 result.
119
120 The pattern matching is done with string match, and the default
121 pattern used when none is specified, is *.
122
124 The format of the parse error messages thrown when encountering viola‐
125 tions of the docidx markup syntax is human readable and not intended
126 for processing by machines. As such it is not documented.
127
128 However, the errorCode attached to the message is machine-readable and
129 has the following format:
130
131 [1] The error code will be a list, each element describing a single
132 error found in the input. The list has at least one element,
133 possibly more.
134
135 [2] Each error element will be a list containing six strings de‐
136 scribing an error in detail. The strings will be
137
138 [1] The path of the file the error occurred in. This may be
139 empty.
140
141 [2] The range of the token the error was found at. This range
142 is a two-element list containing the offset of the first
143 and last character in the range, counted from the begin‐
144 ning of the input (file). Offsets are counted from zero.
145
146 [3] The line the first character after the error is on.
147 Lines are counted from one.
148
149 [4] The column the first character after the error is at.
150 Columns are counted from zero.
151
152 [5] The message code of the error. This value can be used as
153 argument to msgcat::mc to obtain a localized error mes‐
154 sage, assuming that the application had a suitable call
155 of doctools::msgcat::init to initialize the necessary
156 message catalogs (See package doctools::msgcat).
157
158 [6] A list of details for the error, like the markup command
159 involved. In the case of message code docidx/include/syn‐
160 tax this value is the set of errors found in the included
161 file, using the format described here.
162
164 The docidx format for keyword indices, also called the docidx markup
165 language, is too large to be covered in single section. The interested
166 reader should start with the document
167
168 [1] docidx language introduction
169
170 and then proceed from there to the formal specifications, i.e. the doc‐
171 uments
172
173 [1] docidx language syntax and
174
175 [2] docidx language command reference.
176
177 to get a thorough understanding of the language.
178
180 Here we specify the format used by the doctools v2 packages to serial‐
181 ize keyword indices as immutable values for transport, comparison, etc.
182
183 We distinguish between regular and canonical serializations. While a
184 keyword index may have more than one regular serialization only exactly
185 one of them will be canonical.
186
187 regular serialization
188
189 [1] An index serialization is a nested Tcl dictionary.
190
191 [2] This dictionary holds a single key, doctools::idx, and
192 its value. This value holds the contents of the index.
193
194 [3] The contents of the index are a Tcl dictionary holding
195 the title of the index, a label, and the keywords and
196 references. The relevant keys and their values are
197
198 title The value is a string containing the title of the
199 index.
200
201 label The value is a string containing a label for the
202 index.
203
204 keywords
205 The value is a Tcl dictionary, using the keywords
206 known to the index as keys. The associated values
207 are lists containing the identifiers of the refer‐
208 ences associated with that particular keyword.
209
210 Any reference identifier used in these lists has
211 to exist as a key in the references dictionary,
212 see the next item for its definition.
213
214 references
215 The value is a Tcl dictionary, using the identi‐
216 fiers for the references known to the index as
217 keys. The associated values are 2-element lists
218 containing the type and label of the reference, in
219 this order.
220
221 Any key here has to be associated with at least
222 one keyword, i.e. occur in at least one of the
223 reference lists which are the values in the key‐
224 words dictionary, see previous item for its defi‐
225 nition.
226
227 [4] The type of a reference can be one of two values,
228
229 manpage
230 The identifier of the reference is interpreted as
231 symbolic file name, referring to one of the docu‐
232 ments the index was made for.
233
234 url The identifier of the reference is interpreted as
235 an url, referring to some external location, like
236 a website, etc.
237
238 canonical serialization
239 The canonical serialization of a keyword index has the format as
240 specified in the previous item, and then additionally satisfies
241 the constraints below, which make it unique among all the possi‐
242 ble serializations of the keyword index.
243
244 [1] The keys found in all the nested Tcl dictionaries are
245 sorted in ascending dictionary order, as generated by
246 Tcl's builtin command lsort -increasing -dict.
247
248 [2] The references listed for each keyword of the index, if
249 any, are listed in ascending dictionary order of their
250 labels, as generated by Tcl's builtin command lsort -in‐
251 creasing -dict.
252
254 This document, and the package it describes, will undoubtedly contain
255 bugs and other problems. Please report such in the category doctools
256 of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please
257 also report any ideas for enhancements you may have for either package
258 and/or documentation.
259
260 When proposing code changes, please provide unified diffs, i.e the out‐
261 put of diff -u.
262
263 Note further that attachments are strongly preferred over inlined
264 patches. Attachments can be made by going to the Edit form of the
265 ticket immediately after its creation, and then using the left-most
266 button in the secondary navigation bar.
267
269 docidx, doctools, lexer, parser
270
272 Documentation tools
273
275 Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
276
277
278
279
280tcllib 1 doctools::idx::parse(n)