1grammar::me::tcl(n) Grammar operations and usage grammar::me::tcl(n)
2
3
4
5______________________________________________________________________________
6
8 grammar::me::tcl - Virtual machine implementation I for parsing token
9 streams
10
12 package require Tcl 8.4
13
14 package require grammar::me::tcl ?0.1?
15
16 ::grammar::me::tcl cmd ...
17
18 ::grammar::me::tcl init nextcmd ?tokmap?
19
20 ::grammar::me::tcl lc location
21
22 ::grammar::me::tcl tok from ?to?
23
24 ::grammar::me::tcl tokens
25
26 ::grammar::me::tcl sv
27
28 ::grammar::me::tcl ast
29
30 ::grammar::me::tcl astall
31
32 ::grammar::me::tcl ctok
33
34 ::grammar::me::tcl nc
35
36 ::grammar::me::tcl next
37
38 ::grammar::me::tcl ord
39
40 ::grammar::me::tcl::ict_advance message
41
42 ::grammar::me::tcl::ict_match_token tok message
43
44 ::grammar::me::tcl::ict_match_tokrange tokbegin tokend message
45
46 ::grammar::me::tcl::ict_match_tokclass code message
47
48 ::grammar::me::tcl::inc_restore nt
49
50 ::grammar::me::tcl::inc_save nt startlocation
51
52 ::grammar::me::tcl::iok_ok
53
54 ::grammar::me::tcl::iok_fail
55
56 ::grammar::me::tcl::iok_negate
57
58 ::grammar::me::tcl::icl_get
59
60 ::grammar::me::tcl::icl_rewind oldlocation
61
62 ::grammar::me::tcl::ier_get
63
64 ::grammar::me::tcl::ier_clear
65
66 ::grammar::me::tcl::ier_nonterminal message location
67
68 ::grammar::me::tcl::ier_merge olderror
69
70 ::grammar::me::tcl::isv_clear
71
72 ::grammar::me::tcl::isv_terminal
73
74 ::grammar::me::tcl::isv_nonterminal_leaf nt startlocation
75
76 ::grammar::me::tcl::isv_nonterminal_range nt startlocation
77
78 ::grammar::me::tcl::isv_nonterminal_reduce nt startlocation ?marker?
79
80 ::grammar::me::tcl::ias_push
81
82 ::grammar::me::tcl::ias_mark
83
84 ::grammar::me::tcl::ias_pop2mark marker
85
86______________________________________________________________________________
87
89 This package provides an implementation of the ME virtual machine.
90 Please go and read the document grammar::me_intro first if you do not
91 know what a ME virtual machine is.
92
93 This implementation is tied very strongly to Tcl. All the stacks in the
94 machine state are handled through the Tcl stack, all control flow is
95 handled by Tcl commands, and the remaining machine instructions are di‐
96 rectly mapped to Tcl commands. Especially the matching of nonterminal
97 symbols is handled by Tcl procedures as well, essentially extending the
98 machine implementation with custom instructions.
99
100 Further on the implementation handles only a single machine which is
101 uninteruptible during execution and hardwired for pull operation. I.e.
102 it explicitly requests each new token through a callback, pulling them
103 into its state.
104
105 A related package is grammar::peg::interp which provides a generic in‐
106 terpreter / parser for parsing expression grammars (PEGs), implemented
107 on top of this implementation of the ME virtual machine.
108
110 The commands documented in this section do not implement any of the in‐
111 structions of the ME virtual machine. They provide the facilities for
112 the initialization of the machine and the retrieval of important infor‐
113 mation.
114
115 ::grammar::me::tcl cmd ...
116 This is an ensemble command providing access to the commands
117 listed in this section. See the methods themselves for detailed
118 specifications.
119
120 ::grammar::me::tcl init nextcmd ?tokmap?
121 This command (re)initializes the machine. It returns the empty
122 string. This command has to be invoked before any other command
123 of this package.
124
125 The command prefix nextcmd represents the input stream of char‐
126 acters and is invoked by the machine whenever the a new charac‐
127 ter from the stream is required. The instruction for handling
128 this is ict_advance. The callback has to return either the
129 empty list, or a list of 4 elements containing the token, its
130 lexeme attribute, and its location as line number and column in‐
131 dex, in this order. The empty list is the signal that the end
132 of the input stream has been reached. The lexeme attribute is
133 stored in the terminal cache, but otherwise not used by the ma‐
134 chine.
135
136 The optional dictionary tokmap maps from tokens to integer num‐
137 bers. If present the numbers impose an order on the tokens,
138 which is subsequently used by ict_match_tokrange to determine if
139 a token is in the specified range or not. If no token map is
140 specified the lexicographic order of th token names will be used
141 instead. This choice is especially asensible when using charac‐
142 ters as tokens.
143
144 ::grammar::me::tcl lc location
145 This command converts the location of a token given as offset in
146 the input stream into the associated line number and column in‐
147 dex. The result of the command is a 2-element list containing
148 the two values, in the order mentioned in the previous sentence.
149 This allows higher levels to convert the location information
150 found in the error status and the generated AST into more human
151 readable data.
152
153 Note that the command is not able to convert locations which
154 have not been reached by the machine yet. In other words, if the
155 machine has read 7 tokens the command is able to convert the
156 offsets 0 to 6, but nothing beyond that. This also shows that it
157 is not possible to convert offsets which refer to locations be‐
158 fore the beginning of the stream.
159
160 After a call of init the state used for the conversion is
161 cleared, making further conversions impossible until the machine
162 has read tokens again.
163
164 ::grammar::me::tcl tok from ?to?
165 This command returns a Tcl list containing the part of the input
166 stream between the locations from and to (both inclusive). If to
167 is not specified it will default to the value of from.
168
169 Each element of the returned list is a list of four elements,
170 the token, its associated lexeme, line number, and column index,
171 in this order. In other words, each element has the same struc‐
172 ture as the result of the nextcmd callback given to ::gram‐
173 mar::me::tcl::init
174
175 This command places the same restrictions on its location argu‐
176 ments as ::grammar::me::tcl::lc.
177
178 ::grammar::me::tcl tokens
179 This command returns the number of tokens currently known to the
180 ME virtual machine.
181
182 ::grammar::me::tcl sv
183 This command returns the current semantic value SV stored in the
184 machine. This is an abstract syntax tree as specified in the
185 document grammar::me_ast, section AST VALUES.
186
187 ::grammar::me::tcl ast
188 This method returns the abstract syntax tree currently at the
189 top of the AST stack of the ME virtual machine. This is an ab‐
190 stract syntax tree as specified in the document grammar::me_ast,
191 section AST VALUES.
192
193 ::grammar::me::tcl astall
194 This method returns the whole stack of abstract syntax trees
195 currently known to the ME virtual machine. Each element of the
196 returned list is an abstract syntax tree as specified in the
197 document grammar::me_ast, section AST VALUES. The top of the
198 stack resides at the end of the list.
199
200 ::grammar::me::tcl ctok
201 This method returns the current token considered by the ME vir‐
202 tual machine.
203
204 ::grammar::me::tcl nc
205 This method returns the contents of the nonterminal cache as a
206 dictionary mapping from "symbol,location" to match information.
207
208 ::grammar::me::tcl next
209 This method returns the next token callback as specified during
210 initialization of the ME virtual machine.
211
212 ::grammar::me::tcl ord
213 This method returns a dictionary containing the tokmap specified
214 during initialization of the ME virtual machine. ::gram‐
215 mar::me::tcl::ok This variable contains the current match status
216 OK. It is provided as variable instead of a command because that
217 makes access to this information faster, and the speed of access
218 is considered very important here as this information is used
219 constantly to determine the control flow.
220
222 Please go and read the document grammar::me_vm first for a specifica‐
223 tion of the basic ME virtual machine and its state.
224
225 This implementation manages the state described in that document, ex‐
226 cept for the stacks minus the AST stack. In other words, location
227 stack, error stack, return stack, and ast marker stack are implicitly
228 managed through standard Tcl scoping, i.e. Tcl variables in procedures,
229 outside of this implementation.
230
232 Please go and read the document grammar::me_vm first for a specifica‐
233 tion of the basic ME virtual machine and its instruction set.
234
235 This implementation maps all instructions to Tcl commands in the name‐
236 space "::grammar::me::tcl", except for the stack related commands, non‐
237 terminal symbols and control flow. Here we simply list the commands
238 and explain the differences to the specified instructions, if there are
239 any. For their semantics see the aforementioned specification. The ma‐
240 chine commands are not reachable through the ensemble command ::gram‐
241 mar::me::tcl.
242
243 ::grammar::me::tcl::ict_advance message
244 No changes.
245
246 ::grammar::me::tcl::ict_match_token tok message
247 No changes.
248
249 ::grammar::me::tcl::ict_match_tokrange tokbegin tokend message
250 If, and only if a token map was specified during initialization
251 then the arguments are the numeric representations of the small‐
252 est and largest tokens in the range. Otherwise they are the rel‐
253 evant tokens themselves and lexicographic comparison is used.
254
255 ::grammar::me::tcl::ict_match_tokclass code message
256 No changes.
257
258 ::grammar::me::tcl::inc_restore nt
259 Instead of taking a branchlabel the command returns a boolean
260 value. The result will be true if and only if cached informa‐
261 tion was found. The caller has to perform the appropriate
262 branching.
263
264 ::grammar::me::tcl::inc_save nt startlocation
265 The command takes the start location as additional argument, as
266 it is managed on the Tcl stack, and not in the machine state.
267
268 icf_ntcall branchlabel
269
270 icf_ntreturn
271 These two instructions are not mapped to commands. They are con‐
272 trol flow instructions and handled in Tcl.
273
274 ::grammar::me::tcl::iok_ok
275 No changes.
276
277 ::grammar::me::tcl::iok_fail
278 No changes.
279
280 ::grammar::me::tcl::iok_negate
281 No changes.
282
283 icf_jalways branchlabel
284
285 icf_jok branchlabel
286
287 icf_jfail branchlabel
288
289 icf_halt
290 These four instructions are not mapped to commands. They are
291 control flow instructions and handled in Tcl.
292
293 ::grammar::me::tcl::icl_get
294 This command returns the current location CL in the input. It
295 replaces icl_push.
296
297 ::grammar::me::tcl::icl_rewind oldlocation
298 The command takes the location as argument as it comes from the
299 Tcl stack, not the machine state.
300
301 icl_pop
302 Not mapped, the stacks are not managed by the package.
303
304 ::grammar::me::tcl::ier_get
305 This command returns the current error state ER. It replaces
306 ier_push.
307
308 ::grammar::me::tcl::ier_clear
309 No changes.
310
311 ::grammar::me::tcl::ier_nonterminal message location
312 The command takes the location as argument as it comes from the
313 Tcl stack, not the machine state.
314
315 ::grammar::me::tcl::ier_merge olderror
316 The command takes the second error state to merge as argument as
317 it comes from the Tcl stack, not the machine state.
318
319 ::grammar::me::tcl::isv_clear
320 No changes.
321
322 ::grammar::me::tcl::isv_terminal
323 No changes.
324
325 ::grammar::me::tcl::isv_nonterminal_leaf nt startlocation
326 The command takes the start location as argument as it comes
327 from the Tcl stack, not the machine state.
328
329 ::grammar::me::tcl::isv_nonterminal_range nt startlocation
330 The command takes the start location as argument as it comes
331 from the Tcl stack, not the machine state.
332
333 ::grammar::me::tcl::isv_nonterminal_reduce nt startlocation ?marker?
334 The command takes start location and marker as argument as it
335 comes from the Tcl stack, not the machine state.
336
337 ::grammar::me::tcl::ias_push
338 No changes.
339
340 ::grammar::me::tcl::ias_mark
341 This command returns a marker for the current state of the AST
342 stack AS. The marker stack is not managed by the machine.
343
344 ::grammar::me::tcl::ias_pop2mark marker
345 The command takes the marker as argument as it comes from the
346 Tcl stack, not the machine state. It replaces ias_mpop.
347
349 This document, and the package it describes, will undoubtedly contain
350 bugs and other problems. Please report such in the category grammar_me
351 of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please
352 also report any ideas for enhancements you may have for either package
353 and/or documentation.
354
355 When proposing code changes, please provide unified diffs, i.e the out‐
356 put of diff -u.
357
358 Note further that attachments are strongly preferred over inlined
359 patches. Attachments can be made by going to the Edit form of the
360 ticket immediately after its creation, and then using the left-most
361 button in the secondary navigation bar.
362
364 grammar, parsing, virtual machine
365
367 Grammars and finite automata
368
370 Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>
371
372
373
374
375tcllib 0.1 grammar::me::tcl(n)