1grammar::me::cpu::core(n)Grammar operations and usagegrammar::me::cpu::core(n)
2
3
4
5______________________________________________________________________________
6
8 grammar::me::cpu::core - ME virtual machine state manipulation
9
11 package require Tcl 8.4
12
13 package require grammar::me::cpu::core ?0.2?
14
15 ::grammar::me::cpu::core disasm asm
16
17 ::grammar::me::cpu::core asm asm
18
19 ::grammar::me::cpu::core new asm
20
21 ::grammar::me::cpu::core lc state location
22
23 ::grammar::me::cpu::core tok state ?from ?to??
24
25 ::grammar::me::cpu::core pc state
26
27 ::grammar::me::cpu::core iseof state
28
29 ::grammar::me::cpu::core at state
30
31 ::grammar::me::cpu::core cc state
32
33 ::grammar::me::cpu::core sv state
34
35 ::grammar::me::cpu::core ok state
36
37 ::grammar::me::cpu::core error state
38
39 ::grammar::me::cpu::core lstk state
40
41 ::grammar::me::cpu::core astk state
42
43 ::grammar::me::cpu::core mstk state
44
45 ::grammar::me::cpu::core estk state
46
47 ::grammar::me::cpu::core rstk state
48
49 ::grammar::me::cpu::core nc state
50
51 ::grammar::me::cpu::core ast state
52
53 ::grammar::me::cpu::core halted state
54
55 ::grammar::me::cpu::core code state
56
57 ::grammar::me::cpu::core eof statevar
58
59 ::grammar::me::cpu::core put statevar tok lex line col
60
61 ::grammar::me::cpu::core run statevar ?n?
62
63_________________________________________________________________
64
66 This package provides an implementation of the ME virtual machine.
67 Please go and read the document grammar::me_intro first if you do not
68 know what a ME virtual machine is.
69
70 This implementation represents each ME virtual machine as a Tcl value
71 and provides commands to manipulate and query such values to show the
72 effects of executing instructions, adding tokens, retrieving state,
73 etc.
74
75 The values fully follow the paradigm of Tcl that every value is a
76 string and while also allowing C implementations for a proper Tcl_Obj‐
77 Type to keep all the important data in native data structures. Because
78 of the latter it is recommended to access the state values only through
79 the commands of this package to ensure that internal representation is
80 not shimmered away.
81
82 The actual structure used by all state values is described in section
83 CPU STATE.
84
86 The package directly provides only a single command, and all the func‐
87 tionality is made available through its methods.
88
89 ::grammar::me::cpu::core disasm asm
90 This method returns a list containing a disassembly of the match
91 instructions in asm. The format of asm is specified in the sec‐
92 tion MATCH PROGRAM REPRESENTATION.
93
94 Each element of the result contains instruction label, instruc‐
95 tion name, and the instruction arguments, in this order. The
96 label can be the empty string. Jump destinations are shown as
97 labels, strings and tokens unencoded. Token names are prefixed
98 with their numeric id, if, and only if a tokmap is defined. The
99 two components are separated by a colon.
100
101 ::grammar::me::cpu::core asm asm
102 This method returns code in the format as specified in section
103 MATCH PROGRAM REPRESENTATION generated from ME assembly code
104 asm, which is in the format as returned by the method disasm.
105
106 ::grammar::me::cpu::core new asm
107 This method creates state value for a ME virtual machine in its
108 initial state and returns it as its result.
109
110 The argument matchcode contains a Tcl representation of the
111 match instructions the machine has to execute while parsing the
112 input stream. Its format is specified in the section MATCH PRO‐
113 GRAM REPRESENTATION.
114
115 The tokmap argument taken by the implementation provided by the
116 package grammar::me::tcl is here hidden inside of the match
117 instructions and therefore not needed.
118
119 ::grammar::me::cpu::core lc state location
120 This method takes the state value of a ME virtual machine and
121 uses it to convert a location in the input stream (as offset)
122 into a line number and column index. The result of the method is
123 a 2-element list containing the two pieces in the order men‐
124 tioned in the previous sentence.
125
126 Note that the method cannot convert locations which the machine
127 has not yet read from the input stream. In other words, if the
128 machine has read 7 characters so far it is possible to convert
129 the offsets 0 to 6, but nothing beyond that. This also shows
130 that it is not possible to convert offsets which refer to loca‐
131 tions before the beginning of the stream.
132
133 This utility allows higher levels to convert the location off‐
134 sets found in the error status and the AST into more human read‐
135 able data.
136
137 ::grammar::me::cpu::core tok state ?from ?to??
138 This method takes the state value of a ME virtual machine and
139 returns a Tcl list containing the part of the input stream
140 between the locations from and to (both inclusive). If to is not
141 specified it will default to the value of from. If from is not
142 specified either the whole input stream is returned.
143
144 This method places the same restrictions on its location argu‐
145 ments as the method lc.
146
147 ::grammar::me::cpu::core pc state
148 This method takes the state value of a ME virtual machine and
149 returns the current value of the stored program counter.
150
151 ::grammar::me::cpu::core iseof state
152 This method takes the state value of a ME virtual machine and
153 returns the current value of the stored eof flag.
154
155 ::grammar::me::cpu::core at state
156 This method takes the state value of a ME virtual machine and
157 returns the current location in the input stream.
158
159 ::grammar::me::cpu::core cc state
160 This method takes the state value of a ME virtual machine and
161 returns the current token.
162
163 ::grammar::me::cpu::core sv state
164 This method takes the state value of a ME virtual machine and
165 returns the current semantic value stored in it. This is an
166 abstract syntax tree as specified in the document gram‐
167 mar::me_ast, section AST VALUES.
168
169 ::grammar::me::cpu::core ok state
170 This method takes the state value of a ME virtual machine and
171 returns the match status stored in it.
172
173 ::grammar::me::cpu::core error state
174 This method takes the state value of a ME virtual machine and
175 returns the current error status stored in it.
176
177 ::grammar::me::cpu::core lstk state
178 This method takes the state value of a ME virtual machine and
179 returns the location stack.
180
181 ::grammar::me::cpu::core astk state
182 This method takes the state value of a ME virtual machine and
183 returns the AST stack.
184
185 ::grammar::me::cpu::core mstk state
186 This method takes the state value of a ME virtual machine and
187 returns the AST marker stack.
188
189 ::grammar::me::cpu::core estk state
190 This method takes the state value of a ME virtual machine and
191 returns the error stack.
192
193 ::grammar::me::cpu::core rstk state
194 This method takes the state value of a ME virtual machine and
195 returns the subroutine return stack.
196
197 ::grammar::me::cpu::core nc state
198 This method takes the state value of a ME virtual machine and
199 returns the nonterminal match cache as a dictionary.
200
201 ::grammar::me::cpu::core ast state
202 This method takes the state value of a ME virtual machine and
203 returns the abstract syntax tree currently at the top of the AST
204 stack stored in it. This is an abstract syntax tree as speci‐
205 fied in the document grammar::me_ast, section AST VALUES.
206
207 ::grammar::me::cpu::core halted state
208 This method takes the state value of a ME virtual machine and
209 returns the current halt status stored in it, i.e. if the
210 machine has stopped or not.
211
212 ::grammar::me::cpu::core code state
213 This method takes the state value of a ME virtual machine and
214 returns the code stored in it, i.e. the instructions executed by
215 the machine.
216
217 ::grammar::me::cpu::core eof statevar
218 This method takes the state value of a ME virtual machine as
219 stored in the variable named by statevar and modifies it so that
220 the eof flag inside is set. This signals to the machine that
221 whatever token are in the input queue are the last to be pro‐
222 cessed. There will be no more.
223
224 ::grammar::me::cpu::core put statevar tok lex line col
225 This method takes the state value of a ME virtual machine as
226 stored in the variable named by statevar and modifies it so that
227 the token tok is added to the end of the input queue, with asso‐
228 ciated lexeme data lex and line/column information.
229
230 The operation will fail with an error if the eof flag of the
231 machine has been set through the method eof.
232
233 ::grammar::me::cpu::core run statevar ?n?
234 This method takes the state value of a ME virtual machine as
235 stored in the variable named by statevar, executes a number of
236 instructions and stores the state resulting from their modifica‐
237 tions back into the variable.
238
239 The execution loop will run until either
240
241 · n instructions have been executed, or
242
243 · a halt instruction was executed, or
244
245 · the input queue is empty and the code is asking for more
246 tokens to process.
247
248 If no limit n was set only the last two conditions are checked for.
249
250 MATCH PROGRAM REPRESENTATION
251 A match program is represented by nested Tcl list. The first element,
252 asm, is a list of integer numbers, the instructions to execute, and
253 their arguments. The second element, pool, is a list of strings, refer‐
254 enced by the instructions, for error messages, token names, etc. The
255 third element, tokmap, provides ordering information for the tokens,
256 mapping their names to their numerical rank. This element can be empty,
257 forcing lexicographic comparison when matching ranges.
258
259 All ME instructions are encoded as integer numbers, with the mapping
260 given below. A number of the instructions, those which handle error
261 messages, have been given an additional argument to supply that message
262 explicitly instead of having it constructed from token names, etc. This
263 allows the machine state to store only the message ids instead of the
264 full strings.
265
266 Jump destination arguments are absolute indices into the asm element,
267 refering to the instruction to jump to. Any string arguments are abso‐
268 lute indices into the pool element. Tokens, characters, messages, and
269 token (actually character) classes to match are coded as references
270 into the pool as well.
271
272 [1] "ict_advance message"
273
274 [2] "ict_match_token tok message"
275
276 [3] "ict_match_tokrange tokbegin tokend message"
277
278 [4] "ict_match_tokclass code message"
279
280 [5] "inc_restore branchlabel nt"
281
282 [6] "inc_save nt"
283
284 [7] "icf_ntcall branchlabel"
285
286 [8] "icf_ntreturn"
287
288 [9] "iok_ok"
289
290 [10] "iok_fail"
291
292 [11] "iok_negate"
293
294 [12] "icf_jalways branchlabel"
295
296 [13] "icf_jok branchlabel"
297
298 [14] "icf_jfail branchlabel"
299
300 [15] "icf_halt"
301
302 [16] "icl_push"
303
304 [17] "icl_rewind"
305
306 [18] "icl_pop"
307
308 [19] "ier_push"
309
310 [20] "ier_clear"
311
312 [21] "ier_nonterminal message"
313
314 [22] "ier_merge"
315
316 [23] "isv_clear"
317
318 [24] "isv_terminal"
319
320 [25] "isv_nonterminal_leaf nt"
321
322 [26] "isv_nonterminal_range nt"
323
324 [27] "isv_nonterminal_reduce nt"
325
326 [28] "ias_push"
327
328 [29] "ias_mark"
329
330 [30] "ias_mrewind"
331
332 [31] "ias_mpop"
333
335 A state value is a list containing the following elements, in the order
336 listed below:
337
338 [1] code: Match instructions, see MATCH PROGRAM REPRESENTATION.
339
340 [2] pc: Program counter, int.
341
342 [3] halt: Halt flag, boolean.
343
344 [4] eof: Eof flag, boolean
345
346 [5] tc: Terminal cache, and input queue. Structure see below.
347
348 [6] cl: Current location, int.
349
350 [7] ct: Current token, string.
351
352 [8] ok: Match status, boolean.
353
354 [9] sv: Semantic value, list.
355
356 [10] er: Error status, list.
357
358 [11] ls: Location stack, list.
359
360 [12] as: AST stack, list.
361
362 [13] ms: AST marker stack, list.
363
364 [14] es: Error stack, list.
365
366 [15] rs: Return stack, list.
367
368 [16] nc: Nonterminal cache, dictionary.
369
370 tc, the input queue of tokens waiting for processing and the terminal
371 cache containing the tokens already processing are one unified data
372 structure simply holding all tokens and their information, with the
373 current location separating that which has been processed from that
374 which is waiting. Each element of the queue/cache is a list containing
375 the token, its lexeme information, line number, and column index, in
376 this order.
377
378 All stacks have their top element aat the end, i.e. pushing an item is
379 equivalent to appending to the list representing the stack, and popping
380 it removes the last element.
381
382 er, the error status is either empty or a list of two elements, a loca‐
383 tion in the input, and a list of messages, encoded as references into
384 the pool element of the code.
385
386 nc, the nonterminal cache is keyed by nonterminal name and location,
387 each value a four-element list containing current location, match sta‐
388 tus, semantic value, and error status, in this order.
389
391 grammar, parsing, virtual machine
392
394 Copyright (c) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net>
395
396
397
398
399grammar_me 0.2 grammar::me::cpu::core(n)