1grammar::me::cpu::core(n)Grammar operations and usagegrammar::me::cpu::core(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       grammar::me::cpu::core - ME virtual machine state manipulation
9

SYNOPSIS

11       package require Tcl  8.4
12
13       package require grammar::me::cpu::core  ?0.2?
14
15       ::grammar::me::cpu::core disasm asm
16
17       ::grammar::me::cpu::core asm asm
18
19       ::grammar::me::cpu::core new asm
20
21       ::grammar::me::cpu::core lc state location
22
23       ::grammar::me::cpu::core tok state ?from ?to??
24
25       ::grammar::me::cpu::core pc state
26
27       ::grammar::me::cpu::core iseof state
28
29       ::grammar::me::cpu::core at state
30
31       ::grammar::me::cpu::core cc state
32
33       ::grammar::me::cpu::core sv state
34
35       ::grammar::me::cpu::core ok state
36
37       ::grammar::me::cpu::core error state
38
39       ::grammar::me::cpu::core lstk state
40
41       ::grammar::me::cpu::core astk state
42
43       ::grammar::me::cpu::core mstk state
44
45       ::grammar::me::cpu::core estk state
46
47       ::grammar::me::cpu::core rstk state
48
49       ::grammar::me::cpu::core nc state
50
51       ::grammar::me::cpu::core ast state
52
53       ::grammar::me::cpu::core halted state
54
55       ::grammar::me::cpu::core code state
56
57       ::grammar::me::cpu::core eof statevar
58
59       ::grammar::me::cpu::core put statevar tok lex line col
60
61       ::grammar::me::cpu::core run statevar ?n?
62
63______________________________________________________________________________
64

DESCRIPTION

66       This  package  provides  an  implementation  of the ME virtual machine.
67       Please go and read the document grammar::me_intro first if you  do  not
68       know what a ME virtual machine is.
69
70       This  implementation  represents each ME virtual machine as a Tcl value
71       and provides commands to manipulate and query such values to  show  the
72       effects  of  executing  instructions,  adding tokens, retrieving state,
73       etc.
74
75       The values fully follow the paradigm of  Tcl  that  every  value  is  a
76       string  and while also allowing C implementations for a proper Tcl_Obj‐
77       Type to keep all the important data in native data structures.  Because
78       of the latter it is recommended to access the state values only through
79       the commands of this package to ensure that internal representation  is
80       not shimmered away.
81
82       The  actual  structure used by all state values is described in section
83       CPU STATE.
84

API

86       The package directly provides only a single command, and all the  func‐
87       tionality is made available through its methods.
88
89       ::grammar::me::cpu::core disasm asm
90              This method returns a list containing a disassembly of the match
91              instructions in asm. The format of asm is specified in the  sec‐
92              tion MATCH PROGRAM REPRESENTATION.
93
94              Each  element of the result contains instruction label, instruc‐
95              tion name, and the instruction arguments,  in  this  order.  The
96              label  can  be  the empty string. Jump destinations are shown as
97              labels, strings and tokens unencoded. Token names  are  prefixed
98              with  their numeric id, if, and only if a tokmap is defined. The
99              two components are separated by a colon.
100
101       ::grammar::me::cpu::core asm asm
102              This method returns code in the format as specified  in  section
103              MATCH  PROGRAM  REPRESENTATION  generated  from ME assembly code
104              asm, which is in the format as returned by the method disasm.
105
106       ::grammar::me::cpu::core new asm
107              This method creates state value for a ME virtual machine in  its
108              initial state and returns it as its result.
109
110              The  argument  matchcode  contains  a  Tcl representation of the
111              match instructions the machine has to execute while parsing  the
112              input  stream. Its format is specified in the section MATCH PRO‐
113              GRAM REPRESENTATION.
114
115              The tokmap argument taken by the implementation provided by  the
116              package  grammar::me::tcl  is  here  hidden  inside of the match
117              instructions and therefore not needed.
118
119       ::grammar::me::cpu::core lc state location
120              This method takes the state value of a ME  virtual  machine  and
121              uses  it  to  convert a location in the input stream (as offset)
122              into a line number and column index. The result of the method is
123              a  2-element  list  containing  the two pieces in the order men‐
124              tioned in the previous sentence.
125
126              Note that the method cannot convert locations which the  machine
127              has  not  yet read from the input stream. In other words, if the
128              machine has read 7 characters so far it is possible  to  convert
129              the  offsets  0  to  6, but nothing beyond that. This also shows
130              that it is not possible to convert offsets which refer to  loca‐
131              tions before the beginning of the stream.
132
133              This  utility  allows higher levels to convert the location off‐
134              sets found in the error status and the AST into more human read‐
135              able data.
136
137       ::grammar::me::cpu::core tok state ?from ?to??
138              This  method  takes  the state value of a ME virtual machine and
139              returns a Tcl list containing  the  part  of  the  input  stream
140              between the locations from and to (both inclusive). If to is not
141              specified it will default to the value of from. If from  is  not
142              specified either the whole input stream is returned.
143
144              This  method  places the same restrictions on its location argu‐
145              ments as the method lc.
146
147       ::grammar::me::cpu::core pc state
148              This method takes the state value of a ME  virtual  machine  and
149              returns the current value of the stored program counter.
150
151       ::grammar::me::cpu::core iseof state
152              This  method  takes  the state value of a ME virtual machine and
153              returns the current value of the stored eof flag.
154
155       ::grammar::me::cpu::core at state
156              This method takes the state value of a ME  virtual  machine  and
157              returns the current location in the input stream.
158
159       ::grammar::me::cpu::core cc state
160              This  method  takes  the state value of a ME virtual machine and
161              returns the current token.
162
163       ::grammar::me::cpu::core sv state
164              This method takes the state value of a ME  virtual  machine  and
165              returns  the  current  semantic  value stored in it.  This is an
166              abstract  syntax  tree  as  specified  in  the  document   gram‐
167              mar::me_ast, section AST VALUES.
168
169       ::grammar::me::cpu::core ok state
170              This  method  takes  the state value of a ME virtual machine and
171              returns the match status stored in it.
172
173       ::grammar::me::cpu::core error state
174              This method takes the state value of a ME  virtual  machine  and
175              returns the current error status stored in it.
176
177       ::grammar::me::cpu::core lstk state
178              This  method  takes  the state value of a ME virtual machine and
179              returns the location stack.
180
181       ::grammar::me::cpu::core astk state
182              This method takes the state value of a ME  virtual  machine  and
183              returns the AST stack.
184
185       ::grammar::me::cpu::core mstk state
186              This  method  takes  the state value of a ME virtual machine and
187              returns the AST marker stack.
188
189       ::grammar::me::cpu::core estk state
190              This method takes the state value of a ME  virtual  machine  and
191              returns the error stack.
192
193       ::grammar::me::cpu::core rstk state
194              This  method  takes  the state value of a ME virtual machine and
195              returns the subroutine return stack.
196
197       ::grammar::me::cpu::core nc state
198              This method takes the state value of a ME  virtual  machine  and
199              returns the nonterminal match cache as a dictionary.
200
201       ::grammar::me::cpu::core ast state
202              This  method  takes  the state value of a ME virtual machine and
203              returns the abstract syntax tree currently at the top of the AST
204              stack  stored  in it.  This is an abstract syntax tree as speci‐
205              fied in the document grammar::me_ast, section AST VALUES.
206
207       ::grammar::me::cpu::core halted state
208              This method takes the state value of a ME  virtual  machine  and
209              returns  the  current  halt  status  stored  in  it, i.e. if the
210              machine has stopped or not.
211
212       ::grammar::me::cpu::core code state
213              This method takes the state value of a ME  virtual  machine  and
214              returns the code stored in it, i.e. the instructions executed by
215              the machine.
216
217       ::grammar::me::cpu::core eof statevar
218              This method takes the state value of a  ME  virtual  machine  as
219              stored in the variable named by statevar and modifies it so that
220              the eof flag inside is set. This signals  to  the  machine  that
221              whatever  token  are  in the input queue are the last to be pro‐
222              cessed. There will be no more.
223
224       ::grammar::me::cpu::core put statevar tok lex line col
225              This method takes the state value of a  ME  virtual  machine  as
226              stored in the variable named by statevar and modifies it so that
227              the token tok is added to the end of the input queue, with asso‐
228              ciated lexeme data lex and line/column information.
229
230              The  operation  will  fail  with an error if the eof flag of the
231              machine has been set through the method eof.
232
233       ::grammar::me::cpu::core run statevar ?n?
234              This method takes the state value of a  ME  virtual  machine  as
235              stored  in  the variable named by statevar, executes a number of
236              instructions and stores the state resulting from their modifica‐
237              tions back into the variable.
238
239              The execution loop will run until either
240
241              ·      n instructions have been executed, or
242
243              ·      a halt instruction was executed, or
244
245              ·      the  input queue is empty and the code is asking for more
246                     tokens to process.
247
248       If no limit n was set only the last two conditions are checked for.
249
250   MATCH PROGRAM REPRESENTATION
251       A match program is represented by nested Tcl list. The  first  element,
252       asm,  is  a  list  of integer numbers, the instructions to execute, and
253       their arguments. The second element, pool, is a list of strings, refer‐
254       enced  by  the  instructions, for error messages, token names, etc. The
255       third element, tokmap, provides ordering information  for  the  tokens,
256       mapping their names to their numerical rank. This element can be empty,
257       forcing lexicographic comparison when matching ranges.
258
259       All ME instructions are encoded as integer numbers,  with  the  mapping
260       given  below.  A  number  of the instructions, those which handle error
261       messages, have been given an additional argument to supply that message
262       explicitly instead of having it constructed from token names, etc. This
263       allows the machine state to store only the message ids instead  of  the
264       full strings.
265
266       Jump  destination  arguments are absolute indices into the asm element,
267       refering to the instruction to jump to. Any string arguments are  abso‐
268       lute  indices  into the pool element. Tokens, characters, messages, and
269       token (actually character) classes to match  are  coded  as  references
270       into the pool as well.
271
272       [1]    "ict_advance message"
273
274       [2]    "ict_match_token tok message"
275
276       [3]    "ict_match_tokrange tokbegin tokend message"
277
278       [4]    "ict_match_tokclass code message"
279
280       [5]    "inc_restore branchlabel nt"
281
282       [6]    "inc_save nt"
283
284       [7]    "icf_ntcall branchlabel"
285
286       [8]    "icf_ntreturn"
287
288       [9]    "iok_ok"
289
290       [10]   "iok_fail"
291
292       [11]   "iok_negate"
293
294       [12]   "icf_jalways branchlabel"
295
296       [13]   "icf_jok branchlabel"
297
298       [14]   "icf_jfail branchlabel"
299
300       [15]   "icf_halt"
301
302       [16]   "icl_push"
303
304       [17]   "icl_rewind"
305
306       [18]   "icl_pop"
307
308       [19]   "ier_push"
309
310       [20]   "ier_clear"
311
312       [21]   "ier_nonterminal message"
313
314       [22]   "ier_merge"
315
316       [23]   "isv_clear"
317
318       [24]   "isv_terminal"
319
320       [25]   "isv_nonterminal_leaf nt"
321
322       [26]   "isv_nonterminal_range nt"
323
324       [27]   "isv_nonterminal_reduce nt"
325
326       [28]   "ias_push"
327
328       [29]   "ias_mark"
329
330       [30]   "ias_mrewind"
331
332       [31]   "ias_mpop"
333

CPU STATE

335       A state value is a list containing the following elements, in the order
336       listed below:
337
338       [1]    code: Match instructions, see MATCH PROGRAM REPRESENTATION.
339
340       [2]    pc:   Program counter, int.
341
342       [3]    halt: Halt flag, boolean.
343
344       [4]    eof:  Eof flag, boolean
345
346       [5]    tc:   Terminal cache, and input queue. Structure see below.
347
348       [6]    cl:   Current location, int.
349
350       [7]    ct:   Current token, string.
351
352       [8]    ok:   Match status, boolean.
353
354       [9]    sv:   Semantic value, list.
355
356       [10]   er:   Error status, list.
357
358       [11]   ls:   Location stack, list.
359
360       [12]   as:   AST stack, list.
361
362       [13]   ms:   AST marker stack, list.
363
364       [14]   es:   Error stack, list.
365
366       [15]   rs:   Return stack, list.
367
368       [16]   nc:   Nonterminal cache, dictionary.
369
370       tc, the input queue of tokens waiting for processing and  the  terminal
371       cache  containing  the  tokens  already processing are one unified data
372       structure simply holding all tokens and  their  information,  with  the
373       current  location  separating  that  which has been processed from that
374       which is waiting.  Each element of the queue/cache is a list containing
375       the  token,  its  lexeme information, line number, and column index, in
376       this order.
377
378       All stacks have their top element aat the end, i.e. pushing an item  is
379       equivalent to appending to the list representing the stack, and popping
380       it removes the last element.
381
382       er, the error status is either empty or a list of two elements, a loca‐
383       tion  in  the input, and a list of messages, encoded as references into
384       the pool element of the code.
385
386       nc, the nonterminal cache is keyed by nonterminal  name  and  location,
387       each  value a four-element list containing current location, match sta‐
388       tus, semantic value, and error status, in this order.
389

BUGS, IDEAS, FEEDBACK

391       This document, and the package it describes, will  undoubtedly  contain
392       bugs and other problems.  Please report such in the category grammar_me
393       of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].   Please
394       also  report any ideas for enhancements you may have for either package
395       and/or documentation.
396
397       When proposing code changes, please provide unified diffs, i.e the out‐
398       put of diff -u.
399
400       Note  further  that  attachments  are  strongly  preferred over inlined
401       patches. Attachments can be made by going  to  the  Edit  form  of  the
402       ticket  immediately  after  its  creation, and then using the left-most
403       button in the secondary navigation bar.
404

KEYWORDS

406       grammar, parsing, virtual machine
407

CATEGORY

409       Grammars and finite automata
410
412       Copyright (c) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net>
413
414
415
416
417tcllib                                0.2            grammar::me::cpu::core(n)
Impressum