1pt::peg::to::tclparam(n)         Parser Tools         pt::peg::to::tclparam(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       pt::peg::to::tclparam - PEG Conversion. Write TCLPARAM format
9

SYNOPSIS

11       package require Tcl  8.5
12
13       package require pt::peg::to::tclparam  ?1.0.3?
14
15       pt::peg::to::tclparam reset
16
17       pt::peg::to::tclparam configure
18
19       pt::peg::to::tclparam configure option
20
21       pt::peg::to::tclparam configure option value...
22
23       pt::peg::to::tclparam convert serial
24
25______________________________________________________________________________
26

DESCRIPTION

28       Are  you  lost ?  Do you have trouble understanding this document ?  In
29       that case please read the overview  provided  by  the  Introduction  to
30       Parser  Tools.  This document is the entrypoint to the whole system the
31       current package is a part of.
32
33       This package implements the converter from parsing expression  grammars
34       to TCLPARAM markup.
35
36       It resides in the Export section of the Core Layer of Parser Tools, and
37       can be used either directly with the other packages of this  layer,  or
38       indirectly  through the export manager provided by pt::peg::export. The
39       latter is intented for use in untrusted environments and  done  through
40       the  corresponding  export plugin pt::peg::export::tclparam sitting be‐
41       tween converter and export manager.
42
43       IMAGE: arch_core_eplugins
44

API

46       The API provided by this package satisfies  the  specification  of  the
47       Converter API found in the Parser Tools Export API specification.
48
49       pt::peg::to::tclparam reset
50              This  command resets the configuration of the package to its de‐
51              fault settings.
52
53       pt::peg::to::tclparam configure
54              This command returns a dictionary containing the current config‐
55              uration of the package.
56
57       pt::peg::to::tclparam configure option
58              This command returns the current value of the specified configu‐
59              ration option of the package. For  the  set  of  legal  options,
60              please read the section Options.
61
62       pt::peg::to::tclparam configure option value...
63              This  command  sets the given configuration options of the pack‐
64              age, to the specified values. For  the  set  of  legal  options,
65              please read the section Options.
66
67       pt::peg::to::tclparam convert serial
68              This  command takes the canonical serialization of a parsing ex‐
69              pression grammar, as specified in section PEG serialization for‐
70              mat,  and contained in serial, and generates TCLPARAM markup en‐
71              coding the grammar, per the current package configuration.   The
72              created string is then returned as the result of the command.
73

OPTIONS

75       The  converter  to Tcl/PARAM markup recognizes the following configura‐
76       tion variables and changes its behaviour as they specify.
77
78       -template string
79              The value of this configuration variable is a string into  which
80              to  put the generated text and the other configuration settings.
81              The various locations for user-data are expected to be specified
82              with  the  placeholders  listed  below.  The  default  value  is
83              "@code@".
84
85              @user@ To be replaced with the value of the configuration  vari‐
86                     able -user.
87
88              @format@
89                     To be replaced with the the constant Tcl/PARAM.
90
91              @file@ To  be replaced with the value of the configuration vari‐
92                     able -file.
93
94              @name@ To be replaced with the value of the configuration  vari‐
95                     able -name.
96
97              @code@ To be replaced with the generated Tcl code.
98
99              The  following configuration variables are special, in that they
100              will occur within the generated code, and are replaced there  as
101              well.
102
103              @runtime@
104                     To  be replaced with the value of the configuration vari‐
105                     able runtime-command.
106
107              @self@ To be replaced with the value of the configuration  vari‐
108                     able self-command.
109
110              @def@  To  be replaced with the value of the configuration vari‐
111                     able proc-command.
112
113              @ns@   To be replaced with the value of the configuration  vari‐
114                     able namespace.
115
116              @main@ To  be replaced with the value of the configuration vari‐
117                     able main.
118
119              @prelude@
120                     To be replaced with the value of the configuration  vari‐
121                     able prelude.
122
123       -name string
124              The  value  of  this  configuration  variable is the name of the
125              grammar for which the conversion is run. The  default  value  is
126              a_pe_grammar.
127
128       -user string
129              The value of this configuration variable is the name of the user
130              for which the conversion is run. The default value is unknown.
131
132       -file string
133              The value of this configuration variable is the name of the file
134              or  other entity from which the grammar came, for which the con‐
135              version is run. The default value is unknown.
136
137       -runtime-command string
138              A Tcl string representing the Tcl command  or  reference  to  it
139              used  to  call PARAM instruction from parser procedures, per the
140              chosen framework (template).  The default  value  is  the  empty
141              string.
142
143       -self-command string
144              A  Tcl  string  representing  the Tcl command or reference to it
145              used to call the parser procedures (methods  ...)  from  another
146              parser  procedure, per the chosen framework (template).  The de‐
147              fault value is the empty string.
148
149       -proc-command string
150              The name of the Tcl command used to define  procedures  (methods
151              ...), per the chosen framework (template).  The default value is
152              proc.
153
154       -namespace string
155              The name of the namespace the parser procedures  (methods,  ...)
156              shall  reside in, including the trailing '::' needed to separate
157              it from the actual procedure name.  The default value is ::.
158
159       -main string
160              The name of the main procedure (method, ...) to be called by the
161              chosen framework (template) to start parsing input.  The default
162              value is __main.
163
164       -prelude string
165              A snippet of code to be insert at the  head  of  each  generated
166              parsing command.  The default value is the empty string.
167
168       -indent integer
169              The  number  of  characters to indent each line of the generated
170              code by.  The default value is 0.
171
172       While the high parameterizability of this converter, as  shown  by  the
173       multitude of options it supports, is an advantage to the advanced user,
174       allowing her to customize the output of  the  converter  as  needed,  a
175       novice user will likely not see the forest for the trees.
176
177       To help these latter users two adjunct packages are provided, each con‐
178       taining a canned configuration which will generate  immediately  useful
179       full parsers. These are
180
181       pt::tclparam::configuration::snit
182              Generated  parsers  are  classes based on the snit package, i.e.
183              snit::type's.
184
185       pt::tclparam::configuration::tcloo
186              Generated parsers are classes based on the OO package.
187

TCL/PARAM CODE REPRESENTATION OF PARSING EXPRESSION GRAMMARS

189       The Tcl/PARAM representation of parsing expression grammars is Tcl code
190       whose  execution will parse input per the grammar. The code is based on
191       the virtual machine documented in the  PackRat  Machine  Specification,
192       using its instructions and a few more to handle control flow.
193
194       Note that the generated code by itself is not functional. It expects to
195       be embedded into a framework which provides  services  like  the  PARAM
196       state,  implementations  for  the PARAM instructions, etc.  The bulk of
197       such a framework has to be specified through the option -template.  The
198       additional options
199
200       -indent integer
201
202       -main string
203
204       -namespace string
205
206       -prelude string
207
208       -proc-command string
209
210       -runtime-command string
211
212       -self-command string
213
214       provide  code  snippets which help to glue framework and generated code
215       together. Their placeholders are in the generated code.
216

PEG SERIALIZATION FORMAT

218       Here we specify the format used by the Parser Tools to serialize  Pars‐
219       ing  Expression Grammars as immutable values for transport, comparison,
220       etc.
221
222       We distinguish between regular and canonical serializations.   While  a
223       PEG  may  have  more than one regular serialization only exactly one of
224       them will be canonical.
225
226       regular serialization
227
228              [1]    The serialization of any PEG is a nested Tcl dictionary.
229
230              [2]    This dictionary holds a single key, pt::grammar::peg, and
231                     its value. This value holds the contents of the grammar.
232
233              [3]    The  contents of the grammar are a Tcl dictionary holding
234                     the set of nonterminal symbols and the  starting  expres‐
235                     sion. The relevant keys and their values are
236
237                     rules  The  value  is a Tcl dictionary whose keys are the
238                            names of the  nonterminal  symbols  known  to  the
239                            grammar.
240
241                            [1]    Each  nonterminal  symbol  may  occur  only
242                                   once.
243
244                            [2]    The empty string is not a legal nonterminal
245                                   symbol.
246
247                            [3]    The  value for each symbol is a Tcl dictio‐
248                                   nary itself. The relevant  keys  and  their
249                                   values in this dictionary are
250
251                                   is     The  value  is  the serialization of
252                                          the  parsing  expression  describing
253                                          the symbols sentennial structure, as
254                                          specified in the section PE  serial‐
255                                          ization format.
256
257                                   mode   The value can be one of three values
258                                          specifying how a parser should  han‐
259                                          dle  the  semantic value produced by
260                                          the symbol.
261
262                                          value  The  semantic  value  of  the
263                                                 nonterminal  symbol is an ab‐
264                                                 stract syntax tree consisting
265                                                 of a single node node for the
266                                                 nonterminal itself, which has
267                                                 the   ASTs  of  the  symbol's
268                                                 right hand side as its  chil‐
269                                                 dren.
270
271                                          leaf   The  semantic  value  of  the
272                                                 nonterminal symbol is an  ab‐
273                                                 stract syntax tree consisting
274                                                 of a single node node for the
275                                                 nonterminal,    without   any
276                                                 children. Any ASTs  generated
277                                                 by  the  symbol's  right hand
278                                                 side are discarded.
279
280                                          void   The nonterminal has no seman‐
281                                                 tic value. Any ASTs generated
282                                                 by the  symbol's  right  hand
283                                                 side are discarded (as well).
284
285                     start  The  value is the serialization of the start pars‐
286                            ing expression of the grammar, as specified in the
287                            section PE serialization format.
288
289              [4]    The terminal symbols of the grammar are specified implic‐
290                     itly as the set of all terminal symbols used in the start
291                     expression and on the RHS of the grammar rules.
292
293       canonical serialization
294              The canonical serialization of a grammar has the format as spec‐
295              ified in the previous item, and then additionally satisfies  the
296              constraints  below,  which make it unique among all the possible
297              serializations of this grammar.
298
299              [1]    The keys found in all the  nested  Tcl  dictionaries  are
300                     sorted  in  ascending  dictionary  order, as generated by
301                     Tcl's builtin command lsort -increasing -dict.
302
303              [2]    The string representation of the value is  the  canonical
304                     representation of a Tcl dictionary. I.e. it does not con‐
305                     tain superfluous whitespace.
306
307   EXAMPLE
308       Assuming the following PEG for simple mathematical expressions
309
310              PEG calculator (Expression)
311                  Digit      <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'       ;
312                  Sign       <- '-' / '+'                                     ;
313                  Number     <- Sign? Digit+                                  ;
314                  Expression <- Term (AddOp Term)*                            ;
315                  MulOp      <- '*' / '/'                                     ;
316                  Term       <- Factor (MulOp Factor)*                        ;
317                  AddOp      <- '+'/'-'                                       ;
318                  Factor     <- '(' Expression ')' / Number                   ;
319              END;
320
321
322       then its canonical serialization (except for whitespace) is
323
324              pt::grammar::peg {
325                  rules {
326                      AddOp      {is {/ {t -} {t +}}                                                                mode value}
327                      Digit      {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}}                mode value}
328                      Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}}                                        mode value}
329                      Factor     {is {/ {x {t (} {n Expression} {t )}} {n Number}}                                  mode value}
330                      MulOp      {is {/ {t *} {t /}}                                                                mode value}
331                      Number     {is {x {? {n Sign}} {+ {n Digit}}}                                                 mode value}
332                      Sign       {is {/ {t -} {t +}}                                                                mode value}
333                      Term       {is {x {n Factor} {* {x {n MulOp} {n Factor}}}}                                    mode value}
334                  }
335                  start {n Expression}
336              }
337
338

PE SERIALIZATION FORMAT

340       Here we specify the format used by the Parser Tools to serialize  Pars‐
341       ing Expressions as immutable values for transport, comparison, etc.
342
343       We  distinguish  between regular and canonical serializations.  While a
344       parsing expression may have more than one  regular  serialization  only
345       exactly one of them will be canonical.
346
347       Regular serialization
348
349              Atomic Parsing Expressions
350
351                     [1]    The  string  epsilon  is an atomic parsing expres‐
352                            sion. It matches the empty string.
353
354                     [2]    The string dot is an atomic parsing expression. It
355                            matches any character.
356
357                     [3]    The  string alnum is an atomic parsing expression.
358                            It matches any Unicode alphabet or  digit  charac‐
359                            ter.  This  is  a custom extension of PEs based on
360                            Tcl's builtin command string is.
361
362                     [4]    The string alpha is an atomic parsing  expression.
363                            It matches any Unicode alphabet character. This is
364                            a custom extension of PEs based on  Tcl's  builtin
365                            command string is.
366
367                     [5]    The  string ascii is an atomic parsing expression.
368                            It matches any Unicode character below U0080. This
369                            is  a  custom  extension  of  PEs  based  on Tcl's
370                            builtin command string is.
371
372                     [6]    The string control is an  atomic  parsing  expres‐
373                            sion.  It  matches  any Unicode control character.
374                            This is a custom extension of PEs based  on  Tcl's
375                            builtin command string is.
376
377                     [7]    The  string digit is an atomic parsing expression.
378                            It matches any Unicode digit character. Note  that
379                            this  includes  characters  outside  of the [0..9]
380                            range. This is a custom extension of PEs based  on
381                            Tcl's builtin command string is.
382
383                     [8]    The  string graph is an atomic parsing expression.
384                            It matches any Unicode printing character,  except
385                            for space. This is a custom extension of PEs based
386                            on Tcl's builtin command string is.
387
388                     [9]    The string lower is an atomic parsing  expression.
389                            It matches any Unicode lower-case alphabet charac‐
390                            ter. This is a custom extension of  PEs  based  on
391                            Tcl's builtin command string is.
392
393                     [10]   The  string print is an atomic parsing expression.
394                            It matches any Unicode printing character, includ‐
395                            ing space. This is a custom extension of PEs based
396                            on Tcl's builtin command string is.
397
398                     [11]   The string punct is an atomic parsing  expression.
399                            It matches any Unicode punctuation character. This
400                            is a  custom  extension  of  PEs  based  on  Tcl's
401                            builtin command string is.
402
403                     [12]   The  string space is an atomic parsing expression.
404                            It matches any Unicode space character. This is  a
405                            custom  extension  of  PEs  based on Tcl's builtin
406                            command string is.
407
408                     [13]   The string upper is an atomic parsing  expression.
409                            It matches any Unicode upper-case alphabet charac‐
410                            ter. This is a custom extension of  PEs  based  on
411                            Tcl's builtin command string is.
412
413                     [14]   The  string  wordchar is an atomic parsing expres‐
414                            sion. It matches any Unicode word character.  This
415                            is any alphanumeric character (see alnum), and any
416                            connector  punctuation  characters  (e.g.   under‐
417                            score). This is a custom extension of PEs based on
418                            Tcl's builtin command string is.
419
420                     [15]   The string xdigit is an atomic parsing expression.
421                            It  matches  any hexadecimal digit character. This
422                            is a  custom  extension  of  PEs  based  on  Tcl's
423                            builtin command string is.
424
425                     [16]   The string ddigit is an atomic parsing expression.
426                            It matches any decimal digit character. This is  a
427                            custom  extension  of  PEs  based on Tcl's builtin
428                            command regexp.
429
430                     [17]   The expression [list t x] is an atomic parsing ex‐
431                            pression. It matches the terminal string x.
432
433                     [18]   The expression [list n A] is an atomic parsing ex‐
434                            pression. It matches the nonterminal A.
435
436              Combined Parsing Expressions
437
438                     [1]    For parsing expressions e1, e2, ... the result  of
439                            [list  /  e1  e2  ... ] is a parsing expression as
440                            well.  This is the ordered choice, aka prioritized
441                            choice.
442
443                     [2]    For  parsing expressions e1, e2, ... the result of
444                            [list x e1 e2 ... ] is  a  parsing  expression  as
445                            well.  This is the sequence.
446
447                     [3]    For  a  parsing expression e the result of [list *
448                            e] is a parsing expression as well.  This  is  the
449                            kleene  closure,  describing  zero or more repeti‐
450                            tions.
451
452                     [4]    For a parsing expression e the result of  [list  +
453                            e]  is  a parsing expression as well.  This is the
454                            positive kleene closure, describing  one  or  more
455                            repetitions.
456
457                     [5]    For  a  parsing expression e the result of [list &
458                            e] is a parsing expression as well.  This  is  the
459                            and lookahead predicate.
460
461                     [6]    For  a  parsing expression e the result of [list !
462                            e] is a parsing expression as well.  This  is  the
463                            not lookahead predicate.
464
465                     [7]    For  a  parsing expression e the result of [list ?
466                            e] is a parsing expression as well.  This  is  the
467                            optional input.
468
469       Canonical serialization
470              The canonical serialization of a parsing expression has the for‐
471              mat as specified in the previous  item,  and  then  additionally
472              satisfies  the constraints below, which make it unique among all
473              the possible serializations of this parsing expression.
474
475              [1]    The string representation of the value is  the  canonical
476                     representation  of a pure Tcl list. I.e. it does not con‐
477                     tain superfluous whitespace.
478
479              [2]    Terminals are not encoded as ranges (where start and  end
480                     of the range are identical).
481
482   EXAMPLE
483       Assuming  the  parsing  expression  shown on the right-hand side of the
484       rule
485
486                  Expression <- Term (AddOp Term)*
487
488
489       then its canonical serialization (except for whitespace) is
490
491                  {x {n Term} {* {x {n AddOp} {n Term}}}}
492
493

BUGS, IDEAS, FEEDBACK

495       This document, and the package it describes, will  undoubtedly  contain
496       bugs  and other problems.  Please report such in the category pt of the
497       Tcllib Trackers  [http://core.tcl.tk/tcllib/reportlist].   Please  also
498       report  any  ideas  for  enhancements  you  may have for either package
499       and/or documentation.
500
501       When proposing code changes, please provide unified diffs, i.e the out‐
502       put of diff -u.
503
504       Note  further  that  attachments  are  strongly  preferred over inlined
505       patches. Attachments can be made by going  to  the  Edit  form  of  the
506       ticket  immediately  after  its  creation, and then using the left-most
507       button in the secondary navigation bar.
508

KEYWORDS

510       EBNF, LL(k), PEG, TCLPARAM, TDPL, context-free  languages,  conversion,
511       expression,  format  conversion, grammar, matching, parser, parsing ex‐
512       pression, parsing expression grammar, push  down  automaton,  recursive
513       descent, serialization, state, top-down parsing languages, transducer
514

CATEGORY

516       Parsing and Grammars
517
519       Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
520
521
522
523
524tcllib                               1.0.3            pt::peg::to::tclparam(n)
Impressum