1pt::peg::to::tclparam(n) Parser Tools pt::peg::to::tclparam(n)
2
3
4
5______________________________________________________________________________
6
8 pt::peg::to::tclparam - PEG Conversion. Write TCLPARAM format
9
11 package require Tcl 8.5
12
13 package require pt::peg::to::tclparam ?1.0.3?
14
15 pt::peg::to::tclparam reset
16
17 pt::peg::to::tclparam configure
18
19 pt::peg::to::tclparam configure option
20
21 pt::peg::to::tclparam configure option value...
22
23 pt::peg::to::tclparam convert serial
24
25______________________________________________________________________________
26
28 Are you lost ? Do you have trouble understanding this document ? In
29 that case please read the overview provided by the Introduction to
30 Parser Tools. This document is the entrypoint to the whole system the
31 current package is a part of.
32
33 This package implements the converter from parsing expression grammars
34 to TCLPARAM markup.
35
36 It resides in the Export section of the Core Layer of Parser Tools, and
37 can be used either directly with the other packages of this layer, or
38 indirectly through the export manager provided by pt::peg::export. The
39 latter is intented for use in untrusted environments and done through
40 the corresponding export plugin pt::peg::export::tclparam sitting
41 between converter and export manager.
42
43 IMAGE: arch_core_eplugins
44
46 The API provided by this package satisfies the specification of the
47 Converter API found in the Parser Tools Export API specification.
48
49 pt::peg::to::tclparam reset
50 This command resets the configuration of the package to its
51 default settings.
52
53 pt::peg::to::tclparam configure
54 This command returns a dictionary containing the current config‐
55 uration of the package.
56
57 pt::peg::to::tclparam configure option
58 This command returns the current value of the specified configu‐
59 ration option of the package. For the set of legal options,
60 please read the section Options.
61
62 pt::peg::to::tclparam configure option value...
63 This command sets the given configuration options of the pack‐
64 age, to the specified values. For the set of legal options,
65 please read the section Options.
66
67 pt::peg::to::tclparam convert serial
68 This command takes the canonical serialization of a parsing
69 expression grammar, as specified in section PEG serialization
70 format, and contained in serial, and generates TCLPARAM markup
71 encoding the grammar, per the current package configuration.
72 The created string is then returned as the result of the com‐
73 mand.
74
76 The converter to Tcl/PARAM markup recognizes the following configura‐
77 tion variables and changes its behaviour as they specify.
78
79 -template string
80 The value of this configuration variable is a string into which
81 to put the generated text and the other configuration settings.
82 The various locations for user-data are expected to be specified
83 with the placeholders listed below. The default value is
84 "@code@".
85
86 @user@ To be replaced with the value of the configuration vari‐
87 able -user.
88
89 @format@
90 To be replaced with the the constant Tcl/PARAM.
91
92 @file@ To be replaced with the value of the configuration vari‐
93 able -file.
94
95 @name@ To be replaced with the value of the configuration vari‐
96 able -name.
97
98 @code@ To be replaced with the generated Tcl code.
99
100 The following configuration variables are special, in that they
101 will occur within the generated code, and are replaced there as
102 well.
103
104 @runtime@
105 To be replaced with the value of the configuration vari‐
106 able runtime-command.
107
108 @self@ To be replaced with the value of the configuration vari‐
109 able self-command.
110
111 @def@ To be replaced with the value of the configuration vari‐
112 able proc-command.
113
114 @ns@ To be replaced with the value of the configuration vari‐
115 able namespace.
116
117 @main@ To be replaced with the value of the configuration vari‐
118 able main.
119
120 @prelude@
121 To be replaced with the value of the configuration vari‐
122 able prelude.
123
124 -name string
125 The value of this configuration variable is the name of the
126 grammar for which the conversion is run. The default value is
127 a_pe_grammar.
128
129 -user string
130 The value of this configuration variable is the name of the user
131 for which the conversion is run. The default value is unknown.
132
133 -file string
134 The value of this configuration variable is the name of the file
135 or other entity from which the grammar came, for which the con‐
136 version is run. The default value is unknown.
137
138 -runtime-command string
139 A Tcl string representing the Tcl command or reference to it
140 used to call PARAM instruction from parser procedures, per the
141 chosen framework (template). The default value is the empty
142 string.
143
144 -self-command string
145 A Tcl string representing the Tcl command or reference to it
146 used to call the parser procedures (methods ...) from another
147 parser procedure, per the chosen framework (template). The
148 default value is the empty string.
149
150 -proc-command string
151 The name of the Tcl command used to define procedures (methods
152 ...), per the chosen framework (template). The default value is
153 proc.
154
155 -namespace string
156 The name of the namespace the parser procedures (methods, ...)
157 shall reside in, including the trailing '::' needed to separate
158 it from the actual procedure name. The default value is ::.
159
160 -main string
161 The name of the main procedure (method, ...) to be called by the
162 chosen framework (template) to start parsing input. The default
163 value is __main.
164
165 -prelude string
166 A snippet of code to be insert at the head of each generated
167 parsing command. The default value is the empty string.
168
169 -indent integer
170 The number of characters to indent each line of the generated
171 code by. The default value is 0.
172
173 While the high parameterizability of this converter, as shown by the
174 multitude of options it supports, is an advantage to the advanced user,
175 allowing her to customize the output of the converter as needed, a
176 novice user will likely not see the forest for the trees.
177
178 To help these latter users two adjunct packages are provided, each con‐
179 taining a canned configuration which will generate immediately useful
180 full parsers. These are
181
182 pt::tclparam::configuration::snit
183 Generated parsers are classes based on the snit package, i.e.
184 snit::type's.
185
186 pt::tclparam::configuration::tcloo
187 Generated parsers are classes based on the OO package.
188
190 The Tcl/PARAM representation of parsing expression grammars is Tcl code
191 whose execution will parse input per the grammar. The code is based on
192 the virtual machine documented in the PackRat Machine Specification,
193 using its instructions and a few more to handle control flow.
194
195 Note that the generated code by itself is not functional. It expects to
196 be embedded into a framework which provides services like the PARAM
197 state, implementations for the PARAM instructions, etc. The bulk of
198 such a framework has to be specified through the option -template. The
199 additional options
200
201 -indent integer
202
203 -main string
204
205 -namespace string
206
207 -prelude string
208
209 -proc-command string
210
211 -runtime-command string
212
213 -self-command string
214
215 provide code snippets which help to glue framework and generated code
216 together. Their placeholders are in the generated code.
217
219 Here we specify the format used by the Parser Tools to serialize Pars‐
220 ing Expression Grammars as immutable values for transport, comparison,
221 etc.
222
223 We distinguish between regular and canonical serializations. While a
224 PEG may have more than one regular serialization only exactly one of
225 them will be canonical.
226
227 regular serialization
228
229 [1] The serialization of any PEG is a nested Tcl dictionary.
230
231 [2] This dictionary holds a single key, pt::grammar::peg, and
232 its value. This value holds the contents of the grammar.
233
234 [3] The contents of the grammar are a Tcl dictionary holding
235 the set of nonterminal symbols and the starting expres‐
236 sion. The relevant keys and their values are
237
238 rules The value is a Tcl dictionary whose keys are the
239 names of the nonterminal symbols known to the
240 grammar.
241
242 [1] Each nonterminal symbol may occur only
243 once.
244
245 [2] The empty string is not a legal nonterminal
246 symbol.
247
248 [3] The value for each symbol is a Tcl dictio‐
249 nary itself. The relevant keys and their
250 values in this dictionary are
251
252 is The value is the serialization of
253 the parsing expression describing
254 the symbols sentennial structure, as
255 specified in the section PE serial‐
256 ization format.
257
258 mode The value can be one of three values
259 specifying how a parser should han‐
260 dle the semantic value produced by
261 the symbol.
262
263 value The semantic value of the
264 nonterminal symbol is an
265 abstract syntax tree consist‐
266 ing of a single node node for
267 the nonterminal itself, which
268 has the ASTs of the symbol's
269 right hand side as its chil‐
270 dren.
271
272 leaf The semantic value of the
273 nonterminal symbol is an
274 abstract syntax tree consist‐
275 ing of a single node node for
276 the nonterminal, without any
277 children. Any ASTs generated
278 by the symbol's right hand
279 side are discarded.
280
281 void The nonterminal has no seman‐
282 tic value. Any ASTs generated
283 by the symbol's right hand
284 side are discarded (as well).
285
286 start The value is the serialization of the start pars‐
287 ing expression of the grammar, as specified in the
288 section PE serialization format.
289
290 [4] The terminal symbols of the grammar are specified implic‐
291 itly as the set of all terminal symbols used in the start
292 expression and on the RHS of the grammar rules.
293
294 canonical serialization
295 The canonical serialization of a grammar has the format as spec‐
296 ified in the previous item, and then additionally satisfies the
297 constraints below, which make it unique among all the possible
298 serializations of this grammar.
299
300 [1] The keys found in all the nested Tcl dictionaries are
301 sorted in ascending dictionary order, as generated by
302 Tcl's builtin command lsort -increasing -dict.
303
304 [2] The string representation of the value is the canonical
305 representation of a Tcl dictionary. I.e. it does not con‐
306 tain superfluous whitespace.
307
308 EXAMPLE
309 Assuming the following PEG for simple mathematical expressions
310
311 PEG calculator (Expression)
312 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
313 Sign <- '-' / '+' ;
314 Number <- Sign? Digit+ ;
315 Expression <- Term (AddOp Term)* ;
316 MulOp <- '*' / '/' ;
317 Term <- Factor (MulOp Factor)* ;
318 AddOp <- '+'/'-' ;
319 Factor <- '(' Expression ')' / Number ;
320 END;
321
322
323 then its canonical serialization (except for whitespace) is
324
325 pt::grammar::peg {
326 rules {
327 AddOp {is {/ {t -} {t +}} mode value}
328 Digit {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}} mode value}
329 Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}} mode value}
330 Factor {is {/ {x {t (} {n Expression} {t )}} {n Number}} mode value}
331 MulOp {is {/ {t *} {t /}} mode value}
332 Number {is {x {? {n Sign}} {+ {n Digit}}} mode value}
333 Sign {is {/ {t -} {t +}} mode value}
334 Term {is {x {n Factor} {* {x {n MulOp} {n Factor}}}} mode value}
335 }
336 start {n Expression}
337 }
338
339
341 Here we specify the format used by the Parser Tools to serialize Pars‐
342 ing Expressions as immutable values for transport, comparison, etc.
343
344 We distinguish between regular and canonical serializations. While a
345 parsing expression may have more than one regular serialization only
346 exactly one of them will be canonical.
347
348 Regular serialization
349
350 Atomic Parsing Expressions
351
352 [1] The string epsilon is an atomic parsing expres‐
353 sion. It matches the empty string.
354
355 [2] The string dot is an atomic parsing expression. It
356 matches any character.
357
358 [3] The string alnum is an atomic parsing expression.
359 It matches any Unicode alphabet or digit charac‐
360 ter. This is a custom extension of PEs based on
361 Tcl's builtin command string is.
362
363 [4] The string alpha is an atomic parsing expression.
364 It matches any Unicode alphabet character. This is
365 a custom extension of PEs based on Tcl's builtin
366 command string is.
367
368 [5] The string ascii is an atomic parsing expression.
369 It matches any Unicode character below U0080. This
370 is a custom extension of PEs based on Tcl's
371 builtin command string is.
372
373 [6] The string control is an atomic parsing expres‐
374 sion. It matches any Unicode control character.
375 This is a custom extension of PEs based on Tcl's
376 builtin command string is.
377
378 [7] The string digit is an atomic parsing expression.
379 It matches any Unicode digit character. Note that
380 this includes characters outside of the [0..9]
381 range. This is a custom extension of PEs based on
382 Tcl's builtin command string is.
383
384 [8] The string graph is an atomic parsing expression.
385 It matches any Unicode printing character, except
386 for space. This is a custom extension of PEs based
387 on Tcl's builtin command string is.
388
389 [9] The string lower is an atomic parsing expression.
390 It matches any Unicode lower-case alphabet charac‐
391 ter. This is a custom extension of PEs based on
392 Tcl's builtin command string is.
393
394 [10] The string print is an atomic parsing expression.
395 It matches any Unicode printing character, includ‐
396 ing space. This is a custom extension of PEs based
397 on Tcl's builtin command string is.
398
399 [11] The string punct is an atomic parsing expression.
400 It matches any Unicode punctuation character. This
401 is a custom extension of PEs based on Tcl's
402 builtin command string is.
403
404 [12] The string space is an atomic parsing expression.
405 It matches any Unicode space character. This is a
406 custom extension of PEs based on Tcl's builtin
407 command string is.
408
409 [13] The string upper is an atomic parsing expression.
410 It matches any Unicode upper-case alphabet charac‐
411 ter. This is a custom extension of PEs based on
412 Tcl's builtin command string is.
413
414 [14] The string wordchar is an atomic parsing expres‐
415 sion. It matches any Unicode word character. This
416 is any alphanumeric character (see alnum), and any
417 connector punctuation characters (e.g. under‐
418 score). This is a custom extension of PEs based on
419 Tcl's builtin command string is.
420
421 [15] The string xdigit is an atomic parsing expression.
422 It matches any hexadecimal digit character. This
423 is a custom extension of PEs based on Tcl's
424 builtin command string is.
425
426 [16] The string ddigit is an atomic parsing expression.
427 It matches any decimal digit character. This is a
428 custom extension of PEs based on Tcl's builtin
429 command regexp.
430
431 [17] The expression [list t x] is an atomic parsing
432 expression. It matches the terminal string x.
433
434 [18] The expression [list n A] is an atomic parsing
435 expression. It matches the nonterminal A.
436
437 Combined Parsing Expressions
438
439 [1] For parsing expressions e1, e2, ... the result of
440 [list / e1 e2 ... ] is a parsing expression as
441 well. This is the ordered choice, aka prioritized
442 choice.
443
444 [2] For parsing expressions e1, e2, ... the result of
445 [list x e1 e2 ... ] is a parsing expression as
446 well. This is the sequence.
447
448 [3] For a parsing expression e the result of [list *
449 e] is a parsing expression as well. This is the
450 kleene closure, describing zero or more repeti‐
451 tions.
452
453 [4] For a parsing expression e the result of [list +
454 e] is a parsing expression as well. This is the
455 positive kleene closure, describing one or more
456 repetitions.
457
458 [5] For a parsing expression e the result of [list &
459 e] is a parsing expression as well. This is the
460 and lookahead predicate.
461
462 [6] For a parsing expression e the result of [list !
463 e] is a parsing expression as well. This is the
464 not lookahead predicate.
465
466 [7] For a parsing expression e the result of [list ?
467 e] is a parsing expression as well. This is the
468 optional input.
469
470 Canonical serialization
471 The canonical serialization of a parsing expression has the for‐
472 mat as specified in the previous item, and then additionally
473 satisfies the constraints below, which make it unique among all
474 the possible serializations of this parsing expression.
475
476 [1] The string representation of the value is the canonical
477 representation of a pure Tcl list. I.e. it does not con‐
478 tain superfluous whitespace.
479
480 [2] Terminals are not encoded as ranges (where start and end
481 of the range are identical).
482
483 EXAMPLE
484 Assuming the parsing expression shown on the right-hand side of the
485 rule
486
487 Expression <- Term (AddOp Term)*
488
489
490 then its canonical serialization (except for whitespace) is
491
492 {x {n Term} {* {x {n AddOp} {n Term}}}}
493
494
496 This document, and the package it describes, will undoubtedly contain
497 bugs and other problems. Please report such in the category pt of the
498 Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please also
499 report any ideas for enhancements you may have for either package
500 and/or documentation.
501
502 When proposing code changes, please provide unified diffs, i.e the out‐
503 put of diff -u.
504
505 Note further that attachments are strongly preferred over inlined
506 patches. Attachments can be made by going to the Edit form of the
507 ticket immediately after its creation, and then using the left-most
508 button in the secondary navigation bar.
509
511 EBNF, LL(k), PEG, TCLPARAM, TDPL, context-free languages, conversion,
512 expression, format conversion, grammar, matching, parser, parsing
513 expression, parsing expression grammar, push down automaton, recursive
514 descent, serialization, state, top-down parsing languages, transducer
515
517 Parsing and Grammars
518
520 Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
521
522
523
524
525tcllib 1.0.3 pt::peg::to::tclparam(n)