1pt::peg::to::tclparam(n) Parser Tools pt::peg::to::tclparam(n)
2
3
4
5______________________________________________________________________________
6
8 pt::peg::to::tclparam - PEG Conversion. Write TCLPARAM format
9
11 package require Tcl 8.5
12
13 package require pt::peg::to::tclparam ?1.0.3?
14
15 pt::peg::to::tclparam reset
16
17 pt::peg::to::tclparam configure
18
19 pt::peg::to::tclparam configure option
20
21 pt::peg::to::tclparam configure option value...
22
23 pt::peg::to::tclparam convert serial
24
25______________________________________________________________________________
26
28 Are you lost ? Do you have trouble understanding this document ? In
29 that case please read the overview provided by the Introduction to
30 Parser Tools. This document is the entrypoint to the whole system the
31 current package is a part of.
32
33 This package implements the converter from parsing expression grammars
34 to TCLPARAM markup.
35
36 It resides in the Export section of the Core Layer of Parser Tools, and
37 can be used either directly with the other packages of this layer, or
38 indirectly through the export manager provided by pt::peg::export. The
39 latter is intented for use in untrusted environments and done through
40 the corresponding export plugin pt::peg::export::tclparam sitting be‐
41 tween converter and export manager.
42
43 IMAGE: arch_core_eplugins
44
46 The API provided by this package satisfies the specification of the
47 Converter API found in the Parser Tools Export API specification.
48
49 pt::peg::to::tclparam reset
50 This command resets the configuration of the package to its de‐
51 fault settings.
52
53 pt::peg::to::tclparam configure
54 This command returns a dictionary containing the current config‐
55 uration of the package.
56
57 pt::peg::to::tclparam configure option
58 This command returns the current value of the specified configu‐
59 ration option of the package. For the set of legal options,
60 please read the section Options.
61
62 pt::peg::to::tclparam configure option value...
63 This command sets the given configuration options of the pack‐
64 age, to the specified values. For the set of legal options,
65 please read the section Options.
66
67 pt::peg::to::tclparam convert serial
68 This command takes the canonical serialization of a parsing ex‐
69 pression grammar, as specified in section PEG serialization for‐
70 mat, and contained in serial, and generates TCLPARAM markup en‐
71 coding the grammar, per the current package configuration. The
72 created string is then returned as the result of the command.
73
75 The converter to Tcl/PARAM markup recognizes the following configura‐
76 tion variables and changes its behaviour as they specify.
77
78 -template string
79 The value of this configuration variable is a string into which
80 to put the generated text and the other configuration settings.
81 The various locations for user-data are expected to be specified
82 with the placeholders listed below. The default value is
83 "@code@".
84
85 @user@ To be replaced with the value of the configuration vari‐
86 able -user.
87
88 @format@
89 To be replaced with the the constant Tcl/PARAM.
90
91 @file@ To be replaced with the value of the configuration vari‐
92 able -file.
93
94 @name@ To be replaced with the value of the configuration vari‐
95 able -name.
96
97 @code@ To be replaced with the generated Tcl code.
98
99 The following configuration variables are special, in that they
100 will occur within the generated code, and are replaced there as
101 well.
102
103 @runtime@
104 To be replaced with the value of the configuration vari‐
105 able runtime-command.
106
107 @self@ To be replaced with the value of the configuration vari‐
108 able self-command.
109
110 @def@ To be replaced with the value of the configuration vari‐
111 able proc-command.
112
113 @ns@ To be replaced with the value of the configuration vari‐
114 able namespace.
115
116 @main@ To be replaced with the value of the configuration vari‐
117 able main.
118
119 @prelude@
120 To be replaced with the value of the configuration vari‐
121 able prelude.
122
123 -name string
124 The value of this configuration variable is the name of the
125 grammar for which the conversion is run. The default value is
126 a_pe_grammar.
127
128 -user string
129 The value of this configuration variable is the name of the user
130 for which the conversion is run. The default value is unknown.
131
132 -file string
133 The value of this configuration variable is the name of the file
134 or other entity from which the grammar came, for which the con‐
135 version is run. The default value is unknown.
136
137 -runtime-command string
138 A Tcl string representing the Tcl command or reference to it
139 used to call PARAM instruction from parser procedures, per the
140 chosen framework (template). The default value is the empty
141 string.
142
143 -self-command string
144 A Tcl string representing the Tcl command or reference to it
145 used to call the parser procedures (methods ...) from another
146 parser procedure, per the chosen framework (template). The de‐
147 fault value is the empty string.
148
149 -proc-command string
150 The name of the Tcl command used to define procedures (methods
151 ...), per the chosen framework (template). The default value is
152 proc.
153
154 -namespace string
155 The name of the namespace the parser procedures (methods, ...)
156 shall reside in, including the trailing '::' needed to separate
157 it from the actual procedure name. The default value is ::.
158
159 -main string
160 The name of the main procedure (method, ...) to be called by the
161 chosen framework (template) to start parsing input. The default
162 value is __main.
163
164 -prelude string
165 A snippet of code to be insert at the head of each generated
166 parsing command. The default value is the empty string.
167
168 -indent integer
169 The number of characters to indent each line of the generated
170 code by. The default value is 0.
171
172 While the high parameterizability of this converter, as shown by the
173 multitude of options it supports, is an advantage to the advanced user,
174 allowing her to customize the output of the converter as needed, a
175 novice user will likely not see the forest for the trees.
176
177 To help these latter users two adjunct packages are provided, each con‐
178 taining a canned configuration which will generate immediately useful
179 full parsers. These are
180
181 pt::tclparam::configuration::snit
182 Generated parsers are classes based on the snit package, i.e.
183 snit::type's.
184
185 pt::tclparam::configuration::tcloo
186 Generated parsers are classes based on the OO package.
187
189 The Tcl/PARAM representation of parsing expression grammars is Tcl code
190 whose execution will parse input per the grammar. The code is based on
191 the virtual machine documented in the PackRat Machine Specification,
192 using its instructions and a few more to handle control flow.
193
194 Note that the generated code by itself is not functional. It expects to
195 be embedded into a framework which provides services like the PARAM
196 state, implementations for the PARAM instructions, etc. The bulk of
197 such a framework has to be specified through the option -template. The
198 additional options
199
200 -indent integer
201
202 -main string
203
204 -namespace string
205
206 -prelude string
207
208 -proc-command string
209
210 -runtime-command string
211
212 -self-command string
213
214 provide code snippets which help to glue framework and generated code
215 together. Their placeholders are in the generated code.
216
218 Here we specify the format used by the Parser Tools to serialize Pars‐
219 ing Expression Grammars as immutable values for transport, comparison,
220 etc.
221
222 We distinguish between regular and canonical serializations. While a
223 PEG may have more than one regular serialization only exactly one of
224 them will be canonical.
225
226 regular serialization
227
228 [1] The serialization of any PEG is a nested Tcl dictionary.
229
230 [2] This dictionary holds a single key, pt::grammar::peg, and
231 its value. This value holds the contents of the grammar.
232
233 [3] The contents of the grammar are a Tcl dictionary holding
234 the set of nonterminal symbols and the starting expres‐
235 sion. The relevant keys and their values are
236
237 rules The value is a Tcl dictionary whose keys are the
238 names of the nonterminal symbols known to the
239 grammar.
240
241 [1] Each nonterminal symbol may occur only
242 once.
243
244 [2] The empty string is not a legal nonterminal
245 symbol.
246
247 [3] The value for each symbol is a Tcl dictio‐
248 nary itself. The relevant keys and their
249 values in this dictionary are
250
251 is The value is the serialization of
252 the parsing expression describing
253 the symbols sentennial structure, as
254 specified in the section PE serial‐
255 ization format.
256
257 mode The value can be one of three values
258 specifying how a parser should han‐
259 dle the semantic value produced by
260 the symbol.
261
262 value The semantic value of the
263 nonterminal symbol is an ab‐
264 stract syntax tree consisting
265 of a single node node for the
266 nonterminal itself, which has
267 the ASTs of the symbol's
268 right hand side as its chil‐
269 dren.
270
271 leaf The semantic value of the
272 nonterminal symbol is an ab‐
273 stract syntax tree consisting
274 of a single node node for the
275 nonterminal, without any
276 children. Any ASTs generated
277 by the symbol's right hand
278 side are discarded.
279
280 void The nonterminal has no seman‐
281 tic value. Any ASTs generated
282 by the symbol's right hand
283 side are discarded (as well).
284
285 start The value is the serialization of the start pars‐
286 ing expression of the grammar, as specified in the
287 section PE serialization format.
288
289 [4] The terminal symbols of the grammar are specified implic‐
290 itly as the set of all terminal symbols used in the start
291 expression and on the RHS of the grammar rules.
292
293 canonical serialization
294 The canonical serialization of a grammar has the format as spec‐
295 ified in the previous item, and then additionally satisfies the
296 constraints below, which make it unique among all the possible
297 serializations of this grammar.
298
299 [1] The keys found in all the nested Tcl dictionaries are
300 sorted in ascending dictionary order, as generated by
301 Tcl's builtin command lsort -increasing -dict.
302
303 [2] The string representation of the value is the canonical
304 representation of a Tcl dictionary. I.e. it does not con‐
305 tain superfluous whitespace.
306
307 EXAMPLE
308 Assuming the following PEG for simple mathematical expressions
309
310 PEG calculator (Expression)
311 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
312 Sign <- '-' / '+' ;
313 Number <- Sign? Digit+ ;
314 Expression <- Term (AddOp Term)* ;
315 MulOp <- '*' / '/' ;
316 Term <- Factor (MulOp Factor)* ;
317 AddOp <- '+'/'-' ;
318 Factor <- '(' Expression ')' / Number ;
319 END;
320
321
322 then its canonical serialization (except for whitespace) is
323
324 pt::grammar::peg {
325 rules {
326 AddOp {is {/ {t -} {t +}} mode value}
327 Digit {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}} mode value}
328 Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}} mode value}
329 Factor {is {/ {x {t (} {n Expression} {t )}} {n Number}} mode value}
330 MulOp {is {/ {t *} {t /}} mode value}
331 Number {is {x {? {n Sign}} {+ {n Digit}}} mode value}
332 Sign {is {/ {t -} {t +}} mode value}
333 Term {is {x {n Factor} {* {x {n MulOp} {n Factor}}}} mode value}
334 }
335 start {n Expression}
336 }
337
338
340 Here we specify the format used by the Parser Tools to serialize Pars‐
341 ing Expressions as immutable values for transport, comparison, etc.
342
343 We distinguish between regular and canonical serializations. While a
344 parsing expression may have more than one regular serialization only
345 exactly one of them will be canonical.
346
347 Regular serialization
348
349 Atomic Parsing Expressions
350
351 [1] The string epsilon is an atomic parsing expres‐
352 sion. It matches the empty string.
353
354 [2] The string dot is an atomic parsing expression. It
355 matches any character.
356
357 [3] The string alnum is an atomic parsing expression.
358 It matches any Unicode alphabet or digit charac‐
359 ter. This is a custom extension of PEs based on
360 Tcl's builtin command string is.
361
362 [4] The string alpha is an atomic parsing expression.
363 It matches any Unicode alphabet character. This is
364 a custom extension of PEs based on Tcl's builtin
365 command string is.
366
367 [5] The string ascii is an atomic parsing expression.
368 It matches any Unicode character below U0080. This
369 is a custom extension of PEs based on Tcl's
370 builtin command string is.
371
372 [6] The string control is an atomic parsing expres‐
373 sion. It matches any Unicode control character.
374 This is a custom extension of PEs based on Tcl's
375 builtin command string is.
376
377 [7] The string digit is an atomic parsing expression.
378 It matches any Unicode digit character. Note that
379 this includes characters outside of the [0..9]
380 range. This is a custom extension of PEs based on
381 Tcl's builtin command string is.
382
383 [8] The string graph is an atomic parsing expression.
384 It matches any Unicode printing character, except
385 for space. This is a custom extension of PEs based
386 on Tcl's builtin command string is.
387
388 [9] The string lower is an atomic parsing expression.
389 It matches any Unicode lower-case alphabet charac‐
390 ter. This is a custom extension of PEs based on
391 Tcl's builtin command string is.
392
393 [10] The string print is an atomic parsing expression.
394 It matches any Unicode printing character, includ‐
395 ing space. This is a custom extension of PEs based
396 on Tcl's builtin command string is.
397
398 [11] The string punct is an atomic parsing expression.
399 It matches any Unicode punctuation character. This
400 is a custom extension of PEs based on Tcl's
401 builtin command string is.
402
403 [12] The string space is an atomic parsing expression.
404 It matches any Unicode space character. This is a
405 custom extension of PEs based on Tcl's builtin
406 command string is.
407
408 [13] The string upper is an atomic parsing expression.
409 It matches any Unicode upper-case alphabet charac‐
410 ter. This is a custom extension of PEs based on
411 Tcl's builtin command string is.
412
413 [14] The string wordchar is an atomic parsing expres‐
414 sion. It matches any Unicode word character. This
415 is any alphanumeric character (see alnum), and any
416 connector punctuation characters (e.g. under‐
417 score). This is a custom extension of PEs based on
418 Tcl's builtin command string is.
419
420 [15] The string xdigit is an atomic parsing expression.
421 It matches any hexadecimal digit character. This
422 is a custom extension of PEs based on Tcl's
423 builtin command string is.
424
425 [16] The string ddigit is an atomic parsing expression.
426 It matches any decimal digit character. This is a
427 custom extension of PEs based on Tcl's builtin
428 command regexp.
429
430 [17] The expression [list t x] is an atomic parsing ex‐
431 pression. It matches the terminal string x.
432
433 [18] The expression [list n A] is an atomic parsing ex‐
434 pression. It matches the nonterminal A.
435
436 Combined Parsing Expressions
437
438 [1] For parsing expressions e1, e2, ... the result of
439 [list / e1 e2 ... ] is a parsing expression as
440 well. This is the ordered choice, aka prioritized
441 choice.
442
443 [2] For parsing expressions e1, e2, ... the result of
444 [list x e1 e2 ... ] is a parsing expression as
445 well. This is the sequence.
446
447 [3] For a parsing expression e the result of [list *
448 e] is a parsing expression as well. This is the
449 kleene closure, describing zero or more repeti‐
450 tions.
451
452 [4] For a parsing expression e the result of [list +
453 e] is a parsing expression as well. This is the
454 positive kleene closure, describing one or more
455 repetitions.
456
457 [5] For a parsing expression e the result of [list &
458 e] is a parsing expression as well. This is the
459 and lookahead predicate.
460
461 [6] For a parsing expression e the result of [list !
462 e] is a parsing expression as well. This is the
463 not lookahead predicate.
464
465 [7] For a parsing expression e the result of [list ?
466 e] is a parsing expression as well. This is the
467 optional input.
468
469 Canonical serialization
470 The canonical serialization of a parsing expression has the for‐
471 mat as specified in the previous item, and then additionally
472 satisfies the constraints below, which make it unique among all
473 the possible serializations of this parsing expression.
474
475 [1] The string representation of the value is the canonical
476 representation of a pure Tcl list. I.e. it does not con‐
477 tain superfluous whitespace.
478
479 [2] Terminals are not encoded as ranges (where start and end
480 of the range are identical).
481
482 EXAMPLE
483 Assuming the parsing expression shown on the right-hand side of the
484 rule
485
486 Expression <- Term (AddOp Term)*
487
488
489 then its canonical serialization (except for whitespace) is
490
491 {x {n Term} {* {x {n AddOp} {n Term}}}}
492
493
495 This document, and the package it describes, will undoubtedly contain
496 bugs and other problems. Please report such in the category pt of the
497 Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please also
498 report any ideas for enhancements you may have for either package
499 and/or documentation.
500
501 When proposing code changes, please provide unified diffs, i.e the out‐
502 put of diff -u.
503
504 Note further that attachments are strongly preferred over inlined
505 patches. Attachments can be made by going to the Edit form of the
506 ticket immediately after its creation, and then using the left-most
507 button in the secondary navigation bar.
508
510 EBNF, LL(k), PEG, TCLPARAM, TDPL, context-free languages, conversion,
511 expression, format conversion, grammar, matching, parser, parsing ex‐
512 pression, parsing expression grammar, push down automaton, recursive
513 descent, serialization, state, top-down parsing languages, transducer
514
516 Parsing and Grammars
517
519 Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
520
521
522
523
524tcllib 1.0.3 pt::peg::to::tclparam(n)