1pt::peg::to::param(n) Parser Tools pt::peg::to::param(n)
2
3
4
5______________________________________________________________________________
6
8 pt::peg::to::param - PEG Conversion. Write PARAM format
9
11 package require Tcl 8.5
12
13 package require pt::peg::to::param ?1?
14
15 package require pt::peg
16
17 package require pt::pe
18
19 pt::peg::to::param reset
20
21 pt::peg::to::param configure
22
23 pt::peg::to::param configure option
24
25 pt::peg::to::param configure option value...
26
27 pt::peg::to::param convert serial
28
29______________________________________________________________________________
30
32 Are you lost ? Do you have trouble understanding this document ? In
33 that case please read the overview provided by the Introduction to
34 Parser Tools. This document is the entrypoint to the whole system the
35 current package is a part of.
36
37 This package implements the converter from parsing expression grammars
38 to PARAM markup.
39
40 It resides in the Export section of the Core Layer of Parser Tools, and
41 can be used either directly with the other packages of this layer, or
42 indirectly through the export manager provided by pt::peg::export. The
43 latter is intented for use in untrusted environments and done through
44 the corresponding export plugin pt::peg::export::param sitting between
45 converter and export manager.
46
47 IMAGE: arch_core_eplugins
48
50 The API provided by this package satisfies the specification of the
51 Converter API found in the Parser Tools Export API specification.
52
53 pt::peg::to::param reset
54 This command resets the configuration of the package to its de‐
55 fault settings.
56
57 pt::peg::to::param configure
58 This command returns a dictionary containing the current config‐
59 uration of the package.
60
61 pt::peg::to::param configure option
62 This command returns the current value of the specified configu‐
63 ration option of the package. For the set of legal options,
64 please read the section Options.
65
66 pt::peg::to::param configure option value...
67 This command sets the given configuration options of the pack‐
68 age, to the specified values. For the set of legal options,
69 please read the section Options.
70
71 pt::peg::to::param convert serial
72 This command takes the canonical serialization of a parsing ex‐
73 pression grammar, as specified in section PEG serialization for‐
74 mat, and contained in serial, and generates PARAM markup encod‐
75 ing the grammar, per the current package configuration. The
76 created string is then returned as the result of the command.
77
79 The converter to PARAM markup recognizes the following configuration
80 variables and changes its behaviour as they specify.
81
82 -template string
83 The value of this configuration variable is a string into which
84 to put the generated text and the other configuration settings.
85 The various locations for user-data are expected to be specified
86 with the placeholders listed below. The default value is
87 "@code@".
88
89 @user@ To be replaced with the value of the configuration vari‐
90 able -user.
91
92 @format@
93 To be replaced with the the constant PARAM.
94
95 @file@ To be replaced with the value of the configuration vari‐
96 able -file.
97
98 @name@ To be replaced with the value of the configuration vari‐
99 able -name.
100
101 @code@ To be replaced with the generated text.
102
103 -name string
104 The value of this configuration variable is the name of the
105 grammar for which the conversion is run. The default value is
106 a_pe_grammar.
107
108 -user string
109 The value of this configuration variable is the name of the user
110 for which the conversion is run. The default value is unknown.
111
112 -file string
113 The value of this configuration variable is the name of the file
114 or other entity from which the grammar came, for which the con‐
115 version is run. The default value is unknown.
116
118 The PARAM code representation of parsing expression grammars is assem‐
119 bler-like text using the instructions of the virtual machine documented
120 in the PackRat Machine Specification, plus a few more for control flow
121 (jump ok, jump fail, call symbol, return).
122
123 It is not really useful, except possibly as a tool demonstrating how a
124 grammar is compiled in general, without getting distracted by the inci‐
125 dentials of a framework, i.e. like the supporting C and Tcl code gener‐
126 ated by the other PARAM-derived formats.
127
128 It has no direct formal specification beyond what was said above.
129
130 EXAMPLE
131 Assuming the following PEG for simple mathematical expressions
132
133 PEG calculator (Expression)
134 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
135 Sign <- '-' / '+' ;
136 Number <- Sign? Digit+ ;
137 Expression <- Term (AddOp Term)* ;
138 MulOp <- '*' / '/' ;
139 Term <- Factor (MulOp Factor)* ;
140 AddOp <- '+'/'-' ;
141 Factor <- '(' Expression ')' / Number ;
142 END;
143
144
145 one possible PARAM serialization for it is
146
147 # -*- text -*-
148 # Parsing Expression Grammar 'TEMPLATE'.
149 # Generated for unknown, from file 'TEST'
150
151 #
152 # Grammar Start Expression
153 #
154
155 <<MAIN>>:
156 call sym_Expression
157 halt
158
159 #
160 # value Symbol 'AddOp'
161 #
162
163 sym_AddOp:
164 # /
165 # '-'
166 # '+'
167
168 symbol_restore AddOp
169 found! jump found_7
170 loc_push
171
172 call choice_5
173
174 fail! value_clear
175 ok! value_leaf AddOp
176 symbol_save AddOp
177 error_nonterminal AddOp
178 loc_pop_discard
179
180 found_7:
181 ok! ast_value_push
182 return
183
184 choice_5:
185 # /
186 # '-'
187 # '+'
188
189 error_clear
190
191 loc_push
192 error_push
193
194 input_next "t -"
195 ok! test_char "-"
196
197 error_pop_merge
198 ok! jump oknoast_4
199
200 loc_pop_rewind
201 loc_push
202 error_push
203
204 input_next "t +"
205 ok! test_char "+"
206
207 error_pop_merge
208 ok! jump oknoast_4
209
210 loc_pop_rewind
211 status_fail
212 return
213
214 oknoast_4:
215 loc_pop_discard
216 return
217 #
218 # value Symbol 'Digit'
219 #
220
221 sym_Digit:
222 # /
223 # '0'
224 # '1'
225 # '2'
226 # '3'
227 # '4'
228 # '5'
229 # '6'
230 # '7'
231 # '8'
232 # '9'
233
234 symbol_restore Digit
235 found! jump found_22
236 loc_push
237
238 call choice_20
239
240 fail! value_clear
241 ok! value_leaf Digit
242 symbol_save Digit
243 error_nonterminal Digit
244 loc_pop_discard
245
246 found_22:
247 ok! ast_value_push
248 return
249
250 choice_20:
251 # /
252 # '0'
253 # '1'
254 # '2'
255 # '3'
256 # '4'
257 # '5'
258 # '6'
259 # '7'
260 # '8'
261 # '9'
262
263 error_clear
264
265 loc_push
266 error_push
267
268 input_next "t 0"
269 ok! test_char "0"
270
271 error_pop_merge
272 ok! jump oknoast_19
273
274 loc_pop_rewind
275 loc_push
276 error_push
277
278 input_next "t 1"
279 ok! test_char "1"
280
281 error_pop_merge
282 ok! jump oknoast_19
283
284 loc_pop_rewind
285 loc_push
286 error_push
287
288 input_next "t 2"
289 ok! test_char "2"
290
291 error_pop_merge
292 ok! jump oknoast_19
293
294 loc_pop_rewind
295 loc_push
296 error_push
297
298 input_next "t 3"
299 ok! test_char "3"
300
301 error_pop_merge
302 ok! jump oknoast_19
303
304 loc_pop_rewind
305 loc_push
306 error_push
307
308 input_next "t 4"
309 ok! test_char "4"
310
311 error_pop_merge
312 ok! jump oknoast_19
313
314 loc_pop_rewind
315 loc_push
316 error_push
317
318 input_next "t 5"
319 ok! test_char "5"
320
321 error_pop_merge
322 ok! jump oknoast_19
323
324 loc_pop_rewind
325 loc_push
326 error_push
327
328 input_next "t 6"
329 ok! test_char "6"
330
331 error_pop_merge
332 ok! jump oknoast_19
333
334 loc_pop_rewind
335 loc_push
336 error_push
337
338 input_next "t 7"
339 ok! test_char "7"
340
341 error_pop_merge
342 ok! jump oknoast_19
343
344 loc_pop_rewind
345 loc_push
346 error_push
347
348 input_next "t 8"
349 ok! test_char "8"
350
351 error_pop_merge
352 ok! jump oknoast_19
353
354 loc_pop_rewind
355 loc_push
356 error_push
357
358 input_next "t 9"
359 ok! test_char "9"
360
361 error_pop_merge
362 ok! jump oknoast_19
363
364 loc_pop_rewind
365 status_fail
366 return
367
368 oknoast_19:
369 loc_pop_discard
370 return
371 #
372 # value Symbol 'Expression'
373 #
374
375 sym_Expression:
376 # /
377 # x
378 # '\('
379 # (Expression)
380 # '\)'
381 # x
382 # (Factor)
383 # *
384 # x
385 # (MulOp)
386 # (Factor)
387
388 symbol_restore Expression
389 found! jump found_46
390 loc_push
391 ast_push
392
393 call choice_44
394
395 fail! value_clear
396 ok! value_reduce Expression
397 symbol_save Expression
398 error_nonterminal Expression
399 ast_pop_rewind
400 loc_pop_discard
401
402 found_46:
403 ok! ast_value_push
404 return
405
406 choice_44:
407 # /
408 # x
409 # '\('
410 # (Expression)
411 # '\)'
412 # x
413 # (Factor)
414 # *
415 # x
416 # (MulOp)
417 # (Factor)
418
419 error_clear
420
421 ast_push
422 loc_push
423 error_push
424
425 call sequence_27
426
427 error_pop_merge
428 ok! jump ok_43
429
430 ast_pop_rewind
431 loc_pop_rewind
432 ast_push
433 loc_push
434 error_push
435
436 call sequence_40
437
438 error_pop_merge
439 ok! jump ok_43
440
441 ast_pop_rewind
442 loc_pop_rewind
443 status_fail
444 return
445
446 ok_43:
447 ast_pop_discard
448 loc_pop_discard
449 return
450
451 sequence_27:
452 # x
453 # '\('
454 # (Expression)
455 # '\)'
456
457 loc_push
458 error_clear
459
460 error_push
461
462 input_next "t ("
463 ok! test_char "("
464
465 error_pop_merge
466 fail! jump failednoast_29
467 ast_push
468 error_push
469
470 call sym_Expression
471
472 error_pop_merge
473 fail! jump failed_28
474 error_push
475
476 input_next "t )"
477 ok! test_char ")"
478
479 error_pop_merge
480 fail! jump failed_28
481
482 ast_pop_discard
483 loc_pop_discard
484 return
485
486 failed_28:
487 ast_pop_rewind
488
489 failednoast_29:
490 loc_pop_rewind
491 return
492
493 sequence_40:
494 # x
495 # (Factor)
496 # *
497 # x
498 # (MulOp)
499 # (Factor)
500
501 ast_push
502 loc_push
503 error_clear
504
505 error_push
506
507 call sym_Factor
508
509 error_pop_merge
510 fail! jump failed_41
511 error_push
512
513 call kleene_37
514
515 error_pop_merge
516 fail! jump failed_41
517
518 ast_pop_discard
519 loc_pop_discard
520 return
521
522 failed_41:
523 ast_pop_rewind
524 loc_pop_rewind
525 return
526
527 kleene_37:
528 # *
529 # x
530 # (MulOp)
531 # (Factor)
532
533 loc_push
534 error_push
535
536 call sequence_34
537
538 error_pop_merge
539 fail! jump failed_38
540 loc_pop_discard
541 jump kleene_37
542
543 failed_38:
544 loc_pop_rewind
545 status_ok
546 return
547
548 sequence_34:
549 # x
550 # (MulOp)
551 # (Factor)
552
553 ast_push
554 loc_push
555 error_clear
556
557 error_push
558
559 call sym_MulOp
560
561 error_pop_merge
562 fail! jump failed_35
563 error_push
564
565 call sym_Factor
566
567 error_pop_merge
568 fail! jump failed_35
569
570 ast_pop_discard
571 loc_pop_discard
572 return
573
574 failed_35:
575 ast_pop_rewind
576 loc_pop_rewind
577 return
578 #
579 # value Symbol 'Factor'
580 #
581
582 sym_Factor:
583 # x
584 # (Term)
585 # *
586 # x
587 # (AddOp)
588 # (Term)
589
590 symbol_restore Factor
591 found! jump found_60
592 loc_push
593 ast_push
594
595 call sequence_57
596
597 fail! value_clear
598 ok! value_reduce Factor
599 symbol_save Factor
600 error_nonterminal Factor
601 ast_pop_rewind
602 loc_pop_discard
603
604 found_60:
605 ok! ast_value_push
606 return
607
608 sequence_57:
609 # x
610 # (Term)
611 # *
612 # x
613 # (AddOp)
614 # (Term)
615
616 ast_push
617 loc_push
618 error_clear
619
620 error_push
621
622 call sym_Term
623
624 error_pop_merge
625 fail! jump failed_58
626 error_push
627
628 call kleene_54
629
630 error_pop_merge
631 fail! jump failed_58
632
633 ast_pop_discard
634 loc_pop_discard
635 return
636
637 failed_58:
638 ast_pop_rewind
639 loc_pop_rewind
640 return
641
642 kleene_54:
643 # *
644 # x
645 # (AddOp)
646 # (Term)
647
648 loc_push
649 error_push
650
651 call sequence_51
652
653 error_pop_merge
654 fail! jump failed_55
655 loc_pop_discard
656 jump kleene_54
657
658 failed_55:
659 loc_pop_rewind
660 status_ok
661 return
662
663 sequence_51:
664 # x
665 # (AddOp)
666 # (Term)
667
668 ast_push
669 loc_push
670 error_clear
671
672 error_push
673
674 call sym_AddOp
675
676 error_pop_merge
677 fail! jump failed_52
678 error_push
679
680 call sym_Term
681
682 error_pop_merge
683 fail! jump failed_52
684
685 ast_pop_discard
686 loc_pop_discard
687 return
688
689 failed_52:
690 ast_pop_rewind
691 loc_pop_rewind
692 return
693 #
694 # value Symbol 'MulOp'
695 #
696
697 sym_MulOp:
698 # /
699 # '*'
700 # '/'
701
702 symbol_restore MulOp
703 found! jump found_67
704 loc_push
705
706 call choice_65
707
708 fail! value_clear
709 ok! value_leaf MulOp
710 symbol_save MulOp
711 error_nonterminal MulOp
712 loc_pop_discard
713
714 found_67:
715 ok! ast_value_push
716 return
717
718 choice_65:
719 # /
720 # '*'
721 # '/'
722
723 error_clear
724
725 loc_push
726 error_push
727
728 input_next "t *"
729 ok! test_char "*"
730
731 error_pop_merge
732 ok! jump oknoast_64
733
734 loc_pop_rewind
735 loc_push
736 error_push
737
738 input_next "t /"
739 ok! test_char "/"
740
741 error_pop_merge
742 ok! jump oknoast_64
743
744 loc_pop_rewind
745 status_fail
746 return
747
748 oknoast_64:
749 loc_pop_discard
750 return
751 #
752 # value Symbol 'Number'
753 #
754
755 sym_Number:
756 # x
757 # ?
758 # (Sign)
759 # +
760 # (Digit)
761
762 symbol_restore Number
763 found! jump found_80
764 loc_push
765 ast_push
766
767 call sequence_77
768
769 fail! value_clear
770 ok! value_reduce Number
771 symbol_save Number
772 error_nonterminal Number
773 ast_pop_rewind
774 loc_pop_discard
775
776 found_80:
777 ok! ast_value_push
778 return
779
780 sequence_77:
781 # x
782 # ?
783 # (Sign)
784 # +
785 # (Digit)
786
787 ast_push
788 loc_push
789 error_clear
790
791 error_push
792
793 call optional_70
794
795 error_pop_merge
796 fail! jump failed_78
797 error_push
798
799 call poskleene_73
800
801 error_pop_merge
802 fail! jump failed_78
803
804 ast_pop_discard
805 loc_pop_discard
806 return
807
808 failed_78:
809 ast_pop_rewind
810 loc_pop_rewind
811 return
812
813 optional_70:
814 # ?
815 # (Sign)
816
817 loc_push
818 error_push
819
820 call sym_Sign
821
822 error_pop_merge
823 fail! loc_pop_rewind
824 ok! loc_pop_discard
825 status_ok
826 return
827
828 poskleene_73:
829 # +
830 # (Digit)
831
832 loc_push
833
834 call sym_Digit
835
836 fail! jump failed_74
837
838 loop_75:
839 loc_pop_discard
840 loc_push
841 error_push
842
843 call sym_Digit
844
845 error_pop_merge
846 ok! jump loop_75
847 status_ok
848
849 failed_74:
850 loc_pop_rewind
851 return
852 #
853 # value Symbol 'Sign'
854 #
855
856 sym_Sign:
857 # /
858 # '-'
859 # '+'
860
861 symbol_restore Sign
862 found! jump found_86
863 loc_push
864
865 call choice_5
866
867 fail! value_clear
868 ok! value_leaf Sign
869 symbol_save Sign
870 error_nonterminal Sign
871 loc_pop_discard
872
873 found_86:
874 ok! ast_value_push
875 return
876 #
877 # value Symbol 'Term'
878 #
879
880 sym_Term:
881 # (Number)
882
883 symbol_restore Term
884 found! jump found_89
885 loc_push
886 ast_push
887
888 call sym_Number
889
890 fail! value_clear
891 ok! value_reduce Term
892 symbol_save Term
893 error_nonterminal Term
894 ast_pop_rewind
895 loc_pop_discard
896
897 found_89:
898 ok! ast_value_push
899 return
900
901 #
902 #
903
904
906 Here we specify the format used by the Parser Tools to serialize Pars‐
907 ing Expression Grammars as immutable values for transport, comparison,
908 etc.
909
910 We distinguish between regular and canonical serializations. While a
911 PEG may have more than one regular serialization only exactly one of
912 them will be canonical.
913
914 regular serialization
915
916 [1] The serialization of any PEG is a nested Tcl dictionary.
917
918 [2] This dictionary holds a single key, pt::grammar::peg, and
919 its value. This value holds the contents of the grammar.
920
921 [3] The contents of the grammar are a Tcl dictionary holding
922 the set of nonterminal symbols and the starting expres‐
923 sion. The relevant keys and their values are
924
925 rules The value is a Tcl dictionary whose keys are the
926 names of the nonterminal symbols known to the
927 grammar.
928
929 [1] Each nonterminal symbol may occur only
930 once.
931
932 [2] The empty string is not a legal nonterminal
933 symbol.
934
935 [3] The value for each symbol is a Tcl dictio‐
936 nary itself. The relevant keys and their
937 values in this dictionary are
938
939 is The value is the serialization of
940 the parsing expression describing
941 the symbols sentennial structure, as
942 specified in the section PE serial‐
943 ization format.
944
945 mode The value can be one of three values
946 specifying how a parser should han‐
947 dle the semantic value produced by
948 the symbol.
949
950 value The semantic value of the
951 nonterminal symbol is an ab‐
952 stract syntax tree consisting
953 of a single node node for the
954 nonterminal itself, which has
955 the ASTs of the symbol's
956 right hand side as its chil‐
957 dren.
958
959 leaf The semantic value of the
960 nonterminal symbol is an ab‐
961 stract syntax tree consisting
962 of a single node node for the
963 nonterminal, without any
964 children. Any ASTs generated
965 by the symbol's right hand
966 side are discarded.
967
968 void The nonterminal has no seman‐
969 tic value. Any ASTs generated
970 by the symbol's right hand
971 side are discarded (as well).
972
973 start The value is the serialization of the start pars‐
974 ing expression of the grammar, as specified in the
975 section PE serialization format.
976
977 [4] The terminal symbols of the grammar are specified implic‐
978 itly as the set of all terminal symbols used in the start
979 expression and on the RHS of the grammar rules.
980
981 canonical serialization
982 The canonical serialization of a grammar has the format as spec‐
983 ified in the previous item, and then additionally satisfies the
984 constraints below, which make it unique among all the possible
985 serializations of this grammar.
986
987 [1] The keys found in all the nested Tcl dictionaries are
988 sorted in ascending dictionary order, as generated by
989 Tcl's builtin command lsort -increasing -dict.
990
991 [2] The string representation of the value is the canonical
992 representation of a Tcl dictionary. I.e. it does not con‐
993 tain superfluous whitespace.
994
995 EXAMPLE
996 Assuming the following PEG for simple mathematical expressions
997
998 PEG calculator (Expression)
999 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
1000 Sign <- '-' / '+' ;
1001 Number <- Sign? Digit+ ;
1002 Expression <- Term (AddOp Term)* ;
1003 MulOp <- '*' / '/' ;
1004 Term <- Factor (MulOp Factor)* ;
1005 AddOp <- '+'/'-' ;
1006 Factor <- '(' Expression ')' / Number ;
1007 END;
1008
1009
1010 then its canonical serialization (except for whitespace) is
1011
1012 pt::grammar::peg {
1013 rules {
1014 AddOp {is {/ {t -} {t +}} mode value}
1015 Digit {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}} mode value}
1016 Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}} mode value}
1017 Factor {is {/ {x {t (} {n Expression} {t )}} {n Number}} mode value}
1018 MulOp {is {/ {t *} {t /}} mode value}
1019 Number {is {x {? {n Sign}} {+ {n Digit}}} mode value}
1020 Sign {is {/ {t -} {t +}} mode value}
1021 Term {is {x {n Factor} {* {x {n MulOp} {n Factor}}}} mode value}
1022 }
1023 start {n Expression}
1024 }
1025
1026
1028 Here we specify the format used by the Parser Tools to serialize Pars‐
1029 ing Expressions as immutable values for transport, comparison, etc.
1030
1031 We distinguish between regular and canonical serializations. While a
1032 parsing expression may have more than one regular serialization only
1033 exactly one of them will be canonical.
1034
1035 Regular serialization
1036
1037 Atomic Parsing Expressions
1038
1039 [1] The string epsilon is an atomic parsing expres‐
1040 sion. It matches the empty string.
1041
1042 [2] The string dot is an atomic parsing expression. It
1043 matches any character.
1044
1045 [3] The string alnum is an atomic parsing expression.
1046 It matches any Unicode alphabet or digit charac‐
1047 ter. This is a custom extension of PEs based on
1048 Tcl's builtin command string is.
1049
1050 [4] The string alpha is an atomic parsing expression.
1051 It matches any Unicode alphabet character. This is
1052 a custom extension of PEs based on Tcl's builtin
1053 command string is.
1054
1055 [5] The string ascii is an atomic parsing expression.
1056 It matches any Unicode character below U0080. This
1057 is a custom extension of PEs based on Tcl's
1058 builtin command string is.
1059
1060 [6] The string control is an atomic parsing expres‐
1061 sion. It matches any Unicode control character.
1062 This is a custom extension of PEs based on Tcl's
1063 builtin command string is.
1064
1065 [7] The string digit is an atomic parsing expression.
1066 It matches any Unicode digit character. Note that
1067 this includes characters outside of the [0..9]
1068 range. This is a custom extension of PEs based on
1069 Tcl's builtin command string is.
1070
1071 [8] The string graph is an atomic parsing expression.
1072 It matches any Unicode printing character, except
1073 for space. This is a custom extension of PEs based
1074 on Tcl's builtin command string is.
1075
1076 [9] The string lower is an atomic parsing expression.
1077 It matches any Unicode lower-case alphabet charac‐
1078 ter. This is a custom extension of PEs based on
1079 Tcl's builtin command string is.
1080
1081 [10] The string print is an atomic parsing expression.
1082 It matches any Unicode printing character, includ‐
1083 ing space. This is a custom extension of PEs based
1084 on Tcl's builtin command string is.
1085
1086 [11] The string punct is an atomic parsing expression.
1087 It matches any Unicode punctuation character. This
1088 is a custom extension of PEs based on Tcl's
1089 builtin command string is.
1090
1091 [12] The string space is an atomic parsing expression.
1092 It matches any Unicode space character. This is a
1093 custom extension of PEs based on Tcl's builtin
1094 command string is.
1095
1096 [13] The string upper is an atomic parsing expression.
1097 It matches any Unicode upper-case alphabet charac‐
1098 ter. This is a custom extension of PEs based on
1099 Tcl's builtin command string is.
1100
1101 [14] The string wordchar is an atomic parsing expres‐
1102 sion. It matches any Unicode word character. This
1103 is any alphanumeric character (see alnum), and any
1104 connector punctuation characters (e.g. under‐
1105 score). This is a custom extension of PEs based on
1106 Tcl's builtin command string is.
1107
1108 [15] The string xdigit is an atomic parsing expression.
1109 It matches any hexadecimal digit character. This
1110 is a custom extension of PEs based on Tcl's
1111 builtin command string is.
1112
1113 [16] The string ddigit is an atomic parsing expression.
1114 It matches any decimal digit character. This is a
1115 custom extension of PEs based on Tcl's builtin
1116 command regexp.
1117
1118 [17] The expression [list t x] is an atomic parsing ex‐
1119 pression. It matches the terminal string x.
1120
1121 [18] The expression [list n A] is an atomic parsing ex‐
1122 pression. It matches the nonterminal A.
1123
1124 Combined Parsing Expressions
1125
1126 [1] For parsing expressions e1, e2, ... the result of
1127 [list / e1 e2 ... ] is a parsing expression as
1128 well. This is the ordered choice, aka prioritized
1129 choice.
1130
1131 [2] For parsing expressions e1, e2, ... the result of
1132 [list x e1 e2 ... ] is a parsing expression as
1133 well. This is the sequence.
1134
1135 [3] For a parsing expression e the result of [list *
1136 e] is a parsing expression as well. This is the
1137 kleene closure, describing zero or more repeti‐
1138 tions.
1139
1140 [4] For a parsing expression e the result of [list +
1141 e] is a parsing expression as well. This is the
1142 positive kleene closure, describing one or more
1143 repetitions.
1144
1145 [5] For a parsing expression e the result of [list &
1146 e] is a parsing expression as well. This is the
1147 and lookahead predicate.
1148
1149 [6] For a parsing expression e the result of [list !
1150 e] is a parsing expression as well. This is the
1151 not lookahead predicate.
1152
1153 [7] For a parsing expression e the result of [list ?
1154 e] is a parsing expression as well. This is the
1155 optional input.
1156
1157 Canonical serialization
1158 The canonical serialization of a parsing expression has the for‐
1159 mat as specified in the previous item, and then additionally
1160 satisfies the constraints below, which make it unique among all
1161 the possible serializations of this parsing expression.
1162
1163 [1] The string representation of the value is the canonical
1164 representation of a pure Tcl list. I.e. it does not con‐
1165 tain superfluous whitespace.
1166
1167 [2] Terminals are not encoded as ranges (where start and end
1168 of the range are identical).
1169
1170 EXAMPLE
1171 Assuming the parsing expression shown on the right-hand side of the
1172 rule
1173
1174 Expression <- Term (AddOp Term)*
1175
1176
1177 then its canonical serialization (except for whitespace) is
1178
1179 {x {n Term} {* {x {n AddOp} {n Term}}}}
1180
1181
1183 This document, and the package it describes, will undoubtedly contain
1184 bugs and other problems. Please report such in the category pt of the
1185 Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please also
1186 report any ideas for enhancements you may have for either package
1187 and/or documentation.
1188
1189 When proposing code changes, please provide unified diffs, i.e the out‐
1190 put of diff -u.
1191
1192 Note further that attachments are strongly preferred over inlined
1193 patches. Attachments can be made by going to the Edit form of the
1194 ticket immediately after its creation, and then using the left-most
1195 button in the secondary navigation bar.
1196
1198 EBNF, LL(k), PARAM, PEG, TDPL, context-free languages, conversion, ex‐
1199 pression, format conversion, grammar, matching, parser, parsing expres‐
1200 sion, parsing expression grammar, push down automaton, recursive de‐
1201 scent, serialization, state, top-down parsing languages, transducer
1202
1204 Parsing and Grammars
1205
1207 Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
1208
1209
1210
1211
1212tcllib 1 pt::peg::to::param(n)