1pt::peg::to::param(n) Parser Tools pt::peg::to::param(n)
2
3
4
5______________________________________________________________________________
6
8 pt::peg::to::param - PEG Conversion. Write PARAM format
9
11 package require Tcl 8.5
12
13 package require pt::peg::to::param ?1?
14
15 package require pt::peg
16
17 package require pt::pe
18
19 pt::peg::to::param reset
20
21 pt::peg::to::param configure
22
23 pt::peg::to::param configure option
24
25 pt::peg::to::param configure option value...
26
27 pt::peg::to::param convert serial
28
29______________________________________________________________________________
30
32 Are you lost ? Do you have trouble understanding this document ? In
33 that case please read the overview provided by the Introduction to
34 Parser Tools. This document is the entrypoint to the whole system the
35 current package is a part of.
36
37 This package implements the converter from parsing expression grammars
38 to PARAM markup.
39
40 It resides in the Export section of the Core Layer of Parser Tools, and
41 can be used either directly with the other packages of this layer, or
42 indirectly through the export manager provided by pt::peg::export. The
43 latter is intented for use in untrusted environments and done through
44 the corresponding export plugin pt::peg::export::param sitting between
45 converter and export manager.
46
47 IMAGE: arch_core_eplugins
48
50 The API provided by this package satisfies the specification of the
51 Converter API found in the Parser Tools Export API specification.
52
53 pt::peg::to::param reset
54 This command resets the configuration of the package to its
55 default settings.
56
57 pt::peg::to::param configure
58 This command returns a dictionary containing the current config‐
59 uration of the package.
60
61 pt::peg::to::param configure option
62 This command returns the current value of the specified configu‐
63 ration option of the package. For the set of legal options,
64 please read the section Options.
65
66 pt::peg::to::param configure option value...
67 This command sets the given configuration options of the pack‐
68 age, to the specified values. For the set of legal options,
69 please read the section Options.
70
71 pt::peg::to::param convert serial
72 This command takes the canonical serialization of a parsing
73 expression grammar, as specified in section PEG serialization
74 format, and contained in serial, and generates PARAM markup
75 encoding the grammar, per the current package configuration.
76 The created string is then returned as the result of the com‐
77 mand.
78
80 The converter to PARAM markup recognizes the following configuration
81 variables and changes its behaviour as they specify.
82
83 -template string
84 The value of this configuration variable is a string into which
85 to put the generated text and the other configuration settings.
86 The various locations for user-data are expected to be specified
87 with the placeholders listed below. The default value is
88 "@code@".
89
90 @user@ To be replaced with the value of the configuration vari‐
91 able -user.
92
93 @format@
94 To be replaced with the the constant PARAM.
95
96 @file@ To be replaced with the value of the configuration vari‐
97 able -file.
98
99 @name@ To be replaced with the value of the configuration vari‐
100 able -name.
101
102 @code@ To be replaced with the generated text.
103
104 -name string
105 The value of this configuration variable is the name of the
106 grammar for which the conversion is run. The default value is
107 a_pe_grammar.
108
109 -user string
110 The value of this configuration variable is the name of the user
111 for which the conversion is run. The default value is unknown.
112
113 -file string
114 The value of this configuration variable is the name of the file
115 or other entity from which the grammar came, for which the con‐
116 version is run. The default value is unknown.
117
119 The PARAM code representation of parsing expression grammars is assem‐
120 bler-like text using the instructions of the virtual machine documented
121 in the PackRat Machine Specification, plus a few more for control flow
122 (jump ok, jump fail, call symbol, return).
123
124 It is not really useful, except possibly as a tool demonstrating how a
125 grammar is compiled in general, without getting distracted by the inci‐
126 dentials of a framework, i.e. like the supporting C and Tcl code gener‐
127 ated by the other PARAM-derived formats.
128
129 It has no direct formal specification beyond what was said above.
130
131 EXAMPLE
132 Assuming the following PEG for simple mathematical expressions
133
134 PEG calculator (Expression)
135 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
136 Sign <- '-' / '+' ;
137 Number <- Sign? Digit+ ;
138 Expression <- Term (AddOp Term)* ;
139 MulOp <- '*' / '/' ;
140 Term <- Factor (MulOp Factor)* ;
141 AddOp <- '+'/'-' ;
142 Factor <- '(' Expression ')' / Number ;
143 END;
144
145
146 one possible PARAM serialization for it is
147
148 # -*- text -*-
149 # Parsing Expression Grammar 'TEMPLATE'.
150 # Generated for unknown, from file 'TEST'
151
152 #
153 # Grammar Start Expression
154 #
155
156 <<MAIN>>:
157 call sym_Expression
158 halt
159
160 #
161 # value Symbol 'AddOp'
162 #
163
164 sym_AddOp:
165 # /
166 # '-'
167 # '+'
168
169 symbol_restore AddOp
170 found! jump found_7
171 loc_push
172
173 call choice_5
174
175 fail! value_clear
176 ok! value_leaf AddOp
177 symbol_save AddOp
178 error_nonterminal AddOp
179 loc_pop_discard
180
181 found_7:
182 ok! ast_value_push
183 return
184
185 choice_5:
186 # /
187 # '-'
188 # '+'
189
190 error_clear
191
192 loc_push
193 error_push
194
195 input_next "t -"
196 ok! test_char "-"
197
198 error_pop_merge
199 ok! jump oknoast_4
200
201 loc_pop_rewind
202 loc_push
203 error_push
204
205 input_next "t +"
206 ok! test_char "+"
207
208 error_pop_merge
209 ok! jump oknoast_4
210
211 loc_pop_rewind
212 status_fail
213 return
214
215 oknoast_4:
216 loc_pop_discard
217 return
218 #
219 # value Symbol 'Digit'
220 #
221
222 sym_Digit:
223 # /
224 # '0'
225 # '1'
226 # '2'
227 # '3'
228 # '4'
229 # '5'
230 # '6'
231 # '7'
232 # '8'
233 # '9'
234
235 symbol_restore Digit
236 found! jump found_22
237 loc_push
238
239 call choice_20
240
241 fail! value_clear
242 ok! value_leaf Digit
243 symbol_save Digit
244 error_nonterminal Digit
245 loc_pop_discard
246
247 found_22:
248 ok! ast_value_push
249 return
250
251 choice_20:
252 # /
253 # '0'
254 # '1'
255 # '2'
256 # '3'
257 # '4'
258 # '5'
259 # '6'
260 # '7'
261 # '8'
262 # '9'
263
264 error_clear
265
266 loc_push
267 error_push
268
269 input_next "t 0"
270 ok! test_char "0"
271
272 error_pop_merge
273 ok! jump oknoast_19
274
275 loc_pop_rewind
276 loc_push
277 error_push
278
279 input_next "t 1"
280 ok! test_char "1"
281
282 error_pop_merge
283 ok! jump oknoast_19
284
285 loc_pop_rewind
286 loc_push
287 error_push
288
289 input_next "t 2"
290 ok! test_char "2"
291
292 error_pop_merge
293 ok! jump oknoast_19
294
295 loc_pop_rewind
296 loc_push
297 error_push
298
299 input_next "t 3"
300 ok! test_char "3"
301
302 error_pop_merge
303 ok! jump oknoast_19
304
305 loc_pop_rewind
306 loc_push
307 error_push
308
309 input_next "t 4"
310 ok! test_char "4"
311
312 error_pop_merge
313 ok! jump oknoast_19
314
315 loc_pop_rewind
316 loc_push
317 error_push
318
319 input_next "t 5"
320 ok! test_char "5"
321
322 error_pop_merge
323 ok! jump oknoast_19
324
325 loc_pop_rewind
326 loc_push
327 error_push
328
329 input_next "t 6"
330 ok! test_char "6"
331
332 error_pop_merge
333 ok! jump oknoast_19
334
335 loc_pop_rewind
336 loc_push
337 error_push
338
339 input_next "t 7"
340 ok! test_char "7"
341
342 error_pop_merge
343 ok! jump oknoast_19
344
345 loc_pop_rewind
346 loc_push
347 error_push
348
349 input_next "t 8"
350 ok! test_char "8"
351
352 error_pop_merge
353 ok! jump oknoast_19
354
355 loc_pop_rewind
356 loc_push
357 error_push
358
359 input_next "t 9"
360 ok! test_char "9"
361
362 error_pop_merge
363 ok! jump oknoast_19
364
365 loc_pop_rewind
366 status_fail
367 return
368
369 oknoast_19:
370 loc_pop_discard
371 return
372 #
373 # value Symbol 'Expression'
374 #
375
376 sym_Expression:
377 # /
378 # x
379 # '\('
380 # (Expression)
381 # '\)'
382 # x
383 # (Factor)
384 # *
385 # x
386 # (MulOp)
387 # (Factor)
388
389 symbol_restore Expression
390 found! jump found_46
391 loc_push
392 ast_push
393
394 call choice_44
395
396 fail! value_clear
397 ok! value_reduce Expression
398 symbol_save Expression
399 error_nonterminal Expression
400 ast_pop_rewind
401 loc_pop_discard
402
403 found_46:
404 ok! ast_value_push
405 return
406
407 choice_44:
408 # /
409 # x
410 # '\('
411 # (Expression)
412 # '\)'
413 # x
414 # (Factor)
415 # *
416 # x
417 # (MulOp)
418 # (Factor)
419
420 error_clear
421
422 ast_push
423 loc_push
424 error_push
425
426 call sequence_27
427
428 error_pop_merge
429 ok! jump ok_43
430
431 ast_pop_rewind
432 loc_pop_rewind
433 ast_push
434 loc_push
435 error_push
436
437 call sequence_40
438
439 error_pop_merge
440 ok! jump ok_43
441
442 ast_pop_rewind
443 loc_pop_rewind
444 status_fail
445 return
446
447 ok_43:
448 ast_pop_discard
449 loc_pop_discard
450 return
451
452 sequence_27:
453 # x
454 # '\('
455 # (Expression)
456 # '\)'
457
458 loc_push
459 error_clear
460
461 error_push
462
463 input_next "t ("
464 ok! test_char "("
465
466 error_pop_merge
467 fail! jump failednoast_29
468 ast_push
469 error_push
470
471 call sym_Expression
472
473 error_pop_merge
474 fail! jump failed_28
475 error_push
476
477 input_next "t )"
478 ok! test_char ")"
479
480 error_pop_merge
481 fail! jump failed_28
482
483 ast_pop_discard
484 loc_pop_discard
485 return
486
487 failed_28:
488 ast_pop_rewind
489
490 failednoast_29:
491 loc_pop_rewind
492 return
493
494 sequence_40:
495 # x
496 # (Factor)
497 # *
498 # x
499 # (MulOp)
500 # (Factor)
501
502 ast_push
503 loc_push
504 error_clear
505
506 error_push
507
508 call sym_Factor
509
510 error_pop_merge
511 fail! jump failed_41
512 error_push
513
514 call kleene_37
515
516 error_pop_merge
517 fail! jump failed_41
518
519 ast_pop_discard
520 loc_pop_discard
521 return
522
523 failed_41:
524 ast_pop_rewind
525 loc_pop_rewind
526 return
527
528 kleene_37:
529 # *
530 # x
531 # (MulOp)
532 # (Factor)
533
534 loc_push
535 error_push
536
537 call sequence_34
538
539 error_pop_merge
540 fail! jump failed_38
541 loc_pop_discard
542 jump kleene_37
543
544 failed_38:
545 loc_pop_rewind
546 status_ok
547 return
548
549 sequence_34:
550 # x
551 # (MulOp)
552 # (Factor)
553
554 ast_push
555 loc_push
556 error_clear
557
558 error_push
559
560 call sym_MulOp
561
562 error_pop_merge
563 fail! jump failed_35
564 error_push
565
566 call sym_Factor
567
568 error_pop_merge
569 fail! jump failed_35
570
571 ast_pop_discard
572 loc_pop_discard
573 return
574
575 failed_35:
576 ast_pop_rewind
577 loc_pop_rewind
578 return
579 #
580 # value Symbol 'Factor'
581 #
582
583 sym_Factor:
584 # x
585 # (Term)
586 # *
587 # x
588 # (AddOp)
589 # (Term)
590
591 symbol_restore Factor
592 found! jump found_60
593 loc_push
594 ast_push
595
596 call sequence_57
597
598 fail! value_clear
599 ok! value_reduce Factor
600 symbol_save Factor
601 error_nonterminal Factor
602 ast_pop_rewind
603 loc_pop_discard
604
605 found_60:
606 ok! ast_value_push
607 return
608
609 sequence_57:
610 # x
611 # (Term)
612 # *
613 # x
614 # (AddOp)
615 # (Term)
616
617 ast_push
618 loc_push
619 error_clear
620
621 error_push
622
623 call sym_Term
624
625 error_pop_merge
626 fail! jump failed_58
627 error_push
628
629 call kleene_54
630
631 error_pop_merge
632 fail! jump failed_58
633
634 ast_pop_discard
635 loc_pop_discard
636 return
637
638 failed_58:
639 ast_pop_rewind
640 loc_pop_rewind
641 return
642
643 kleene_54:
644 # *
645 # x
646 # (AddOp)
647 # (Term)
648
649 loc_push
650 error_push
651
652 call sequence_51
653
654 error_pop_merge
655 fail! jump failed_55
656 loc_pop_discard
657 jump kleene_54
658
659 failed_55:
660 loc_pop_rewind
661 status_ok
662 return
663
664 sequence_51:
665 # x
666 # (AddOp)
667 # (Term)
668
669 ast_push
670 loc_push
671 error_clear
672
673 error_push
674
675 call sym_AddOp
676
677 error_pop_merge
678 fail! jump failed_52
679 error_push
680
681 call sym_Term
682
683 error_pop_merge
684 fail! jump failed_52
685
686 ast_pop_discard
687 loc_pop_discard
688 return
689
690 failed_52:
691 ast_pop_rewind
692 loc_pop_rewind
693 return
694 #
695 # value Symbol 'MulOp'
696 #
697
698 sym_MulOp:
699 # /
700 # '*'
701 # '/'
702
703 symbol_restore MulOp
704 found! jump found_67
705 loc_push
706
707 call choice_65
708
709 fail! value_clear
710 ok! value_leaf MulOp
711 symbol_save MulOp
712 error_nonterminal MulOp
713 loc_pop_discard
714
715 found_67:
716 ok! ast_value_push
717 return
718
719 choice_65:
720 # /
721 # '*'
722 # '/'
723
724 error_clear
725
726 loc_push
727 error_push
728
729 input_next "t *"
730 ok! test_char "*"
731
732 error_pop_merge
733 ok! jump oknoast_64
734
735 loc_pop_rewind
736 loc_push
737 error_push
738
739 input_next "t /"
740 ok! test_char "/"
741
742 error_pop_merge
743 ok! jump oknoast_64
744
745 loc_pop_rewind
746 status_fail
747 return
748
749 oknoast_64:
750 loc_pop_discard
751 return
752 #
753 # value Symbol 'Number'
754 #
755
756 sym_Number:
757 # x
758 # ?
759 # (Sign)
760 # +
761 # (Digit)
762
763 symbol_restore Number
764 found! jump found_80
765 loc_push
766 ast_push
767
768 call sequence_77
769
770 fail! value_clear
771 ok! value_reduce Number
772 symbol_save Number
773 error_nonterminal Number
774 ast_pop_rewind
775 loc_pop_discard
776
777 found_80:
778 ok! ast_value_push
779 return
780
781 sequence_77:
782 # x
783 # ?
784 # (Sign)
785 # +
786 # (Digit)
787
788 ast_push
789 loc_push
790 error_clear
791
792 error_push
793
794 call optional_70
795
796 error_pop_merge
797 fail! jump failed_78
798 error_push
799
800 call poskleene_73
801
802 error_pop_merge
803 fail! jump failed_78
804
805 ast_pop_discard
806 loc_pop_discard
807 return
808
809 failed_78:
810 ast_pop_rewind
811 loc_pop_rewind
812 return
813
814 optional_70:
815 # ?
816 # (Sign)
817
818 loc_push
819 error_push
820
821 call sym_Sign
822
823 error_pop_merge
824 fail! loc_pop_rewind
825 ok! loc_pop_discard
826 status_ok
827 return
828
829 poskleene_73:
830 # +
831 # (Digit)
832
833 loc_push
834
835 call sym_Digit
836
837 fail! jump failed_74
838
839 loop_75:
840 loc_pop_discard
841 loc_push
842 error_push
843
844 call sym_Digit
845
846 error_pop_merge
847 ok! jump loop_75
848 status_ok
849
850 failed_74:
851 loc_pop_rewind
852 return
853 #
854 # value Symbol 'Sign'
855 #
856
857 sym_Sign:
858 # /
859 # '-'
860 # '+'
861
862 symbol_restore Sign
863 found! jump found_86
864 loc_push
865
866 call choice_5
867
868 fail! value_clear
869 ok! value_leaf Sign
870 symbol_save Sign
871 error_nonterminal Sign
872 loc_pop_discard
873
874 found_86:
875 ok! ast_value_push
876 return
877 #
878 # value Symbol 'Term'
879 #
880
881 sym_Term:
882 # (Number)
883
884 symbol_restore Term
885 found! jump found_89
886 loc_push
887 ast_push
888
889 call sym_Number
890
891 fail! value_clear
892 ok! value_reduce Term
893 symbol_save Term
894 error_nonterminal Term
895 ast_pop_rewind
896 loc_pop_discard
897
898 found_89:
899 ok! ast_value_push
900 return
901
902 #
903 #
904
905
907 Here we specify the format used by the Parser Tools to serialize Pars‐
908 ing Expression Grammars as immutable values for transport, comparison,
909 etc.
910
911 We distinguish between regular and canonical serializations. While a
912 PEG may have more than one regular serialization only exactly one of
913 them will be canonical.
914
915 regular serialization
916
917 [1] The serialization of any PEG is a nested Tcl dictionary.
918
919 [2] This dictionary holds a single key, pt::grammar::peg, and
920 its value. This value holds the contents of the grammar.
921
922 [3] The contents of the grammar are a Tcl dictionary holding
923 the set of nonterminal symbols and the starting expres‐
924 sion. The relevant keys and their values are
925
926 rules The value is a Tcl dictionary whose keys are the
927 names of the nonterminal symbols known to the
928 grammar.
929
930 [1] Each nonterminal symbol may occur only
931 once.
932
933 [2] The empty string is not a legal nonterminal
934 symbol.
935
936 [3] The value for each symbol is a Tcl dictio‐
937 nary itself. The relevant keys and their
938 values in this dictionary are
939
940 is The value is the serialization of
941 the parsing expression describing
942 the symbols sentennial structure, as
943 specified in the section PE serial‐
944 ization format.
945
946 mode The value can be one of three values
947 specifying how a parser should han‐
948 dle the semantic value produced by
949 the symbol.
950
951 value The semantic value of the
952 nonterminal symbol is an
953 abstract syntax tree consist‐
954 ing of a single node node for
955 the nonterminal itself, which
956 has the ASTs of the symbol's
957 right hand side as its chil‐
958 dren.
959
960 leaf The semantic value of the
961 nonterminal symbol is an
962 abstract syntax tree consist‐
963 ing of a single node node for
964 the nonterminal, without any
965 children. Any ASTs generated
966 by the symbol's right hand
967 side are discarded.
968
969 void The nonterminal has no seman‐
970 tic value. Any ASTs generated
971 by the symbol's right hand
972 side are discarded (as well).
973
974 start The value is the serialization of the start pars‐
975 ing expression of the grammar, as specified in the
976 section PE serialization format.
977
978 [4] The terminal symbols of the grammar are specified implic‐
979 itly as the set of all terminal symbols used in the start
980 expression and on the RHS of the grammar rules.
981
982 canonical serialization
983 The canonical serialization of a grammar has the format as spec‐
984 ified in the previous item, and then additionally satisfies the
985 constraints below, which make it unique among all the possible
986 serializations of this grammar.
987
988 [1] The keys found in all the nested Tcl dictionaries are
989 sorted in ascending dictionary order, as generated by
990 Tcl's builtin command lsort -increasing -dict.
991
992 [2] The string representation of the value is the canonical
993 representation of a Tcl dictionary. I.e. it does not con‐
994 tain superfluous whitespace.
995
996 EXAMPLE
997 Assuming the following PEG for simple mathematical expressions
998
999 PEG calculator (Expression)
1000 Digit <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9' ;
1001 Sign <- '-' / '+' ;
1002 Number <- Sign? Digit+ ;
1003 Expression <- Term (AddOp Term)* ;
1004 MulOp <- '*' / '/' ;
1005 Term <- Factor (MulOp Factor)* ;
1006 AddOp <- '+'/'-' ;
1007 Factor <- '(' Expression ')' / Number ;
1008 END;
1009
1010
1011 then its canonical serialization (except for whitespace) is
1012
1013 pt::grammar::peg {
1014 rules {
1015 AddOp {is {/ {t -} {t +}} mode value}
1016 Digit {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}} mode value}
1017 Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}} mode value}
1018 Factor {is {/ {x {t (} {n Expression} {t )}} {n Number}} mode value}
1019 MulOp {is {/ {t *} {t /}} mode value}
1020 Number {is {x {? {n Sign}} {+ {n Digit}}} mode value}
1021 Sign {is {/ {t -} {t +}} mode value}
1022 Term {is {x {n Factor} {* {x {n MulOp} {n Factor}}}} mode value}
1023 }
1024 start {n Expression}
1025 }
1026
1027
1029 Here we specify the format used by the Parser Tools to serialize Pars‐
1030 ing Expressions as immutable values for transport, comparison, etc.
1031
1032 We distinguish between regular and canonical serializations. While a
1033 parsing expression may have more than one regular serialization only
1034 exactly one of them will be canonical.
1035
1036 Regular serialization
1037
1038 Atomic Parsing Expressions
1039
1040 [1] The string epsilon is an atomic parsing expres‐
1041 sion. It matches the empty string.
1042
1043 [2] The string dot is an atomic parsing expression. It
1044 matches any character.
1045
1046 [3] The string alnum is an atomic parsing expression.
1047 It matches any Unicode alphabet or digit charac‐
1048 ter. This is a custom extension of PEs based on
1049 Tcl's builtin command string is.
1050
1051 [4] The string alpha is an atomic parsing expression.
1052 It matches any Unicode alphabet character. This is
1053 a custom extension of PEs based on Tcl's builtin
1054 command string is.
1055
1056 [5] The string ascii is an atomic parsing expression.
1057 It matches any Unicode character below U0080. This
1058 is a custom extension of PEs based on Tcl's
1059 builtin command string is.
1060
1061 [6] The string control is an atomic parsing expres‐
1062 sion. It matches any Unicode control character.
1063 This is a custom extension of PEs based on Tcl's
1064 builtin command string is.
1065
1066 [7] The string digit is an atomic parsing expression.
1067 It matches any Unicode digit character. Note that
1068 this includes characters outside of the [0..9]
1069 range. This is a custom extension of PEs based on
1070 Tcl's builtin command string is.
1071
1072 [8] The string graph is an atomic parsing expression.
1073 It matches any Unicode printing character, except
1074 for space. This is a custom extension of PEs based
1075 on Tcl's builtin command string is.
1076
1077 [9] The string lower is an atomic parsing expression.
1078 It matches any Unicode lower-case alphabet charac‐
1079 ter. This is a custom extension of PEs based on
1080 Tcl's builtin command string is.
1081
1082 [10] The string print is an atomic parsing expression.
1083 It matches any Unicode printing character, includ‐
1084 ing space. This is a custom extension of PEs based
1085 on Tcl's builtin command string is.
1086
1087 [11] The string punct is an atomic parsing expression.
1088 It matches any Unicode punctuation character. This
1089 is a custom extension of PEs based on Tcl's
1090 builtin command string is.
1091
1092 [12] The string space is an atomic parsing expression.
1093 It matches any Unicode space character. This is a
1094 custom extension of PEs based on Tcl's builtin
1095 command string is.
1096
1097 [13] The string upper is an atomic parsing expression.
1098 It matches any Unicode upper-case alphabet charac‐
1099 ter. This is a custom extension of PEs based on
1100 Tcl's builtin command string is.
1101
1102 [14] The string wordchar is an atomic parsing expres‐
1103 sion. It matches any Unicode word character. This
1104 is any alphanumeric character (see alnum), and any
1105 connector punctuation characters (e.g. under‐
1106 score). This is a custom extension of PEs based on
1107 Tcl's builtin command string is.
1108
1109 [15] The string xdigit is an atomic parsing expression.
1110 It matches any hexadecimal digit character. This
1111 is a custom extension of PEs based on Tcl's
1112 builtin command string is.
1113
1114 [16] The string ddigit is an atomic parsing expression.
1115 It matches any decimal digit character. This is a
1116 custom extension of PEs based on Tcl's builtin
1117 command regexp.
1118
1119 [17] The expression [list t x] is an atomic parsing
1120 expression. It matches the terminal string x.
1121
1122 [18] The expression [list n A] is an atomic parsing
1123 expression. It matches the nonterminal A.
1124
1125 Combined Parsing Expressions
1126
1127 [1] For parsing expressions e1, e2, ... the result of
1128 [list / e1 e2 ... ] is a parsing expression as
1129 well. This is the ordered choice, aka prioritized
1130 choice.
1131
1132 [2] For parsing expressions e1, e2, ... the result of
1133 [list x e1 e2 ... ] is a parsing expression as
1134 well. This is the sequence.
1135
1136 [3] For a parsing expression e the result of [list *
1137 e] is a parsing expression as well. This is the
1138 kleene closure, describing zero or more repeti‐
1139 tions.
1140
1141 [4] For a parsing expression e the result of [list +
1142 e] is a parsing expression as well. This is the
1143 positive kleene closure, describing one or more
1144 repetitions.
1145
1146 [5] For a parsing expression e the result of [list &
1147 e] is a parsing expression as well. This is the
1148 and lookahead predicate.
1149
1150 [6] For a parsing expression e the result of [list !
1151 e] is a parsing expression as well. This is the
1152 not lookahead predicate.
1153
1154 [7] For a parsing expression e the result of [list ?
1155 e] is a parsing expression as well. This is the
1156 optional input.
1157
1158 Canonical serialization
1159 The canonical serialization of a parsing expression has the for‐
1160 mat as specified in the previous item, and then additionally
1161 satisfies the constraints below, which make it unique among all
1162 the possible serializations of this parsing expression.
1163
1164 [1] The string representation of the value is the canonical
1165 representation of a pure Tcl list. I.e. it does not con‐
1166 tain superfluous whitespace.
1167
1168 [2] Terminals are not encoded as ranges (where start and end
1169 of the range are identical).
1170
1171 EXAMPLE
1172 Assuming the parsing expression shown on the right-hand side of the
1173 rule
1174
1175 Expression <- Term (AddOp Term)*
1176
1177
1178 then its canonical serialization (except for whitespace) is
1179
1180 {x {n Term} {* {x {n AddOp} {n Term}}}}
1181
1182
1184 This document, and the package it describes, will undoubtedly contain
1185 bugs and other problems. Please report such in the category pt of the
1186 Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please also
1187 report any ideas for enhancements you may have for either package
1188 and/or documentation.
1189
1190 When proposing code changes, please provide unified diffs, i.e the out‐
1191 put of diff -u.
1192
1193 Note further that attachments are strongly preferred over inlined
1194 patches. Attachments can be made by going to the Edit form of the
1195 ticket immediately after its creation, and then using the left-most
1196 button in the secondary navigation bar.
1197
1199 EBNF, LL(k), PARAM, PEG, TDPL, context-free languages, conversion,
1200 expression, format conversion, grammar, matching, parser, parsing
1201 expression, parsing expression grammar, push down automaton, recursive
1202 descent, serialization, state, top-down parsing languages, transducer
1203
1205 Parsing and Grammars
1206
1208 Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
1209
1210
1211
1212
1213tcllib 1 pt::peg::to::param(n)