1pt::peg::export(n)               Parser Tools               pt::peg::export(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       pt::peg::export - PEG Export
9

SYNOPSIS

11       package require Tcl  8.5
12
13       package require snit
14
15       package require configuration
16
17       package require pt::peg
18
19       package require pluginmgr
20
21       package require pt::peg::export  ?1?
22
23       ::pt::peg::export objectName
24
25       objectName method ?arg arg ...?
26
27       objectName destroy
28
29       objectName export serial serial ?format?
30
31       objectName export object object ?format?
32
33       objectName configuration names
34
35       objectName configuration get
36
37       objectName configuration set name ?value?
38
39       objectName configuration unset pattern...
40
41______________________________________________________________________________
42

DESCRIPTION

44       Are  you  lost ?  Do you have trouble understanding this document ?  In
45       that case please read the overview  provided  by  the  Introduction  to
46       Parser  Tools.  This document is the entrypoint to the whole system the
47       current package is a part of.
48
49       This package provides a manager for parsing expression  grammars,  with
50       each instance handling a set of plugins for the export of them to other
51       formats, i.e. their conversion to, for example nroff, HTML, etc.
52
53       It resides in the Export section of the Core Layer of Parser Tools, and
54       is  one of the three pillars the management of parsing expression gram‐
55       mars resides on.
56
57       IMAGE: arch_core_export
58
59       The other two pillars are, as shown above
60
61       [1]    PEG Import, and
62
63       [2]    PEG Storage
64
65       For information about the data structure which is the  major  input  to
66       the  manager objects provided by this package see the section PEG seri‐
67       alization format.
68
69       The plugin system of this class is based on the package pluginmgr,  and
70       configured to look for plugins using
71
72       [1]    the environment variable GRAMMAR_PEG_EXPORT_PLUGINS,
73
74       [2]    the environment variable GRAMMAR_PEG_PLUGINS,
75
76       [3]    the environment variable GRAMMAR_PLUGINS,
77
78       [4]    the path "~/.grammar/peg/export/plugin"
79
80       [5]    the path "~/.grammar/peg/plugin"
81
82       [6]    the path "~/.grammar/plugin"
83
84       [7]    the path "~/.grammar/peg/export/plugins"
85
86       [8]    the path "~/.grammar/peg/plugins"
87
88       [9]    the path "~/.grammar/plugins"
89
90       [10]   the     registry     entry     "HKEY_CURRENT_USER\SOFTWARE\GRAM‐
91              MAR\PEG\EXPORT\PLUGINS"
92
93       [11]   the registry entry "HKEY_CURRENT_USER\SOFTWARE\GRAMMAR\PEG\PLUG‐
94              INS"
95
96       [12]   the registry entry "HKEY_CURRENT_USER\SOFTWARE\GRAMMAR\PLUGINS"
97
98       The last three are used only when the package is run on a machine using
99       the Windows(tm) operating system.
100
101       The whole system is delivered with  three  predefined  export  plugins,
102       namely
103
104       container
105              See PEG Export Plugin. To CONTAINER format for details.
106
107       json   See PEG Export Plugin. To JSON format for details.
108
109       peg    See PEG Export Plugin. To PEG format for details.
110
111       For  readers  wishing to write their own export plugin for some format,
112       i.e. plugin writers, reading and understanding the Parser Tools  Export
113       API  specification is an absolute necessity, as it documents the inter‐
114       action between this package and its plugins in detail.
115

API

117   PACKAGE COMMANDS
118       ::pt::peg::export objectName
119              This command creates a new export manager object with an associ‐
120              ated  Tcl  command whose name is objectName. This object command
121              is explained in full detail in the sections Object  command  and
122              Object  methods.  The  object  command will be created under the
123              current namespace if the objectName is not fully qualified,  and
124              in the specified namespace otherwise.
125
126   OBJECT COMMAND
127       All objects created by the ::pt::peg::export command have the following
128       general form:
129
130       objectName method ?arg arg ...?
131              The method method and its arg'uments determine the exact  behav‐
132              ior of the command.  See section Object methods for the detailed
133              specifications.
134
135   OBJECT METHODS
136       objectName destroy
137              This method destroys the object it is invoked for.
138
139       objectName export serial serial ?format?
140              This method takes  the  canonical  serialization  of  a  parsing
141              expression grammar stored in serial and converts it to the spec‐
142              ified format, using the export plugin for the format. This  will
143              fail  with  an error if no plugin could be found for the format.
144              The string generated by the conversion process  is  returned  as
145              the result of this method.
146
147              If no format is specified the method defaults to text.
148
149              The  specification  of  what a canonical serialization is can be
150              found in the section PEG serialization format.
151
152              The plugin has to conform to the  interface  documented  in  the
153              Parser Tools Export API specification.
154
155       objectName export object object ?format?
156              This  method  is  a  convenient wrapper around the export serial
157              method described by the previous item.  It expects  that  object
158              is an object command supporting a serialize method returning the
159              canonical serialization of  a  parsing  expression  grammar.  It
160              invokes  that  method,  feeds  the result into export serial and
161              returns the resulting string as its own result.
162
163       objectName configuration names
164              This method returns a list containing the names of all  configu‐
165              ration options currently known to the object.
166
167       objectName configuration get
168              This method returns a dictionary containing the names and values
169              of all configuration options currently known to the object.
170
171       objectName configuration set name ?value?
172              This method sets the configuration option name to the  specified
173              value and returns the new value of the option.
174
175              If  no  value  is specified it simply returns the current value,
176              without changing it.
177
178              Note that these configuration options and their values are  sim‐
179              ply  passed  to a plugin when the actual export is performed. It
180              is the plugin which checks the validity, not the manager.
181
182       objectName configuration unset pattern...
183              This method unsets all configuration options matching the speci‐
184              fied glob patterns. If no pattern is specified it will unset all
185              currently defined configuration options.
186

PEG SERIALIZATION FORMAT

188       Here we specify the format used by the Parser Tools to serialize  Pars‐
189       ing  Expression Grammars as immutable values for transport, comparison,
190       etc.
191
192       We distinguish between regular and canonical serializations.   While  a
193       PEG  may  have  more than one regular serialization only exactly one of
194       them will be canonical.
195
196       regular serialization
197
198              [1]    The serialization of any PEG is a nested Tcl dictionary.
199
200              [2]    This dictionary holds a single key, pt::grammar::peg, and
201                     its value. This value holds the contents of the grammar.
202
203              [3]    The  contents of the grammar are a Tcl dictionary holding
204                     the set of nonterminal symbols and the  starting  expres‐
205                     sion. The relevant keys and their values are
206
207                     rules  The  value  is a Tcl dictionary whose keys are the
208                            names of the  nonterminal  symbols  known  to  the
209                            grammar.
210
211                            [1]    Each  nonterminal  symbol  may  occur  only
212                                   once.
213
214                            [2]    The empty string is not a legal nonterminal
215                                   symbol.
216
217                            [3]    The  value for each symbol is a Tcl dictio‐
218                                   nary itself. The relevant  keys  and  their
219                                   values in this dictionary are
220
221                                   is     The  value  is  the serialization of
222                                          the  parsing  expression  describing
223                                          the symbols sentennial structure, as
224                                          specified in the section PE  serial‐
225                                          ization format.
226
227                                   mode   The value can be one of three values
228                                          specifying how a parser should  han‐
229                                          dle  the  semantic value produced by
230                                          the symbol.
231
232                                          value  The  semantic  value  of  the
233                                                 nonterminal   symbol   is  an
234                                                 abstract syntax tree consist‐
235                                                 ing of a single node node for
236                                                 the nonterminal itself, which
237                                                 has  the ASTs of the symbol's
238                                                 right hand side as its  chil‐
239                                                 dren.
240
241                                          leaf   The  semantic  value  of  the
242                                                 nonterminal  symbol   is   an
243                                                 abstract syntax tree consist‐
244                                                 ing of a single node node for
245                                                 the  nonterminal, without any
246                                                 children. Any ASTs  generated
247                                                 by  the  symbol's  right hand
248                                                 side are discarded.
249
250                                          void   The nonterminal has no seman‐
251                                                 tic value. Any ASTs generated
252                                                 by the  symbol's  right  hand
253                                                 side are discarded (as well).
254
255                     start  The  value is the serialization of the start pars‐
256                            ing expression of the grammar, as specified in the
257                            section PE serialization format.
258
259              [4]    The terminal symbols of the grammar are specified implic‐
260                     itly as the set of all terminal symbols used in the start
261                     expression and on the RHS of the grammar rules.
262
263       canonical serialization
264              The canonical serialization of a grammar has the format as spec‐
265              ified in the previous item, and then additionally satisfies  the
266              constraints  below,  which make it unique among all the possible
267              serializations of this grammar.
268
269              [1]    The keys found in all the  nested  Tcl  dictionaries  are
270                     sorted  in  ascending  dictionary  order, as generated by
271                     Tcl's builtin command lsort -increasing -dict.
272
273              [2]    The string representation of the value is  the  canonical
274                     representation of a Tcl dictionary. I.e. it does not con‐
275                     tain superfluous whitespace.
276
277   EXAMPLE
278       Assuming the following PEG for simple mathematical expressions
279
280              PEG calculator (Expression)
281                  Digit      <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'       ;
282                  Sign       <- '-' / '+'                                     ;
283                  Number     <- Sign? Digit+                                  ;
284                  Expression <- Term (AddOp Term)*                            ;
285                  MulOp      <- '*' / '/'                                     ;
286                  Term       <- Factor (MulOp Factor)*                        ;
287                  AddOp      <- '+'/'-'                                       ;
288                  Factor     <- '(' Expression ')' / Number                   ;
289              END;
290
291
292       then its canonical serialization (except for whitespace) is
293
294              pt::grammar::peg {
295                  rules {
296                      AddOp      {is {/ {t -} {t +}}                                                                mode value}
297                      Digit      {is {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}}                mode value}
298                      Expression {is {x {n Term} {* {x {n AddOp} {n Term}}}}                                        mode value}
299                      Factor     {is {/ {x {t (} {n Expression} {t )}} {n Number}}                                  mode value}
300                      MulOp      {is {/ {t *} {t /}}                                                                mode value}
301                      Number     {is {x {? {n Sign}} {+ {n Digit}}}                                                 mode value}
302                      Sign       {is {/ {t -} {t +}}                                                                mode value}
303                      Term       {is {x {n Factor} {* {x {n MulOp} {n Factor}}}}                                    mode value}
304                  }
305                  start {n Expression}
306              }
307
308

PE SERIALIZATION FORMAT

310       Here we specify the format used by the Parser Tools to serialize  Pars‐
311       ing Expressions as immutable values for transport, comparison, etc.
312
313       We  distinguish  between regular and canonical serializations.  While a
314       parsing expression may have more than one  regular  serialization  only
315       exactly one of them will be canonical.
316
317       Regular serialization
318
319              Atomic Parsing Expressions
320
321                     [1]    The  string  epsilon  is an atomic parsing expres‐
322                            sion. It matches the empty string.
323
324                     [2]    The string dot is an atomic parsing expression. It
325                            matches any character.
326
327                     [3]    The  string alnum is an atomic parsing expression.
328                            It matches any Unicode alphabet or  digit  charac‐
329                            ter.  This  is  a custom extension of PEs based on
330                            Tcl's builtin command string is.
331
332                     [4]    The string alpha is an atomic parsing  expression.
333                            It matches any Unicode alphabet character. This is
334                            a custom extension of PEs based on  Tcl's  builtin
335                            command string is.
336
337                     [5]    The  string ascii is an atomic parsing expression.
338                            It matches any Unicode character below U0080. This
339                            is  a  custom  extension  of  PEs  based  on Tcl's
340                            builtin command string is.
341
342                     [6]    The string control is an  atomic  parsing  expres‐
343                            sion.  It  matches  any Unicode control character.
344                            This is a custom extension of PEs based  on  Tcl's
345                            builtin command string is.
346
347                     [7]    The  string digit is an atomic parsing expression.
348                            It matches any Unicode digit character. Note  that
349                            this  includes  characters  outside  of the [0..9]
350                            range. This is a custom extension of PEs based  on
351                            Tcl's builtin command string is.
352
353                     [8]    The  string graph is an atomic parsing expression.
354                            It matches any Unicode printing character,  except
355                            for space. This is a custom extension of PEs based
356                            on Tcl's builtin command string is.
357
358                     [9]    The string lower is an atomic parsing  expression.
359                            It matches any Unicode lower-case alphabet charac‐
360                            ter. This is a custom extension of  PEs  based  on
361                            Tcl's builtin command string is.
362
363                     [10]   The  string print is an atomic parsing expression.
364                            It matches any Unicode printing character, includ‐
365                            ing space. This is a custom extension of PEs based
366                            on Tcl's builtin command string is.
367
368                     [11]   The string punct is an atomic parsing  expression.
369                            It matches any Unicode punctuation character. This
370                            is a  custom  extension  of  PEs  based  on  Tcl's
371                            builtin command string is.
372
373                     [12]   The  string space is an atomic parsing expression.
374                            It matches any Unicode space character. This is  a
375                            custom  extension  of  PEs  based on Tcl's builtin
376                            command string is.
377
378                     [13]   The string upper is an atomic parsing  expression.
379                            It matches any Unicode upper-case alphabet charac‐
380                            ter. This is a custom extension of  PEs  based  on
381                            Tcl's builtin command string is.
382
383                     [14]   The  string  wordchar is an atomic parsing expres‐
384                            sion. It matches any Unicode word character.  This
385                            is any alphanumeric character (see alnum), and any
386                            connector  punctuation  characters  (e.g.   under‐
387                            score). This is a custom extension of PEs based on
388                            Tcl's builtin command string is.
389
390                     [15]   The string xdigit is an atomic parsing expression.
391                            It  matches  any hexadecimal digit character. This
392                            is a  custom  extension  of  PEs  based  on  Tcl's
393                            builtin command string is.
394
395                     [16]   The string ddigit is an atomic parsing expression.
396                            It matches any decimal digit character. This is  a
397                            custom  extension  of  PEs  based on Tcl's builtin
398                            command regexp.
399
400                     [17]   The expression [list t x]  is  an  atomic  parsing
401                            expression. It matches the terminal string x.
402
403                     [18]   The  expression  [list  n  A] is an atomic parsing
404                            expression. It matches the nonterminal A.
405
406              Combined Parsing Expressions
407
408                     [1]    For parsing expressions e1, e2, ... the result  of
409                            [list  /  e1  e2  ... ] is a parsing expression as
410                            well.  This is the ordered choice, aka prioritized
411                            choice.
412
413                     [2]    For  parsing expressions e1, e2, ... the result of
414                            [list x e1 e2 ... ] is  a  parsing  expression  as
415                            well.  This is the sequence.
416
417                     [3]    For  a  parsing expression e the result of [list *
418                            e] is a parsing expression as well.  This  is  the
419                            kleene  closure,  describing  zero or more repeti‐
420                            tions.
421
422                     [4]    For a parsing expression e the result of  [list  +
423                            e]  is  a parsing expression as well.  This is the
424                            positive kleene closure, describing  one  or  more
425                            repetitions.
426
427                     [5]    For  a  parsing expression e the result of [list &
428                            e] is a parsing expression as well.  This  is  the
429                            and lookahead predicate.
430
431                     [6]    For  a  parsing expression e the result of [list !
432                            e] is a parsing expression as well.  This  is  the
433                            not lookahead predicate.
434
435                     [7]    For  a  parsing expression e the result of [list ?
436                            e] is a parsing expression as well.  This  is  the
437                            optional input.
438
439       Canonical serialization
440              The canonical serialization of a parsing expression has the for‐
441              mat as specified in the previous  item,  and  then  additionally
442              satisfies  the constraints below, which make it unique among all
443              the possible serializations of this parsing expression.
444
445              [1]    The string representation of the value is  the  canonical
446                     representation  of a pure Tcl list. I.e. it does not con‐
447                     tain superfluous whitespace.
448
449              [2]    Terminals are not encoded as ranges (where start and  end
450                     of the range are identical).
451
452   EXAMPLE
453       Assuming  the  parsing  expression  shown on the right-hand side of the
454       rule
455
456                  Expression <- Term (AddOp Term)*
457
458
459       then its canonical serialization (except for whitespace) is
460
461                  {x {n Term} {* {x {n AddOp} {n Term}}}}
462
463

BUGS, IDEAS, FEEDBACK

465       This document, and the package it describes, will  undoubtedly  contain
466       bugs  and other problems.  Please report such in the category pt of the
467       Tcllib Trackers  [http://core.tcl.tk/tcllib/reportlist].   Please  also
468       report  any  ideas  for  enhancements  you  may have for either package
469       and/or documentation.
470
471       When proposing code changes, please provide unified diffs, i.e the out‐
472       put of diff -u.
473
474       Note  further  that  attachments  are  strongly  preferred over inlined
475       patches. Attachments can be made by going  to  the  Edit  form  of  the
476       ticket  immediately  after  its  creation, and then using the left-most
477       button in the secondary navigation bar.
478

KEYWORDS

480       EBNF, LL(k), PEG, TDPL, context-free  languages,  expression,  grammar,
481       matching,  parser, parsing expression, parsing expression grammar, push
482       down automaton, recursive descent, state, top-down  parsing  languages,
483       transducer
484

CATEGORY

486       Parsing and Grammars
487
489       Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
490
491
492
493
494tcllib                                 1                    pt::peg::export(n)
Impressum