1page(n)                                                                page(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       page - Parser Generator
9

SYNOPSIS

11       page ?options...? ?input ?output??
12
13_________________________________________________________________
14

DESCRIPTION

16       The  application described by this document, page, is actually not just
17       a parser generator, as the name implies, but a  generic  tool  for  the
18       execution of arbitrary transformations on texts.
19
20       Its genericity comes through the use of plugins for reading, transform‐
21       ing, and writing data, and the predefined set of  plugins  provided  by
22       Tcllib  is  for  the  generation of memoizing recursive descent parsers
23       (aka packrat parsers) from grammar specifications  (Parsing  Expression
24       Grammars).
25
26       page  is  written  on  top of the package page::pluginmgr, wrapping its
27       functionality into a command line  based  application.  All  the  other
28       page::*  packages are plugin and/or supporting packages for the genera‐
29       tion of parsers. The parsers themselves are based on the packages gram‐
30       mar::peg, grammar::peg::interp, and grammar::mengine.
31
32   COMMAND LINE
33       page ?options...? ?input ?output??
34              This is general form for calling page. The application will read
35              the contents of the file input, process them under  the  control
36              of  the specified options, and then write the result to the file
37              output.
38
39              If input is the string - the data to process will be  read  from
40              stdin  instead of a file. Analogously the result will be written
41              to stdout instead of a file if output is the string -. A missing
42              output  or  input specification causes the application to assume
43              -.
44
45              The detailed specifications of the recognized options  are  pro‐
46              vided in section OPTIONS.
47
48              path input (in)
49                     This  argument  specifies the path to the file to be pro‐
50                     cessed by the application, or -. The  last  value  causes
51                     the application to read the text from stdin. Otherwise it
52                     has to exist, and be readable. If the argument is missing
53                     - is assumed.
54
55              path output (in)
56                     This  argument  specifies  where  to  write the generated
57                     text. It can be the path to a file, or -. The last  value
58                     causes  the application to write the generated documented
59                     to stdout.
60
61                     If the file output does  not  exist  then  [file  dirname
62                     $output]  has  to exist and must be a writable directory,
63                     as the application will create the fileto write to.
64
65                     If the argument is missing - is assumed.
66
67   OPERATION
68   OPTIONS
69       This section describes all the options available to  the  user  of  the
70       application. Options are always processed in order. I.e. of both --help
71       and --version are specified the option  encountered  first  has  prece‐
72       dence.
73
74       Unknown  options  specified  before any of the options -rd, -wr, or -tr
75       will cause processing to abort with an error. Unknown options coming in
76       between  these options, or after the last of them are assumed to always
77       take a single argument and are associated with the last  plugin  option
78       coming  before  them. They will be checked after all the relevant plug‐
79       ins, and thus the options they understand, are known. I.e. such unknown
80       options  cause  error if and only if the plugin option they are associ‐
81       ated with does not understand them, and was not superceded by a  plugin
82       option coming after.
83
84       Default  options  are used if and only if the command line did not con‐
85       tain any options at all. They will set the application  up  as  a  PEG-
86       based parser generator. The exact list of options is
87
88       -c peg
89
90       And now the recognized options and their arguments, if they have any:
91
92       --help
93
94       -h
95
96       -?     When one of these options is found on the command line all argu‐
97              ments coming before or after are ignored. The  application  will
98              print a short description of the recognized options and exit.
99
100       --version
101
102       -V     When one of these options is found on the command line all argu‐
103              ments coming before or after are ignored. The  application  will
104              print its own revision and exit.
105
106       -P     This  option signals the application to activate visual feedback
107              while reading the input.
108
109       -T     This option signals the application to collect statistics  while
110              reading the input and to print them after reading has completed,
111              before processing started.
112
113       -D     This option signals the application to activate logging  in  the
114              Safe base, for the debugging of problems with plugins.
115
116       -r parser
117
118       -rd parser
119
120       --reader parser
121              These  options specify the plugin the application has to use for
122              reading the input. If the options are used  multiple  times  the
123              last one will be used.
124
125       -w generator
126
127       -wr generator
128
129       --writer generator
130              These  options specify the plugin the application has to use for
131              generating and writing the final output. If the options are used
132              multiple times the last one will be used.
133
134       -t process
135
136       -tr process
137
138       --transform process
139              These  options specify a plugin to run on the input. In contrast
140              to readers and writers each  use  will  not  supersede  previous
141              uses,  but  add each chosen plugin to a list of transformations,
142              either at the front, or the end, per the last seen use of either
143              option -p or -a. The initial default is to append the new trans‐
144              formations.
145
146       -a
147
148       --append
149              These options signal the application that all  following  trans‐
150              formations should be added at the end of the list of transforma‐
151              tions.
152
153       -p
154
155       --prepend
156              These options signal the application that all  following  trans‐
157              formations  should  be  added  at  the  beginning of the list of
158              transformations.
159
160       --reset
161              This option signals the application to clear the list of  trans‐
162              formations.  This is necessary to wipe out the default transfor‐
163              mations used.
164
165       -c file
166
167       --configuration file
168              This option causes the application to load a configuration  file
169              and/or plugin. This is a plugin which in essence provides a pre-
170              defined set of commandline options. They are  processed  exactly
171              as  if  they  have been specified in place of the option and its
172              arguments. This means that unknown options found at  the  begin‐
173              ning  of  the  configuration  file  are associated with the last
174              plugin, even if that plugin was specified before the  configura‐
175              tion  file  itself. Conversely, unknown options coming after the
176              configuration file can be associated with a plugin specified  in
177              the file.
178
179              If the argument is a file which cannot be loaded as a plugin the
180              application will assume that its contents are a list of  options
181              and  their  arguments,  separated  by space, tabs, and newlines.
182              Options and argumentes containing spaces can be quoted via  dou‐
183              ble-quotes (") and quotes ('). The quote character can be speci‐
184              fied within in a quoted string by doubling  it.  Newlines  in  a
185              quoted string are accepted as is.
186
187   PLUGINS
188       page  makes  use  of  four different types of plugins, namely: readers,
189       writers, transformations, and configurations. Here we  provide  only  a
190       basic  introduction  on  how to use them from page. The exact APIs pro‐
191       vided to and expected from the plugins can be found in  the  documenta‐
192       tion  for  page::pluginmgr, for those who wish to write their own plug‐
193       ins.
194
195       Plugins are specified as arguments to the options -r, -w, -t,  -c,  and
196       their equivalent longer forms. See the section OPTIONS for reference.
197
198       Each such argument will be first treated as the name of a file and this
199       file is loaded as the plugin. If however there is  no  file  with  that
200       name,  then  it will be translated into the name of a package, and this
201       package is then loaded. For each type of plugins the package management
202       searches  not  only the regular paths, but a set application- and type-
203       specific paths as well. Please see the section PLUGIN LOCATIONS  for  a
204       listing of all paths and their sources.
205
206       -c name
207              Configurations.  The  name of the package for the plugin name is
208              "page::config::name".
209
210              We have one predefined plugin:
211
212              peg    It sets the application up as a parser generator  accept‐
213                     ing  parsing  expression  grammars  and writing a packrat
214                     parser in Tcl. The actual arguments it specifies are:
215
216
217                          --reset
218                          --append
219                          --reader    peg
220                          --transform reach
221                          --transform use
222                          --writer    me
223
224
225
226       -r name
227              Readers. The  name  of  the  package  for  the  plugin  name  is
228              "page::reader::name".
229
230              We have five predefined plugins:
231
232              peg    Interprets  the  input  as  a  parsing expression grammar
233                     (PEG) and generates a tree representation  for  it.  Both
234                     the  syntax  of PEGs and the structure of the tree repre‐
235                     sentation are explained in their own manpages.
236
237              hb     Interprets the input as Tcl  code  as  generated  by  the
238                     writer plugin hb and generates its tree representation.
239
240              ser    Interprets  the  input  as the serialization of a PEG, as
241                     generated by the writer plugin  ser,  using  the  package
242                     grammar::peg.
243
244              lemon  Interprets the input as a grammar specification as under‐
245                     stood by Richard Hipp's LEMON parser generator and gener‐
246                     ates  a tree representation for it. Both the input syntax
247                     and  the  structure  of  the  tree   representation   are
248                     explained in their own manpages.
249
250              treeser
251                     Interprets   the   input   as   the  serialization  of  a
252                     struct::tree. It is validated as such, but nothing  else.
253                     It  is  not  assumed  to  be the tree representation of a
254                     grammar.
255
256       -w name
257              Writers. The  name  of  the  package  for  the  plugin  name  is
258              "page::writer::name".
259
260              We have eight predefined plugins:
261
262              identity
263                     Simply  writes the incoming data as it is, without making
264                     any changes. This is good for inspecting the  raw  result
265                     of a reader or transformation.
266
267              null   Generates  nothing,  and ignores the incoming data struc‐
268                     ture.
269
270              tree   Assumes  that  the   incoming   data   structure   is   a
271                     struct::tree  and generates an indented textual represen‐
272                     tation of all nodes, their  parental  relationships,  and
273                     their attribute information.
274
275              peg    Assumes that the incoming data structure is a tree repre‐
276                     sentation of a PEG or other other grammar and  writes  it
277                     out  as  a  PEG.  The result is nicely formatted and par‐
278                     tially simplified (strings as sequences of characters). A
279                     pretty printer in essence, but can also be used to obtain
280                     a canonical representation of the input grammar.
281
282              tpc    Assumes that the incoming data structure is a tree repre‐
283                     sentation  of a PEG or other other grammar and writes out
284                     Tcl code defining a package which defines a  grammar::peg
285                     object  containing  the grammar when it is loaded into an
286                     interpreter.
287
288              hb     This is like the writer plugin tpc, but  it  writes  only
289                     the  statements  which define stat expression and grammar
290                     rules. The code making the result a package is left out.
291
292              ser    Assumes that the incoming data structure is a tree repre‐
293                     sentation  of a PEG or other other grammar, transforms it
294                     internally into a grammar::peg object and writes out  its
295                     serialization.
296
297              me     Assumes that the incoming data structure is a tree repre‐
298                     sentation of a PEG or other other grammar and writes  out
299                     Tcl  code defining a package which implements a memoizing
300                     recursive descent parser based on the match  engine  (ME)
301                     provided by the package grammar::mengine.
302
303       -t name
304              Transformers.  The  name  of  the package for the plugin name is
305              "page::transform::name".
306
307              We have two predefined plugins:
308
309              reach  Assumes that the incoming data structure is a tree repre‐
310                     sentation  of a PEG or other other grammar. It determines
311                     which nonterminal symbols and rules  are  reachable  from
312                     start-symbol/expression.  All  nonterminal  symbols which
313                     were not reached are removed.
314
315              use    Assumes that the incoming data structure is a tree repre‐
316                     sentation  of a PEG or other other grammar. It determines
317                     which nonterminal symbols and rules are able to  generate
318                     a  finite sequences of terminal symbols (in the sense for
319                     a Context Free Grammar). All  nonterminal  symbols  which
320                     were not deemed useful in this sense are removed.
321
322   PLUGIN LOCATIONS
323       The  application-specific  paths  searched  by page either are, or come
324       from:
325
326       [1]    The directory            "~/.page/plugin"
327
328       [2]    The environment variable PAGE_PLUGINS
329
330       [3]    The registry entry        HKEY_LOCAL_MACHINE\SOFTWARE\PAGE\PLUG‐
331              INS
332
333       [4]    The registry entry       HKEY_CURRENT_USER\SOFTWARE\PAGE\PLUGINS
334
335       The type-specific paths searched by page either are, or come from:
336
337       [1]    The directory            "~/.page/plugin/<TYPE>"
338
339       [2]    The environment variable PAGE_<TYPE>_PLUGINS
340
341       [3]    The      registry      entry            HKEY_LOCAL_MACHINE\SOFT‐
342              WARE\PAGE\<TYPE>\PLUGINS
343
344       [4]    The      registry      entry             HKEY_CURRENT_USER\SOFT‐
345              WARE\PAGE\<TYPE>\PLUGINS
346
347       Where  the  placeholder <TYPE> is always one of the values below, prop‐
348       erly capitalized.
349
350       [1]    reader
351
352       [2]    writer
353
354       [3]    transform
355
356       [4]    config
357
358       The registry entries are specific  to  the  Windows(tm)  platform,  all
359       other platforms will ignore them.
360
361       The  contents  of  both  environment variables and registry entries are
362       interpreted as a list of paths, with the elements separated  by  either
363       colon (Unix), or semicolon (Windows).
364

BUGS, IDEAS, FEEDBACK

366       This  document, and the application it describes, will undoubtedly con‐
367       tain bugs and other problems.  Please report such in the category  page
368       of       the       Tcllib       SF       Trackers       [http://source
369       forge.net/tracker/?group_id=12883].  Please also report any  ideas  for
370       enhancements you may have for either application and/or documentation.
371

SEE ALSO

373       page::pluginmgr
374

KEYWORDS

376       parser generator, text processing
377
379       Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>
380
381
382
383
384Development Tools                     1.0                              page(n)
Impressum