1Parse::Yapp(3)        User Contributed Perl Documentation       Parse::Yapp(3)
2
3
4

NAME

6       Parse::Yapp - Perl extension for generating and using LALR parsers.
7

SYNOPSIS

9         yapp -m MyParser grammar_file.yp
10
11         ...
12
13         use MyParser;
14
15         $parser=new MyParser();
16         $value=$parser->YYParse(yylex => \&lexer_sub, yyerror => \&error_sub);
17
18         $nberr=$parser->YYNberr();
19
20         $parser->YYData->{DATA}= [ 'Anything', 'You Want' ];
21
22         $data=$parser->YYData->{DATA}[0];
23

DESCRIPTION

25       Parse::Yapp (Yet Another Perl Parser compiler) is a collection of
26       modules that let you generate and use yacc like thread safe (reentrant)
27       parsers with perl object oriented interface.
28
29       The script yapp is a front-end to the Parse::Yapp module and let you
30       easily create a Perl OO parser from an input grammar file.
31
32   The Grammar file
33       "Comments"
34           Through all your files, comments are either Perl style, introduced
35           by # up to the end of line, or C style, enclosed between  /* and
36           */.
37
38       "Tokens and string literals"
39           Through all the grammar files, two kind of symbols may appear: Non-
40           terminal symbols, called also left-hand-side symbols, which are the
41           names of your rules, and Terminal symbols, called also Tokens.
42
43           Tokens are the symbols your lexer function will feed your parser
44           with (see below). They are of two flavours: symbolic tokens and
45           string literals.
46
47           Non-terminals and symbolic tokens share the same identifier syntax:
48
49                           [A-Za-z][A-Za-z0-9_]*
50
51           String literals are enclosed in single quotes and can contain
52           almost anything. They will be output to your parser file double-
53           quoted, making any special character as such. '"', '$' and '@' will
54           be automatically quoted with '\', making their writing more
55           natural. On the other hand, if you need a single quote inside your
56           literal, just quote it with '\'.
57
58           You cannot have a literal 'error' in your grammar as it would
59           confuse the driver with the error token. Use a symbolic token
60           instead.  In case you inadvertently use it, this will produce a
61           warning telling you you should have written it error and will treat
62           it as if it were the error token, which is certainly NOT what you
63           meant.
64
65       "Grammar file syntax"
66           It is very close to yacc syntax (in fact, Parse::Yapp should
67           compile a clean yacc grammar without any modification, whereas the
68           opposite is not true).
69
70           This file is divided in three sections, separated by "%%":
71
72                   header section
73                   %%
74                   rules section
75                   %%
76                   footer section
77
78           The Header Section section may optionally contain:
79           *   One or more code blocks enclosed inside "%{" and "%}" just like
80               in yacc. They may contain any valid Perl code and will be
81               copied verbatim at the very beginning of the parser module.
82               They are not as useful as they are in yacc, but you can use
83               them, for example, for global variable declarations, though you
84               will notice later that such global variables can be avoided to
85               make a reentrant parser module.
86
87           *   Precedence declarations, introduced by %left, %right and
88               %nonassoc specifying associativity, followed by the list of
89               tokens or litterals having the same precedence and
90               associativity.  The precedence beeing the latter declared will
91               be having the highest level.  (see the yacc or bison manuals
92               for a full explanation of how they work, as they are
93               implemented exactly the same way in Parse::Yapp)
94
95           *   %start followed by a rule's left hand side, declaring this rule
96               to be the starting rule of your grammar. The default, when
97               %start is not used, is the first rule in your grammar section.
98
99           *   %token followed by a list of symbols, forcing them to be
100               recognized as tokens, generating a syntax error if used in the
101               left hand side of a rule declaration.  Note that in
102               Parse::Yapp, you don't need to declare tokens as in yacc: any
103               symbol not appearing as a left hand side of a rule is
104               considered to be a token.  Other yacc declarations or
105               constructs such as %type and %union are parsed but (almost)
106               ignored.
107
108           *   %expect followed by a number, suppress warnings about number of
109               Shift/Reduce conflicts when both numbers match, a la bison.
110
111       The Rule Section contains your grammar rules:
112           A rule is made of a left-hand-side symbol, followed by a ':' and
113           one or more right-hand-sides separated by '|' and terminated by a
114           ';':
115
116               exp:    exp '+' exp
117                   |   exp '-' exp
118                   ;
119
120           A right hand side may be empty:
121
122               input:  #empty
123                   |   input line
124                   ;
125
126           (if you have more than one empty rhs, Parse::Yapp will issue a
127           warning, as this is usually a mistake, and you will certainly have
128           a reduce/reduce conflict)
129
130           A rhs may be followed by an optional %prec directive, followed by a
131           token, giving the rule an explicit precedence (see yacc manuals for
132           its precise meaning) and optional semantic action code block (see
133           below).
134
135               exp:   '-' exp %prec NEG { -$_[1] }
136                   |  exp '+' exp       { $_[1] + $_[3] }
137                   |  NUM
138                   ;
139
140           Note that in Parse::Yapp, a lhs cannot appear more than once as a
141           rule name (This differs from yacc).
142
143       "The footer section"
144           may contain any valid Perl code and will be appended at the very
145           end of your parser module. Here you can write your lexer, error
146           report subs and anything relevant to you parser.
147
148       "Semantic actions"
149           Semantic actions are run every time a reduction occurs in the
150           parsing flow and they must return a semantic value.
151
152           They are (usually, but see below "In rule actions") written at the
153           very end of the rhs, enclosed with "{ }", and are copied verbatim
154           to your parser file, inside of the rules table.
155
156           Be aware that matching braces in Perl is much more difficult than
157           in C: inside strings they don't need to match. While in C it is
158           very easy to detect the beginning of a string construct, or a
159           single character, it is much more difficult in Perl, as there are
160           so many ways of writing such literals. So there is no check for
161           that today. If you need a brace in a double-quoted string, just
162           quote it ("\{" or "\}"). For single-quoted strings, you will need
163           to make a comment matching it in th right order.  Sorry for the
164           inconvenience.
165
166               {
167                   "{ My string block }".
168                   "\{ My other string block \}".
169                   qq/ My unmatched brace \} /.
170                   # Force the match: {
171                   q/ for my closing brace } /
172                   q/ My opening brace { /
173                   # must be closed: }
174               }
175
176           All of these constructs should work.
177
178           In Parse::Yapp, semantic actions are called like normal Perl sub
179           calls, with their arguments passed in @_, and their semantic value
180           are their return values.
181
182           $_[1] to $_[n] are the parameters just as $1 to $n in yacc, while
183           $_[0] is the parser object itself.
184
185           Having $_[0] beeing the parser object itself allows you to call
186           parser methods. Thats how the yacc macros are implemented:
187
188                   yyerrok is done by calling $_[0]->YYErrok
189                   YYERROR is done by calling $_[0]->YYError
190                   YYACCEPT is done by calling $_[0]->YYAccept
191                   YYABORT is done by calling $_[0]->YYAbort
192
193           All those methods explicitly return undef, for convenience.
194
195               YYRECOVERING is done by calling $_[0]->YYRecovering
196
197           Four useful methods in error recovery sub
198
199               $_[0]->YYCurtok
200               $_[0]->YYCurval
201               $_[0]->YYExpect
202               $_[0]->YYLexer
203
204           return respectivly the current input token that made the parse
205           fail, its semantic value (both can be used to modify their values
206           too, but know what you are doing ! See Error reporting routine
207           section for an example), a list which contains the tokens the
208           parser expected when the failure occured and a reference to the
209           lexer routine.
210
211           Note that if "$_[0]->YYCurtok" is declared as a %nonassoc token, it
212           can be included in "$_[0]->YYExpect" list whenever the input try to
213           use it in an associative way. This is not a bug: the token IS
214           expected to report an error if encountered.
215
216           To detect such a thing in your error reporting sub, the following
217           example should do the trick:
218
219                   grep { $_[0]->YYCurtok eq $_ } $_[0]->YYExpect
220               and do {
221                   #Non-associative token used in an associative expression
222               };
223
224           Accessing semantics values on the left of your reducing rule is
225           done through the method
226
227               $_[0]->YYSemval( index )
228
229           where index is an integer. Its value being 1 .. n returns the same
230           values than $_[1] .. $_[n], but -n .. 0 returns values on the left
231           of the rule beeing reduced (It is related to $-n .. $0 .. $n in
232           yacc, but you cannot use $_[0] or $_[-n] constructs in Parse::Yapp
233           for obvious reasons)
234
235           There is also a provision for a user data area in the parser
236           object, accessed by the method:
237
238               $_[0]->YYData
239
240           which returns a reference to an anonymous hash, which let you have
241           all of your parsing data held inside the object (see the Calc.yp or
242           ParseYapp.yp files in the distribution for some examples).  That's
243           how you can make you parser module reentrant: all of your module
244           states and variables are held inside the parser object.
245
246           Note: unfortunatly, method calls in Perl have a lot of overhead,
247                 and when YYData is used, it may be called a huge number
248                 of times. If your are not a *real* purist and efficiency
249                 is your concern, you may access directly the user-space
250                 in the object: $parser->{USER} wich is a reference to an
251                 anonymous hash array, and then benchmark.
252
253           If no action is specified for a rule, the equivalant of a default
254           action is run, which returns the first parameter:
255
256              { $_[1] }
257
258       "In rule actions"
259           It is also possible to embed semantic actions inside of a rule:
260
261               typedef:    TYPE { $type = $_[1] } identlist { ... } ;
262
263           When the Parse::Yapp's parser encounter such an embedded action, it
264           modifies the grammar as if you wrote (although @x-1 is not a legal
265           lhs value):
266
267               @x-1:   /* empty */ { $type = $_[1] };
268               typedef:    TYPE @x-1 identlist { ... } ;
269
270           where x is a sequential number incremented for each "in rule"
271           action, and -1 represents the "dot position" in the rule where the
272           action arises.
273
274           In such actions, you can use $_[1]..$_[n] variables, which are the
275           semantic values on the left of your action.
276
277           Be aware that the way Parse::Yapp modifies your grammar because of
278           in rule actions can produce, in some cases, spurious conflicts that
279           wouldn't happen otherwise.
280
281       "Generating the Parser Module"
282           Now that you grammar file is written, you can use yapp on it to
283           generate your parser module:
284
285               yapp -v Calc.yp
286
287           will create two files Calc.pm, your parser module, and Calc.output
288           a verbose output of your parser rules, conflicts, warnings, states
289           and summary.
290
291           What your are missing now is a lexer routine.
292
293       "The Lexer sub"
294           is called each time the parser need to read the next token.
295
296           It is called with only one argument that is the parser object
297           itself, so you can access its methods, specially the
298
299               $_[0]->YYData
300
301           data area.
302
303           It is its duty to return the next token and value to the parser.
304           They "must" be returned as a list of two variables, the first one
305           is the token known by the parser (symbolic or literal), the second
306           one beeing anything you want (usually the content of the token, or
307           the literal value) from a simple scalar value to any complex
308           reference, as the parsing driver never use it but to call semantic
309           actions:
310
311               ( 'NUMBER', $num )
312           or
313               ( '>=', '>=' )
314           or
315               ( 'ARRAY', [ @values ] )
316
317           When the lexer reach the end of input, it must return the '' empty
318           token with an undef value:
319
320                ( '', undef )
321
322           Note that your lexer should never return 'error' as token value:
323           for the driver, this is the error token used for error recovery and
324           would lead to odd reactions.
325
326           Now that you have your lexer written, maybe you will need to output
327           meaningful error messages, instead of the default which is to print
328           'Parse error.' on STDERR.
329
330           So you will need an Error reporting sub.
331
332       "Error reporting routine"
333           If you want one, write it knowing that it is passed as parameter
334           the parser object. So you can share information whith the lexer
335           routine quite easily.
336
337           You can also use the "$_[0]->YYErrok" method in it, which will
338           resume parsing as if no error occured. Of course, since the invalid
339           token is still invalid, you're supposed to fix the problem by
340           yourself.
341
342           The method "$_[0]->YYLexer" may help you, as it returns a reference
343           to the lexer routine, and can be called as
344
345               ($tok,$val)=&{$_[0]->Lexer}
346
347           to get the next token and semantic value from the input stream. To
348           make them current for the parser, use:
349
350               ($_[0]->YYCurtok, $_[0]->YYCurval) = ($tok, $val)
351
352           and know what you're doing...
353
354       "Parsing"
355           Now you've got everything to do the parsing.
356
357           First, use the parser module:
358
359               use Calc;
360
361           Then create the parser object:
362
363               $parser=new Calc;
364
365           Now, call the YYParse method, telling it where to find the lexer
366           and error report subs:
367
368               $result=$parser->YYParse(yylex => \&Lexer,
369                                      yyerror => \&ErrorReport);
370
371           (assuming Lexer and ErrorReport subs have been written in your
372           current package)
373
374           The order in which parameters appear is unimportant.
375
376           Et voila.
377
378           The YYParse method will do the parse, then return the last semantic
379           value returned, or undef if error recovery cannot recover.
380
381           If you need to be sure the parse has been successful (in case your
382           last returned semantic value is undef) make a call to:
383
384               $parser->YYNberr()
385
386           which returns the total number of time the error reporting sub has
387           been called.
388
389       "Error Recovery"
390           in Parse::Yapp is implemented the same way it is in yacc.
391
392       "Debugging Parser"
393           To debug your parser, you can call the YYParse method with a debug
394           parameter:
395
396               $parser->YYParse( ... , yydebug => value, ... )
397
398           where value is a bitfield, each bit representing a specific debug
399           output:
400
401               Bit Value    Outputs
402               0x01         Token reading (useful for Lexer debugging)
403               0x02         States information
404               0x04         Driver actions (shifts, reduces, accept...)
405               0x08         Parse Stack dump
406               0x10         Error Recovery tracing
407
408           To have a full debugging ouput, use
409
410               debug => 0x1F
411
412           Debugging output is sent to STDERR, and be aware that it can
413           produce "huge" outputs.
414
415       "Standalone Parsers"
416           By default, the parser modules generated will need the Parse::Yapp
417           module installed on the system to run. They use the
418           Parse::Yapp::Driver which can be safely shared between parsers in
419           the same script.
420
421           In the case you'd prefer to have a standalone module generated, use
422           the "-s" switch with yapp: this will automagically copy the driver
423           code into your module so you can use/distribute it without the need
424           of the Parse::Yapp module, making it really a "Standalone Parser".
425
426           If you do so, please remember to include Parse::Yapp's copyright
427           notice in your main module copyright, so others can know about
428           Parse::Yapp module.
429
430       "Source file line numbers"
431           by default will be included in the generated parser module, which
432           will help to find the guilty line in your source file in case of a
433           syntax error.  You can disable this feature by compiling your
434           grammar with yapp using the "-n" switch.
435

BUGS AND SUGGESTIONS

437       If you find bugs, think of anything that could improve Parse::Yapp or
438       have any questions related to it, feel free to contact the author.
439

AUTHOR

441       Francois Desarmenien  <francois@fdesar.net>
442

SEE ALSO

444       yapp(1) perl(1) yacc(1) bison(1).
445
447       The Parse::Yapp module and its related modules and shell scripts are
448       copyright (c) 1998-2001 Francois Desarmenien, France. All rights
449       reserved.
450
451       You may use and distribute them under the terms of either the GNU
452       General Public License or the Artistic License, as specified in the
453       Perl README file.
454
455       If you use the "standalone parser" option so people don't need to
456       install Parse::Yapp on their systems in order to run you software, this
457       copyright noticed should be included in your software copyright too,
458       and the copyright notice in the embedded driver should be left
459       untouched.
460

POD ERRORS

462       Hey! The above document had some coding errors, which are explained
463       below:
464
465       Around line 112:
466           Expected text after =item, not a bullet
467
468       Around line 121:
469           Expected text after =item, not a bullet
470
471       Around line 130:
472           Expected text after =item, not a bullet
473
474       Around line 136:
475           Expected text after =item, not a bullet
476
477       Around line 147:
478           Expected text after =item, not a bullet
479
480
481
482perl v5.16.3                      2014-06-10                    Parse::Yapp(3)
Impressum