1Parse::RecDescent(3)  User Contributed Perl Documentation Parse::RecDescent(3)
2
3
4

NAME

6       Parse::RecDescent - Generate Recursive-Descent Parsers
7

VERSION

9       This document describes version 1.94 of Parse::RecDescent, released
10       April  9, 2003.
11

SYNOPSIS

13        use Parse::RecDescent;
14
15        # Generate a parser from the specification in $grammar:
16
17            $parser = new Parse::RecDescent ($grammar);
18
19        # Generate a parser from the specification in $othergrammar
20
21            $anotherparser = new Parse::RecDescent ($othergrammar);
22
23
24        # Parse $text using rule 'startrule' (which must be
25        # defined in $grammar):
26
27           $parser->startrule($text);
28
29
30        # Parse $text using rule 'otherrule' (which must also
31        # be defined in $grammar):
32
33            $parser->otherrule($text);
34
35
36        # Change the universal token prefix pattern
37        # (the default is: '\s*'):
38
39           $Parse::RecDescent::skip = '[ \t]+';
40
41
42        # Replace productions of existing rules (or create new ones)
43        # with the productions defined in $newgrammar:
44
45           $parser->Replace($newgrammar);
46
47
48        # Extend existing rules (or create new ones)
49        # by adding extra productions defined in $moregrammar:
50
51           $parser->Extend($moregrammar);
52
53
54        # Global flags (useful as command line arguments under -s):
55
56           $::RD_ERRORS       # unless undefined, report fatal errors
57           $::RD_WARN         # unless undefined, also report non-fatal problems
58           $::RD_HINT         # if defined, also suggestion remedies
59           $::RD_TRACE        # if defined, also trace parsers' behaviour
60           $::RD_AUTOSTUB     # if defined, generates "stubs" for undefined rules
61           $::RD_AUTOACTION   # if defined, appends specified action to productions
62

DESCRIPTION

64   Overview
65       Parse::RecDescent incrementally generates top-down recursive-descent
66       text parsers from simple yacc-like grammar specifications. It provides:
67
68       ·   Regular expressions or literal strings as terminals (tokens),
69
70       ·   Multiple (non-contiguous) productions for any rule,
71
72       ·   Repeated and optional subrules within productions,
73
74       ·   Full access to Perl within actions specified as part of the
75           grammar,
76
77       ·   Simple automated error reporting during parser generation and
78           parsing,
79
80       ·   The ability to commit to, uncommit to, or reject particular
81           productions during a parse,
82
83       ·   The ability to pass data up and down the parse tree ("down" via
84           subrule argument lists, "up" via subrule return values)
85
86       ·   Incremental extension of the parsing grammar (even during a parse),
87
88       ·   Precompilation of parser objects,
89
90       ·   User-definable reduce-reduce conflict resolution via "scoring" of
91           matching productions.
92
93   Using "Parse::RecDescent"
94       Parser objects are created by calling "Parse::RecDescent::new", passing
95       in a grammar specification (see the following subsections). If the
96       grammar is correct, "new" returns a blessed reference which can then be
97       used to initiate parsing through any rule specified in the original
98       grammar. A typical sequence looks like this:
99
100           $grammar = q {
101               # GRAMMAR SPECIFICATION HERE
102                };
103
104           $parser = new Parse::RecDescent ($grammar) or die "Bad grammar!\n";
105
106           # acquire $text
107
108           defined $parser->startrule($text) or print "Bad text!\n";
109
110       The rule through which parsing is initiated must be explicitly defined
111       in the grammar (i.e. for the above example, the grammar must include a
112       rule of the form: "startrule: <subrules>".
113
114       If the starting rule succeeds, its value (see below) is returned.
115       Failure to generate the original parser or failure to match a text is
116       indicated by returning "undef". Note that it's easy to set up grammars
117       that can succeed, but which return a value of 0, "0", or "".  So don't
118       be tempted to write:
119
120           $parser->startrule($text) or print "Bad text!\n";
121
122       Normally, the parser has no effect on the original text. So in the
123       previous example the value of $text would be unchanged after having
124       been parsed.
125
126       If, however, the text to be matched is passed by reference:
127
128           $parser->startrule(\$text)
129
130       then any text which was consumed during the match will be removed from
131       the start of $text.
132
133   Rules
134       In the grammar from which the parser is built, rules are specified by
135       giving an identifier (which must satisfy /[A-Za-z]\w*/), followed by a
136       colon on the same line, followed by one or more productions, separated
137       by single vertical bars. The layout of the productions is entirely
138       free-format:
139
140           rule1:  production1
141            |  production2 |
142           production3 | production4
143
144       At any point in the grammar previously defined rules may be extended
145       with additional productions. This is achieved by redeclaring the rule
146       with the new productions. Thus:
147
148           rule1: a | b | c
149           rule2: d | e | f
150           rule1: g | h
151
152       is exactly equivalent to:
153
154           rule1: a | b | c | g | h
155           rule2: d | e | f
156
157       Each production in a rule consists of zero or more items, each of which
158       may be either: the name of another rule to be matched (a "subrule"), a
159       pattern or string literal to be matched directly (a "token"), a block
160       of Perl code to be executed (an "action"), a special instruction to the
161       parser (a "directive"), or a standard Perl comment (which is ignored).
162
163       A rule matches a text if one of its productions matches. A production
164       matches if each of its items match consecutive substrings of the text.
165       The productions of a rule being matched are tried in the same order
166       that they appear in the original grammar, and the first matching
167       production terminates the match attempt (successfully). If all
168       productions are tried and none matches, the match attempt fails.
169
170       Note that this behaviour is quite different from the "prefer the longer
171       match" behaviour of yacc. For example, if yacc were parsing the rule:
172
173           seq : 'A' 'B'
174           | 'A' 'B' 'C'
175
176       upon matching "AB" it would look ahead to see if a 'C' is next and, if
177       so, will match the second production in preference to the first. In
178       other words, yacc effectively tries all the productions of a rule
179       breadth-first in parallel, and selects the "best" match, where "best"
180       means longest (note that this is a gross simplification of the true
181       behaviour of yacc but it will do for our purposes).
182
183       In contrast, "Parse::RecDescent" tries each production depth-first in
184       sequence, and selects the "best" match, where "best" means first. This
185       is the fundamental difference between "bottom-up" and "recursive
186       descent" parsing.
187
188       Each successfully matched item in a production is assigned a value,
189       which can be accessed in subsequent actions within the same production
190       (or, in some cases, as the return value of a successful subrule call).
191       Unsuccessful items don't have an associated value, since the failure of
192       an item causes the entire surrounding production to immediately fail.
193       The following sections describe the various types of items and their
194       success values.
195
196   Subrules
197       A subrule which appears in a production is an instruction to the parser
198       to attempt to match the named rule at that point in the text being
199       parsed. If the named subrule is not defined when requested the
200       production containing it immediately fails (unless it was "autostubbed"
201       - see Autostubbing).
202
203       A rule may (recursively) call itself as a subrule, but not as the left-
204       most item in any of its productions (since such recursions are usually
205       non-terminating).
206
207       The value associated with a subrule is the value associated with its
208       $return variable (see "Actions" below), or with the last successfully
209       matched item in the subrule match.
210
211       Subrules may also be specified with a trailing repetition specifier,
212       indicating that they are to be (greedily) matched the specified number
213       of times. The available specifiers are:
214
215           subrule(?)  # Match one-or-zero times
216           subrule(s)  # Match one-or-more times
217           subrule(s?) # Match zero-or-more times
218           subrule(N)  # Match exactly N times for integer N > 0
219           subrule(N..M)   # Match between N and M times
220           subrule(..M)    # Match between 1 and M times
221           subrule(N..)    # Match at least N times
222
223       Repeated subrules keep matching until either the subrule fails to
224       match, or it has matched the minimal number of times but fails to
225       consume any of the parsed text (this second condition prevents the
226       subrule matching forever in some cases).
227
228       Since a repeated subrule may match many instances of the subrule
229       itself, the value associated with it is not a simple scalar, but rather
230       a reference to a list of scalars, each of which is the value associated
231       with one of the individual subrule matches. In other words in the rule:
232
233           program: statement(s)
234
235       the value associated with the repeated subrule "statement(s)" is a
236       reference to an array containing the values matched by each call to the
237       individual subrule "statement".
238
239       Repetition modifiers may include a separator pattern:
240
241           program: statement(s /;/)
242
243       specifying some sequence of characters to be skipped between each
244       repetition.  This is really just a shorthand for the <leftop:...>
245       directive (see below).
246
247   Tokens
248       If a quote-delimited string or a Perl regex appears in a production,
249       the parser attempts to match that string or pattern at that point in
250       the text. For example:
251
252           typedef: "typedef" typename identifier ';'
253
254           identifier: /[A-Za-z_][A-Za-z0-9_]*/
255
256       As in regular Perl, a single quoted string is uninterpolated, whilst a
257       double-quoted string or a pattern is interpolated (at the time of
258       matching, not when the parser is constructed). Hence, it is possible to
259       define rules in which tokens can be set at run-time:
260
261           typedef: "$::typedefkeyword" typename identifier ';'
262
263           identifier: /$::identpat/
264
265       Note that, since each rule is implemented inside a special namespace
266       belonging to its parser, it is necessary to explicitly quantify
267       variables from the main package.
268
269       Regex tokens can be specified using just slashes as delimiters or with
270       the explicit "m<delimiter>......<delimiter>" syntax:
271
272           typedef: "typedef" typename identifier ';'
273
274           typename: /[A-Za-z_][A-Za-z0-9_]*/
275
276           identifier: m{[A-Za-z_][A-Za-z0-9_]*}
277
278       A regex of either type can also have any valid trailing parameter(s)
279       (that is, any of [cgimsox]):
280
281           typedef: "typedef" typename identifier ';'
282
283           identifier: / [a-z_]        # LEADING ALPHA OR UNDERSCORE
284                 [a-z0-9_]*    # THEN DIGITS ALSO ALLOWED
285               /ix     # CASE/SPACE/COMMENT INSENSITIVE
286
287       The value associated with any successfully matched token is a string
288       containing the actual text which was matched by the token.
289
290       It is important to remember that, since each grammar is specified in a
291       Perl string, all instances of the universal escape character '\' within
292       a grammar must be "doubled", so that they interpolate to single '\'s
293       when the string is compiled. For example, to use the grammar:
294
295           word:       /\S+/ | backslash
296           line:       prefix word(s) "\n"
297           backslash:  '\\'
298
299       the following code is required:
300
301           $parser = new Parse::RecDescent (q{
302
303               word:   /\\S+/ | backslash
304               line:   prefix word(s) "\\n"
305               backslash:  '\\\\'
306
307           });
308
309   Anonymous subrules
310       Parentheses introduce a nested scope that is very like a call to an
311       anonymous subrule. Hence they are useful for "in-lining" subroutine
312       calls, and other kinds of grouping behaviour. For example, instead of:
313
314           word:       /\S+/ | backslash
315           line:       prefix word(s) "\n"
316
317       you could write:
318
319           line:       prefix ( /\S+/ | backslash )(s) "\n"
320
321       and get exactly the same effects.
322
323       Parentheses are also use for collecting unrepeated alternations within
324       a single production.
325
326           secret_identity: "Mr" ("Incredible"|"Fantastic"|"Sheen") ", Esq."
327
328   Terminal Separators
329       For the purpose of matching, each terminal in a production is
330       considered to be preceded by a "prefix" - a pattern which must be
331       matched before a token match is attempted. By default, the prefix is
332       optional whitespace (which always matches, at least trivially), but
333       this default may be reset in any production.
334
335       The variable $Parse::RecDescent::skip stores the universal prefix,
336       which is the default for all terminal matches in all parsers built with
337       "Parse::RecDescent".
338
339       The prefix for an individual production can be altered by using the
340       "<skip:...>" directive (see below).
341
342   Actions
343       An action is a block of Perl code which is to be executed (as the block
344       of a "do" statement) when the parser reaches that point in a
345       production. The action executes within a special namespace belonging to
346       the active parser, so care must be taken in correctly qualifying
347       variable names (see also "Start-up Actions" below).
348
349       The action is considered to succeed if the final value of the block is
350       defined (that is, if the implied "do" statement evaluates to a defined
351       value - even one which would be treated as "false"). Note that the
352       value associated with a successful action is also the final value in
353       the block.
354
355       An action will fail if its last evaluated value is "undef". This is
356       surprisingly easy to accomplish by accident. For instance, here's an
357       infuriating case of an action that makes its production fail, but only
358       when debugging isn't activated:
359
360           description: name rank serial_number
361               { print "Got $item[2] $item[1] ($item[3])\n"
362               if $::debugging
363               }
364
365       If $debugging is false, no statement in the block is executed, so the
366       final value is "undef", and the entire production fails. The solution
367       is:
368
369           description: name rank serial_number
370               { print "Got $item[2] $item[1] ($item[3])\n"
371               if $::debugging;
372                 1;
373               }
374
375       Within an action, a number of useful parse-time variables are available
376       in the special parser namespace (there are other variables also
377       accessible, but meddling with them will probably just break your
378       parser. As a general rule, if you avoid referring to unqualified
379       variables - especially those starting with an underscore - inside an
380       action, things should be okay):
381
382       @item and %item
383           The array slice @item[1..$#item] stores the value associated with
384           each item (that is, each subrule, token, or action) in the current
385           production. The analogy is to $1, $2, etc. in a yacc grammar.  Note
386           that, for obvious reasons, @item only contains the values of items
387           before the current point in the production.
388
389           The first element ($item[0]) stores the name of the current rule
390           being matched.
391
392           @item is a standard Perl array, so it can also be indexed with
393           negative numbers, representing the number of items back from the
394           current position in the parse:
395
396               stuff: /various/ bits 'and' pieces "then" data 'end'
397                   { print $item[-2] }  # PRINTS data
398                        # (EASIER THAN: $item[6])
399
400           The %item hash complements the <@item> array, providing named
401           access to the same item values:
402
403               stuff: /various/ bits 'and' pieces "then" data 'end'
404                   { print $item{data}  # PRINTS data
405                        # (EVEN EASIER THAN USING @item)
406
407           The results of named subrules are stored in the hash under each
408           subrule's name (including the repetition specifier, if any), whilst
409           all other items are stored under a "named positional" key that
410           indictates their ordinal position within their item type:
411           __STRINGn__, __PATTERNn__, __DIRECTIVEn__, __ACTIONn__:
412
413               stuff: /various/ bits 'and' pieces "then" data 'end' { save }
414                   { print $item{__PATTERN1__}, # PRINTS 'various'
415                   $item{__STRING2__},  # PRINTS 'then'
416                   $item{__ACTION1__},  # PRINTS RETURN
417                            # VALUE OF save
418                   }
419
420           If you want proper named access to patterns or literals, you need
421           to turn them into separate rules:
422
423               stuff: various bits 'and' pieces "then" data 'end'
424                   { print $item{various}  # PRINTS various
425                   }
426
427               various: /various/
428
429           The special entry $item{__RULE__} stores the name of the current
430           rule (i.e. the same value as $item[0].
431
432           The advantage of using %item, instead of @items is that it removes
433           the need to track items positions that may change as a grammar
434           evolves. For example, adding an interim "<skip>" directive of
435           action can silently ruin a trailing action, by moving an @item
436           element "down" the array one place. In contrast, the named entry of
437           %item is unaffected by such an insertion.
438
439           A limitation of the %item hash is that it only records the last
440           value of a particular subrule. For example:
441
442               range: '(' number '..' number )'
443                   { $return = $item{number} }
444
445           will return only the value corresponding to the second match of the
446           "number" subrule. In other words, successive calls to a subrule
447           overwrite the corresponding entry in %item. Once again, the
448           solution is to rename each subrule in its own rule:
449
450               range: '(' from_num '..' to_num )'
451                   { $return = $item{from_num} }
452
453               from_num: number
454               to_num:   number
455
456       @arg and %arg
457           The array @arg and the hash %arg store any arguments passed to the
458           rule from some other rule (see ""Subrule argument lists"). Changes
459           to the elements of either variable do not propagate back to the
460           calling rule (data can be passed back from a subrule via the
461           $return variable - see next item).
462
463       $return
464           If a value is assigned to $return within an action, that value is
465           returned if the production containing the action eventually matches
466           successfully. Note that setting $return doesn't cause the current
467           production to succeed. It merely tells it what to return if it does
468           succeed.  Hence $return is analogous to $$ in a yacc grammar.
469
470           If $return is not assigned within a production, the value of the
471           last component of the production (namely: $item[$#item]) is
472           returned if the production succeeds.
473
474       $commit
475           The current state of commitment to the current production (see
476           "Directives" below).
477
478       $skip
479           The current terminal prefix (see "Directives" below).
480
481       $text
482           The remaining (unparsed) text. Changes to $text do not propagate
483           out of unsuccessful productions, but do survive successful
484           productions. Hence it is possible to dynamically alter the text
485           being parsed - for example, to provide a "#include"-like facility:
486
487               hash_include: '#include' filename
488                   { $text = ::loadfile($item[2]) . $text }
489
490               filename: '<' /[a-z0-9._-]+/i '>'  { $return = $item[2] }
491               | '"' /[a-z0-9._-]+/i '"'  { $return = $item[2] }
492
493       $thisline and $prevline
494           $thisline stores the current line number within the current parse
495           (starting from 1). $prevline stores the line number for the last
496           character which was already successfully parsed (this will be
497           different from $thisline at the end of each line).
498
499           For efficiency, $thisline and $prevline are actually tied hashes,
500           and only recompute the required line number when the variable's
501           value is used.
502
503           Assignment to $thisline adjusts the line number calculator, so that
504           it believes that the current line number is the value being
505           assigned. Note that this adjustment will be reflected in all
506           subsequent line numbers calculations.
507
508           Modifying the value of the variable $text (as in the previous
509           "hash_include" example, for instance) will confuse the line
510           counting mechanism. To prevent this, you should call
511           "Parse::RecDescent::LineCounter::resync($thisline)" immediately
512           after any assignment to the variable $text (or, at least, before
513           the next attempt to use $thisline).
514
515           Note that if a production fails after assigning to or resync'ing
516           $thisline, the parser's line counter mechanism will usually be
517           corrupted.
518
519           Also see the entry for @itempos.
520
521           The line number can be set to values other than 1, by calling the
522           start rule with a second argument. For example:
523
524               $parser = new Parse::RecDescent ($grammar);
525
526               $parser->input($text, 10);  # START LINE NUMBERS AT 10
527
528       $thiscolumn and $prevcolumn
529           $thiscolumn stores the current column number within the current
530           line being parsed (starting from 1). $prevcolumn stores the column
531           number of the last character which was actually successfully
532           parsed. Usually "$prevcolumn == $thiscolumn-1", but not at the end
533           of lines.
534
535           For efficiency, $thiscolumn and $prevcolumn are actually tied
536           hashes, and only recompute the required column number when the
537           variable's value is used.
538
539           Assignment to $thiscolumn or $prevcolumn is a fatal error.
540
541           Modifying the value of the variable $text (as in the previous
542           "hash_include" example, for instance) may confuse the column
543           counting mechanism.
544
545           Note that $thiscolumn reports the column number before any
546           whitespace that might be skipped before reading a token. Hence if
547           you wish to know where a token started (and ended) use something
548           like this:
549
550               rule: token1 token2 startcol token3 endcol token4
551                   { print "token3: columns $item[3] to $item[5]"; }
552
553               startcol: '' { $thiscolumn }    # NEED THE '' TO STEP PAST TOKEN SEP
554               endcol:  { $prevcolumn }
555
556           Also see the entry for @itempos.
557
558       $thisoffset and $prevoffset
559           $thisoffset stores the offset of the current parsing position
560           within the complete text being parsed (starting from 0).
561           $prevoffset stores the offset of the last character which was
562           actually successfully parsed. In all cases "$prevoffset ==
563           $thisoffset-1".
564
565           For efficiency, $thisoffset and $prevoffset are actually tied
566           hashes, and only recompute the required offset when the variable's
567           value is used.
568
569           Assignment to $thisoffset or <$prevoffset> is a fatal error.
570
571           Modifying the value of the variable $text will not affect the
572           offset counting mechanism.
573
574           Also see the entry for @itempos.
575
576       @itempos
577           The array @itempos stores a hash reference corresponding to each
578           element of @item. The elements of the hash provide the following:
579
580               $itempos[$n]{offset}{from}  # VALUE OF $thisoffset BEFORE $item[$n]
581               $itempos[$n]{offset}{to}    # VALUE OF $prevoffset AFTER $item[$n]
582               $itempos[$n]{line}{from}    # VALUE OF $thisline BEFORE $item[$n]
583               $itempos[$n]{line}{to}  # VALUE OF $prevline AFTER $item[$n]
584               $itempos[$n]{column}{from}  # VALUE OF $thiscolumn BEFORE $item[$n]
585               $itempos[$n]{column}{to}    # VALUE OF $prevcolumn AFTER $item[$n]
586
587           Note that the various "$itempos[$n]...{from}" values record the
588           appropriate value after any token prefix has been skipped.
589
590           Hence, instead of the somewhat tedious and error-prone:
591
592               rule: startcol token1 endcol
593                 startcol token2 endcol
594                 startcol token3 endcol
595                   { print "token1: columns $item[1]
596                         to $item[3]
597                    token2: columns $item[4]
598                         to $item[6]
599                    token3: columns $item[7]
600                         to $item[9]" }
601
602               startcol: '' { $thiscolumn }    # NEED THE '' TO STEP PAST TOKEN SEP
603               endcol:  { $prevcolumn }
604
605           it is possible to write:
606
607               rule: token1 token2 token3
608                   { print "token1: columns $itempos[1]{column}{from}
609                         to $itempos[1]{column}{to}
610                    token2: columns $itempos[2]{column}{from}
611                         to $itempos[2]{column}{to}
612                    token3: columns $itempos[3]{column}{from}
613                         to $itempos[3]{column}{to}" }
614
615           Note however that (in the current implementation) the use of
616           @itempos anywhere in a grammar implies that item positioning
617           information is collected everywhere during the parse. Depending on
618           the grammar and the size of the text to be parsed, this may be
619           prohibitively expensive and the explicit use of $thisline,
620           $thiscolumn, etc. may be a better choice.
621
622       $thisparser
623           A reference to the "Parse::RecDescent" object through which parsing
624           was initiated.
625
626           The value of $thisparser propagates down the subrules of a parse
627           but not back up. Hence, you can invoke subrules from another parser
628           for the scope of the current rule as follows:
629
630               rule: subrule1 subrule2
631               | { $thisparser = $::otherparser } <reject>
632               | subrule3 subrule4
633               | subrule5
634
635           The result is that the production calls "subrule1" and "subrule2"
636           of the current parser, and the remaining productions call the named
637           subrules from $::otherparser. Note, however that "Bad Things" will
638           happen if "::otherparser" isn't a blessed reference and/or doesn't
639           have methods with the same names as the required subrules!
640
641       $thisrule
642           A reference to the "Parse::RecDescent::Rule" object corresponding
643           to the rule currently being matched.
644
645       $thisprod
646           A reference to the "Parse::RecDescent::Production" object
647           corresponding to the production currently being matched.
648
649       $score and $score_return
650           $score stores the best production score to date, as specified by an
651           earlier "<score:...>" directive. $score_return stores the
652           corresponding return value for the successful production.
653
654           See "Scored productions".
655
656       Warning: the parser relies on the information in the various "this..."
657       objects in some non-obvious ways. Tinkering with the other members of
658       these objects will probably cause Bad Things to happen, unless you
659       really know what you're doing. The only exception to this advice is
660       that the use of "$this...->{local}" is always safe.
661
662   Start-up Actions
663       Any actions which appear before the first rule definition in a grammar
664       are treated as "start-up" actions. Each such action is stripped of its
665       outermost brackets and then evaluated (in the parser's special
666       namespace) just before the rules of the grammar are first compiled.
667
668       The main use of start-up actions is to declare local variables within
669       the parser's special namespace:
670
671           { my $lastitem = '???'; }
672
673           list: item(s)   { $return = $lastitem }
674
675           item: book  { $lastitem = 'book'; }
676             bell  { $lastitem = 'bell'; }
677             candle    { $lastitem = 'candle'; }
678
679       but start-up actions can be used to execute any valid Perl code within
680       a parser's special namespace.
681
682       Start-up actions can appear within a grammar extension or replacement
683       (that is, a partial grammar installed via "Parse::RecDescent::Extend()"
684       or "Parse::RecDescent::Replace()" - see "Incremental Parsing"), and
685       will be executed before the new grammar is installed. Note, however,
686       that a particular start-up action is only ever executed once.
687
688   Autoactions
689       It is sometimes desirable to be able to specify a default action to be
690       taken at the end of every production (for example, in order to easily
691       build a parse tree). If the variable $::RD_AUTOACTION is defined when
692       "Parse::RecDescent::new()" is called, the contents of that variable are
693       treated as a specification of an action which is to appended to each
694       production in the corresponding grammar.
695
696       Alternatively, you can hard-code the autoaction within a grammar, using
697       the "<autoaction:...>" directive.
698
699       So, for example, to construct a simple parse tree you could write:
700
701           $::RD_AUTOACTION = q { [@item] };
702
703           parser = Parse::RecDescent->new(q{
704           expression: and_expr '||' expression | and_expr
705           and_expr:   not_expr '&&' and_expr   | not_expr
706           not_expr:   '!' brack_expr       | brack_expr
707           brack_expr: '(' expression ')'       | identifier
708           identifier: /[a-z]+/i
709           });
710
711       or:
712
713           parser = Parse::RecDescent->new(q{
714           <autoaction: { [@item] } >
715
716           expression: and_expr '||' expression | and_expr
717           and_expr:   not_expr '&&' and_expr   | not_expr
718           not_expr:   '!' brack_expr       | brack_expr
719           brack_expr: '(' expression ')'       | identifier
720           identifier: /[a-z]+/i
721           });
722
723       Either of these is equivalent to:
724
725           parser = new Parse::RecDescent (q{
726           expression: and_expr '||' expression
727               { [@item] }
728             | and_expr
729               { [@item] }
730
731           and_expr:   not_expr '&&' and_expr
732               { [@item] }
733           |   not_expr
734               { [@item] }
735
736           not_expr:   '!' brack_expr
737               { [@item] }
738           |   brack_expr
739               { [@item] }
740
741           brack_expr: '(' expression ')'
742               { [@item] }
743             | identifier
744               { [@item] }
745
746           identifier: /[a-z]+/i
747               { [@item] }
748           });
749
750       Alternatively, we could take an object-oriented approach, use different
751       classes for each node (and also eliminating redundant intermediate
752       nodes):
753
754           $::RD_AUTOACTION = q
755             { $#item==1 ? $item[1] : "$item[0]_node"->new(@item[1..$#item]) };
756
757           parser = Parse::RecDescent->new(q{
758               expression: and_expr '||' expression | and_expr
759               and_expr:   not_expr '&&' and_expr   | not_expr
760               not_expr:   '!' brack_expr           | brack_expr
761               brack_expr: '(' expression ')'       | identifier
762               identifier: /[a-z]+/i
763           });
764
765       or:
766
767           parser = Parse::RecDescent->new(q{
768               <autoaction:
769                 $#item==1 ? $item[1] : "$item[0]_node"->new(@item[1..$#item])
770               >
771
772               expression: and_expr '||' expression | and_expr
773               and_expr:   not_expr '&&' and_expr   | not_expr
774               not_expr:   '!' brack_expr           | brack_expr
775               brack_expr: '(' expression ')'       | identifier
776               identifier: /[a-z]+/i
777           });
778
779       which are equivalent to:
780
781           parser = Parse::RecDescent->new(q{
782               expression: and_expr '||' expression
783                   { "expression_node"->new(@item[1..3]) }
784               | and_expr
785
786               and_expr:   not_expr '&&' and_expr
787                   { "and_expr_node"->new(@item[1..3]) }
788               |   not_expr
789
790               not_expr:   '!' brack_expr
791                   { "not_expr_node"->new(@item[1..2]) }
792               |   brack_expr
793
794               brack_expr: '(' expression ')'
795                   { "brack_expr_node"->new(@item[1..3]) }
796               | identifier
797
798               identifier: /[a-z]+/i
799                   { "identifer_node"->new(@item[1]) }
800           });
801
802       Note that, if a production already ends in an action, no autoaction is
803       appended to it. For example, in this version:
804
805           $::RD_AUTOACTION = q
806             { $#item==1 ? $item[1] : "$item[0]_node"->new(@item[1..$#item]) };
807
808           parser = Parse::RecDescent->new(q{
809               expression: and_expr '&&' expression | and_expr
810               and_expr:   not_expr '&&' and_expr   | not_expr
811               not_expr:   '!' brack_expr           | brack_expr
812               brack_expr: '(' expression ')'       | identifier
813               identifier: /[a-z]+/i
814                   { 'terminal_node'->new($item[1]) }
815           });
816
817       each "identifier" match produces a "terminal_node" object, not an
818       "identifier_node" object.
819
820       A level 1 warning is issued each time an "autoaction" is added to some
821       production.
822
823   Autotrees
824       A commonly needed autoaction is one that builds a parse-tree. It is
825       moderately tricky to set up such an action (which must treat terminals
826       differently from non-terminals), so Parse::RecDescent simplifies the
827       process by providing the "<autotree>" directive.
828
829       If this directive appears at the start of grammar, it causes
830       Parse::RecDescent to insert autoactions at the end of any rule except
831       those which already end in an action. The action inserted depends on
832       whether the production is an intermediate rule (two or more items), or
833       a terminal of the grammar (i.e. a single pattern or string item).
834
835       So, for example, the following grammar:
836
837           <autotree>
838
839           file    : command(s)
840           command : get | set | vet
841           get : 'get' ident ';'
842           set : 'set' ident 'to' value ';'
843           vet : 'check' ident 'is' value ';'
844           ident   : /\w+/
845           value   : /\d+/
846
847       is equivalent to:
848
849           file    : command(s)        { bless \%item, $item[0] }
850           command : get       { bless \%item, $item[0] }
851           | set           { bless \%item, $item[0] }
852           | vet           { bless \%item, $item[0] }
853           get : 'get' ident ';'   { bless \%item, $item[0] }
854           set : 'set' ident 'to' value ';'    { bless \%item, $item[0] }
855           vet : 'check' ident 'is' value ';'  { bless \%item, $item[0] }
856
857           ident   : /\w+/  { bless {__VALUE__=>$item[1]}, $item[0] }
858           value   : /\d+/  { bless {__VALUE__=>$item[1]}, $item[0] }
859
860       Note that each node in the tree is blessed into a class of the same
861       name as the rule itself. This makes it easy to build object-oriented
862       processors for the parse-trees that the grammar produces. Note too that
863       the last two rules produce special objects with the single attribute
864       '__VALUE__'. This is because they consist solely of a single terminal.
865
866       This autoaction-ed grammar would then produce a parse tree in a data
867       structure like this:
868
869           {
870             file => {
871               command => {
872                [ get => {
873                   identifier => { __VALUE__ => 'a' },
874                     },
875                  set => {
876                   identifier => { __VALUE__ => 'b' },
877                   value      => { __VALUE__ => '7' },
878                     },
879                  vet => {
880                   identifier => { __VALUE__ => 'b' },
881                   value      => { __VALUE__ => '7' },
882                     },
883                 ],
884                  },
885             }
886           }
887
888       (except, of course, that each nested hash would also be blessed into
889       the appropriate class).
890
891   Autostubbing
892       Normally, if a subrule appears in some production, but no rule of that
893       name is ever defined in the grammar, the production which refers to the
894       non-existent subrule fails immediately. This typically occurs as a
895       result of misspellings, and is a sufficiently common occurance that a
896       warning is generated for such situations.
897
898       However, when prototyping a grammar it is sometimes useful to be able
899       to use subrules before a proper specification of them is really
900       possible.  For example, a grammar might include a section like:
901
902           function_call: identifier '(' arg(s?) ')'
903
904           identifier: /[a-z]\w*/i
905
906       where the possible format of an argument is sufficiently complex that
907       it is not worth specifying in full until the general function call
908       syntax has been debugged. In this situation it is convenient to leave
909       the real rule "arg" undefined and just slip in a placeholder (or
910       "stub"):
911
912           arg: 'arg'
913
914       so that the function call syntax can be tested with dummy input such
915       as:
916
917           f0()
918           f1(arg)
919           f2(arg arg)
920           f3(arg arg arg)
921
922       et cetera.
923
924       Early in prototyping, many such "stubs" may be required, so
925       "Parse::RecDescent" provides a means of automating their definition.
926       If the variable $::RD_AUTOSTUB is defined when a parser is built, a
927       subrule reference to any non-existent rule (say, "sr"), causes a "stub"
928       rule of the form:
929
930           sr: 'sr'
931
932       to be automatically defined in the generated parser.  A level 1 warning
933       is issued for each such "autostubbed" rule.
934
935       Hence, with $::AUTOSTUB defined, it is possible to only partially
936       specify a grammar, and then "fake" matches of the unspecified
937       (sub)rules by just typing in their name.
938
939   Look-ahead
940       If a subrule, token, or action is prefixed by "...", then it is treated
941       as a "look-ahead" request. That means that the current production can
942       (as usual) only succeed if the specified item is matched, but that the
943       matching does not consume any of the text being parsed. This is very
944       similar to the "/(?=...)/" look-ahead construct in Perl patterns. Thus,
945       the rule:
946
947           inner_word: word ...word
948
949       will match whatever the subrule "word" matches, provided that match is
950       followed by some more text which subrule "word" would also match
951       (although this second substring is not actually consumed by
952       "inner_word")
953
954       Likewise, a "...!" prefix, causes the following item to succeed
955       (without consuming any text) if and only if it would normally fail.
956       Hence, a rule such as:
957
958           identifier: ...!keyword ...!'_' /[A-Za-z_]\w*/
959
960       matches a string of characters which satisfies the pattern
961       "/[A-Za-z_]\w*/", but only if the same sequence of characters would not
962       match either subrule "keyword" or the literal token '_'.
963
964       Sequences of look-ahead prefixes accumulate, multiplying their positive
965       and/or negative senses. Hence:
966
967           inner_word: word ...!......!word
968
969       is exactly equivalent the the original example above (a warning is
970       issued in cases like these, since they often indicate something left
971       out, or misunderstood).
972
973       Note that actions can also be treated as look-aheads. In such cases,
974       the state of the parser text (in the local variable $text) after the
975       look-ahead action is guaranteed to be identical to its state before the
976       action, regardless of how it's changed within the action (unless you
977       actually undefine $text, in which case you get the disaster you deserve
978       :-).
979
980   Directives
981       Directives are special pre-defined actions which may be used to alter
982       the behaviour of the parser. There are currently twenty-three
983       directives: "<commit>", "<uncommit>", "<reject>", "<score>",
984       "<autoscore>", "<skip>", "<resync>", "<error>", "<warn>", "<hint>",
985       "<trace_build>", "<trace_parse>", "<nocheck>", "<rulevar>",
986       "<matchrule>", "<leftop>", "<rightop>", "<defer>", "<nocheck>",
987       "<perl_quotelike>", "<perl_codeblock>", "<perl_variable>", and
988       "<token>".
989
990       Committing and uncommitting
991           The "<commit>" and "<uncommit>" directives permit the recursive
992           descent of the parse tree to be pruned (or "cut") for efficiency.
993           Within a rule, a "<commit>" directive instructs the rule to ignore
994           subsequent productions if the current production fails. For
995           example:
996
997               command: 'find' <commit> filename
998                  | 'open' <commit> filename
999                  | 'move' filename filename
1000
1001           Clearly, if the leading token 'find' is matched in the first
1002           production but that production fails for some other reason, then
1003           the remaining productions cannot possibly match. The presence of
1004           the "<commit>" causes the "command" rule to fail immediately if an
1005           invalid "find" command is found, and likewise if an invalid "open"
1006           command is encountered.
1007
1008           It is also possible to revoke a previous commitment. For example:
1009
1010               if_statement: 'if' <commit> condition
1011                   'then' block <uncommit>
1012                   'else' block
1013                   | 'if' <commit> condition
1014                   'then' block
1015
1016           In this case, a failure to find an "else" block in the first
1017           production shouldn't preclude trying the second production, but a
1018           failure to find a "condition" certainly should.
1019
1020           As a special case, any production in which the first item is an
1021           "<uncommit>" immediately revokes a preceding "<commit>" (even
1022           though the production would not otherwise have been tried). For
1023           example, in the rule:
1024
1025               request: 'explain' expression
1026                  | 'explain' <commit> keyword
1027                  | 'save'
1028                  | 'quit'
1029                  | <uncommit> term '?'
1030
1031           if the text being matched was "explain?", and the first two
1032           productions failed, then the "<commit>" in production two would
1033           cause productions three and four to be skipped, but the leading
1034           "<uncommit>" in the production five would allow that production to
1035           attempt a match.
1036
1037           Note in the preceding example, that the "<commit>" was only placed
1038           in production two. If production one had been:
1039
1040               request: 'explain' <commit> expression
1041
1042           then production two would be (inappropriately) skipped if a leading
1043           "explain..." was encountered.
1044
1045           Both "<commit>" and "<uncommit>" directives always succeed, and
1046           their value is always 1.
1047
1048       Rejecting a production
1049           The "<reject>" directive immediately causes the current production
1050           to fail (it is exactly equivalent to, but more obvious than, the
1051           action "{undef}"). A "<reject>" is useful when it is desirable to
1052           get the side effects of the actions in one production, without
1053           prejudicing a match by some other production later in the rule. For
1054           example, to insert tracing code into the parse:
1055
1056               complex_rule: { print "In complex rule...\n"; } <reject>
1057
1058               complex_rule: simple_rule '+' 'i' '*' simple_rule
1059                   | 'i' '*' simple_rule
1060                   | simple_rule
1061
1062           It is also possible to specify a conditional rejection, using the
1063           form "<reject:condition>", which only rejects if the specified
1064           condition is true. This form of rejection is exactly equivalent to
1065           the action "{(condition)?undef:1}>".  For example:
1066
1067               command: save_command
1068                  | restore_command
1069                  | <reject: defined $::tolerant> { exit }
1070                  | <error: Unknown command. Ignored.>
1071
1072           A "<reject>" directive never succeeds (and hence has no associated
1073           value). A conditional rejection may succeed (if its condition is
1074           not satisfied), in which case its value is 1.
1075
1076           As an extra optimization, "Parse::RecDescent" ignores any
1077           production which begins with an unconditional "<reject>" directive,
1078           since any such production can never successfully match or have any
1079           useful side-effects. A level 1 warning is issued in all such cases.
1080
1081           Note that productions beginning with conditional "<reject:...>"
1082           directives are never "optimized away" in this manner, even if they
1083           are always guaranteed to fail (for example: "<reject:1>")
1084
1085           Due to the way grammars are parsed, there is a minor restriction on
1086           the condition of a conditional "<reject:...>": it cannot contain
1087           any raw '<' or '>' characters. For example:
1088
1089               line: cmd <reject: $thiscolumn > max> data
1090
1091           results in an error when a parser is built from this grammar (since
1092           the grammar parser has no way of knowing whether the first > is a
1093           "less than" or the end of the "<reject:...>".
1094
1095           To overcome this problem, put the condition inside a do{} block:
1096
1097               line: cmd <reject: do{$thiscolumn > max}> data
1098
1099           Note that the same problem may occur in other directives that take
1100           arguments. The same solution will work in all cases.
1101
1102       Skipping between terminals
1103           The "<skip>" directive enables the terminal prefix used in a
1104           production to be changed. For example:
1105
1106               OneLiner: Command <skip:'[ \t]*'> Arg(s) /;/
1107
1108           causes only blanks and tabs to be skipped before terminals in the
1109           "Arg" subrule (and any of its subrules>, and also before the final
1110           "/;/" terminal.  Once the production is complete, the previous
1111           terminal prefix is reinstated. Note that this implies that distinct
1112           productions of a rule must reset their terminal prefixes
1113           individually.
1114
1115           The "<skip>" directive evaluates to the previous terminal prefix,
1116           so it's easy to reinstate a prefix later in a production:
1117
1118               Command: <skip:","> CSV(s) <skip:$item[1]> Modifier
1119
1120           The value specified after the colon is interpolated into a pattern,
1121           so all of the following are equivalent (though their efficiency
1122           increases down the list):
1123
1124               <skip: "$colon|$comma">   # ASSUMING THE VARS HOLD THE OBVIOUS VALUES
1125
1126               <skip: ':|,'>
1127
1128               <skip: q{[:,]}>
1129
1130               <skip: qr/[:,]/>
1131
1132           There is no way of directly setting the prefix for an entire rule,
1133           except as follows:
1134
1135               Rule: <skip: '[ \t]*'> Prod1
1136               | <skip: '[ \t]*'> Prod2a Prod2b
1137               | <skip: '[ \t]*'> Prod3
1138
1139           or, better:
1140
1141               Rule: <skip: '[ \t]*'>
1142               (
1143               Prod1
1144                 | Prod2a Prod2b
1145                 | Prod3
1146               )
1147
1148           Note: Up to release 1.51 of Parse::RecDescent, an entirely
1149           different mechanism was used for specifying terminal prefixes. The
1150           current method is not backwards-compatible with that early
1151           approach. The current approach is stable and will not to change
1152           again.
1153
1154       Resynchronization
1155           The "<resync>" directive provides a visually distinctive means of
1156           consuming some of the text being parsed, usually to skip an
1157           erroneous input. In its simplest form "<resync>" simply consumes
1158           text up to and including the next newline ("\n") character,
1159           succeeding only if the newline is found, in which case it causes
1160           its surrounding rule to return zero on success.
1161
1162           In other words, a "<resync>" is exactly equivalent to the token
1163           "/[^\n]*\n/" followed by the action "{ $return = 0 }" (except that
1164           productions beginning with a "<resync>" are ignored when generating
1165           error messages). A typical use might be:
1166
1167               script : command(s)
1168
1169               command: save_command
1170                  | restore_command
1171                  | <resync> # TRY NEXT LINE, IF POSSIBLE
1172
1173           It is also possible to explicitly specify a resynchronization
1174           pattern, using the "<resync:pattern>" variant. This version
1175           succeeds only if the specified pattern matches (and consumes) the
1176           parsed text. In other words, "<resync:pattern>" is exactly
1177           equivalent to the token "/pattern/" (followed by a
1178           "{ $return = 0 }" action). For example, if commands were terminated
1179           by newlines or semi-colons:
1180
1181               command: save_command
1182                  | restore_command
1183                  | <resync:[^;\n]*[;\n]>
1184
1185           The value of a successfully matched "<resync>" directive (of either
1186           type) is the text that it consumed. Note, however, that since the
1187           directive also sets $return, a production consisting of a lone
1188           "<resync>" succeeds but returns the value zero (which a calling
1189           rule may find useful to distinguish between "true" matches and
1190           "tolerant" matches).  Remember that returning a zero value
1191           indicates that the rule succeeded (since only an "undef" denotes
1192           failure within "Parse::RecDescent" parsers.
1193
1194       Error handling
1195           The "<error>" directive provides automatic or user-defined
1196           generation of error messages during a parse. In its simplest form
1197           "<error>" prepares an error message based on the mismatch between
1198           the last item expected and the text which cause it to fail. For
1199           example, given the rule:
1200
1201               McCoy: curse ',' name ', I'm a doctor, not a' a_profession '!'
1202                | pronoun 'dead,' name '!'
1203                | <error>
1204
1205           the following strings would produce the following messages:
1206
1207           "Amen, Jim!"
1208                      ERROR (line 1): Invalid McCoy: Expected curse or pronoun
1209                          not found
1210
1211           "Dammit, Jim, I'm a doctor!"
1212                      ERROR (line 1): Invalid McCoy: Expected ", I'm a doctor, not a"
1213                          but found ", I'm a doctor!" instead
1214
1215           "He's dead,\n"
1216                      ERROR (line 2): Invalid McCoy: Expected name not found
1217
1218           "He's alive!"
1219                      ERROR (line 1): Invalid McCoy: Expected 'dead,' but found
1220                          "alive!" instead
1221
1222           "Dammit, Jim, I'm a doctor, not a pointy-eared Vulcan!"
1223                      ERROR (line 1): Invalid McCoy: Expected a profession but found
1224                          "pointy-eared Vulcan!" instead
1225
1226           Note that, when autogenerating error messages, all underscores in
1227           any rule name used in a message are replaced by single spaces (for
1228           example "a_production" becomes "a production"). Judicious choice of
1229           rule names can therefore considerably improve the readability of
1230           automatic error messages (as well as the maintainability of the
1231           original grammar).
1232
1233           If the automatically generated error is not sufficient, it is
1234           possible to provide an explicit message as part of the error
1235           directive. For example:
1236
1237               Spock: "Fascinating ',' (name | 'Captain') '.'
1238                | "Highly illogical, doctor."
1239                | <error: He never said that!>
1240
1241           which would result in all failures to parse a "Spock" subrule
1242           printing the following message:
1243
1244                  ERROR (line <N>): Invalid Spock:  He never said that!
1245
1246           The error message is treated as a "qq{...}" string and interpolated
1247           when the error is generated (not when the directive is specified!).
1248           Hence:
1249
1250               <error: Mystical error near "$text">
1251
1252           would correctly insert the ambient text string which caused the
1253           error.
1254
1255           There are two other forms of error directive: "<error?>" and
1256           "<error?: msg>". These behave just like "<error>" and
1257           "<error: msg>" respectively, except that they are only triggered if
1258           the rule is "committed" at the time they are encountered. For
1259           example:
1260
1261               Scotty: "Ya kenna change the Laws of Phusics," <commit> name
1262                 | name <commit> ',' 'she's goanta blaw!'
1263                 | <error?>
1264
1265           will only generate an error for a string beginning with "Ya kenna
1266           change the Laws o' Phusics," or a valid name, but which still fails
1267           to match the corresponding production. That is,
1268           "$parser->Scotty("Aye, Cap'ain")" will fail silently (since neither
1269           production will "commit" the rule on that input), whereas
1270           "$parser->Scotty("Mr Spock, ah jest kenna do'ut!")"  will fail with
1271           the error message:
1272
1273                  ERROR (line 1): Invalid Scotty: expected 'she's goanta blaw!'
1274                      but found 'I jest kenna do'ut!' instead.
1275
1276           since in that case the second production would commit after
1277           matching the leading name.
1278
1279           Note that to allow this behaviour, all "<error>" directives which
1280           are the first item in a production automatically uncommit the rule
1281           just long enough to allow their production to be attempted (that
1282           is, when their production fails, the commitment is reinstated so
1283           that subsequent productions are skipped).
1284
1285           In order to permanently uncommit the rule before an error message,
1286           it is necessary to put an explicit "<uncommit>" before the
1287           "<error>". For example:
1288
1289               line: 'Kirk:'  <commit> Kirk
1290               | 'Spock:' <commit> Spock
1291               | 'McCoy:' <commit> McCoy
1292               | <uncommit> <error?> <reject>
1293               | <resync>
1294
1295           Error messages generated by the various "<error...>" directives are
1296           not displayed immediately. Instead, they are "queued" in a buffer
1297           and are only displayed once parsing ultimately fails. Moreover,
1298           "<error...>" directives that cause one production of a rule to fail
1299           are automatically removed from the message queue if another
1300           production subsequently causes the entire rule to succeed.  This
1301           means that you can put "<error...>" directives wherever useful
1302           diagnosis can be done, and only those associated with actual parser
1303           failure will ever be displayed. Also see "Gotchas".
1304
1305           As a general rule, the most useful diagnostics are usually
1306           generated either at the very lowest level within the grammar, or at
1307           the very highest. A good rule of thumb is to identify those
1308           subrules which consist mainly (or entirely) of terminals, and then
1309           put an "<error...>" directive at the end of any other rule which
1310           calls one or more of those subrules.
1311
1312           There is one other situation in which the output of the various
1313           types of error directive is suppressed; namely, when the rule
1314           containing them is being parsed as part of a "look-ahead" (see
1315           "Look-ahead"). In this case, the error directive will still cause
1316           the rule to fail, but will do so silently.
1317
1318           An unconditional "<error>" directive always fails (and hence has no
1319           associated value). This means that encountering such a directive
1320           always causes the production containing it to fail. Hence an
1321           "<error>" directive will inevitably be the last (useful) item of a
1322           rule (a level 3 warning is issued if a production contains items
1323           after an unconditional "<error>" directive).
1324
1325           An "<error?>" directive will succeed (that is: fail to fail :-), if
1326           the current rule is uncommitted when the directive is encountered.
1327           In that case the directive's associated value is zero. Hence, this
1328           type of error directive can be used before the end of a production.
1329           For example:
1330
1331               command: 'do' <commit> something
1332                  | 'report' <commit> something
1333                  | <error?: Syntax error> <error: Unknown command>
1334
1335           Warning: The "<error?>" directive does not mean "always fail (but
1336           do so silently unless committed)". It actually means "only fail
1337           (and report) if committed, otherwise succeed". To achieve the "fail
1338           silently if uncommitted" semantics, it is necessary to use:
1339
1340               rule: item <commit> item(s)
1341               | <error?> <reject>  # FAIL SILENTLY UNLESS COMMITTED
1342
1343           However, because people seem to expect a lone "<error?>" directive
1344           to work like this:
1345
1346               rule: item <commit> item(s)
1347               | <error?: Error message if committed>
1348               | <error:  Error message if uncommitted>
1349
1350           Parse::RecDescent automatically appends a "<reject>" directive if
1351           the "<error?>" directive is the only item in a production. A level
1352           2 warning (see below) is issued when this happens.
1353
1354           The level of error reporting during both parser construction and
1355           parsing is controlled by the presence or absence of four global
1356           variables: $::RD_ERRORS, $::RD_WARN, $::RD_HINT, and <$::RD_TRACE>.
1357           If $::RD_ERRORS is defined (and, by default, it is) then fatal
1358           errors are reported.
1359
1360           Whenever $::RD_WARN is defined, certain non-fatal problems are also
1361           reported.
1362
1363           Warnings have an associated "level": 1, 2, or 3. The higher the
1364           level, the more serious the warning. The value of the corresponding
1365           global variable ($::RD_WARN) determines the lowest level of warning
1366           to be displayed. Hence, to see all warnings, set $::RD_WARN to 1.
1367           To see only the most serious warnings set $::RD_WARN to 3.  By
1368           default $::RD_WARN is initialized to 3, ensuring that serious but
1369           non-fatal errors are automatically reported.
1370
1371           There is also a grammar directive to turn on warnings from within
1372           the grammar: "<warn>". It takes an optional argument, which
1373           specifies the warning level: "<warn: 2>".
1374
1375           See "DIAGNOSTICS" for a list of the varous error and warning
1376           messages that Parse::RecDescent generates when these two variables
1377           are defined.
1378
1379           Defining any of the remaining variables (which are not defined by
1380           default) further increases the amount of information reported.
1381           Defining $::RD_HINT causes the parser generator to offer more
1382           detailed analyses and hints on both errors and warnings.  Note that
1383           setting $::RD_HINT at any point automagically sets $::RD_WARN to 1.
1384           There is also a "<hint>" directive, which can be hard-coded into a
1385           grammar.
1386
1387           Defining $::RD_TRACE causes the parser generator and the parser to
1388           report their progress to STDERR in excruciating detail (although,
1389           without hints unless $::RD_HINT is separately defined). This detail
1390           can be moderated in only one respect: if $::RD_TRACE has an integer
1391           value (N) greater than 1, only the N characters of the "current
1392           parsing context" (that is, where in the input string we are at any
1393           point in the parse) is reported at any time.
1394              > $::RD_TRACE is mainly useful for debugging a grammar that
1395           isn't behaving as you expected it to. To this end, if $::RD_TRACE
1396           is defined when a parser is built, any actual parser code which is
1397           generated is also written to a file named "RD_TRACE" in the local
1398           directory.
1399
1400           There are two directives associated with the $::RD_TRACE variable.
1401           If a grammar contains a "<trace_build>" directive anywhere in its
1402           specification, $::RD_TRACE is turned on during the parser
1403           construction phase.  If a grammar contains a "<trace_parse>"
1404           directive anywhere in its specification, $::RD_TRACE is turned on
1405           during any parse the parser performs.
1406
1407           Note that the four variables belong to the "main" package, which
1408           makes them easier to refer to in the code controlling the parser,
1409           and also makes it easy to turn them into command line flags
1410           ("-RD_ERRORS", "-RD_WARN", "-RD_HINT", "-RD_TRACE") under perl -s.
1411
1412           The corresponding directives are useful to "hardwire" the various
1413           debugging features into a particular grammar (rather than having to
1414           set and reset external variables).
1415
1416       Consistency checks
1417           Whenever a parser is build, Parse::RecDescent carries out a number
1418           of (potentially expensive) consistency checks. These include:
1419           verifying that the grammar is not left-recursive and that no rules
1420           have been left undefined.
1421
1422           These checks are important safeguards during development, but
1423           unnecessary overheads when the grammar is stable and ready to be
1424           deployed. So Parse::RecDescent provides a directive to disable
1425           them: "<nocheck>".
1426
1427           If a grammar contains a "<nocheck>" directive anywhere in its
1428           specification, the extra compile-time checks are by-passed.
1429
1430       Specifying local variables
1431           It is occasionally convenient to specify variables which are local
1432           to a single rule. This may be achieved by including a
1433           "<rulevar:...>" directive anywhere in the rule. For example:
1434
1435               markup: <rulevar: $tag>
1436
1437               markup: tag {($tag=$item[1]) =~ s/^<|>$//g} body[$tag]
1438
1439           The example "<rulevar: $tag>" directive causes a "my" variable
1440           named $tag to be declared at the start of the subroutine
1441           implementing the "markup" rule (that is, before the first
1442           production, regardless of where in the rule it is specified).
1443
1444           Specifically, any directive of the form: "<rulevar:text>" causes a
1445           line of the form "my text;" to be added at the beginning of the
1446           rule subroutine, immediately after the definitions of the following
1447           local variables:
1448
1449               $thisparser $commit
1450               $thisrule   @item
1451               $thisline   @arg
1452               $text   %arg
1453
1454           This means that the following "<rulevar>" directives work as
1455           expected:
1456
1457               <rulevar: $count = 0 >
1458
1459               <rulevar: $firstarg = $arg[0] || '' >
1460
1461               <rulevar: $myItems = \@item >
1462
1463               <rulevar: @context = ( $thisline, $text, @arg ) >
1464
1465               <rulevar: ($name,$age) = $arg{"name","age"} >
1466
1467           If a variable that is also visible to subrules is required, it
1468           needs to be "local"'d, not "my"'d. "rulevar" defaults to "my", but
1469           if "local" is explicitly specified:
1470
1471               <rulevar: local $count = 0 >
1472
1473           then a "local"-ized variable is declared instead, and will be
1474           available within subrules.
1475
1476           Note however that, because all such variables are "my" variables,
1477           their values do not persist between match attempts on a given rule.
1478           To preserve values between match attempts, values can be stored
1479           within the "local" member of the $thisrule object:
1480
1481               countedrule: { $thisrule->{"local"}{"count"}++ }
1482                    <reject>
1483                  | subrule1
1484                  | subrule2
1485                  | <reject: $thisrule->{"local"}{"count"} == 1>
1486                    subrule3
1487
1488           When matching a rule, each "<rulevar>" directive is matched as if
1489           it were an unconditional "<reject>" directive (that is, it causes
1490           any production in which it appears to immediately fail to match).
1491           For this reason (and to improve readability) it is usual to specify
1492           any "<rulevar>" directive in a separate production at the start of
1493           the rule (this has the added advantage that it enables
1494           "Parse::RecDescent" to optimize away such productions, just as it
1495           does for the "<reject>" directive).
1496
1497       Dynamically matched rules
1498           Because regexes and double-quoted strings are interpolated, it is
1499           relatively easy to specify productions with "context sensitive"
1500           tokens. For example:
1501
1502               command:  keyword  body  "end $item[1]"
1503
1504           which ensures that a command block is bounded by a "<keyword>...end
1505           <same keyword>" pair.
1506
1507           Building productions in which subrules are context sensitive is
1508           also possible, via the "<matchrule:...>" directive. This directive
1509           behaves identically to a subrule item, except that the rule which
1510           is invoked to match it is determined by the string specified after
1511           the colon. For example, we could rewrite the "command" rule like
1512           this:
1513
1514               command:  keyword  <matchrule:body>  "end $item[1]"
1515
1516           Whatever appears after the colon in the directive is treated as an
1517           interpolated string (that is, as if it appeared in "qq{...}"
1518           operator) and the value of that interpolated string is the name of
1519           the subrule to be matched.
1520
1521           Of course, just putting a constant string like "body" in a
1522           "<matchrule:...>" directive is of little interest or benefit.  The
1523           power of directive is seen when we use a string that interpolates
1524           to something interesting. For example:
1525
1526               command:    keyword <matchrule:$item[1]_body> "end $item[1]"
1527
1528               keyword:    'while' | 'if' | 'function'
1529
1530               while_body: condition block
1531
1532               if_body:    condition block ('else' block)(?)
1533
1534               function_body:  arglist block
1535
1536           Now the "command" rule selects how to proceed on the basis of the
1537           keyword that is found. It is as if "command" were declared:
1538
1539               command:    'while'    while_body    "end while"
1540                  |    'if'       if_body   "end if"
1541                  |    'function' function_body "end function"
1542
1543           When a "<matchrule:...>" directive is used as a repeated subrule,
1544           the rule name expression is "late-bound". That is, the name of the
1545           rule to be called is re-evaluated each time a match attempt is
1546           made. Hence, the following grammar:
1547
1548               { $::species = 'dogs' }
1549
1550               pair:   'two' <matchrule:$::species>(s)
1551
1552               dogs:   /dogs/ { $::species = 'cats' }
1553
1554               cats:   /cats/
1555
1556           will match the string "two dogs cats cats" completely, whereas it
1557           will only match the string "two dogs dogs dogs" up to the eighth
1558           letter. If the rule name were "early bound" (that is, evaluated
1559           only the first time the directive is encountered in a production),
1560           the reverse behaviour would be expected.
1561
1562           Note that the "matchrule" directive takes a string that is to be
1563           treated as a rule name, not as a rule invocation. That is, it's
1564           like a Perl symbolic reference, not an "eval". Just as you can say:
1565
1566               $subname = 'foo';
1567
1568               # and later...
1569
1570               &{$foo}(@args);
1571
1572           but not:
1573
1574               $subname = 'foo(@args)';
1575
1576               # and later...
1577
1578               &{$foo};
1579
1580           likewise you can say:
1581
1582               $rulename = 'foo';
1583
1584               # and in the grammar...
1585
1586               <matchrule:$rulename>[@args]
1587
1588           but not:
1589
1590               $rulename = 'foo[@args]';
1591
1592               # and in the grammar...
1593
1594               <matchrule:$rulename>
1595
1596       Deferred actions
1597           The "<defer:...>" directive is used to specify an action to be
1598           performed when (and only if!) the current production ultimately
1599           succeeds.
1600
1601           Whenever a "<defer:...>" directive appears, the code it specifies
1602           is converted to a closure (an anonymous subroutine reference) which
1603           is queued within the active parser object. Note that, because the
1604           deferred code is converted to a closure, the values of any "local"
1605           variable (such as $text, <@item>, etc.) are preserved until the
1606           deferred code is actually executed.
1607
1608           If the parse ultimately succeeds and the production in which the
1609           "<defer:...>" directive was evaluated formed part of the successful
1610           parse, then the deferred code is executed immediately before the
1611           parse returns. If however the production which queued a deferred
1612           action fails, or one of the higher-level rules which called that
1613           production fails, then the deferred action is removed from the
1614           queue, and hence is never executed.
1615
1616           For example, given the grammar:
1617
1618               sentence: noun trans noun
1619               | noun intrans
1620
1621               noun:     'the dog'
1622                   { print "$item[1]\t(noun)\n" }
1623               |     'the meat'
1624                   { print "$item[1]\t(noun)\n" }
1625
1626               trans:    'ate'
1627                   { print "$item[1]\t(transitive)\n" }
1628
1629               intrans:  'ate'
1630                   { print "$item[1]\t(intransitive)\n" }
1631                  |  'barked'
1632                   { print "$item[1]\t(intransitive)\n" }
1633
1634           then parsing the sentence "the dog ate" would produce the output:
1635
1636               the dog  (noun)
1637               ate  (transitive)
1638               the dog  (noun)
1639               ate  (intransitive)
1640
1641           This is because, even though the first production of "sentence"
1642           ultimately fails, its initial subrules "noun" and "trans" do match,
1643           and hence they execute their associated actions.  Then the second
1644           production of "sentence" succeeds, causing the actions of the
1645           subrules "noun" and "intrans" to be executed as well.
1646
1647           On the other hand, if the actions were replaced by "<defer:...>"
1648           directives:
1649
1650               sentence: noun trans noun
1651               | noun intrans
1652
1653               noun:     'the dog'
1654                   <defer: print "$item[1]\t(noun)\n" >
1655               |     'the meat'
1656                   <defer: print "$item[1]\t(noun)\n" >
1657
1658               trans:    'ate'
1659                   <defer: print "$item[1]\t(transitive)\n" >
1660
1661               intrans:  'ate'
1662                   <defer: print "$item[1]\t(intransitive)\n" >
1663                  |  'barked'
1664                   <defer: print "$item[1]\t(intransitive)\n" >
1665
1666           the output would be:
1667
1668               the dog  (noun)
1669               ate  (intransitive)
1670
1671           since deferred actions are only executed if they were evaluated in
1672           a production which ultimately contributes to the successful parse.
1673
1674           In this case, even though the first production of "sentence" caused
1675           the subrules "noun" and "trans" to match, that production
1676           ultimately failed and so the deferred actions queued by those
1677           subrules were subsequently disgarded. The second production then
1678           succeeded, causing the entire parse to succeed, and so the deferred
1679           actions queued by the (second) match of the "noun" subrule and the
1680           subsequent match of "intrans" are preserved and eventually
1681           executed.
1682
1683           Deferred actions provide a means of improving the performance of a
1684           parser, by only executing those actions which are part of the final
1685           parse-tree for the input data.
1686
1687           Alternatively, deferred actions can be viewed as a mechanism for
1688           building (and executing) a customized subroutine corresponding to
1689           the given input data, much in the same way that autoactions (see
1690           "Autoactions") can be used to build a customized data structure for
1691           specific input.
1692
1693           Whether or not the action it specifies is ever executed, a
1694           "<defer:...>" directive always succeeds, returning the number of
1695           deferred actions currently queued at that point.
1696
1697       Parsing Perl
1698           Parse::RecDescent provides limited support for parsing subsets of
1699           Perl, namely: quote-like operators, Perl variables, and complete
1700           code blocks.
1701
1702           The "<perl_quotelike>" directive can be used to parse any Perl
1703           quote-like operator: 'a string', "m/a pattern/", "tr{ans}{lation}",
1704           etc.  It does this by calling Text::Balanced::quotelike().
1705
1706           If a quote-like operator is found, a reference to an array of eight
1707           elements is returned. Those elements are identical to the last
1708           eight elements returned by Text::Balanced::extract_quotelike() in
1709           an array context, namely:
1710
1711           [0] the name of the quotelike operator -- 'q', 'qq', 'm', 's', 'tr'
1712               -- if the operator was named; otherwise "undef",
1713
1714           [1] the left delimiter of the first block of the operation,
1715
1716           [2] the text of the first block of the operation (that is, the
1717               contents of a quote, the regex of a match, or substitution or
1718               the target list of a translation),
1719
1720           [3] the right delimiter of the first block of the operation,
1721
1722           [4] the left delimiter of the second block of the operation if
1723               there is one (that is, if it is a "s", "tr", or "y"); otherwise
1724               "undef",
1725
1726           [5] the text of the second block of the operation if there is one
1727               (that is, the replacement of a substitution or the translation
1728               list of a translation); otherwise "undef",
1729
1730           [6] the right delimiter of the second block of the operation (if
1731               any); otherwise "undef",
1732
1733           [7] the trailing modifiers on the operation (if any); otherwise
1734               "undef".
1735
1736           If a quote-like expression is not found, the directive fails with
1737           the usual "undef" value.
1738
1739           The "<perl_variable>" directive can be used to parse any Perl
1740           variable: $scalar, @array, %hash, $ref->{field}[$index], etc.  It
1741           does this by calling Text::Balanced::extract_variable().
1742
1743           If the directive matches text representing a valid Perl variable
1744           specification, it returns that text. Otherwise it fails with the
1745           usual "undef" value.
1746
1747           The "<perl_codeblock>" directive can be used to parse curly-brace-
1748           delimited block of Perl code, such as: { $a = 1; f() =~ m/pat/; }.
1749           It does this by calling Text::Balanced::extract_codeblock().
1750
1751           If the directive matches text representing a valid Perl code block,
1752           it returns that text. Otherwise it fails with the usual "undef"
1753           value.
1754
1755           You can also tell it what kind of brackets to use as the outermost
1756           delimiters. For example:
1757
1758               arglist: <perl_codeblock ()>
1759
1760           causes an arglist to match a perl code block whose outermost
1761           delimiters are "(...)" (rather than the default "{...}").
1762
1763       Constructing tokens
1764           Eventually, Parse::RecDescent will be able to parse tokenized
1765           input, as well as ordinary strings. In preparation for this joyous
1766           day, the "<token:...>" directive has been provided.  This directive
1767           creates a token which will be suitable for input to a
1768           Parse::RecDescent parser (when it eventually supports tokenized
1769           input).
1770
1771           The text of the token is the value of the immediately preceding
1772           item in the production. A "<token:...>" directive always succeeds
1773           with a return value which is the hash reference that is the new
1774           token. It also sets the return value for the production to that
1775           hash ref.
1776
1777           The "<token:...>" directive makes it easy to build a
1778           Parse::RecDescent-compatible lexer in Parse::RecDescent:
1779
1780               my $lexer = new Parse::RecDescent q
1781               {
1782               lex:    token(s)
1783
1784               token:  /a\b/          <token:INDEF>
1785                    |  /the\b/        <token:DEF>
1786                    |  /fly\b/        <token:NOUN,VERB>
1787                    |  /[a-z]+/i { lc $item[1] }  <token:ALPHA>
1788                    |  <error: Unknown token>
1789
1790               };
1791
1792           which will eventually be able to be used with a regular
1793           Parse::RecDescent grammar:
1794
1795               my $parser = new Parse::RecDescent q
1796               {
1797               startrule: subrule1 subrule 2
1798
1799               # ETC...
1800               };
1801
1802           either with a pre-lexing phase:
1803
1804               $parser->startrule( $lexer->lex($data) );
1805
1806           or with a lex-on-demand approach:
1807
1808               $parser->startrule( sub{$lexer->token(\$data)} );
1809
1810           But at present, only the "<token:...>" directive is actually
1811           implemented. The rest is vapourware.
1812
1813       Specifying operations
1814           One of the commonest requirements when building a parser is to
1815           specify binary operators. Unfortunately, in a normal grammar, the
1816           rules for such things are awkward:
1817
1818               disjunction:    conjunction ('or' conjunction)(s?)
1819                   { $return = [ $item[1], @{$item[2]} ] }
1820
1821               conjunction:    atom ('and' atom)(s?)
1822                   { $return = [ $item[1], @{$item[2]} ] }
1823
1824           or inefficient:
1825
1826               disjunction:    conjunction 'or' disjunction
1827                   { $return = [ $item[1], @{$item[2]} ] }
1828                  |    conjunction
1829                   { $return = [ $item[1] ] }
1830
1831               conjunction:    atom 'and' conjunction
1832                   { $return = [ $item[1], @{$item[2]} ] }
1833                  |    atom
1834                   { $return = [ $item[1] ] }
1835
1836           and either way is ugly and hard to get right.
1837
1838           The "<leftop:...>" and "<rightop:...>" directives provide an easier
1839           way of specifying such operations. Using "<leftop:...>" the above
1840           examples become:
1841
1842               disjunction:    <leftop: conjunction 'or' conjunction>
1843               conjunction:    <leftop: atom 'and' atom>
1844
1845           The "<leftop:...>" directive specifies a left-associative binary
1846           operator.  It is specified around three other grammar elements
1847           (typically subrules or terminals), which match the left operand,
1848           the operator itself, and the right operand respectively.
1849
1850           A "<leftop:...>" directive such as:
1851
1852               disjunction:    <leftop: conjunction 'or' conjunction>
1853
1854           is converted to the following:
1855
1856               disjunction:    ( conjunction ('or' conjunction)(s?)
1857                   { $return = [ $item[1], @{$item[2]} ] } )
1858
1859           In other words, a "<leftop:...>" directive matches the left operand
1860           followed by zero or more repetitions of both the operator and the
1861           right operand. It then flattens the matched items into an anonymous
1862           array which becomes the (single) value of the entire "<leftop:...>"
1863           directive.
1864
1865           For example, an "<leftop:...>" directive such as:
1866
1867               output:  <leftop: ident '<<' expr >
1868
1869           when given a string such as:
1870
1871               cout << var << "str" << 3
1872
1873           would match, and $item[1] would be set to:
1874
1875               [ 'cout', 'var', '"str"', '3' ]
1876
1877           In other words:
1878
1879               output:  <leftop: ident '<<' expr >
1880
1881           is equivalent to a left-associative operator:
1882
1883               output:  ident          { $return = [$item[1]]   }
1884                 |  ident '<<' expr        { $return = [@item[1,3]]     }
1885                 |  ident '<<' expr '<<' expr      { $return = [@item[1,3,5]]   }
1886                 |  ident '<<' expr '<<' expr '<<' expr    { $return = [@item[1,3,5,7]] }
1887                 #  ...etc...
1888
1889           Similarly, the "<rightop:...>" directive takes a left operand, an
1890           operator, and a right operand:
1891
1892               assign:  <rightop: var '=' expr >
1893
1894           and converts them to:
1895
1896               assign:  ( (var '=' {$return=$item[1]})(s?) expr
1897                   { $return = [ @{$item[1]}, $item[2] ] } )
1898
1899           which is equivalent to a right-associative operator:
1900
1901               assign:  var        { $return = [$item[1]]       }
1902                 |  var '=' expr       { $return = [@item[1,3]]     }
1903                 |  var '=' var '=' expr   { $return = [@item[1,3,5]]   }
1904                 |  var '=' var '=' var '=' expr   { $return = [@item[1,3,5,7]] }
1905                 #  ...etc...
1906
1907           Note that for both the "<leftop:...>" and "<rightop:...>"
1908           directives, the directive does not normally return the operator
1909           itself, just a list of the operands involved. This is particularly
1910           handy for specifying lists:
1911
1912               list: '(' <leftop: list_item ',' list_item> ')'
1913                   { $return = $item[2] }
1914
1915           There is, however, a problem: sometimes the operator is itself
1916           significant.  For example, in a Perl list a comma and a "=>" are
1917           both valid separators, but the "=>" has additional stringification
1918           semantics.  Hence it's important to know which was used in each
1919           case.
1920
1921           To solve this problem the "<leftop:...>" and "<rightop:...>"
1922           directives do return the operator(s) as well, under two
1923           circumstances.  The first case is where the operator is specified
1924           as a subrule. In that instance, whatever the operator matches is
1925           returned (on the assumption that if the operator is important
1926           enough to have its own subrule, then it's important enough to
1927           return).
1928
1929           The second case is where the operator is specified as a regular
1930           expression. In that case, if the first bracketed subpattern of the
1931           regular expression matches, that matching value is returned (this
1932           is analogous to the behaviour of the Perl "split" function, except
1933           that only the first subpattern is returned).
1934
1935           In other words, given the input:
1936
1937               ( a=>1, b=>2 )
1938
1939           the specifications:
1940
1941               list:      '('  <leftop: list_item separator list_item>  ')'
1942
1943               separator: ',' | '=>'
1944
1945           or:
1946
1947               list:      '('  <leftop: list_item /(,|=>)/ list_item>  ')'
1948
1949           cause the list separators to be interleaved with the operands in
1950           the anonymous array in $item[2]:
1951
1952               [ 'a', '=>', '1', ',', 'b', '=>', '2' ]
1953
1954           But the following version:
1955
1956               list:      '('  <leftop: list_item /,|=>/ list_item>  ')'
1957
1958           returns only the operators:
1959
1960               [ 'a', '1', 'b', '2' ]
1961
1962           Of course, none of the above specifications handle the case of an
1963           empty list, since the "<leftop:...>" and "<rightop:...>" directives
1964           require at least a single right or left operand to match. To
1965           specify that the operator can match "trivially", it's necessary to
1966           add a "(s?)" qualifier to the directive:
1967
1968               list:      '('  <leftop: list_item /(,|=>)/ list_item>(s?)  ')'
1969
1970           Note that in almost all the above examples, the first and third
1971           arguments of the "<leftop:...>" directive were the same subrule.
1972           That is because "<leftop:...>"'s are frequently used to specify
1973           "separated" lists of the same type of item. To make such lists
1974           easier to specify, the following syntax:
1975
1976               list:   element(s /,/)
1977
1978           is exactly equivalent to:
1979
1980               list:   <leftop: element /,/ element>
1981
1982           Note that the separator must be specified as a raw pattern (i.e.
1983           not a string or subrule).
1984
1985       Scored productions
1986           By default, Parse::RecDescent grammar rules always accept the first
1987           production that matches the input. But if two or more productions
1988           may potentially match the same input, choosing the first that does
1989           so may not be optimal.
1990
1991           For example, if you were parsing the sentence "time flies like an
1992           arrow", you might use a rule like this:
1993
1994               sentence: verb noun preposition article noun { [@item] }
1995               | adjective noun verb article noun   { [@item] }
1996               | noun verb preposition article noun { [@item] }
1997
1998           Each of these productions matches the sentence, but the third one
1999           is the most likely interpretation. However, if the sentence had
2000           been "fruit flies like a banana", then the second production is
2001           probably the right match.
2002
2003           To cater for such situtations, the "<score:...>" can be used.  The
2004           directive is equivalent to an unconditional "<reject>", except that
2005           it allows you to specify a "score" for the current production. If
2006           that score is numerically greater than the best score of any
2007           preceding production, the current production is cached for later
2008           consideration. If no later production matches, then the cached
2009           production is treated as having matched, and the value of the item
2010           immediately before its "<score:...>" directive is returned as the
2011           result.
2012
2013           In other words, by putting a "<score:...>" directive at the end of
2014           each production, you can select which production matches using
2015           criteria other than specification order. For example:
2016
2017               sentence: verb noun preposition article noun { [@item] } <score: sensible(@item)>
2018               | adjective noun verb article noun   { [@item] } <score: sensible(@item)>
2019               | noun verb preposition article noun { [@item] } <score: sensible(@item)>
2020
2021           Now, when each production reaches its respective "<score:...>"
2022           directive, the subroutine "sensible" will be called to evaluate the
2023           matched items (somehow). Once all productions have been tried, the
2024           one which "sensible" scored most highly will be the one that is
2025           accepted as a match for the rule.
2026
2027           The variable $score always holds the current best score of any
2028           production, and the variable $score_return holds the corresponding
2029           return value.
2030
2031           As another example, the following grammar matches lines that may be
2032           separated by commas, colons, or semi-colons. This can be tricky if
2033           a colon-separated line also contains commas, or vice versa. The
2034           grammar resolves the ambiguity by selecting the rule that results
2035           in the fewest fields:
2036
2037               line: seplist[sep=>',']  <score: -@{$item[1]}>
2038               | seplist[sep=>':']  <score: -@{$item[1]}>
2039               | seplist[sep=>" "]  <score: -@{$item[1]}>
2040
2041               seplist: <skip:""> <leftop: /[^$arg{sep}]*/ "$arg{sep}" /[^$arg{sep}]*/>
2042
2043           Note the use of negation within the "<score:...>" directive to
2044           ensure that the seplist with the most items gets the lowest score.
2045
2046           As the above examples indicate, it is often the case that all
2047           productions in a rule use exactly the same "<score:...>" directive.
2048           It is tedious to have to repeat this identical directive in every
2049           production, so Parse::RecDescent also provides the
2050           "<autoscore:...>" directive.
2051
2052           If an "<autoscore:...>" directive appears in any production of a
2053           rule, the code it specifies is used as the scoring code for every
2054           production of that rule, except productions that already end with
2055           an explicit "<score:...>" directive. Thus the rules above could be
2056           rewritten:
2057
2058               line: <autoscore: -@{$item[1]}>
2059               line: seplist[sep=>',']
2060               | seplist[sep=>':']
2061               | seplist[sep=>" "]
2062
2063
2064               sentence: <autoscore: sensible(@item)>
2065               | verb noun preposition article noun { [@item] }
2066               | adjective noun verb article noun   { [@item] }
2067               | noun verb preposition article noun { [@item] }
2068
2069           Note that the "<autoscore:...>" directive itself acts as an
2070           unconditional "<reject>", and (like the "<rulevar:...>" directive)
2071           is pruned at compile-time wherever possible.
2072
2073       Dispensing with grammar checks
2074           During the compilation phase of parser construction,
2075           Parse::RecDescent performs a small number of checks on the grammar
2076           it's given. Specifically it checks that the grammar is not left-
2077           recursive, that there are no "insatiable" constructs of the form:
2078
2079               rule: subrule(s) subrule
2080
2081           and that there are no rules missing (i.e. referred to, but never
2082           defined).
2083
2084           These checks are important during development, but can slow down
2085           parser construction in stable code. So Parse::RecDescent provides
2086           the <nocheck> directive to turn them off. The directive can only
2087           appear before the first rule definition, and switches off checking
2088           throughout the rest of the current grammar.
2089
2090           Typically, this directive would be added when a parser has been
2091           thoroughly tested and is ready for release.
2092
2093   Subrule argument lists
2094       It is occasionally useful to pass data to a subrule which is being
2095       invoked. For example, consider the following grammar fragment:
2096
2097           classdecl: keyword decl
2098
2099           keyword:   'struct' | 'class';
2100
2101           decl:      # WHATEVER
2102
2103       The "decl" rule might wish to know which of the two keywords was used
2104       (since it may affect some aspect of the way the subsequent declaration
2105       is interpreted). "Parse::RecDescent" allows the grammar designer to
2106       pass data into a rule, by placing that data in an argument list (that
2107       is, in square brackets) immediately after any subrule item in a
2108       production. Hence, we could pass the keyword to "decl" as follows:
2109
2110           classdecl: keyword decl[ $item[1] ]
2111
2112           keyword:   'struct' | 'class';
2113
2114           decl:      # WHATEVER
2115
2116       The argument list can consist of any number (including zero!) of comma-
2117       separated Perl expressions. In other words, it looks exactly like a
2118       Perl anonymous array reference. For example, we could pass the keyword,
2119       the name of the surrounding rule, and the literal 'keyword' to "decl"
2120       like so:
2121
2122           classdecl: keyword decl[$item[1],$item[0],'keyword']
2123
2124           keyword:   'struct' | 'class';
2125
2126           decl:      # WHATEVER
2127
2128       Within the rule to which the data is passed ("decl" in the above
2129       examples) that data is available as the elements of a local variable
2130       @arg. Hence "decl" might report its intentions as follows:
2131
2132           classdecl: keyword decl[$item[1],$item[0],'keyword']
2133
2134           keyword:   'struct' | 'class';
2135
2136           decl:      { print "Declaring $arg[0] (a $arg[2])\n";
2137                print "(this rule called by $arg[1])" }
2138
2139       Subrule argument lists can also be interpreted as hashes, simply by
2140       using the local variable %arg instead of @arg. Hence we could rewrite
2141       the previous example:
2142
2143           classdecl: keyword decl[keyword => $item[1],
2144               caller  => $item[0],
2145               type    => 'keyword']
2146
2147           keyword:   'struct' | 'class';
2148
2149           decl:      { print "Declaring $arg{keyword} (a $arg{type})\n";
2150                print "(this rule called by $arg{caller})" }
2151
2152       Both @arg and %arg are always available, so the grammar designer may
2153       choose whichever convention (or combination of conventions) suits best.
2154
2155       Subrule argument lists are also useful for creating "rule templates"
2156       (especially when used in conjunction with the "<matchrule:...>"
2157       directive). For example, the subrule:
2158
2159           list:     <matchrule:$arg{rule}> /$arg{sep}/ list[%arg]
2160               { $return = [ $item[1], @{$item[3]} ] }
2161           |     <matchrule:$arg{rule}>
2162               { $return = [ $item[1]] }
2163
2164       is a handy template for the common problem of matching a separated
2165       list.  For example:
2166
2167           function: 'func' name '(' list[rule=>'param',sep=>';'] ')'
2168
2169           param:    list[rule=>'name',sep=>','] ':' typename
2170
2171           name:     /\w+/
2172
2173           typename: name
2174
2175       When a subrule argument list is used with a repeated subrule, the
2176       argument list goes before the repetition specifier:
2177
2178           list:   /some|many/ thing[ $item[1] ](s)
2179
2180       The argument list is "late bound". That is, it is re-evaluated for
2181       every repetition of the repeated subrule.  This means that each
2182       repeated attempt to match the subrule may be passed a completely
2183       different set of arguments if the value of the expression in the
2184       argument list changes between attempts. So, for example, the grammar:
2185
2186           { $::species = 'dogs' }
2187
2188           pair:   'two' animal[$::species](s)
2189
2190           animal: /$arg[0]/ { $::species = 'cats' }
2191
2192       will match the string "two dogs cats cats" completely, whereas it will
2193       only match the string "two dogs dogs dogs" up to the eighth letter. If
2194       the value of the argument list were "early bound" (that is, evaluated
2195       only the first time a repeated subrule match is attempted), one would
2196       expect the matching behaviours to be reversed.
2197
2198       Of course, it is possible to effectively "early bind" such argument
2199       lists by passing them a value which does not change on each repetition.
2200       For example:
2201
2202           { $::species = 'dogs' }
2203
2204           pair:   'two' { $::species } animal[$item[2]](s)
2205
2206           animal: /$arg[0]/ { $::species = 'cats' }
2207
2208       Arguments can also be passed to the start rule, simply by appending
2209       them to the argument list with which the start rule is called (after
2210       the "line number" parameter). For example, given:
2211
2212           $parser = new Parse::RecDescent ( $grammar );
2213
2214           $parser->data($text, 1, "str", 2, \@arr);
2215
2216           #         ^^^^^  ^  ^^^^^^^^^^^^^^^
2217           #       |    |     |
2218           # TEXT TO BE PARSED  |     |
2219           # STARTING LINE NUMBER     |
2220           # ELEMENTS OF @arg WHICH IS PASSED TO RULE data
2221
2222       then within the productions of the rule "data", the array @arg will
2223       contain "("str", 2, \@arr)".
2224
2225   Alternations
2226       Alternations are implicit (unnamed) rules defined as part of a
2227       production. An alternation is defined as a series of '|'-separated
2228       productions inside a pair of round brackets. For example:
2229
2230           character: 'the' ( good | bad | ugly ) /dude/
2231
2232       Every alternation implicitly defines a new subrule, whose
2233       automatically-generated name indicates its origin:
2234       "_alternation_<I>_of_production_<P>_of_rule<R>" for the appropriate
2235       values of <I>, <P>, and <R>. A call to this implicit subrule is then
2236       inserted in place of the brackets. Hence the above example is merely a
2237       convenient short-hand for:
2238
2239           character: 'the'
2240              _alternation_1_of_production_1_of_rule_character
2241              /dude/
2242
2243           _alternation_1_of_production_1_of_rule_character:
2244              good | bad | ugly
2245
2246       Since alternations are parsed by recursively calling the parser
2247       generator, any type(s) of item can appear in an alternation. For
2248       example:
2249
2250           character: 'the' ( 'high' "plains"  # Silent, with poncho
2251                | /no[- ]name/ # Silent, no poncho
2252                | vengeance_seeking    # Poncho-optional
2253                | <error>
2254                ) drifter
2255
2256       In this case, if an error occurred, the automatically generated message
2257       would be:
2258
2259           ERROR (line <N>): Invalid implicit subrule: Expected
2260                 'high' or /no[- ]name/ or generic,
2261                 but found "pacifist" instead
2262
2263       Since every alternation actually has a name, it's even possible to
2264       extend or replace them:
2265
2266           parser->Replace(
2267           "_alternation_1_of_production_1_of_rule_character:
2268               'generic Eastwood'"
2269               );
2270
2271       More importantly, since alternations are a form of subrule, they can be
2272       given repetition specifiers:
2273
2274           character: 'the' ( good | bad | ugly )(?) /dude/
2275
2276   Incremental Parsing
2277       "Parse::RecDescent" provides two methods - "Extend" and "Replace" -
2278       which can be used to alter the grammar matched by a parser. Both
2279       methods take the same argument as "Parse::RecDescent::new", namely a
2280       grammar specification string
2281
2282       "Parse::RecDescent::Extend" interprets the grammar specification and
2283       adds any productions it finds to the end of the rules for which they
2284       are specified. For example:
2285
2286           $add = "name: 'Jimmy-Bob' | 'Bobby-Jim'\ndesc: colour /necks?/";
2287           parser->Extend($add);
2288
2289       adds two productions to the rule "name" (creating it if necessary) and
2290       one production to the rule "desc".
2291
2292       "Parse::RecDescent::Replace" is identical, except that it first resets
2293       are rule specified in the additional grammar, removing any existing
2294       productions.  Hence after:
2295
2296           $add = "name: 'Jimmy-Bob' | 'Bobby-Jim'\ndesc: colour /necks?/";
2297           parser->Replace($add);
2298
2299       are are only valid "name"s and the one possible description.
2300
2301       A more interesting use of the "Extend" and "Replace" methods is to call
2302       them inside the action of an executing parser. For example:
2303
2304           typedef: 'typedef' type_name identifier ';'
2305                  { $thisparser->Extend("type_name: '$item[3]'") }
2306              | <error>
2307
2308           identifier: ...!type_name /[A-Za-z_]w*/
2309
2310       which automatically prevents type names from being typedef'd, or:
2311
2312           command: 'map' key_name 'to' abort_key
2313                  { $thisparser->Replace("abort_key: '$item[2]'") }
2314              | 'map' key_name 'to' key_name
2315                  { map_key($item[2],$item[4]) }
2316              | abort_key
2317                  { exit if confirm("abort?") }
2318
2319           abort_key: 'q'
2320
2321           key_name: ...!abort_key /[A-Za-z]/
2322
2323       which allows the user to change the abort key binding, but not to
2324       unbind it.
2325
2326       The careful use of such constructs makes it possible to reconfigure a a
2327       running parser, eliminating the need for semantic feedback by providing
2328       syntactic feedback instead. However, as currently implemented,
2329       "Replace()" and "Extend()" have to regenerate and re-"eval" the entire
2330       parser whenever they are called. This makes them quite slow for large
2331       grammars.
2332
2333       In such cases, the judicious use of an interpolated regex is likely to
2334       be far more efficient:
2335
2336           typedef: 'typedef' type_name/ identifier ';'
2337                  { $thisparser->{local}{type_name} .= "|$item[3]" }
2338              | <error>
2339
2340           identifier: ...!type_name /[A-Za-z_]w*/
2341
2342           type_name: /$thisparser->{local}{type_name}/
2343
2344   Precompiling parsers
2345       Normally Parse::RecDescent builds a parser from a grammar at run-time.
2346       That approach simplifies the design and implementation of parsing code,
2347       but has the disadvantage that it slows the parsing process down - you
2348       have to wait for Parse::RecDescent to build the parser every time the
2349       program runs. Long or complex grammars can be particularly slow to
2350       build, leading to unacceptable delays at start-up.
2351
2352       To overcome this, the module provides a way of "pre-building" a parser
2353       object and saving it in a separate module. That module can then be used
2354       to create clones of the original parser.
2355
2356       A grammar may be precompiled using the "Precompile" class method.  For
2357       example, to precompile a grammar stored in the scalar $grammar, and
2358       produce a class named PreGrammar in a module file named PreGrammar.pm,
2359       you could use:
2360
2361           use Parse::RecDescent;
2362
2363           Parse::RecDescent->Precompile($grammar, "PreGrammar");
2364
2365       The first argument is the grammar string, the second is the name of the
2366       class to be built. The name of the module file is generated
2367       automatically by appending ".pm" to the last element of the class name.
2368       Thus
2369
2370           Parse::RecDescent->Precompile($grammar, "My::New::Parser");
2371
2372       would produce a module file named Parser.pm.
2373
2374       It is somewhat tedious to have to write a small Perl program just to
2375       generate a precompiled grammar class, so Parse::RecDescent has some
2376       special magic that allows you to do the job directly from the command-
2377       line.
2378
2379       If your grammar is specified in a file named grammar, you can generate
2380       a class named Yet::Another::Grammar like so:
2381
2382           > perl -MParse::RecDescent - grammar Yet::Another::Grammar
2383
2384       This would produce a file named Grammar.pm containing the full
2385       definition of a class called Yet::Another::Grammar. Of course, to use
2386       that class, you would need to put the Grammar.pm file in a directory
2387       named Yet/Another, somewhere in your Perl include path.
2388
2389       Having created the new class, it's very easy to use it to build a
2390       parser. You simply "use" the new module, and then call its "new" method
2391       to create a parser object. For example:
2392
2393           use Yet::Another::Grammar;
2394           my $parser = Yet::Another::Grammar->new();
2395
2396       The effect of these two lines is exactly the same as:
2397
2398           use Parse::RecDescent;
2399
2400           open GRAMMAR_FILE, "grammar" or die;
2401           local $/;
2402           my $grammar = <GRAMMAR_FILE>;
2403
2404           my $parser = Parse::RecDescent->new($grammar);
2405
2406       only considerably faster.
2407
2408       Note however that the parsers produced by either approach are exactly
2409       the same, so whilst precompilation has an effect on set-up speed, it
2410       has no effect on parsing speed. RecDescent 2.0 will address that
2411       problem.
2412

GOTCHAS

2414       This section describes common mistakes that grammar writers seem to
2415       make on a regular basis.
2416
2417   1. Expecting an error to always invalidate a parse
2418       A common mistake when using error messages is to write the grammar like
2419       this:
2420
2421           file: line(s)
2422
2423           line: line_type_1
2424           | line_type_2
2425           | line_type_3
2426           | <error>
2427
2428       The expectation seems to be that any line that is not of type 1, 2 or 3
2429       will invoke the "<error>" directive and thereby cause the parse to
2430       fail.
2431
2432       Unfortunately, that only happens if the error occurs in the very first
2433       line.  The first rule states that a "file" is matched by one or more
2434       lines, so if even a single line succeeds, the first rule is completely
2435       satisfied and the parse as a whole succeeds. That means that any error
2436       messages generated by subsequent failures in the "line" rule are
2437       quietly ignored.
2438
2439       Typically what's really needed is this:
2440
2441           file: line(s) eofile    { $return = $item[1] }
2442
2443           line: line_type_1
2444           | line_type_2
2445           | line_type_3
2446           | <error>
2447
2448           eofile: /^\Z/
2449
2450       The addition of the "eofile" subrule  to the first production means
2451       that a file only matches a series of successful "line" matches that
2452       consume the complete input text. If any input text remains after the
2453       lines are matched, there must have been an error in the last "line". In
2454       that case the "eofile" rule will fail, causing the entire "file" rule
2455       to fail too.
2456
2457       Note too that "eofile" must match "/^\Z/" (end-of-text), not "/^\cZ/"
2458       or "/^\cD/" (end-of-file).
2459
2460       And don't forget the action at the end of the production. If you just
2461       write:
2462
2463           file: line(s) eofile
2464
2465       then the value returned by the "file" rule will be the value of its
2466       last item: "eofile". Since "eofile" always returns an empty string on
2467       success, that will cause the "file" rule to return that empty string.
2468       Apart from returning the wrong value, returning an empty string will
2469       trip up code such as:
2470
2471           $parser->file($filetext) || die;
2472
2473       (since "" is false).
2474
2475       Remember that Parse::RecDescent returns undef on failure, so the only
2476       safe test for failure is:
2477
2478           defined($parser->file($filetext)) || die;
2479
2480   2. Using a "return" in an action
2481       An action is like a "do" block inside the subroutine implementing the
2482       surrounding rule. So if you put a "return" statement in an action:
2483
2484           range: '(' start '..' end )'
2485               { return $item{end} }
2486              /\s+/
2487
2488       that subroutine will immediately return, without checking the rest of
2489       the items in the current production (e.g. the "/\s+/") and without
2490       setting up the necessary data structures to tell the parser that the
2491       rule has succeeded.
2492
2493       The correct way to set a return value in an action is to set the
2494       $return variable:
2495
2496           range: '(' start '..' end )'
2497               { $return = $item{end} }
2498              /\s+/
2499

DIAGNOSTICS

2501       Diagnostics are intended to be self-explanatory (particularly if you
2502       use -RD_HINT (under perl -s) or define $::RD_HINT inside the program).
2503
2504       "Parse::RecDescent" currently diagnoses the following:
2505
2506       ·   Invalid regular expressions used as pattern terminals (fatal
2507           error).
2508
2509       ·   Invalid Perl code in code blocks (fatal error).
2510
2511       ·   Lookahead used in the wrong place or in a nonsensical way (fatal
2512           error).
2513
2514       ·   "Obvious" cases of left-recursion (fatal error).
2515
2516       ·   Missing or extra components in a "<leftop>" or "<rightop>"
2517           directive.
2518
2519       ·   Unrecognisable components in the grammar specification (fatal
2520           error).
2521
2522       ·   "Orphaned" rule components specified before the first rule (fatal
2523           error) or after an "<error>" directive (level 3 warning).
2524
2525       ·   Missing rule definitions (this only generates a level 3 warning,
2526           since you may be providing them later via
2527           "Parse::RecDescent::Extend()").
2528
2529       ·   Instances where greedy repetition behaviour will almost certainly
2530           cause the failure of a production (a level 3 warning - see "ON-
2531           GOING ISSUES AND FUTURE DIRECTIONS" below).
2532
2533       ·   Attempts to define rules named 'Replace' or 'Extend', which cannot
2534           be called directly through the parser object because of the
2535           predefined meaning of "Parse::RecDescent::Replace" and
2536           "Parse::RecDescent::Extend". (Only a level 2 warning is generated,
2537           since such rules can still be used as subrules).
2538
2539       ·   Productions which consist of a single "<error?>" directive, and
2540           which therefore may succeed unexpectedly (a level 2 warning, since
2541           this might conceivably be the desired effect).
2542
2543       ·   Multiple consecutive lookahead specifiers (a level 1 warning only,
2544           since their effects simply accumulate).
2545
2546       ·   Productions which start with a "<reject>" or "<rulevar:...>"
2547           directive. Such productions are optimized away (a level 1 warning).
2548
2549       ·   Rules which are autogenerated under $::AUTOSTUB (a level 1
2550           warning).
2551

AUTHOR

2553       Damian Conway (damian@conway.org)
2554

BUGS AND IRRITATIONS

2556       There are undoubtedly serious bugs lurking somewhere in this much code
2557       :-) Bug reports and other feedback are most welcome.
2558
2559       Ongoing annoyances include:
2560
2561       ·   There's no support for parsing directly from an input stream.  If
2562           and when the Perl Gods give us regular expressions on streams, this
2563           should be trivial (ahem!) to implement.
2564
2565       ·   The parser generator can get confused if actions aren't properly
2566           closed or if they contain particularly nasty Perl syntax errors
2567           (especially unmatched curly brackets).
2568
2569       ·   The generator only detects the most obvious form of left recursion
2570           (potential recursion on the first subrule in a rule). More subtle
2571           forms of left recursion (for example, through the second item in a
2572           rule after a "zero" match of a preceding "zero-or-more" repetition,
2573           or after a match of a subrule with an empty production) are not
2574           found.
2575
2576       ·   Instead of complaining about left-recursion, the generator should
2577           silently transform the grammar to remove it. Don't expect this
2578           feature any time soon as it would require a more sophisticated
2579           approach to parser generation than is currently used.
2580
2581       ·   The generated parsers don't always run as fast as might be wished.
2582
2583       ·   The meta-parser should be bootstrapped using "Parse::RecDescent"
2584           :-)
2585

ON-GOING ISSUES AND FUTURE DIRECTIONS

2587       1.  Repetitions are "incorrigibly greedy" in that they will eat
2588           everything they can and won't backtrack if that behaviour causes a
2589           production to fail needlessly.  So, for example:
2590
2591               rule: subrule(s) subrule
2592
2593           will never succeed, because the repetition will eat all the
2594           subrules it finds, leaving none to match the second item. Such
2595           constructions are relatively rare (and "Parse::RecDescent::new"
2596           generates a warning whenever they occur) so this may not be a
2597           problem, especially since the insatiable behaviour can be overcome
2598           "manually" by writing:
2599
2600               rule: penultimate_subrule(s) subrule
2601
2602               penultimate_subrule: subrule ...subrule
2603
2604           The issue is that this construction is exactly twice as expensive
2605           as the original, whereas backtracking would add only 1/N to the
2606           cost (for matching N repetitions of "subrule"). I would welcome
2607           feedback on the need for backtracking; particularly on cases where
2608           the lack of it makes parsing performance problematical.
2609
2610       2.  Having opened that can of worms, it's also necessary to consider
2611           whether there is a need for non-greedy repetition specifiers.
2612           Again, it's possible (at some cost) to manually provide the
2613           required functionality:
2614
2615               rule: nongreedy_subrule(s) othersubrule
2616
2617               nongreedy_subrule: subrule ...!othersubrule
2618
2619           Overall, the issue is whether the benefit of this extra
2620           functionality outweighs the drawbacks of further complicating the
2621           (currently minimalist) grammar specification syntax, and (worse)
2622           introducing more overhead into the generated parsers.
2623
2624       3.  An "<autocommit>" directive would be nice. That is, it would be
2625           useful to be able to say:
2626
2627               command: <autocommit>
2628               command: 'find' name
2629                  | 'find' address
2630                  | 'do' command 'at' time 'if' condition
2631                  | 'do' command 'at' time
2632                  | 'do' command
2633                  | unusual_command
2634
2635           and have the generator work out that this should be "pruned" thus:
2636
2637               command: 'find' name
2638                  | 'find' <commit> address
2639                  | 'do' <commit> command <uncommit>
2640                   'at' time
2641                   'if' <commit> condition
2642                  | 'do' <commit> command <uncommit>
2643                   'at' <commit> time
2644                  | 'do' <commit> command
2645                  | unusual_command
2646
2647           There are several issues here. Firstly, should the "<autocommit>"
2648           automatically install an "<uncommit>" at the start of the last
2649           production (on the grounds that the "command" rule doesn't know
2650           whether an "unusual_command" might start with "find" or "do") or
2651           should the "unusual_command" subgraph be analysed (to see if it
2652           might be viable after a "find" or "do")?
2653
2654           The second issue is how regular expressions should be treated. The
2655           simplest approach would be simply to uncommit before them (on the
2656           grounds that they might match). Better efficiency would be obtained
2657           by analyzing all preceding literal tokens to determine whether the
2658           pattern would match them.
2659
2660           Overall, the issues are: can such automated "pruning" approach a
2661           hand-tuned version sufficiently closely to warrant the extra set-up
2662           expense, and (more importantly) is the problem important enough to
2663           even warrant the non-trivial effort of building an automated
2664           solution?
2665

SUPPORT

2667   Mailing List
2668       Visit <http://www.perlfoundation.org/perl5/index.cgi?parse_recdescent>
2669       to sign up for the mailing list.
2670
2671       <http://www.PerlMonks.org> is also a good place to ask questions.
2672
2673   FAQ
2674       Visit Parse::RecDescent::FAQ for answers to frequently (and not so
2675       frequently) asked questions about Parse::RecDescent
2676

SEE ALSO

2678       Regexp::Grammars provides Parse::RecDescent style parsing using native
2679       Perl 5.10 regular expressions.
2680
2682       Copyright (c) 1997-2007, Damian Conway "<DCONWAY@CPAN.org>". All rights
2683       reserved.
2684
2685       This module is free software; you can redistribute it and/or modify it
2686       under the same terms as Perl itself. See perlartistic.
2687

DISCLAIMER OF WARRANTY

2689       BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
2690       FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT
2691       WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
2692       PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND,
2693       EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
2694       WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
2695       ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH
2696       YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
2697       NECESSARY SERVICING, REPAIR, OR CORRECTION.
2698
2699       IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
2700       WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
2701       REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE
2702       TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR
2703       CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
2704       SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
2705       RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
2706       FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
2707       SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
2708       DAMAGES.
2709
2710
2711
2712perl v5.10.1                      2009-08-28              Parse::RecDescent(3)
Impressum