1Marpa::XS(3)          User Contributed Perl Documentation         Marpa::XS(3)
2
3
4

NAME

6       Marpa::XS - XS version of Marpa
7

SYNOPSIS

9           use Marpa::XS;
10
11           my $grammar = Marpa::XS::Grammar->new(
12               {   start   => 'Expression',
13                   actions => 'My_Actions',
14                   default_action => 'first_arg',
15                   rules   => [
16                       { lhs => 'Expression', rhs => [qw/Term/] },
17                       { lhs => 'Term', rhs => [qw/Factor/] },
18                       { lhs => 'Factor', rhs => [qw/Number/] },
19                       { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
20                       {   lhs    => 'Factor',
21                           rhs    => [qw/Factor Multiply Factor/],
22                           action => 'do_multiply'
23                       },
24                   ],
25               }
26           );
27
28           $grammar->precompute();
29
30           my $recce = Marpa::XS::Recognizer->new( { grammar => $grammar } );
31
32           $recce->read( 'Number', 42 );
33           $recce->read( 'Multiply', );
34           $recce->read( 'Number', 1 );
35           $recce->read( 'Add', );
36           $recce->read( 'Number', 7 );
37
38           sub My_Actions::do_add {
39               my ( undef, $t1, undef, $t2 ) = @_;
40               return $t1 + $t2;
41           }
42
43           sub My_Actions::do_multiply {
44               my ( undef, $t1, undef, $t2 ) = @_;
45               return $t1 * $t2;
46           }
47
48           sub My_Actions::first_arg { shift; return shift; }
49
50           my $value_ref = $recce->value;
51           my $value = $value_ref ? ${$value_ref} : 'No Parse';
52

DESCRIPTION

54   Overview
55       Marpa parses any language whose grammar can be written in BNF.  That
56       includes recursive grammars, ambiguous grammars, infinitely ambiguous
57       grammars and grammars with useless or empty productions.
58
59       This document contains a top-level overview of the API for the Marpa
60       parse engine.  The two examples in this document show the typical flows
61       of Marpa method calls.  This document will use these examples to
62       describe the basic features of Marpa in semi-tutorial fashion.  Marpa's
63       advanced features, and full reference details of all features, can be
64       found in the other Marpa API documents.
65
66   The three phases
67       A parser needs to:
68
69       •   Accept a grammar.
70
71       •   Read input.
72
73       •   Return values from the parses, according to a semantics.
74
75       In Marpa these three tasks are, for the most part, distinct phases.
76       Grammars are "Marpa::XS::Grammar" objects.  The reading of input and
77       the evaluation of the parse according to the semantics is performed by
78       "Marpa::XS::Recognizer" objects.
79

EXAMPLE 1: A SIMPLE CALCULATOR

81       The synopsis shows the code for a very simple calculator.  It handles
82       only addition and multiplication of integers.  This section explains,
83       line by line, how it works.
84
85   Marpa::XS::Grammar::new
86           my $grammar = Marpa::XS::Grammar->new(
87               {   start   => 'Expression',
88                   actions => 'My_Actions',
89                   default_action => 'first_arg',
90                   rules   => [
91                       { lhs => 'Expression', rhs => [qw/Term/] },
92                       { lhs => 'Term', rhs => [qw/Factor/] },
93                       { lhs => 'Factor', rhs => [qw/Number/] },
94                       { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
95                       {   lhs    => 'Factor',
96                           rhs    => [qw/Factor Multiply Factor/],
97                           action => 'do_multiply'
98                       },
99                   ],
100               }
101           );
102
103       Marpa grammars are "Marpa::XS::Grammar" objects.  They are created with
104       the Marpa::XS::Grammar::new constructor.  The arguments to
105       Marpa::XS::Grammar::new are references to hashes of named arguments.
106       In the key/value pairs of these hashes, the hash key is the name of the
107       argument, and the hash value is the value of the named argument.
108
109       The start named argument
110
111           start => 'Expression',
112
113       The "start" named argument is required.  Its value is a string
114       containing the name of the grammar's start symbol.
115
116       Named arguments for the semantics
117
118                   actions => 'My_Actions',
119                   default_action => 'first_arg',
120
121       The "actions" and "default_action" named arguments specify semantics.
122       Their argument values are strings, which acquire their semantics during
123       evaluation.
124
125       Evaluation will be described later.  Peeking ahead, the
126       "default_action" named argument will be interpreted as an action name.
127       This action name will resolve to an action -- a Perl closure that
128       implements semantics.  The action specified by "default_action" is used
129       as the action for rules with no action of their own.  "actions"
130       provides the name of a Perl package where Marpa will look for its
131       actions.
132
133       The rules named argument
134
135           rules => [
136               { lhs => 'Expression', rhs => [qw/Term/] },
137               { lhs => 'Term',       rhs => [qw/Factor/] },
138               { lhs => 'Factor',     rhs => [qw/Number/] },
139               { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
140               {   lhs    => 'Factor',
141                   rhs    => [qw/Factor Multiply Factor/],
142                   action => 'do_multiply'
143               },
144           ],
145
146       The value of the "rules" named argument is a reference to an array of
147       rule descriptors.  In this example, all the rule descriptors are in the
148       "long" form -- they are references to hashes of rule properties.  In
149       each key/value pair of a rule descriptor hash, the key is the name of a
150       rule property, and the hash value is the value of that rule property.
151
152       The lhs property
153
154       The value of the "lhs" rule property must be a string containing the
155       name of the rule's left hand side symbol.  Every Marpa rule must have a
156       left hand side symbol.
157
158       The rhs property
159
160       The value of the "rhs" property is a reference to an array of strings
161       containing names of the rule's right hand symbols, in order.  This
162       array may be zero length, in which case this is an empty rule -- a rule
163       with no symbols on the right hand side.  There are no empty rules in
164       this example.
165
166       The action property
167
168       The value of the "action" rule property is a string.  Peeking ahead,
169       each "action" property string will be interpreted as an action name.
170       This action name will be resolved to a Perl closure that implements the
171       rule's semantics.
172
173   Marpa::XS::Grammar::precompute
174           $grammar->precompute();
175
176       Before a Marpa grammar object can be used by a Marpa recognizer, it
177       must be precomputed.  Precomputation compiles data structures that the
178       recognizer will need.
179
180   Marpa::XS::Recognizer::new
181           my $recce = Marpa::XS::Recognizer->new( { grammar => $grammar } );
182
183       "Marpa::XS::Recognizer::new" creates a new recognizer.  Its arguments
184       are references to hashes of named arguments.  In this example the only
185       named argument is the required argument: ""grammar"".  The value of the
186       "grammar" named argument must be a precomputed Marpa grammar.
187
188   Marpa::XS::Recognizer::read
189           $recce->read( 'Number', 42 );
190           $recce->read( 'Multiply', );
191           $recce->read( 'Number', 1 );
192           $recce->read( 'Add', );
193           $recce->read( 'Number', 7 );
194
195       The "Marpa::XS::Recognizer::read" method takes two arguments, a token
196       name and a token value.  The token name must be the name of a valid
197       terminal symbol in the grammar.  By default all symbols are valid as
198       terminal symbols, unless a grammar contains empty rules.
199
200       The grammars in the examples in this document do not contain empty
201       rules, and therefore are free to use any symbol in the grammar as a
202       token name.  For more on terminals, including how to explicitly mark
203       terminal symbols, see "Terminals" in Marpa::XS::Grammar.
204
205       The token value must be a Perl scalar, but otherwise its form and
206       semantics are entirely up to the application.  If the token value is
207       omitted, it is a Perl "undef".  In the calculator example, the values
208       of the ""Add"" and ""Multiply"" tokens are never used, so they are
209       allowed to default to "undef".
210
211   Marpa::XS::Recognizer::value
212           my $value_ref = $recce->value;
213           my $value = $value_ref ? ${$value_ref} : 'No Parse';
214
215       The "Marpa::XS::Recognizer::value" method returns a reference to the
216       parse result's value, if there was a parse result.  If there was no
217       parse result, "Marpa::XS::Recognizer::value" returns "undef".
218
219   Resolving the semantics
220       The first thing "Marpa::XS::Recognizer::value" needs to do is to
221       resolve the semantics.  Resolving the semantics means mapping the
222       action names into actions.  Actions are Perl closures which directly
223       implement semantics.  In this example, the "actions" named argument is
224       specified.  "actions" is a Perl package name.  Marpa will look for
225       actions in that package.
226
227           actions => 'My_Actions',
228
229           { lhs => 'Factor', rhs => [qw/Factor Multiply Factor/], action => 'do_multiply' },
230
231       For example, the "action" property for the above rule is
232       ""do_multiply"" and the "actions" named argument to the grammar was
233       ""My_Actions"".  So Marpa looks for a closure whose fully qualified
234       name is "My_Actions::do_multiply", which it finds:
235
236           sub My_Actions::do_multiply {
237               my ( undef, $t1, undef, $t2 ) = @_;
238               return $t1 * $t2;
239           }
240
241       Rules do not always have "action" properties.  That is the case with
242       these rules in this example:
243
244           { lhs => 'Expression', rhs => [qw/Term/] },
245           { lhs => 'Term', rhs => [qw/Factor/] },
246           { lhs => 'Factor', rhs => [qw/Number/] },
247
248       Where there is no "action" rule property, Marpa tries to use the "lhs"
249       property as an action name.  When Marpa cannot resolve the "lhs"
250       property as an action name, it will fall back to using the default
251       action for the grammar.
252
253       For example, in the first rule in the above display, Marpa will look
254       for a Perl closure with the fully qualified name
255       ""My_Actions::Expression"".  When Marpa does not find a closure by that
256       name, Marpa will fall back to trying to use the default action.
257
258       The other two rules in the above display work similarly.  Marpa will
259       look for Perl closures named ""My_Actions::Term"" and
260       ""My_Actions::Factor"".  It will not find them and will fall back to
261       trying to use the default action, as described next.
262
263           default_action => 'first_arg',
264
265       The "default_action" named argument is resolved in the same way as are
266       the "action" properties of the rules.  In this example, default_action
267       is specified as ""first_arg"" and resolves to "My_Actions::first_arg".
268
269   Actions
270           sub My_Actions::first_arg { shift; return shift; }
271
272           sub My_Actions::do_add {
273               my ( undef, $t1, undef, $t2 ) = @_;
274               return $t1 + $t2;
275           }
276
277       Value actions are Perl closures used as callbacks.  Value actions are
278       called when nodes in a parse tree are evaluated.  A value action
279       receives one or more arguments.  The first argument to a value action
280       is always a per-parse-tree object, which the callbacks can use as a
281       scratchpad.  In these examples, the per-parse-tree object is not used.
282
283       For a non-empty rule, the second and any subsequent arguments to the
284       callback are the values, in lexical order, of the symbols on the right
285       hand side of the rule.  If the action is for an empty rule, the per-
286       parse-tree object will be its only argument.
287
288       Every value action is expected to return a value.  With one exception,
289       this value is passed up to a parent node as an argument.  The exception
290       is the value for the start rule.  The return value for the start rule
291       becomes the parse result.
292
293       Rules with no action specified for them take their semantics from the
294       "default_action" named argument.  If there is no default action for a
295       grammar, rules with no action specified for them return a Perl "undef".
296

EXAMPLE 2: AN AMBIGUOUS PARSE

298       This is the same calculator as before, rewritten to be ambiguous.
299       Rather than give multiplication precedence over addition, the rewritten
300       calculator allows any order of operations.  In this example, the
301       actions ("My_Actions::do_add", etc.)  and the @tokens array remain the
302       same as before.
303
304       Eliminating precedence makes the grammar shorter, but it also means
305       there can be multiple parse trees, and that the different parse trees
306       can have different parse results.  In this application we decide, for
307       each input, to return every one of the parse results.
308
309           use Marpa::XS;
310
311           my $ambiguous_grammar = Marpa::XS::Grammar->new(
312               {   start   => 'E',
313                   actions => 'My_Actions',
314                   rules   => [
315                       [ 'E', [qw/E Add E/],      'do_add' ],
316                       [ 'E', [qw/E Multiply E/], 'do_multiply' ],
317                       [ 'E', [qw/Number/],       ],
318                   ],
319                   default_action => 'first_arg',
320               }
321           );
322
323           $ambiguous_grammar->precompute();
324
325           my $ambiguous_recce =
326               Marpa::XS::Recognizer->new( { grammar => $ambiguous_grammar } );
327
328           $ambiguous_recce->read( 'Number', 42 );
329           $ambiguous_recce->read( 'Multiply', );
330           $ambiguous_recce->read( 'Number', 1 );
331           $ambiguous_recce->read( 'Add', );
332           $ambiguous_recce->read( 'Number', 7 );
333
334           my @values = ();
335           while ( defined( my $ambiguous_value_ref = $ambiguous_recce->value() ) ) {
336               push @values, ${$ambiguous_value_ref};
337           }
338
339   Short form rule descriptors
340           rules => [
341               [ 'E', [qw/E Add E/],      'do_add' ],
342               [ 'E', [qw/E Multiply E/], 'do_multiply' ],
343               [ 'E', [qw/Number/], ],
344           ],
345
346       The rule descriptors in the ambiguous example demonstrate the "short"
347       or array form of rule descriptors.  Array form rule descriptors are
348       references to arrays.  Here the elements are, in order, the "lhs"
349       property, the "rhs" property, and the "action" property.
350
351   Marpa::XS::Recognizer::value
352           my @values = ();
353           while ( defined( my $ambiguous_value_ref = $ambiguous_recce->value() ) ) {
354               push @values, ${$ambiguous_value_ref};
355           }
356
357       When called more than once, the "Marpa::XS::Recognizer::value" method
358       iterates through the parse results.  For each call, it returns a
359       reference to the parse result.  At the end of the iteration, after all
360       parse results have been returned, "Marpa::XS::Recognizer::value"
361       returns "undef".  If there were no parse results,
362       "Marpa::XS::Recognizer::value" returns "undef" the first time that it
363       is called.
364

ERRORS AND EXCEPTIONS

366       Methods in the Marpa API do not return errors.  When there are errors,
367       Marpa API methods throw an exception.
368

INHERITANCE

370       Classes in the Marpa API are not designed to be inherited.
371

OTHER DOCUMENTS

373       This document gives a semi-tutorial overview of the entire Marpa API.
374       For full details on Marpa's grammar objects and their methods, see the
375       Marpa::XS::Grammar document.  For full details on Marpa's recognizer
376       objects and their methods, see the Marpa::XS::Recognizer document.
377
378       Marpa::XS::Vocabulary is intended as a quick refresher in parsing
379       terminology, emphasizing how the standard terms are used in the Marpa
380       context.  Marpa's standard semantics are fully described in the
381       Marpa::XS::Semantics document.  Techniques for tracing and for
382       debugging your Marpa grammars are described in the Marpa::XS::Tracing
383       document and the Marpa::XS::Debug document.  For those with a
384       theoretical bent, my sources, and other useful references, are
385       described in Marpa::XS::Advanced::Bibliography.
386

AUTHOR

388       Jeffrey Kegler
389
390   Why is it called "Marpa"?
391       Marpa is the name of the greatest of the Tibetan "translators".  In his
392       time (the 11th century AD) Indian Buddhism was at its height.  Marpa's
393       generation of scholars was devoted to producing Tibetan versions of
394       Buddhism's Sanskrit scriptures.  Marpa became the greatest of them, and
395       today is known as Marpa Lotsawa: "Marpa the Translator".
396
397   Blatant plug
398       Marpa is a character in my novel, The God Proof.  The God Proof centers
399       around Kurt Goedel's proof of God's existence.  Yes, that Kurt Goedel,
400       and yes, he really did work out a God Proof (it's in his Collected
401       Works, Vol. 3, pp. 403-404).  The God Proof is available as a free
402       download (<http://www.lulu.com/content/933192>).  It can be purchased
403       in print form at Amazon.com:
404       <http://www.amazon.com/God-Proof-Jeffrey-Kegler/dp/1434807355>.
405

ACKNOWLEDGMENTS

407       Marpa is directly derived from two other parsers.  The first was
408       discovered by John Aycock and R.  Nigel Horspool and is described in
409       their Aycock and Horspool 2002.  The second was described by Joop Leo
410       and is described in Leo 1991.  Aycock, Horspool, and Leo, in turn,
411       based their algorithms on the algorithm discovered by Jay Earley.  I
412       combined the Aycock-Horspool algorithm with the Leo algorithm, and
413       added significant changes of my own.
414
415       I'm grateful to Randal Schwartz for his support over the years that
416       I've been working on Marpa.  My chats with Larry Wall have been few and
417       brief, but his openness to new ideas has been a major encouragement and
418       his insight into the relationship between "natural language" and
419       computer language has been a major influence.  More recently, Allison
420       Randal and Patrick Michaud have been generous with their very valuable
421       time.  They might have preferred that I volunteered as a Parrot cage-
422       cleaner, but if so, they were too polite to say.
423
424       Many at perlmonks.org answered questions for me.  I used answers from
425       chromatic, Corion, dragonchild, jdporter, samtregar and Juerd, among
426       others, in writing this module.  I'm just as grateful to those whose
427       answers I didn't use.  My inquiries were made while I was thinking out
428       the code and it wasn't always 100% clear what I was after.  If the butt
429       is moved after the round, it shouldn't count against the archer.
430
431       In writing the Pure Perl version of Marpa, I benefited from studying
432       the work of Francois Desarmenien ("Parse::Yapp"), Damian Conway
433       ("Parse::RecDescent") and Graham Barr ("Scalar::Util").  Adam Kennedy
434       patiently instructed me in module writing, both on the finer points and
435       on issues about which I really should have know better.
436

SUPPORT

438       Marpa::XS comes without warranty.  Support is provided on a volunteer
439       basis through the standard mechanisms for CPAN modules.  The Support
440       document has details.
441
443         Copyright 2012 Jeffrey Kegler
444         This file is part of Marpa::XS.  Marpa::XS is free software: you can
445         redistribute it and/or modify it under the terms of the GNU Lesser
446         General Public License as published by the Free Software Foundation,
447         either version 3 of the License, or (at your option) any later version.
448
449         Marpa::XS is distributed in the hope that it will be useful,
450         but WITHOUT ANY WARRANTY; without even the implied warranty of
451         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
452         Lesser General Public License for more details.
453
454         You should have received a copy of the GNU Lesser
455         General Public License along with Marpa::XS.  If not, see
456         http://www.gnu.org/licenses/.
457
458
459
460perl v5.32.1                      2021-01-27                      Marpa::XS(3)
Impressum