1Marpa::XS::Semantics(3)User Contributed Perl DocumentatioMnarpa::XS::Semantics(3)
2
3
4

NAME

6       Marpa::XS::Semantics - How Marpa evaluates parses
7

SYNOPSIS

9           my $grammar = Marpa::XS::Grammar->new(
10               {   start          => 'Expression',
11                   actions        => 'My_Actions',
12                   default_action => 'first_arg',
13                   rules          => [
14                       { lhs => 'Expression', rhs => [qw/Term/] },
15                       { lhs => 'Term',       rhs => [qw/Factor/] },
16                       { lhs => 'Factor',     rhs => [qw/Number/] },
17                       { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
18                       {   lhs    => 'Factor',
19                           rhs    => [qw/Factor Multiply Factor/],
20                           action => 'do_multiply'
21                       },
22                   ],
23               }
24           );
25
26           sub My_Actions::do_add {
27               my ( undef, $t1, undef, $t2 ) = @_;
28               return $t1 + $t2;
29           }
30
31           sub My_Actions::do_multiply {
32               my ( undef, $t1, undef, $t2 ) = @_;
33               return $t1 * $t2;
34           }
35
36           sub My_Actions::first_arg { shift; return shift; }
37
38           my $value_ref = $recce->value;
39           my $value = $value_ref ? ${$value_ref} : 'No Parse';
40

OVERVIEW

42       Marpa's semantics will be familiar to those who have used traditional
43       methods to evaluate parses.  A parse is seen as a parse tree.  Nodes on
44       the tree are evaluated recursively, bottom-up.  Once the values of all
45       its child nodes are known, a parent node is ready to be evaluated.  The
46       value of a parse is the value of the top node of the parse tree.
47
48       Marpa's semantics operate on the application's original rules and
49       symbols.  The internals of the parser are hidden from the semantics.
50       Nodes in Marpa's virtual parse trees are of three kinds:
51
52       •   Token Nodes, which have a constant semantics, initialized when the
53           token is read.
54
55       •   Rule Nodes, which have a dynamic semantics, based on action names
56           and actions, as described below.
57
58       •   Null Nodes, which have a constant semantics, initialized when the
59           recognizer is created.
60
61   Action names and actions
62       When a Marpa grammar is created, its dynamic semantics is specified
63       indirectly, as action names.  To implement its semantics of action
64       names and actions, Marpa must do three things:
65
66       •   Determine the action name.
67
68       •   Resolve the action name to an action.  An action is a Perl closure.
69
70       •   Call the Perl closure to produce the actual result.
71
72       An action name and action is also used to create the per-parse-tree
73       variable, as described below.  The per-parse-tree variable is a special
74       case, but it is intended to be used as part of the semantics.
75

NODES

77   Token nodes
78       For every input token, there is an associated token node.  Token nodes
79       are leaf nodes in the parse tree.  Tokens always have a token symbol.
80       At lexing time, they can be assigned a token value.  If no token value
81       is assigned at lex time, the token value defaults to a Perl "undef".
82
83   Rule nodes
84       Nodes which are ancestors of token nodes are called rule nodes.  Rule
85       nodes are always associated with a rule.  The value of a rule node is
86       computed at Node Evaluation Time.  Applications can specify, on a per-
87       rule basis, Perl closures to evaluate rule nodes.  The Perl closures
88       which produce the value of rule nodes are called value actions.
89
90       A value action's arguments will be a per-parse-tree variable followed
91       by the values of its child nodes in lexical order.  The return value of
92       a value action becomes the value of the node.  A value action is always
93       evaluated in scalar context.  If there is no value action for a rule
94       node, the value of the rule node is a Perl "undef".
95
96   Sequence rule nodes
97       Some rules are sequence rules.  Sequence rule nodes are also rule
98       nodes.  Everything said above about rule nodes applies to sequence rule
99       nodes.  Specifically, the arguments to the value actions for sequence
100       rules are the per-parse-tree variable followed by the values of the
101       child nodes in lexical order.
102
103       The difference (and it is a big one) is that in an ordinary rule, the
104       right hand side is fixed in length, and that length is known when you
105       are writing the code for the value action.  In a sequence rule, the
106       number of right hand side symbols is not known until node evaluation
107       time.  The value action of a sequence rule node must be a Perl closure
108       capable of dealing with a variable number of arguments.
109
110       Sequence semantics work best when every child node in the sequence has
111       the same semantics.  When that is not the case, writing the sequence
112       using ordinary non-sequence rules should be considered as an
113       alternative.
114
115       By default, if a sequence rule has separators, the separators are
116       thrown away before the value action is called.  This means that
117       separators do not appear in the @_ array of the Perl closure which is
118       the value action.  If the value of the "keep" rule property is a Perl
119       true value, separators are kept, and do appear in the value action's @_
120       array.
121
122   Null nodes
123       A null node is a node which derives the zero-length, or empty string.
124       By default, the value of a null node is a Perl "undef".  This is
125       adequate for many applications, but Marpa allows other values to be
126       specified for null nodes.  In fact, Marpa allows allows an arbitrarily
127       complex null semantics.  Readers interested in null nodes with values
128       other than "undef", or who would like to read a more detailed account
129       of how Marpa's null semantics works, should turn to the document on
130       null semantics.
131

OVERVIEW OF THE SEMANTIC PHASES

133       Most applications will find that the order in which Marpa executes its
134       semantics "just works".  This section describes that order in detail.
135       These details can matter in some applications, for example, those which
136       exploit side effects.
137
138   Parse trees, parse results and parse series
139       When the semantics are applied to a parse tree, it produces a value
140       called a parse result.  Because Marpa allows ambiguous parsing, each
141       parse can produce a parse series -- a series of zero or more parse
142       trees, each with its own parse result.  The first call to the the
143       recognizer's "value" method after the recognizer is created is the
144       start of the first parse series.  The first parse series continues
145       until there is a call to the the "reset_evaluation" method or until the
146       recognizer is destroyed.  Usually, an application is only interested in
147       a single parse series.
148
149       When the "reset_evaluation" method is called for a recognizer, it
150       begins a new parse series.  The new parse series continues until there
151       is another call to the the "reset_evaluation" method, or until the
152       recognizer is destroyed.
153
154   Summary of the phases
155       While processing a recognizer, we have
156
157       •   A Recognizer Setup Phase, which occurs during the call of a
158           recognizer's "new" method.  It is followed by
159
160       •   the processing of zero or more parse series.
161
162       While processing a parse series, we have:
163
164       •   A Series Setup Phase, which occurs during the first call of the
165           recognizer's "value" method for that series.  It is followed by
166
167       •   the processing of zero or more parse trees.
168
169       While processing a parse tree, we have:
170
171       •   A Tree Setup Phase, which occurs during the call of the
172           recognizer's "value" method for that parse tree.  It is followed by
173
174       •   a Tree Traveral Phase.
175
176       Node Evaluation Time is the Tree Traversal Phase, as seen from the
177       point of view of each rule node.  It is not a separate phase.
178

THE SEMANTIC PHASES IN DETAIL

180   Recognizer Setup Phase
181       In the Recognizer Setup Phase, the null values of the symbols are
182       determined.  The null values of symbols never change -- they are in
183       effect properties of the grammar.
184
185   Series Setup Phase
186       During the Series Setup Phase all value action names are resolved to
187       Perl closures -- the value actions.  The value actions are never called
188       in the Series Setup Phase.  Value actions are called in the Tree
189       Traversal Phase.  Also during the Series Setup Phase, the logic which
190       ranks parse trees is executed.
191
192   Tree Setup Phase
193       In the Tree Setup Phase, the per-parse-tree variable is created.  If a
194       constructor was found for the "action_object", it is run at this point,
195       and the per-parse-tree variable is its return value.  Exactly one Tree
196       Setup Phase occurs for each parse tree.
197
198   Tree Traversal Phase
199       During the Tree Traversal Phase, the value actions are called.  Node
200       Evaluation Time is the Tree Traversal Phase, as seen from the point of
201       view of the individual nodes of the parse tree.  If a node has a value
202       action, it is called at Node Evaluation Time.
203

FINDING THE ACTION FOR A RULE

205       Marpa finds the action for each rule based on rule and symbol
206       properties and on the grammar named arguments.  Specifically, Marpa
207       attempts the following, in order:
208
209       •   Resolving an action based on the "action" property of the rule, if
210           one is defined.
211
212       •   Resolving an action based on the "lhs" property of the rule.
213
214       •   Resolving an action based on the "default_action" named argument of
215           the grammar, if one is defined.
216
217       •   Defaulting to a virtual action, one which returns a Perl "undef".
218
219       Resolution of action names is described below.  If the "action"
220       property or the "default_action" named argument is defined, but does
221       not resolve successfully, Marpa throws an exception.  Marpa prefers to
222       "fast fail" in these cases, because they often indicate a mistake in
223       writing the application.
224

RESOLVING ACTION NAMES

226       Action names come from these sources:
227
228       •   The "default_action" named argument of Marpa's grammar.
229
230       •   The "action" property of Marpa's rules.
231
232       •   The "new" constructor in the package specified by the
233           "action_object" named argument of the Marpa grammar.
234
235       •   The "lhs" property of Marpa's rules.
236
237   Explicit resolution
238       The recognizer's "closures" named argument allows the user to directly
239       control the mapping from action names to actions.  The value of the
240       "closures" named argument is a reference to a hash whose keys are
241       action names and whose hash values are CODE refs.
242
243       If an action name is the key of an entry in the "closures" hash, it
244       resolves to the closure referenced by the value part of that hash
245       entry.  Resolution via the "closures" named argument is called explicit
246       resolution.
247
248       When explicit resolution is the only kind of resolution that is wanted,
249       it is best to pick a name that is very unlikely to be the name of a
250       Perl closure.  Many of Marpa::HTML's action names are intended for
251       explicit resolution only.  In Marpa::HTML those action names begin with
252       an exclamation mark ("!"), and that convention is recommended.
253
254   Fully qualified action names
255       If explicit resolution fails, Marpa transforms the action name into a
256       fully qualified Perl name.  An action name that contains a double colon
257       (""::"") or a single quote (""'"") is considered to be a fully
258       qualified name.  Any other action name is considered to be a bare
259       action name.
260
261       If the action name to be resolved is already a fully qualified name, it
262       is not further transformed.  It will be resolved in the form it was
263       received, or not at all.
264
265       For bare action names, Marpa tries to qualify them by adding a package
266       name.  If the "actions" grammar named argument is defined, Marpa uses
267       it as the package name.  Otherwise, if the "action_object" grammar
268       named argument is defined, Marpa uses it as the package name.  Once
269       Marpa has fully qualified the action name, Marpa looks for a Perl
270       closure with that name.
271
272       Marpa will not attempt to resolve an action name that it cannot fully
273       qualify.  This implies that, for an action name to resolve
274       successfully, one of these four things must be the case:
275
276       •   The action name resolves explicitly.
277
278       •   The action name is fully qualified to begin with.
279
280       •   The "actions" named argument is defined.
281
282       •   The "action_object" named argument is defined.
283
284       In all but one circumstance, failure to resolve an action name is
285       thrown as an exception.  Marpa is more lenient when a rule attempts to
286       use the "lhs" rule property as an action name.  That is the one case in
287       which Marpa will look at other alternatives.
288
289       Marpa's philosophy is to require that the programmer be specific about
290       action names.  This can be an inconvenience, but Marpa prefers this to
291       silently executing unintended code.
292
293       Generally it is a good practice to keep the semantic Perl closures in
294       their own namespace.  But if, for example, the user wants to leave the
295       semantic closures in the "main" namespace, she can specify "main" as
296       the value of the "actions" named argument.
297

THE PER-PARSE-TREE VARIABLE

299       In the Tree Setup Phase, Marpa creates a per-parse-tree variable.  This
300       becomes the first argument of the semantic Perl closures for the rule
301       nodes.  If the grammar's "action_object" named argument is not defined,
302       the per-parse-tree variable is initialized to an empty hash ref.
303
304       Most data for the value actions of the rules will be passed up the
305       parse tree.  The actions will see the values of the rule node's child
306       nodes as arguments, and will return their own value to be seen as an
307       argument by their parent node.  The per-parse-tree variable can be used
308       for data which does not conveniently fit this model.
309
310       The lifetime of the per-parse-tree variable extends into the Tree
311       Traversal Phase.  During the Tree Traversal Phase, Marpa's internals
312       never alter the per-parse-tree variable -- it is reserved for use by
313       the application.
314
315   Action object constructor
316       If the grammar's "action_object" named argument has a defined value,
317       that value is treated as the name of a class.  The action object
318       constructor is the "new" method in the "action_object" class.
319
320       The action object constructor is called in the Tree Setup Phase.  The
321       return value of the action object constructor becomes the per-parse-
322       tree variable.  It is a fatal error if the grammar's "action_object"
323       named argument is defined, but does not name a class with a "new"
324       method.
325
326       Resolution of the action object constructor is resolution of the action
327       object constructor name.  The action object constructor name is formed
328       by concatenating the literal string ""::new"" to the value of the
329       "action_object" named argument.
330
331       All standard rules apply when resolving the action object constructor
332       name.  In particular, bypass via explicit resolution applies to the
333       action object constructor.  If the action object constructor name is a
334       hash key in the evaluator's "closures" named argument, then the value
335       referred to by that hash entry becomes the action object constructor.
336
337       If a grammar has both the "action" and the "action_object" named
338       arguments defined, all action names except for the action object
339       constructor will be resolved in the "action" package or not at all.
340       The action object constructor is always in the "action_object" class,
341       if it is anywhere.
342

PARSE ORDER

344       If a parse is ambiguous, all parses are returned, with no duplication.
345       By default, the order is arbitrary, but it is also possible to control
346       the order.  Details are in the document on parse order.
347

INFINITE LOOPS

349       Marpa allows grammars with infinite loops.  These are universally
350       considered useless in practical applications.  Marpa supports all the
351       semantics for these that should be necessary and more.  Because it can
352       handle infinite loops, Marpa can accurately claim to support every
353       grammar that can be written in BNF.
354
355       Marpa applies semantics to infinite loops by breaking the loop at some
356       convenient point, so that an infinite regress is prevented.  The exact
357       location at which the loop is broken varies among Marpa
358       implementations.  If an infinite loop is given a null semantics, the
359       location of the break will not matter.  A null semantics is the
360       default.
361
362       More could be done.  In particular, a precise definition of where loops
363       are broken would allow applications to depend on the details of the
364       structure of infinite loops.  But practical interest in this seems non-
365       existent.  For those who want to know more, the details are in a
366       separate document.
367
369         Copyright 2012 Jeffrey Kegler
370         This file is part of Marpa::XS.  Marpa::XS is free software: you can
371         redistribute it and/or modify it under the terms of the GNU Lesser
372         General Public License as published by the Free Software Foundation,
373         either version 3 of the License, or (at your option) any later version.
374
375         Marpa::XS is distributed in the hope that it will be useful,
376         but WITHOUT ANY WARRANTY; without even the implied warranty of
377         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
378         Lesser General Public License for more details.
379
380         You should have received a copy of the GNU Lesser
381         General Public License along with Marpa::XS.  If not, see
382         http://www.gnu.org/licenses/.
383
384
385
386perl v5.34.0                      2022-01-21           Marpa::XS::Semantics(3)
Impressum