1Marpa::XS::Semantics(3)User Contributed Perl DocumentatioMnarpa::XS::Semantics(3)
2
3
4
6 Marpa::XS::Semantics - How Marpa evaluates parses
7
9 my $grammar = Marpa::XS::Grammar->new(
10 { start => 'Expression',
11 actions => 'My_Actions',
12 default_action => 'first_arg',
13 rules => [
14 { lhs => 'Expression', rhs => [qw/Term/] },
15 { lhs => 'Term', rhs => [qw/Factor/] },
16 { lhs => 'Factor', rhs => [qw/Number/] },
17 { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
18 { lhs => 'Factor',
19 rhs => [qw/Factor Multiply Factor/],
20 action => 'do_multiply'
21 },
22 ],
23 }
24 );
25
26 sub My_Actions::do_add {
27 my ( undef, $t1, undef, $t2 ) = @_;
28 return $t1 + $t2;
29 }
30
31 sub My_Actions::do_multiply {
32 my ( undef, $t1, undef, $t2 ) = @_;
33 return $t1 * $t2;
34 }
35
36 sub My_Actions::first_arg { shift; return shift; }
37
38 my $value_ref = $recce->value;
39 my $value = $value_ref ? ${$value_ref} : 'No Parse';
40
42 Marpa's semantics will be familiar to those who have used traditional
43 methods to evaluate parses. A parse is seen as a parse tree. Nodes on
44 the tree are evaluated recursively, bottom-up. Once the values of all
45 its child nodes are known, a parent node is ready to be evaluated. The
46 value of a parse is the value of the top node of the parse tree.
47
48 Marpa's semantics operate on the application's original rules and
49 symbols. The internals of the parser are hidden from the semantics.
50 Nodes in Marpa's virtual parse trees are of three kinds:
51
52 • Token Nodes, which have a constant semantics, initialized when the
53 token is read.
54
55 • Rule Nodes, which have a dynamic semantics, based on action names
56 and actions, as described below.
57
58 • Null Nodes, which have a constant semantics, initialized when the
59 recognizer is created.
60
61 Action names and actions
62 When a Marpa grammar is created, its dynamic semantics is specified
63 indirectly, as action names. To implement its semantics of action
64 names and actions, Marpa must do three things:
65
66 • Determine the action name.
67
68 • Resolve the action name to an action. An action is a Perl closure.
69
70 • Call the Perl closure to produce the actual result.
71
72 An action name and action is also used to create the per-parse-tree
73 variable, as described below. The per-parse-tree variable is a special
74 case, but it is intended to be used as part of the semantics.
75
77 Token nodes
78 For every input token, there is an associated token node. Token nodes
79 are leaf nodes in the parse tree. Tokens always have a token symbol.
80 At lexing time, they can be assigned a token value. If no token value
81 is assigned at lex time, the token value defaults to a Perl "undef".
82
83 Rule nodes
84 Nodes which are ancestors of token nodes are called rule nodes. Rule
85 nodes are always associated with a rule. The value of a rule node is
86 computed at Node Evaluation Time. Applications can specify, on a per-
87 rule basis, Perl closures to evaluate rule nodes. The Perl closures
88 which produce the value of rule nodes are called value actions.
89
90 A value action's arguments will be a per-parse-tree variable followed
91 by the values of its child nodes in lexical order. The return value of
92 a value action becomes the value of the node. A value action is always
93 evaluated in scalar context. If there is no value action for a rule
94 node, the value of the rule node is a Perl "undef".
95
96 Sequence rule nodes
97 Some rules are sequence rules. Sequence rule nodes are also rule
98 nodes. Everything said above about rule nodes applies to sequence rule
99 nodes. Specifically, the arguments to the value actions for sequence
100 rules are the per-parse-tree variable followed by the values of the
101 child nodes in lexical order.
102
103 The difference (and it is a big one) is that in an ordinary rule, the
104 right hand side is fixed in length, and that length is known when you
105 are writing the code for the value action. In a sequence rule, the
106 number of right hand side symbols is not known until node evaluation
107 time. The value action of a sequence rule node must be a Perl closure
108 capable of dealing with a variable number of arguments.
109
110 Sequence semantics work best when every child node in the sequence has
111 the same semantics. When that is not the case, writing the sequence
112 using ordinary non-sequence rules should be considered as an
113 alternative.
114
115 By default, if a sequence rule has separators, the separators are
116 thrown away before the value action is called. This means that
117 separators do not appear in the @_ array of the Perl closure which is
118 the value action. If the value of the "keep" rule property is a Perl
119 true value, separators are kept, and do appear in the value action's @_
120 array.
121
122 Null nodes
123 A null node is a node which derives the zero-length, or empty string.
124 By default, the value of a null node is a Perl "undef". This is
125 adequate for many applications, but Marpa allows other values to be
126 specified for null nodes. In fact, Marpa allows allows an arbitrarily
127 complex null semantics. Readers interested in null nodes with values
128 other than "undef", or who would like to read a more detailed account
129 of how Marpa's null semantics works, should turn to the document on
130 null semantics.
131
133 Most applications will find that the order in which Marpa executes its
134 semantics "just works". This section describes that order in detail.
135 These details can matter in some applications, for example, those which
136 exploit side effects.
137
138 Parse trees, parse results and parse series
139 When the semantics are applied to a parse tree, it produces a value
140 called a parse result. Because Marpa allows ambiguous parsing, each
141 parse can produce a parse series -- a series of zero or more parse
142 trees, each with its own parse result. The first call to the the
143 recognizer's "value" method after the recognizer is created is the
144 start of the first parse series. The first parse series continues
145 until there is a call to the the "reset_evaluation" method or until the
146 recognizer is destroyed. Usually, an application is only interested in
147 a single parse series.
148
149 When the "reset_evaluation" method is called for a recognizer, it
150 begins a new parse series. The new parse series continues until there
151 is another call to the the "reset_evaluation" method, or until the
152 recognizer is destroyed.
153
154 Summary of the phases
155 While processing a recognizer, we have
156
157 • A Recognizer Setup Phase, which occurs during the call of a
158 recognizer's "new" method. It is followed by
159
160 • the processing of zero or more parse series.
161
162 While processing a parse series, we have:
163
164 • A Series Setup Phase, which occurs during the first call of the
165 recognizer's "value" method for that series. It is followed by
166
167 • the processing of zero or more parse trees.
168
169 While processing a parse tree, we have:
170
171 • A Tree Setup Phase, which occurs during the call of the
172 recognizer's "value" method for that parse tree. It is followed by
173
174 • a Tree Traveral Phase.
175
176 Node Evaluation Time is the Tree Traversal Phase, as seen from the
177 point of view of each rule node. It is not a separate phase.
178
180 Recognizer Setup Phase
181 In the Recognizer Setup Phase, the null values of the symbols are
182 determined. The null values of symbols never change -- they are in
183 effect properties of the grammar.
184
185 Series Setup Phase
186 During the Series Setup Phase all value action names are resolved to
187 Perl closures -- the value actions. The value actions are never called
188 in the Series Setup Phase. Value actions are called in the Tree
189 Traversal Phase. Also during the Series Setup Phase, the logic which
190 ranks parse trees is executed.
191
192 Tree Setup Phase
193 In the Tree Setup Phase, the per-parse-tree variable is created. If a
194 constructor was found for the "action_object", it is run at this point,
195 and the per-parse-tree variable is its return value. Exactly one Tree
196 Setup Phase occurs for each parse tree.
197
198 Tree Traversal Phase
199 During the Tree Traversal Phase, the value actions are called. Node
200 Evaluation Time is the Tree Traversal Phase, as seen from the point of
201 view of the individual nodes of the parse tree. If a node has a value
202 action, it is called at Node Evaluation Time.
203
205 Marpa finds the action for each rule based on rule and symbol
206 properties and on the grammar named arguments. Specifically, Marpa
207 attempts the following, in order:
208
209 • Resolving an action based on the "action" property of the rule, if
210 one is defined.
211
212 • Resolving an action based on the "lhs" property of the rule.
213
214 • Resolving an action based on the "default_action" named argument of
215 the grammar, if one is defined.
216
217 • Defaulting to a virtual action, one which returns a Perl "undef".
218
219 Resolution of action names is described below. If the "action"
220 property or the "default_action" named argument is defined, but does
221 not resolve successfully, Marpa throws an exception. Marpa prefers to
222 "fast fail" in these cases, because they often indicate a mistake in
223 writing the application.
224
226 Action names come from these sources:
227
228 • The "default_action" named argument of Marpa's grammar.
229
230 • The "action" property of Marpa's rules.
231
232 • The "new" constructor in the package specified by the
233 "action_object" named argument of the Marpa grammar.
234
235 • The "lhs" property of Marpa's rules.
236
237 Explicit resolution
238 The recognizer's "closures" named argument allows the user to directly
239 control the mapping from action names to actions. The value of the
240 "closures" named argument is a reference to a hash whose keys are
241 action names and whose hash values are CODE refs.
242
243 If an action name is the key of an entry in the "closures" hash, it
244 resolves to the closure referenced by the value part of that hash
245 entry. Resolution via the "closures" named argument is called explicit
246 resolution.
247
248 When explicit resolution is the only kind of resolution that is wanted,
249 it is best to pick a name that is very unlikely to be the name of a
250 Perl closure. Many of Marpa::HTML's action names are intended for
251 explicit resolution only. In Marpa::HTML those action names begin with
252 an exclamation mark ("!"), and that convention is recommended.
253
254 Fully qualified action names
255 If explicit resolution fails, Marpa transforms the action name into a
256 fully qualified Perl name. An action name that contains a double colon
257 (""::"") or a single quote (""'"") is considered to be a fully
258 qualified name. Any other action name is considered to be a bare
259 action name.
260
261 If the action name to be resolved is already a fully qualified name, it
262 is not further transformed. It will be resolved in the form it was
263 received, or not at all.
264
265 For bare action names, Marpa tries to qualify them by adding a package
266 name. If the "actions" grammar named argument is defined, Marpa uses
267 it as the package name. Otherwise, if the "action_object" grammar
268 named argument is defined, Marpa uses it as the package name. Once
269 Marpa has fully qualified the action name, Marpa looks for a Perl
270 closure with that name.
271
272 Marpa will not attempt to resolve an action name that it cannot fully
273 qualify. This implies that, for an action name to resolve
274 successfully, one of these four things must be the case:
275
276 • The action name resolves explicitly.
277
278 • The action name is fully qualified to begin with.
279
280 • The "actions" named argument is defined.
281
282 • The "action_object" named argument is defined.
283
284 In all but one circumstance, failure to resolve an action name is
285 thrown as an exception. Marpa is more lenient when a rule attempts to
286 use the "lhs" rule property as an action name. That is the one case in
287 which Marpa will look at other alternatives.
288
289 Marpa's philosophy is to require that the programmer be specific about
290 action names. This can be an inconvenience, but Marpa prefers this to
291 silently executing unintended code.
292
293 Generally it is a good practice to keep the semantic Perl closures in
294 their own namespace. But if, for example, the user wants to leave the
295 semantic closures in the "main" namespace, she can specify "main" as
296 the value of the "actions" named argument.
297
299 In the Tree Setup Phase, Marpa creates a per-parse-tree variable. This
300 becomes the first argument of the semantic Perl closures for the rule
301 nodes. If the grammar's "action_object" named argument is not defined,
302 the per-parse-tree variable is initialized to an empty hash ref.
303
304 Most data for the value actions of the rules will be passed up the
305 parse tree. The actions will see the values of the rule node's child
306 nodes as arguments, and will return their own value to be seen as an
307 argument by their parent node. The per-parse-tree variable can be used
308 for data which does not conveniently fit this model.
309
310 The lifetime of the per-parse-tree variable extends into the Tree
311 Traversal Phase. During the Tree Traversal Phase, Marpa's internals
312 never alter the per-parse-tree variable -- it is reserved for use by
313 the application.
314
315 Action object constructor
316 If the grammar's "action_object" named argument has a defined value,
317 that value is treated as the name of a class. The action object
318 constructor is the "new" method in the "action_object" class.
319
320 The action object constructor is called in the Tree Setup Phase. The
321 return value of the action object constructor becomes the per-parse-
322 tree variable. It is a fatal error if the grammar's "action_object"
323 named argument is defined, but does not name a class with a "new"
324 method.
325
326 Resolution of the action object constructor is resolution of the action
327 object constructor name. The action object constructor name is formed
328 by concatenating the literal string ""::new"" to the value of the
329 "action_object" named argument.
330
331 All standard rules apply when resolving the action object constructor
332 name. In particular, bypass via explicit resolution applies to the
333 action object constructor. If the action object constructor name is a
334 hash key in the evaluator's "closures" named argument, then the value
335 referred to by that hash entry becomes the action object constructor.
336
337 If a grammar has both the "action" and the "action_object" named
338 arguments defined, all action names except for the action object
339 constructor will be resolved in the "action" package or not at all.
340 The action object constructor is always in the "action_object" class,
341 if it is anywhere.
342
344 If a parse is ambiguous, all parses are returned, with no duplication.
345 By default, the order is arbitrary, but it is also possible to control
346 the order. Details are in the document on parse order.
347
349 Marpa allows grammars with infinite loops. These are universally
350 considered useless in practical applications. Marpa supports all the
351 semantics for these that should be necessary and more. Because it can
352 handle infinite loops, Marpa can accurately claim to support every
353 grammar that can be written in BNF.
354
355 Marpa applies semantics to infinite loops by breaking the loop at some
356 convenient point, so that an infinite regress is prevented. The exact
357 location at which the loop is broken varies among Marpa
358 implementations. If an infinite loop is given a null semantics, the
359 location of the break will not matter. A null semantics is the
360 default.
361
362 More could be done. In particular, a precise definition of where loops
363 are broken would allow applications to depend on the details of the
364 structure of infinite loops. But practical interest in this seems non-
365 existent. For those who want to know more, the details are in a
366 separate document.
367
369 Copyright 2012 Jeffrey Kegler
370 This file is part of Marpa::XS. Marpa::XS is free software: you can
371 redistribute it and/or modify it under the terms of the GNU Lesser
372 General Public License as published by the Free Software Foundation,
373 either version 3 of the License, or (at your option) any later version.
374
375 Marpa::XS is distributed in the hope that it will be useful,
376 but WITHOUT ANY WARRANTY; without even the implied warranty of
377 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
378 Lesser General Public License for more details.
379
380 You should have received a copy of the GNU Lesser
381 General Public License along with Marpa::XS. If not, see
382 http://www.gnu.org/licenses/.
383
384
385
386perl v5.34.0 2022-01-21 Marpa::XS::Semantics(3)