1PARSLEY(1)                          Parsley                         PARSLEY(1)
2
3
4

NAME

6       parsley - Parsley Documentation
7
8       Contents:
9

PARSLEY TUTORIAL PART I: BASICS AND SYNTAX

11   From Regular Expressions To Grammars
12       Parsley is a pattern matching and parsing tool for Python programmers.
13
14       Most  Python programmers are familiar with regular expressions, as pro‐
15       vided by Python’s re module. To use it, you provide a string  that  de‐
16       scribes the pattern you want to match, and your input.
17
18       For example:
19
20          >>> import re
21          >>> x = re.compile("a(b|c)d+e")
22          >>> x.match("abddde")
23          <_sre.SRE_Match object at 0x7f587af54af8>
24
25       You can do exactly the same sort of thing in Parsley:
26
27          >>> import parsley
28          >>> x = parsley.makeGrammar("foo = 'a' ('b' | 'c') 'd'+ 'e'", {})
29          >>> x("abdde").foo()
30          'e'
31
32       From  this  small example, a couple differences between regular expres‐
33       sions and Parsley grammars can be seen:
34
35   Parsley Grammars Have Named Rules
36       A Parsley grammar can have many rules, and each has a name. The example
37       above  has  a single rule named foo. Rules can call each other; calling
38       rules in Parsley works like calling functions in Python.  Here  is  an‐
39       other way to write the grammar above:
40
41          foo = 'a' baz 'd'+ 'e'
42          baz = 'b' | 'c'
43
44   Parsley Grammars Are Expressions
45       Calling  match  for  a regular expression returns a match object if the
46       match succeeds or None if it fails. Parsley parsers return the value of
47       last expression in the rule. Behind the scenes, Parsley turns each rule
48       in your grammar into Python methods. In pseudo-Python  code,  it  looks
49       something like this:
50
51          def foo(self):
52              match('a')
53              self.baz()
54              match_one_or_more('d')
55              return match('e')
56
57          def baz(self):
58              return match('b') or match('c')
59
60       The  value of the last expression in the rule is what the rule returns.
61       This is why our example returns ‘e’.
62
63       The similarities to regular expressions pretty much end  here,  though.
64       Having  multiple  named  rules composed of expressions makes for a much
65       more powerful tool, and now we’re going to look at some  more  features
66       that go even further.
67
68   Rules Can Embed Python Expressions
69       Since  these  rules just turn into Python code eventually, we can stick
70       some Python code into them ourselves. This is particularly  useful  for
71       changing the return value of a rule. The Parsley expression for this is
72       ->. We can also bind the results of expressions to variable  names  and
73       use them in Python code. So things like this are possible:
74
75          x = parsley.makeGrammar("""
76          foo = 'a':one baz:two 'd'+ 'e' -> (one, two)
77          baz = 'b' | 'c'
78          """, {})
79          print x("abdde").foo()
80
81          ('a', 'b')
82
83       Literal match expressions like ‘a’ return the character they match. Us‐
84       ing a colon and a variable name after an expression is like  assignment
85       in Python. As a result, we can use those names in a Python expression -
86       in this case, creating a tuple.
87
88       Another way to use Python code in a rule is to write custom  tests  for
89       matching.  Sometimes it’s more convenient to write some Python that de‐
90       termines if a rule matches than to stick to Parsley expressions  alone.
91       For those cases, we can use ?(). Here, we use the builtin rule anything
92       to match a single character, then a Python predicate to decide if  it’s
93       the one we want:
94
95          digit = anything:x ?(x in '0123456789') -> x
96
97       This  rule  digit will match any decimal digit. We need the -> x on the
98       end to return the character rather than the value of the predicate  ex‐
99       pression, which is just True.
100
101   Repeated Matches Make Lists
102       Like  regular  expressions, Parsley supports repeating matches. You can
103       match an expression zero or more times with ‘* ‘,  one  or  more  times
104       with  ‘+’,  and a specific number of times with ‘{n, m}’ or just ‘{n}’.
105       Since all expressions in Parsley return a value, these repetition oper‐
106       ators return a list containing each match they made.
107
108          x = parsley.makeGrammar("""
109          digit = anything:x ?(x in '0123456789') -> x
110          number = digit+
111          """, {})
112          print x("314159").number()
113
114          ['3', '1', '4', '1', '5', '9']
115
116       The  number rule repeatedly matches digit and collects the matches into
117       a list. This gets us part way to turning a string like 314159  into  an
118       integer.  All  we  need  now is to turn the list back into a string and
119       call int():
120
121          x = parsley.makeGrammar("""
122          digit = anything:x ?(x in '0123456789') -> x
123          number = digit+:ds -> int(''.join(ds))
124          """, {})
125          print x("8675309").number()
126
127          8675309
128
129   Collecting Chunks Of Input
130       If it seemed kind of strange to break our input string up into  a  list
131       and  then  reassemble  it  into  a string using join, you’re not alone.
132       Parsley has a shortcut for this since it’s a common case: you  can  use
133       <>  around a rule to make it return the slice of input it consumes, ig‐
134       noring the actual return value of the rule. For example:
135
136          x = parsley.makeGrammar("""
137          digit = anything:x ?(x in '0123456789')
138          number = <digit+>:ds -> int(ds)
139          """, {})
140          print x("11235").number()
141
142          11235
143
144       Here, <digit+> returns the string “11235”, since that’s the portion  of
145       the input that digit+ matched. (In this case it’s the entire input, but
146       we’ll see some more complex cases soon.) Since it ignores the list  re‐
147       turned  by digit+, leaving the -> x out of digit doesn’t change the re‐
148       sult.
149
150   Building A Calculator
151       Now let’s look at using these rules in a more  complicated  parser.  We
152       have support for parsing numbers; let’s do addition, as well.
153
154          x = parsley.makeGrammar("""
155          digit = anything:x ?(x in '0123456789')
156          number = <digit+>:ds -> int(ds)
157          expr = number:left ( '+' number:right -> left + right
158                             | -> left)
159          """, {})
160          print x("17+34").expr()
161          print x("18").expr()
162
163          51
164          18
165
166       Parentheses  group expressions just like in Python. the ‘|’ operator is
167       like or in Python - it short-circuits. It tries each  expression  until
168       it  finds  one that matches. For “17+34”, the number rule matches “17”,
169       then Parsley tries to match + followed by another number. Since “+” and
170       “34”  are  the  next things in the input, those match, and it then runs
171       the Python expression left + right and returns its value. For the input
172       “18”  it does the same, but + does not match, so Parsley tries the next
173       thing after |. Since this is just a Python expression, the  match  suc‐
174       ceeds and the number 18 is returned.
175
176       Now let’s add subtraction:
177
178          digit = anything:x ?(x in '0123456789')
179          number = <digit+>:ds -> int(ds)
180          expr = number:left ( '+' number:right -> left + right
181                             | '-' number:right -> left - right
182                             | -> left)
183
184       This will accept things like ‘5-4’ now.
185
186       Since  parsing  numbers  is  so common and useful, Parsley actually has
187       ‘digit’ as a builtin rule, so we don’t even  need  to  define  it  our‐
188       selves.  We’ll leave it out in further examples and rely on the version
189       Parsley provides.
190
191       Normally we like to allow whitespace in our expressions, so  let’s  add
192       some support for spaces:
193
194          number = <digit+>:ds -> int(ds)
195          ws = ' '*
196          expr = number:left ws ('+' ws number:right -> left + right
197                                |'-' ws number:right -> left - right
198                                | -> left)
199
200       Now we can handle “17 +34”, “2  - 1”, etc.
201
202       We  could  go ahead and add multiplication and division here (and hope‐
203       fully it’s obvious how that would work), but  let’s  complicate  things
204       further  and allow multiple operations in our expressions – things like
205       “1 - 2 + 3”.
206
207       There’s a couple different ways to do this. Possibly the easiest is  to
208       build a list of numbers and operations, then do the math.:
209
210          x = parsley.makeGrammar("""
211          number = <digit+>:ds -> int(ds)
212          ws = ' '*
213          add = '+' ws number:n -> ('+', n)
214          sub = '-' ws number:n -> ('-', n)
215          addsub = ws (add | sub)
216          expr = number:left (addsub+:right -> right
217                             | -> left)
218          """, {})
219          print x("1 + 2 - 3").expr()
220
221          [('+', 2), ('-, 3)]
222
223       Oops,  this  is  only half the job done. We’re collecting the operators
224       and values, but now we need to do the actual calculation.  The  easiest
225       way  to  do  it is probably to write a Python function and call it from
226       inside the grammar.
227
228       So far we have been passing an empty dict as  the  second  argument  to
229       makeGrammar.  This  is  a dict of variable bindings that can be used in
230       Python expressions in the grammar. So we can pass Python objects,  such
231       as functions, this way:
232
233          def calculate(start, pairs):
234              result = start
235              for op, value in pairs:
236                  if op == '+':
237                      result += value
238                  elif op == '-':
239                      result -= value
240              return result
241          x = parsley.makeGrammar("""
242          number = <digit+>:ds -> int(ds)
243          ws = ' '*
244          add = '+' ws number:n -> ('+', n)
245          sub = '-' ws number:n -> ('-', n)
246          addsub = ws (add | sub)
247          expr = number:left (addsub+:right -> calculate(left, right)
248                             | -> left)
249          """, {"calculate": calculate})
250          print x("4 + 5 - 6").expr()
251
252          3
253
254       Introducing this function lets us simplify even further: instead of us‐
255       ing addsub+, we can use addsub*, since calculate(left, []) will  return
256       left – so now expr becomes:
257
258          expr = number:left addsub*:right -> calculate(left, right)
259
260       So  now  let’s look at adding multiplication and division. Here, we run
261       into precedence rules: should “4 * 5 + 6” give us 26, or 44? The tradi‐
262       tional  choice  is  for  multiplication and division to take precedence
263       over addition and subtraction, so the answer should be  26.  We’ll  re‐
264       solve this by making sure multiplication and division happen before ad‐
265       dition and subtraction are considered:
266
267          def calculate(start, pairs):
268              result = start
269              for op, value in pairs:
270                  if op == '+':
271                      result += value
272                  elif op == '-':
273                      result -= value
274                  elif op == '*':
275                      result *= value
276                  elif op == '/':
277                      result /= value
278              return result
279          x = parsley.makeGrammar("""
280          number = <digit+>:ds -> int(ds)
281          ws = ' '*
282          add = '+' ws expr2:n -> ('+', n)
283          sub = '-' ws expr2:n -> ('-', n)
284          mul = '*' ws number:n -> ('*', n)
285          div = '/' ws number:n -> ('/', n)
286
287          addsub = ws (add | sub)
288          muldiv = ws (mul | div)
289
290          expr = expr2:left addsub*:right -> calculate(left, right)
291          expr2 = number:left muldiv*:right -> calculate(left, right)
292          """, {"calculate": calculate})
293          print x("4 * 5 + 6").expr()
294
295          26
296
297       Notice particularly that add, sub, and expr all call the expr2 rule now
298       where they called number before. This means that all the places where a
299       number was expected previously, a multiplication or division expression
300       can appear instead.
301
302       Finally  let’s  add parentheses, so you can override the precedence and
303       write “4 * (5 + 6)” when you do want 44. We’ll  do  this  by  adding  a
304       value  rule  that accepts either a number or an expression in parenthe‐
305       ses, and replace existing calls to number with calls to value.
306
307          def calculate(start, pairs):
308              result = start
309              for op, value in pairs:
310                  if op == '+':
311                      result += value
312                  elif op == '-':
313                      result -= value
314                  elif op == '*':
315                      result *= value
316                  elif op == '/':
317                      result /= value
318              return result
319          x = parsley.makeGrammar("""
320          number = <digit+>:ds -> int(ds)
321          parens = '(' ws expr:e ws ')' -> e
322          value = number | parens
323          ws = ' '*
324          add = '+' ws expr2:n -> ('+', n)
325          sub = '-' ws expr2:n -> ('-', n)
326          mul = '*' ws value:n -> ('*', n)
327          div = '/' ws value:n -> ('/', n)
328
329          addsub = ws (add | sub)
330          muldiv = ws (mul | div)
331
332          expr = expr2:left addsub*:right -> calculate(left, right)
333          expr2 = value:left muldiv*:right -> calculate(left, right)
334          """, {"calculate": calculate})
335
336          print x("4 * (5 + 6) + 1").expr()
337
338          45
339
340       And there you have it: a four-function calculator with  precedence  and
341       parentheses.
342

PARSLEY TUTORIAL PART II: PARSING STRUCTURED DATA

344       Now that you are familiar with the basics of Parsley syntax, let’s look
345       at a more realistic example: a JSON parser.
346
347       The JSON spec on http://json.org/ describes  the  format,  and  we  can
348       adapt its description to a parser. We’ll write the Parsley rules in the
349       same order as the grammar rules in the right sidebar on the JSON  site,
350       starting with the top-level rule, ‘object’.
351
352          object = ws '{' members:m ws '}' -> dict(m)
353
354       Parsley  defines  a builtin rule ws which consumes any spaces, tabs, or
355       newlines it can.
356
357       Since JSON objects are represented in Python as dicts, and dict takes a
358       list of pairs, we need a rule to collect name/value pairs inside an ob‐
359       ject expression.
360
361          members = (pair:first (ws ',' pair)*:rest -> [first] + rest)
362                    | -> []
363
364       This handles the three cases for object  contents:  one,  multiple,  or
365       zero  pairs.  A  name/value  pair  is  separated by a colon. We use the
366       builtin rule spaces to consume any whitespace after the colon:
367
368          pair = ws string:k ws ':' value:v -> (k, v)
369
370       Arrays, similarly, are sequences of array elements, and are represented
371       as Python lists.
372
373          array = '[' elements:xs ws ']' -> xs
374          elements = (value:first (ws ',' value)*:rest -> [first] + rest) | -> []
375
376       Values can be any JSON expression.
377
378          value = ws (string | number | object | array
379                     | 'true'  -> True
380                     | 'false' -> False
381                     | 'null'  -> None)
382
383       Strings are sequences of zero or more characters between double quotes.
384       Of course, we need to deal with escaped characters as well.  This  rule
385       introduces  the  operator  ~, which does negative lookahead; if the ex‐
386       pression following it succeeds, its parse will fail. If the  expression
387       fails,  the  rest  of the parse continues. Either way, no input will be
388       consumed.
389
390          string = '"' (escapedChar | ~'"' anything)*:c '"' -> ''.join(c)
391
392       This is a common pattern, so let’s examine it step by step.  This  will
393       match  leading  whitespace  and  then a double quote character. It then
394       matches zero or more characters. If it’s not an escapedChar (which will
395       start  with  a  backslash),  we check to see if it’s a double quote, in
396       which case we want to end the loop. If it’s  not  a  double  quote,  we
397       match  it  using the rule anything, which accepts a single character of
398       any kind, and continue. Finally, we match the ending double  quote  and
399       return  the  characters  in  the string. We cannot use the <> syntax in
400       this case because we don’t want a literal slice of the input – we  want
401       escape sequences to be replaced with the character they represent.
402
403       It’s  very  common to use ~ for “match until” situations where you want
404       to keep parsing only until an end marker is  found.  Similarly,  ~~  is
405       positive  lookahead: it succeed if its expression succeeds but not con‐
406       sume any input.
407
408       The escapedChar rule should not be too surprising: we match a backslash
409       then whatever escape code is given.
410
411          escapedChar = '\\' (('"' -> '"')    |('\\' -> '\\')
412                             |('/' -> '/')    |('b' -> '\b')
413                             |('f' -> '\f')   |('n' -> '\n')
414                             |('r' -> '\r')   |('t' -> '\t')
415                             |('\'' -> '\'')  | escapedUnicode)
416
417       Unicode  escapes (of the form \u2603) require matching four hex digits,
418       so we use the repetition operator {}, which works like +  or  *  except
419       taking either a {min, max} pair or simply a {number} indicating the ex‐
420       act number of repetitions.
421
422          hexdigit = :x ?(x in '0123456789abcdefABCDEF') -> x
423          escapedUnicode = 'u' <hexdigit{4}>:hs -> unichr(int(hs, 16))
424
425       With strings out of the way, we advance to numbers,  both  integer  and
426       floating-point.
427
428          number = spaces ('-' | -> ''):sign (intPart:ds (floatPart(sign ds)
429                                                         | -> int(sign + ds)))
430
431       Here  we vary from the json.org description a little and move sign han‐
432       dling up into the number rule. We match either an intPart followed by a
433       floatPart or just an intPart by itself.
434
435          digit = :x ?(x in '0123456789') -> x
436          digits = <digit*>
437          digit1_9 = :x ?(x in '123456789') -> x
438
439          intPart = (digit1_9:first digits:rest -> first + rest) | digit
440          floatPart :sign :ds = <('.' digits exponent?) | exponent>:tail
441                               -> float(sign + ds + tail)
442          exponent = ('e' | 'E') ('+' | '-')? digits
443
444       In  JSON,  multi-digit numbers cannot start with 0 (since that is Java‐
445       script’s syntax for octal numbers), so intPart uses digit1_9 to exclude
446       it in the first position.
447
448       The  floatPart  rule takes two parameters, sign and ds. Our number rule
449       passes values for these when it invokes floatPart, letting us avoid du‐
450       plication  of work within the rule. Note that pattern matching on argu‐
451       ments to rules works the same as on the string input to the parser.  In
452       this  case, we provide no pattern, just a name: :ds is the same as any‐
453       thing:ds.
454
455       (Also note that our float rule cheats a  little:  it  does  not  really
456       parse floating-point numbers, it merely recognizes them and passes them
457       to Python’s float builtin to actually produce the value.)
458
459       The full version of this parser and its test cases can be found in  the
460       examples directory in the Parsley distribution.
461

PARSLEY TUTORIAL PART III: PARSING NETWORK DATA

463       This tutorial assumes basic knowledge of writing Twisted TCP clients or
464       servers.
465
466   Basic parsing
467       Parsing data that comes in over the network can  be  difficult  due  to
468       that  there  is  no guarantee of receiving whole messages. Buffering is
469       often complicated by protocols switching between using fixed-width mes‐
470       sages  and  delimiters for framing. Fortunately, Parsley can remove all
471       of this tedium.
472
473       With   parsley.makeProtocol(),   Parsley   can   generate   a   Twisted
474       IProtocol-implementing class which will match incoming network data us‐
475       ing Parsley grammar rules. Before getting started with  makeProtocol(),
476       let’s  build  a grammar for netstrings. The netstrings protocol is very
477       simple:
478
479          4:spam,4:eggs,
480
481       This stream contains two netstrings: spam, and eggs. The data  is  pre‐
482       fixed  with one or more ASCII digits followed by a :, and suffixed with
483       a ,. So, a Parsley grammar to match a netstring would look like:
484
485          nonzeroDigit = digit:x ?(x != '0')
486          digits = <'0' | nonzeroDigit digit*>:i -> int(i)
487
488          netstring = digits:length ':' <anything{length}>:string ',' -> string
489
490
491       makeProtocol() takes, in  addition  to  a  grammar,  a  factory  for  a
492       “sender”  and a factory for a “receiver”. In the system of objects man‐
493       aged by the ParserProtocol, the sender is in charge of writing data  to
494       the  wire,  and  the  receiver  has methods called on it by the Parsley
495       rules. To demonstrate it, here is the final piece needed in the Parsley
496       grammar for netstrings:
497
498          receiveNetstring = netstring:string -> receiver.netstringReceived(string)
499
500
501       The  receiver  is  always  available in Parsley rules with the name re‐
502       ceiver, allowing Parsley rules to call methods on it.
503
504       When data is received over the wire, the ParserProtocol tries to  match
505       the  received  data  against  the current rule. If the current rule re‐
506       quires more data to finish matching, the ParserProtocol stops and waits
507       until more data comes in, then tries to continue matching. This repeats
508       until the current rule is completely matched, and then the ParserProto‐
509       col starts matching any leftover data against the current rule again.
510
511       One  specifies  the  current rule by setting a currentRule attribute on
512       the receiver, which the ParserProtocol looks at before doing any  pars‐
513       ing. Changing the current rule is addressed in the Switching rules sec‐
514       tion.
515
516       Since the ParserProtocol will never modify  the  currentRule  attribute
517       itself,  the  default  behavior is to keep using the same rule. Parsing
518       netstrings doesn’t require any rule changing, so, the default  behavior
519       of continuing to use the same rule is fine.
520
521       Both  the  sender factory and receiver factory are constructed when the
522       ParserProtocol’s connection is established. The  sender  factory  is  a
523       one-argument   callable  which  will  be  passed  the  ParserProtocol’s
524       Transport. This allows the sender to send data over the transport.  For
525       example:
526
527          class NetstringSender(object):
528              def __init__(self, transport):
529                  self.transport = transport
530
531              def sendNetstring(self, string):
532                  self.transport.write('%d:%s,' % (len(string), string))
533
534
535       The  receiver  factory is another one-argument callable which is passed
536       the constructed sender. The returned object must  at  least  have  pre‐
537       pareParsing()  and finishParsing() methods.  prepareParsing() is called
538       with the ParserProtocol instance when a connection is established (i.e.
539       in  the  connectionMade  of  the ParserProtocol) and finishParsing() is
540       called when a connection is closed (i.e. in the connectionLost  of  the
541       ParserProtocol).
542
543       NOTE:
544          Both the receiver factory and its returned object’s prepareParsing()
545          are called at in the ParserProtocol’s  connectionMade  method;  this
546          separation is for ease of testing receivers.
547
548       To demonstrate a receiver, here is a simple receiver that receives net‐
549       strings and echos the same netstrings back:
550
551          class NetstringReceiver(object):
552              currentRule = 'receiveNetstring'
553
554              def __init__(self, sender):
555                  self.sender = sender
556
557              def prepareParsing(self, parser):
558                  pass
559
560              def finishParsing(self, reason):
561                  pass
562
563              def netstringReceived(self, string):
564                  self.sender.sendNetstring(string)
565
566
567       Putting it all together, the Protocol is constructed using the grammar,
568       sender factory, and receiver factory:
569
570
571
572          NetstringProtocol = makeProtocol(
573              grammar, NetstringSender, NetstringReceiver)
574
575
576
577
578       The complete script is also available for download.
579
580   Intermezzo: error reporting
581       If  an  exception is raised from within Parsley during parsing, whether
582       it’s due to input not matching the current rule or an  exception  being
583       raised  from code the grammar calls, the connection will be immediately
584       closed. The traceback will be captured as a Failure and passed  to  the
585       finishParsing() method of the receiver.
586
587       At present, there is no way to recover from failure.
588
589   Composing senders and receivers
590       The  design of senders and receivers is intentional to make composition
591       easy: no subclassing is required. While the composition is easy  enough
592       to  do  on  your  own, Parsley provides a function: stack(). It takes a
593       base factory followed by zero or more wrappers.
594
595       Its use is extremely simple: stack(x, y,  z)  will  return  a  callable
596       suitable either as a sender or receiver factory which will, when called
597       with an argument, return x(y(z(argument))).
598
599       An example of wrapping a sender factory:
600
601          class NetstringReversalWrapper(object):
602              def __init__(self, wrapped):
603                  self.wrapped = wrapped
604
605              def sendNetstring(self, string):
606                  self.wrapped.sendNetstring(string[::-1])
607
608
609       And then, constructing the Protocol:
610
611          NetstringProtocol = makeProtocol(
612              grammar,
613              stack(NetstringReversalWrapper, NetstringSender),
614              NetstringReceiver)
615
616       A wrapper doesn’t need to call the same methods on the thing it’s wrap‐
617       ping.   Also note that in most cases, it’s important to forward unknown
618       methods on to the wrapped object. An example of wrapping a receiver:
619
620          class NetstringSplittingWrapper(object):
621              def __init__(self, wrapped):
622                  self.wrapped = wrapped
623
624              def netstringReceived(self, string):
625                  splitpoint = len(string) // 2
626                  self.wrapped.netstringFirstHalfReceived(string[:splitpoint])
627                  self.wrapped.netstringSecondHalfReceived(string[splitpoint:])
628
629              def __getattr__(self, attr):
630                  return getattr(self.wrapped, attr)
631
632
633       The corresponding receiver and again, constructing the Protocol:
634
635          class SplitNetstringReceiver(object):
636              currentRule = 'receiveNetstring'
637
638              def __init__(self, sender):
639                  self.sender = sender
640
641              def prepareParsing(self, parser):
642                  pass
643
644              def finishParsing(self, reason):
645                  pass
646
647              def netstringFirstHalfReceived(self, string):
648                  self.sender.sendNetstring(string)
649
650              def netstringSecondHalfReceived(self, string):
651                  pass
652
653
654          NetstringProtocol = makeProtocol(
655              grammar,
656              stack(NetstringReversalWrapper, NetstringSender),
657
658
659       The complete script is also available for download.
660
661   Switching rules
662       As mentioned before, it’s possible to change the current rule.  Imagine
663       a “netstrings2” protocol that looks like this:
664
665          3:foo,3;bar,4:spam,4;eggs,
666
667       That is, the protocol alternates between using : and using ; delimiting
668       data length and the data. The amended grammar would look something like
669       this:
670
671          nonzeroDigit = digit:x ?(x != '0')
672          digits = <'0' | nonzeroDigit digit*>:i -> int(i)
673          netstring :delimiter = digits:length delimiter <anything{length}>:string ',' -> string
674
675          colon = digits:length ':' <anything{length}>:string ',' -> receiver.netstringReceived(':', string)
676          semicolon = digits:length ';' <anything{length}>:string ',' -> receiver.netstringReceived(';', string)
677
678
679       Changing  the current rule is as simple as changing the currentRule at‐
680       tribute on the receiver. So, the netstringReceived  method  could  look
681       like this:
682
683       While  changing  the currentRule attribute can be done at any time, the
684       ParserProtocol only examines the currentRule at the beginning of  pars‐
685       ing and after a rule has finished matching. As a result, if the curren‐
686       tRule changes, the ParserProtocol will wait until the current  rule  is
687       completely matched before switching rules.
688
689       The complete script is also available for download.
690

EXTENDING GRAMMARS AND INHERITANCE

692       warning
693              Unfinished
694
695       Another feature taken from OMeta is grammar inheritance. We can write a
696       grammar with rules that override ones in a parent. If we load the gram‐
697       mar  from  our  calculator tutorial as Calc, we can extend it with some
698       constants:
699
700          from parsley import makeGrammar
701          import math
702          import calc
703          calcGrammarEx = """
704          value = super | constant
705          constant = 'pi' -> math.pi
706                   | 'e' -> math.e
707          """
708          CalcEx = makeGrammar(calcGrammar, {"math": math}, extends=calc.Calc)
709
710       Invoking the rule super calls the rule value in Calc. If  it  fails  to
711       match, our new value rule attempts to match a constant name.
712

TERML

714       TermL  (“term-ell”) is the Term Language, a small expression-based lan‐
715       guage for representing arbitrary data in a simple structured format. It
716       is ideal for expressing abstract syntax trees (ASTs) and other kinds of
717       primitive data trees.
718
719   Creating Terms
720          >>> from terml.nodes import termMaker as t
721          >>> t.Term()
722          term('Term')
723
724       That’s it! We’ve created an empty term, Term, with nothing inside.
725
726          >>> t.Num(1)
727          term('Num(1)')
728          >>> t.Outer(t.Inner())
729          term('Outer(Inner)')
730
731       We can see that terms are not just  namedtuple  lookalikes.  They  have
732       their  own  internals  and  store data in a slightly different and more
733       structured way than a normal tuple.
734
735   Parsing Terms
736       Parsley can parse terms from streams. Terms can  contain  any  kind  of
737       parseable data, including other terms. Returning to the ubiquitous cal‐
738       culator example:
739
740          add = Add(:x, :y) -> x + y
741
742       Here this rule matches a term called Add which has two components, bind
743       those  components to a couple of names (x and y), and return their sum.
744       If this rule were applied to a term like Add(3, 5), it would return 8.
745
746       Terms can be nested, too. Here’s an example that  performs  a  slightly
747       contrived match on a negated term inside an addition:
748
749          add_negate = Add(:x, Negate(:y)) -> x - y
750

PARSLEY REFERENCE

752   Basic syntax
753       foo = ....:
754              Define a rule named foo.
755
756       expr1 expr2:
757              Match  expr1, and then match expr2 if it succeeds, returning the
758              value of expr2. Like Python's and.
759
760       expr1 | expr2:
761              Try to match expr1 --- if it fails, match  expr2  instead.  Like
762              Python's or.
763
764       expr*: Match expr zero or more times, returning a list of matches.
765
766       expr+: Match expr one or more times, returning a list of matches.
767
768       expr?: Try to match expr. Returns None if it fails to match.
769
770       expr{n, m}:
771              Match expr at least n times, and no more than m times.
772
773       expr{n}:
774              Match expr n times exactly.
775
776       ~expr: Negative  lookahead. Fails if the next item in the input matches
777              expr. Consumes no input.
778
779       ~~expr:
780              Positive lookahead. Fails if the next item in the input does not
781              match expr. Consumes no input.
782
783       ruleName or ruleName(arg1 arg2 etc):
784              Call the rule ruleName, possibly with args.
785
786       'x':   Match the literal character 'x'.
787
788       <expr>:
789              Returns  the string consumed by matching expr. Good for tokeniz‐
790              ing rules.
791
792       expr:name:
793              Bind the result of expr to the local variable name.
794
795       -> pythonExpression:
796              Evaluate the given Python expression and return its result.  Can
797              be used inside parentheses too!
798
799       !(pythonExpression):
800              Invoke a Python expression as an action.
801
802       ?(pythonExpression):
803              Fail if the Python expression is false, Returns True otherwise.
804
805       expr ^(CustomLabel):
806              If  the  expr fails, the exception raised will contain CustomLa‐
807              bel.  Good for providing more context when  a  rule  is  broken.
808              CustomLabel can contain any character other than "(" and ")".
809
810       Comments  like  Python  comments are supported as well, starting with #
811       and extending to the end of the line.
812
813   Python API
814   Protocol parsing API
815       class ometa.protocol.ParserProtocol
816              The Twisted Protocol subclass used for parsing stream  protocols
817              using Parsley. It has two public attributes:
818
819              sender After  the connection is established, this attribute will
820                     refer to the sender created by the sender factory of  the
821                     ParserProtocol.
822
823              receiver
824                     After  the connection is established, this attribute will
825                     refer to the receiver created by the receiver factory  of
826                     the ParserProtocol.
827
828              It's   common   to   also   add   a  factory  attribute  to  the
829              ParserProtocol from its factory's buildProtocol method, but this
830              isn't strictly required or guaranteed to be present.
831
832              Subclassing  or  instantiating  ParserProtocol is not necessary;
833              makeProtocol() is sufficient and requires less boilerplate.
834
835       class ometa.protocol.Receiver
836              Receiver is not a real class but is used here for  demonstration
837              purposes to indicate the required API.
838
839              currentRule
840                     ParserProtocol  examines the currentRule attribute at the
841                     beginning of parsing as well as after every time  a  rule
842                     has completely matched. At these times, the rule with the
843                     same name as the value of currentRule will be selected to
844                     start parsing the incoming stream of data.
845
846              prepareParsing(parserProtocol)
847                     prepareParsing()  is  called after the ParserProtocol has
848                     established   a   connection,   and   is    passed    the
849                     ParserProtocol instance itself.
850
851                     Parameters
852                            parserProtocol -- An instance of ProtocolParser.
853
854              finishParsing(reason)
855                     finishParsing() is called if an exception was raised dur‐
856                     ing parsing, or when the ParserProtocol has lost its con‐
857                     nection,  whichever  comes  first. It will only be called
858                     once.
859
860                     An exception raised during parsing can be due to incoming
861                     data  that doesn't match the current rule or an exception
862                     raised calling python code during matching.
863
864                     Parameters
865                            reason -- A Failure encapsulating the reason pars‐
866                            ing has ended.
867
868       Senders  do not have any required API as ParserProtocol will never call
869       methods on a sender.
870
871   Built-in Parsley Rules
872       anything:
873              Matches a single character from the input.
874
875       letter:
876              Matches a single ASCII letter.
877
878       digit: Matches a decimal digit.
879
880       letterOrDigit:
881              Combines the above.
882
883       end:   Matches the end of input.
884
885       ws:    Matches zero or more spaces, tabs, or newlines.
886
887       exactly(char):
888              Matches the character char.
889

AUTHOR

891       Allen Short
892
894       2022, Allen Short
895
896
897
898
8991.3                              Jan 21, 2022                       PARSLEY(1)
Impressum