1RE2C(1)                     General Commands Manual                    RE2C(1)
2
3
4

NAME

6       re2c - convert regular-expressions to C/C++
7
8

SYNOPSIS

10       re2c [-bdDefFghisuvVw1] [-o output] [-c [-t header]] file
11
12

DESCRIPTION

14       re2c  is a preprocessor that generates C-based recognizers from regular
15       expressions.  The input to re2c consists of  C/C++  source  interleaved
16       with comments of the form /*!re2c ... */ which contain scanner specifi‐
17       cations.  In the output these comments are  replaced  with  code  that,
18       when  executed,  will  find  the next input token and then execute some
19       user-supplied token-specific code.
20
21       For example, given the following code
22
23          char *scan(char *p)
24          {
25          /*!re2c
26                  re2c:define:YYCTYPE  = "unsigned char";
27                  re2c:define:YYCURSOR = p;
28                  re2c:yyfill:enable   = 0;
29                  re2c:yych:conversion = 1;
30                  re2c:indent:top      = 1;
31                  [0-9]+          {return p;}
32                  [^]             {return (char*)0;}
33          */
34          }
35
36       re2c -is will generate
37
38          /* Generated by re2c on Sat Apr 16 11:40:58 1994 */
39          char *scan(char *p)
40          {
41              {
42                  unsigned char yych;
43
44                  yych = (unsigned char)*p;
45                  if(yych <= '/') goto yy4;
46                  if(yych >= ':') goto yy4;
47                  ++p;
48                  yych = (unsigned char)*p;
49                  goto yy7;
50          yy3:
51                  {return p;}
52          yy4:
53                  ++p;
54                  yych = (unsigned char)*p;
55                  {return char*)0;}
56          yy6:
57                  ++p;
58                  yych = (unsigned char)*p;
59          yy7:
60                  if(yych <= '/') goto yy3;
61                  if(yych <= '9') goto yy6;
62                  goto yy3;
63              }
64
65          }
66
67       You can place one /*!max:re2c */ comment that will  output  a  "#define
68       YYMAXFILL  <n>"  line  that  holds  the  maximum  number  of characters
69       required to parse the input. That is the maximum value  YYFILL(n)  will
70       receive.  If  -1 is in effect then YYMAXFILL can only be triggered once
71       after the last /*!re2c */.
72
73       You can also use /*!ignore:re2c */ blocks that allows to  document  the
74       scanner code and will not be part of the output.
75
76

OPTIONS

78       re2c provides the following options:
79
80       -?     -h Invoke a short help.
81
82       -b     Implies -s.  Use bit vectors as well in the attempt to coax bet‐
83              ter code out of the compiler.  Most  useful  for  specifications
84              with  more  than  a few keywords (e.g. for most programming lan‐
85              guages).
86
87       -c     Used to support (f)lex-like condition support.
88
89       -d     Creates a parser that dumps information about the current  posi‐
90              tion  and  in which state the parser is while parsing the input.
91              This is useful to debug parser issues and  states.  If  you  use
92              this  switch  you  need to define a macro YYDEBUG that is called
93              like a function with two  parameters:  void  YYDEBUG(int  state,
94              char  current). The first parameter receives the state or -1 and
95              the second parameter receives the input at the current cursor.
96
97       -D     Emit Graphviz dot data. It can then be processed with e.g.  "dot
98              -Tpng  input.dot  >  output.png". Please note that scanners with
99              many states may crash dot.
100
101       -e     Cross-compile from an ASCII platform to an EBCDIC one.
102
103       -f     Generate a scanner with support for storable state.  For details
104              see below at SCANNER WITH STORABLE STATES.
105
106       -F     Partial  support  for flex syntax. When this flag is active then
107              named definitions must be surrounded by curly braces and can  be
108              defined  without  an  equal sign and the terminating semi colon.
109              Instead names are treated as direct double quoted strings.
110
111       -g     Generate a scanner that utilizes GCC's  computed  goto  feature.
112              That  is  re2c generates jump tables whenever a decision is of a
113              certain complexity (e.g. a lot of if  conditions  are  otherwise
114              necessary).  This  is  only useable with GCC and produces output
115              that cannot be compiled with any other compiler. Note that  this
116              implies  -b  and that the complexity threshold can be configured
117              using the inplace configuration "cgoto:threshold".
118
119       -i     Do not output #line information. This is usefull when  you  want
120              use  a CMS tool with the re2c output which you might want if you
121              do not require your users to have re2c themselves when  building
122              from your source.  -o output Specify the output file.
123
124       -r     Allows  reuse  of  scanner  definitions with '/*!use:re2c' after
125              every '/*!use:re2c' block that follows. These blocks can contain
126              inplace    configurations,    especially    're2c:flags:w'   and
127              're2c:flags:u'.  That way it is  possible  to  create  the  same
128              scanner  multiple times for different character types, different
129              input  mechanisms   or   different   output   mechanisms.    The
130              '/*!use:re2c' blocks can also contain additional rules that will
131              be appended to the set of rules in '/*!rules:re2c'.
132
133       -s     Generate nested ifs for some switches.  Many compilers need this
134              assist to generate better code.
135
136       -t     Create  a  header  file  that contains types for the (f)lex-like
137              condition support.  This can only be activated  when  -c  is  in
138              use.
139
140       -u     Generate  a  parser  that  supports Unicode chars (UTF-32). This
141              means the generated code can deal with any valid Unicode charac‐
142              ter  up  to 0x10FFFF. When UTF-8 or UTF-16 needs to be supported
143              you need to convert the incoming stream  to  UTF-32  upon  input
144              yourself.
145
146       -v     Show version information.
147
148       -V     Show the version as a number XXYYZZ.
149
150       -w     Create  a  parser that supports wide chars (UCS-2). This implies
151              -s and cannot be used together with -e switch.
152
153       -1     Force single pass generation, this cannot be  combined  with  -f
154              and disables YYMAXFILL generation prior to last re2c block.
155
156       --no-generation-date
157              Suppress  date  output  in  the generated output so that it only
158              shows the re2c version.
159
160       --case-insensitive
161              All strings are  case  insensitive,  so  all  "-expressions  are
162              treated in the same way '-expressions are.
163
164       --case-inverted
165              Invert  the  meaning  of single and double quoted strings.  With
166              this switch single quotes are case sensitive and  double  quotes
167              are case insensitive.
168
169

INTERFACE CODE

171       Unlike  other scanner generators, re2c does not generate complete scan‐
172       ners: the user must supply some interface  code.   In  particular,  the
173       user  must define the following macros or use the corresponding inplace
174       configurations:
175
176       YYCONDTYPE
177              In -c mode you can use -t to generate a file that  contains  the
178              enumeration  used  as conditions. Each of the values refers to a
179              condition of a rule set.
180
181       YYCTXMARKER
182              l-expression of type *YYCTYPE.  The generated code saves  trail‐
183              ing  context  backtracking information in YYCTXMARKER.  The user
184              only needs to define this macro if a scanner specification  uses
185              trailing context in one or more of its regular-expressions.
186
187       YYCTYPE
188              Type  used  to  hold  an input symbol.  Usually char or unsigned
189              char.
190
191       YYCURSOR
192              l-expression of type *YYCTYPE that points to the  current  input
193              symbol.   The  generated  code  advances YYCURSOR as symbols are
194              matched.  On entry, YYCURSOR is assumed to point  to  the  first
195              character of the current token.  On exit, YYCURSOR will point to
196              the first character of the following token.
197
198       YYDEBUG(state,current)
199              This is only needed if the -d flag was specified. It  allows  to
200              easily  debug  the  generated  parser  by calling a user defined
201              function for every state. The function should have the following
202              signature:  void  YYDEBUG(int  state,  char current).  The first
203              parameter receives the state or  -1  and  the  second  parameter
204              receives the input at the current cursor.
205
206       YYFILL(n)
207              The  generated  code  "calls"  YYFILL(n)  when  the buffer needs
208              (re)filling:  at least n additional characters  should  be  pro‐
209              vided.  YYFILL(n)  should adjust YYCURSOR, YYLIMIT, YYMARKER and
210              YYCTXMARKER as needed.  Note that for typical  programming  lan‐
211              guages  n  will  be  the length of the longest keyword plus one.
212              The user can place a comment of the form /*!max:re2c */ once  to
213              insert  a  YYMAXFILL(n)  definition  that  is set to the maximum
214              length value. If -1 switch is used then YYMAXFILL can  be  trig‐
215              gered only once after the last /*!re2c */ block.
216
217       YYGETCONDITION()
218              This  define  is used to get the condition prior to entering the
219              scanner code when using -c switch. The value must be initialized
220              with a value from the enumeration YYCONDTYPE type.
221
222       YYGETSTATE()
223              The  user  only  needs  to  define this macro if the -f flag was
224              specified.  In that case,  the  generated  code  "calls"  YYGET‐
225              STATE()  at the very beginning of the scanner in order to obtain
226              the saved state. YYGETSTATE() must return a signed integer.  The
227              value  must be either -1, indicating that the scanner is entered
228              for the first time,  or  a  value  previously  saved  by  YYSET‐
229              STATE(s).   In  the  second case, the scanner will resume opera‐
230              tions right after where the last YYFILL(n) was called.
231
232       YYLIMIT
233              Expression of type *YYCTYPE that marks the  end  of  the  buffer
234              (YYLIMIT[-1]  is  the last character in the buffer).  The gener‐
235              ated code repeatedly compares YYCURSOR to YYLIMIT  to  determine
236              when the buffer needs (re)filling.
237
238       YYMARKER
239              l-expression  of  type *YYCTYPE.  The generated code saves back‐
240              tracking information in YYMARKER. Some easy scanners  might  not
241              use this.
242
243       YYMAXFILL
244              This  will  be automatically defined by /*!max:re2c */ blocks as
245              explained above.
246
247       YYSETCONDITION(c)
248              This define is used to set the condition  in  transition  rules.
249              This  is  only being used when -c is active and transition rules
250              are being used.
251
252       YYSETSTATE(s)
253              The user only needs to define this macro  if  the  -f  flag  was
254              specified.   In that case, the generated code "calls" YYSETSTATE
255              just before calling YYFILL(n).  The parameter to YYSETSTATE is a
256              signed integer that uniquely identifies the specific instance of
257              YYFILL(n) that is about to be called.  Should the user  wish  to
258              save  the  state of the scanner and have YYFILL(n) return to the
259              caller, all he has to do is store that  unique  identifer  in  a
260              variable.   Later,  when  the scannered is called again, it will
261              call YYGETSTATE() and resume execution right where it left  off.
262              The  generated  code  will contain both YYSETSTATE(s) and YYGET‐
263              STATE even if YYFILL(n) is being disabled.
264
265

SCANNER WITH STORABLE STATES

267       When the -f flag is specified, re2c generates a scanner that can  store
268       its  current  state,  return to the caller, and later resume operations
269       exactly where it left off.
270
271       The default operation of re2c is a "pull" model, where the scanner asks
272       for  extra  input whenever it needs it. However, this mode of operation
273       assumes that the scanner is the "owner" the parsing loop, and that  may
274       not always be convenient.
275
276       Typically,  if  there  is  a  preprocessor  ahead of the scanner in the
277       stream, or for that matter any other procedural  source  of  data,  the
278       scanner  cannot "ask" for more data unless both scanner and source live
279       in a separate threads.
280
281       The -f flag is useful for just this situation : it  lets  users  design
282       scanners  that  work  in  a "push" model, i.e. where data is fed to the
283       scanner chunk by chunk. When the scanner runs out of data  to  consume,
284       it  just  stores  its  state, and return to the caller. When more input
285       data is fed to the scanner, it resumes operations exactly where it left
286       off.
287
288       When  using  the -f option re2c does not accept stdin because it has to
289       do the full generation process twice which means it  has  to  read  the
290       input  twice.  That  means  re2c  would fail in case it cannot open the
291       input twice or reading the input for the first time influences the sec‐
292       ond read attempt.
293
294       Changes needed compared to the "pull" model.
295
296       1. User has to supply macros YYSETSTATE() and YYGETSTATE(state)
297
298       2. The -f option inhibits declaration of yych and yyaccept. So the user
299       has to declare these. Also the user has to save and restore  these.  In
300       the  example examples/push.re these are declared as fields of the (C++)
301       class of which the scanner is a method, so  they  do  not  need  to  be
302       saved/restored  explicitly.  For  C they could e.g. be made macros that
303       select fields from a structure passed in as  parameter.  Alternatively,
304       they could be declared as local variables, saved with YYFILL(n) when it
305       decides to return and restored at entry to the function. Also, it could
306       be  more  efficient  to  save  the  state from YYFILL(n) because YYSET‐
307       STATE(state) is called unconditionally. YYFILL(n) however does not  get
308       state as parameter, so we would have to store state in a local variable
309       by YYSETSTATE(state).
310
311       3. Modify YYFILL(n) to return (from the function calling  it)  if  more
312       input is needed.
313
314       4. Modify caller to recognise "more input is needed" and respond appro‐
315       priately.
316
317       5. The generated code will contain a  switch  block  that  is  used  to
318       restores  the  last  state  by jumping behind the corrspoding YYFILL(n)
319       call. This code is automatically generated in the epilog of  the  first
320       "/*!re2c */" block.  It is possible to trigger generation of the YYGET‐
321       STATE() block earlier by placing a "/*!getstate:re2c */" comment.  This
322       is  especially  useful when the scanner code should be wrapped inside a
323       loop.
324
325       Please see examples/push.re for push-model scanner. The generated  code
326       can   be   tweaked   using  inplace  configurations  "state:abort"  and
327       "state:nextlabel".
328
329

SCANNER WITH CONDITION SUPPORT

331       You can preceed regular-expressions with a list of condition names when
332       using  the  -c  switch.  In this case re2c generates scanner blocks for
333       each conditon. Where each of the generated blocks has its own precondi‐
334       tion.  The  precondition is given by the interface define YYGETCONDITON
335       and must be of type YYCONDTYPE.
336
337       There are two special rule types. First, the rules of the condition '*'
338       are  merged  to  all   conditions.  And second the empty condition list
339       allows to provide a code block that does not have a scanner part. Mean‐
340       ing  it  does  not  allow  any  regular expression. The condition value
341       referring to this special block is always the one with the  enumeration
342       value 0. This way the code of this special rule can be used to initial‐
343       ize a scanner. It is in no way necessary to have these rules: but some‐
344       times it is helpful to have a dedicated uninitialized condition state.
345
346       Non  empty  rules  allow to specify the new condition, which makes them
347       transition rules. Besides generating calls for the define  YYSETCONDTI‐
348       TION no other special code is generated.
349
350       There  is  another  kind of special rules that allow to prepend code to
351       any code block of all rules of a certain set of conditions  or  to  all
352       code  blocks  to  all rules. This can be helpful when some operation is
353       common among rules. For instance this can be used to store  the  length
354       of the scanned string. These special setup rules start with an exclama‐
355       tion mark followed by either a list of conditions <! condition,  ...  >
356       or  a  star  <!*>.  When re2c generates the code for a rule whose state
357       does not have a setup rule and a star'd setup  rule  is  present,  than
358       that code will be used as setup code.
359
360

SCANNER SPECIFICATIONS

362       Each  scanner  specification  consists of a set of rules, named defini‐
363       tions and configurations.
364
365       Rules consist of a regular-expression along with a block of C/C++  code
366       that  is  to  be  executed  when  the  associated regular-expression is
367       matched. You can either start the code with an opening curly  brace  or
368       the  sequence  ':='.  When the code with a curly brace then re2c counts
369       the brace depth and stops looking  for  code  automatically.  Otherwise
370       curly  braces  are  not  allowed and re2c stops looking for code at the
371       first line that does not begin with whitespace.
372
373              regular-expression { C/C++ code }
374
375              regular-expression := C/C++ code
376
377       If -c is active then each regular-expression is preceeded by a list  of
378       comma  separated condition names. Besides normal naming rules there are
379       two special cases. A rule may contain the single condition name '*' and
380       no  contition  name  at  all. In the latter case the rule cannot have a
381       regular-expression. Non empty rules may further more  specify  the  new
382       condition.  In  that  case  re2c  will  generated the necessary code to
383       chnage the condition automatically. Just as above code can  be  started
384       with  a  curly  brace  of the sequence ':='. Further more rules can use
385       ':=>' as a shortcut to automatically generate code that not  only  sets
386       the  new  condition  state  but  also  continues execution with the new
387       state. A shortcut rule should not be used in a loop where there is code
388       between  the start of the loop and the re2c block unless re2c:cond:goto
389       is changed to 'continue;'. If code is necessary before all rule (though
390       not simple jumps) you can doso by using <! pseudo-rules.
391
392              <condition-list> regular-expression { C/C++ code }
393
394              <condition-list> regular-expression := C/C++ code
395
396              <condition-list> regular-expression => condition { C/C++ code }
397
398              <condition-list> regular-expression => condition := C/C++ code
399
400              <condition-list> regular-expression :=> condition
401
402              <*> regular-expression { C/C++ code }
403
404              <*> regular-expression := C/C++ code
405
406              <*> regular-expression => condition { C/C++ code }
407
408              <*> regular-expression => condition := C/C++ code
409
410              <*> regular-expression :=> condition
411
412              <> { C/C++ code }
413
414              <> := C/C++ code
415
416              <> => condition { C/C++ code }
417
418              <> => condition := C/C++ code
419
420              <> :=> condition
421
422              <!condition-list> { C/C++ code }
423
424              <!condition-list> := C/C++ code
425
426              <!*> { C/C++ code }
427
428              <!*> := C/C++ code
429
430       Named definitions are of the form:
431
432              name = regular-expression;
433
434       -F is active, then named definitions are also of the form:
435
436              name regular-expression
437
438       Configurations  look  like  named  definitions  whose  names start with
439       "re2c:":
440
441              re2c:name = value;
442              re2c:name = "value";
443
444

SUMMARY OF RE2C REGULAR-EXPRESSIONS

446       "foo"  the literal string foo.  ANSI-C escape sequences can be used.
447
448       'foo'  the literal string foo (characters [a-zA-Z] treated  case-insen‐
449              sitive).  ANSI-C escape sequences can be used.
450
451       [xyz]  a  "character  class";  in  this  case,  the  regular-expression
452              matches either an 'x', a 'y', or a 'z'.
453
454       [abj-oZ]
455              a "character class" with a range in it; matches an 'a',  a  'b',
456              any letter from 'j' through 'o', or a 'Z'.
457
458       [^class]
459              an inverted "character class".
460
461       r\s    match  any  r  which isn't an s. r and s must be regular-expres‐
462              sions which can be expressed as character classes.
463
464       r*     zero or more r's, where r is any regular-expression
465
466       r+     one or more r's
467
468       r?     zero or one r's (that is, "an optional r")
469
470       name   the expansion of the "named definition" (see above)
471
472       (r)    an r; parentheses are used to override precedence (see below)
473
474       rs     an r followed by an s ("concatenation")
475
476       r|s    either an r or an s
477
478       r/s    an r but only if it is followed by an s. The s is  not  part  of
479              the  matched  text.  This  type  of regular-expression is called
480              "trailing context". A trailing context can only be the end of  a
481              rule and not part of a named definition.
482
483       r{n}   matches r exactly n times.
484
485       r{n,}  matches r at least n times.
486
487       r{n,m} matches r at least n but not more than m times.
488
489       .      match any character except newline (\n).
490
491       def    matches  named definition as specified by def only if -F is off.
492              If the switch -F  is  active  then  this  behaves  like  it  was
493              enclosed in double quotes and matches the string def.
494
495       Character classes and string literals may contain octoal or hexadecimal
496       character definitions and the following set of escape sequences (\n,
497        \t, \v, \b, \r, \f, \a, \\).  An octal character is defined by a back‐
498       slash followed by its three octal digits and a hexadecimal character is
499       defined by backslash, a lower cased 'x' and its two hexadecimal  digits
500       or a backslash, an upper cased X and its four hexadecimal digits.
501
502       re2c  further more supports the c/c++ unicode notation. That is a back‐
503       slash followed by either a lowercased u and its four hexadecimal digits
504       or an uppercased U and its eight hexadecimal digits. However only in -u
505       mode the generated code can deal with any valid Unicode character up to
506       0x10FFFF.
507
508       Since  characters  greater  \X00FF are not allowed in non unicode mode,
509       the only portable "any" rules are (.|"\n") and [^].
510
511       The regular-expressions listed above are grouped  according  to  prece‐
512       dence,  from  highest  precedence  at  the top to lowest at the bottom.
513       Those grouped together have equal precedence.
514
515

INPLACE CONFIGURATION

517       It is possible to configure code generation  inside  re2c  blocks.  The
518       following lists the available configurations:
519
520       re2c:condprefix = yyc_ ;
521              Allows  to specify the prefix used for condition labels. That is
522              this text is prepended to any condition label in  the  generated
523              output file.
524
525       re2c:condenumprefix = yyc ;
526              Allows  to specify the prefix used for condition values. That is
527              this text is prepended to any condition enum value in the gener‐
528              ated output file.
529
530       re2c:cond:divider = "/* *********************************** */" ;
531              Allows  to  customize  the devider for condition blocks. You can
532              use '@@' to put the name of the condition or ustomize the  plae‐
533              holder using re2c:cond:divider@cond.
534
535       re2c:cond:divider@cond = @@ ;
536              Specifies  the placeholder that will be replaced with the condi‐
537              tion name in re2c:cond:divider.
538
539       re2c:cond:goto = "goto @@;" ;
540              Allows to customize the  condition  goto  statements  used  with
541              ':=>' style rules.  You can use '@@' to put the name of the con‐
542              dition or ustomize the plaeholder using re2c:cond:goto@cond. You
543              can  also  change  this to 'continue;', which would allow you to
544              continue with the next loop cycle  including  any  code  between
545              loop start and re2c block.
546
547       re2c:cond:goto@cond = @@ ;
548              Spcifies  the  placeholder that will be replaced with the condi‐
549              tion label in re2c:cond:goto.
550
551       re2c:indent:top = 0 ;
552              Specifies the minimum number of indendation to use.  Requires  a
553              numeric value greater than or equal zero.
554
555       re2c:indent:string = "\t" ;
556              Specifies  the  string to use for indendation. Requires a string
557              that should contain only whitespace unless  you  need  this  for
558              external  tools. The easiest way to specify spaces is to enclude
559              them in single or double quotes. If you do not want any indenda‐
560              tion at all you can simply set this to "".
561
562       re2c:yych:conversion = 0 ;
563              When this setting is non zero, then re2c automatically generates
564              conversion code whenever yych gets read. In this case  the  type
565              must be defined using re2c:define:YYCTYPE.
566
567       re2c:yych:emit = 1 ;
568              Generation of yych can be suppressed by setting this to 0.
569
570       re2c:yybm:hex = 0 ;
571              If  set  to zero then a decimal table is being used else a hexa‐
572              decimal table will be generated.
573
574       re2c:yyfill:enable = 1 ;
575              Set this to zero to suppress generation of YYFILL(n). When using
576              this  be sure to verify that the generated scanner does not read
577              behind input. Allowing this behavior might introduce sever secu‐
578              rity issues to you programs.
579
580       re2c:yyfill:check = 1 ;
581              This  can be set 0 to suppress output of the pre condition using
582              YYCURSOR and  YYLIMIT  which  becomes  usefull  when  YYLIMIT  +
583              max(YYFILL) is always accessible.
584
585       re2c:yyfill:parameter = 1 ;
586              Allows  to suppress parameter passing to YYFILL calls. If set to
587              zero  then  no  parameter   is   passed   to   YYFILL.   However
588              define:YYFILL@LEN allows to specify a replacement string for the
589              actual length value. If set to a  non  zero  value  then  YYFILL
590              usage  will be followed by the number of requested characters in
591              braces unless re2c:define:YYFILL:naked is  set.   Also  look  at
592              re2c:define:YYFILL:naked and re2c:define:YYFILL@LEN.
593
594       re2c:startlabel = 0 ;
595              If  set  to  a non zero integer then the start label of the next
596              scanner blocks will be generated even if not used by the scanner
597              itself.  Otherwise the normal yy0 like start label is only being
598              generated if needed. If set to a text value then  a  label  with
599              that  text  will  be  generated regardless of whether the normal
600              start label is being used or not. This setting is being reset to
601              0 after a start label has been generated.
602
603       re2c:labelprefix = yy ;
604              Allows  to  change the prefix of numbered labels. The default is
605              yy and can be set any string that is a valid label.
606
607       re2c:state:abort = 0 ;
608              When not zero and switch -f is active then the YYGETSTATE  block
609              will  contain  a  default case that aborts and a -1 case is used
610              for initialization.
611
612       re2c:state:nextlabel = 0 ;
613              Used when -f is active to control whether the  YYGETSTATE  block
614              is followed by a yyNext: label line. Instead of using yyNext you
615              can usually also use configuration startlabel to  force  a  spe‐
616              cific  start  label or default to yy0 as start label. Instead of
617              using a dedicated label it  is  often  better  to  separate  the
618              YYGETSTATE  code  from  the  actual  scanner  code  by placing a
619              "/*!getstate:re2c */" comment.
620
621       re2c:cgoto:threshold = 9 ;
622              When -g is active this value specifies the complexity  threshold
623              that triggers generation of jump tables rather than using nested
624              if's and decision bitfields.  The threshold is compared  against
625              a  calculated  estimation of if-s needed where every used bitmap
626              divides the threshold by 2.
627
628       re2c:yych:conversion = 0 ;
629              When the input uses signed characters and -s or -b switches  are
630              in  effect  re2c allows to automatically convert to the unsigned
631              character type that is then necessary for  its  internal  single
632              character. When this setting is zero or an empty string the con‐
633              version is disabled. Using a non zero number the  conversion  is
634              taken from YYCTYPE. If that is given by an inplace configuration
635              that value is being used. Otherwise it  will  be  (YYCTYPE)  and
636              changes to that configuration are  no longer possible. When this
637              setting is a string the braces must be specified.  Now  assuming
638              your  input  is a char* buffer and you are using above mentioned
639              switches you can set YYCTYPE to unsigned char and  this  setting
640              to either 1 or "(unsigned char)".
641
642       re2c:define:define:YYCONDTYPE = YYCONDTYPE ;
643              Enumeration used for condition support with -c mode.
644
645       re2c:define:YYCTXMARKER = YYCTXMARKER ;
646              Allows  to overwrite the define YYCTXMARKER and thus avoiding it
647              by setting the value to the actual code needed.
648
649       re2c:define:YYCTYPE = YYCTYPE ;
650              Allows to overwrite the define YYCTYPE and thus avoiding  it  by
651              setting the value to the actual code needed.
652
653       re2c:define:YYCURSOR = YYCURSOR ;
654              Allows  to overwrite the define YYCURSOR and thus avoiding it by
655              setting the value to the actual code needed.
656
657       re2c:define:YYDEBUG = YYDEBUG ;
658              Allows to overwrite the define YYDEBUG and thus avoiding  it  by
659              setting the value to the actual code needed.
660
661       re2c:define:YYFILL = YYFILL ;
662              Allows  to  overwrite  the define YYFILL and thus avoiding it by
663              setting the value to the actual code needed.
664
665       re2c:define:YYFILL:naked = 0 ;
666              When set to 1 neither braces, parameter nor semicolon gets emit‐
667              ted.
668
669       re2c:define:YYFILL@len = @@ ;
670              When  using  re2c:define:YYFILL  and  re2c:yyfill:parameter is 0
671              then any occurence of this text inside YYFILL will  be  replaced
672              with the actual length value.
673
674       re2c:define:YYGETCONDITION = YYGETCONDITION ;
675              Allows to overwrite the define YYGETCONDITION.
676
677       re2c:define:YYGETCONDITION:naked =  ;
678              When set to 1 neither braces, parameter nor semicolon gets emit‐
679              ted.
680
681       re2c:define:YYGETSTATE = YYGETSTATE ;
682              Allows to overwrite the define YYGETSTATE and thus  avoiding  it
683              by setting the value to the actual code needed.
684
685       re2c:define:YYGETSTATE:naked = 0 ;
686              When set to 1 neither braces, parameter nor semicolon gets emit‐
687              ted.
688
689       re2c:define:YYLIMIT = YYLIMIT ;
690              Allows to overwrite the define YYLIMIT and thus avoiding  it  by
691              setting the value to the actual code needed.
692
693       re2c:define:YYMARKER = YYMARKER ;
694              Allows  to overwrite the define YYMARKER and thus avoiding it by
695              setting the value to the actual code needed.
696
697       re2c:define:YYSETCONDITION = YYSETCONDITION ;
698              Allows to overwrite the define YYSETCONDITION.
699
700       re2c:define:YYSETCONDITION@cond = @@ ;
701              When using re2c:define:YYSETCONDITION then any occurence of this
702              text  inside YYSETCONDITION will be replaced with the actual new
703              condition value.
704
705       re2c:define:YYSETSTATE = YYSETSTATE ;
706              Allows to overwrite the define YYSETSTATE and thus  avoiding  it
707              by setting the value to the actual code needed.
708
709       re2c:define:YYSETSTATE:naked = 0 ;
710              When set to 1 neither braces, parameter nor semicolon gets emit‐
711              ted.
712
713       re2c:define:YYSETSTATE@state = @@ ;
714              When using re2c:define:YYSETSTATE then  any  occurence  of  this
715              text  inside  YYSETSTATE  will  be  replaced with the actual new
716              state value.
717
718       re2c:label:yyFillLabel = yyFillLabel ;
719              Allows to overwrite the name of the label yyFillLabel.
720
721       re2c:label:yyNext = yyNext ;
722              Allows to overwrite the name of the label yyNext.
723
724       re2c:variable:yyaccept = yyaccept ;
725              Allows to overwrite the name of the variable yyaccept.
726
727       re2c:variable:yybm = yybm ;
728              Allows to overwrite the name of the variable yybm.
729
730       re2c:variable:yych = yych ;
731              Allows to overwrite the name of the variable yych.
732
733       re2c:variable:yyctable = yyctable ;
734              When both -c and -g are active then re2c uses this  variable  to
735              generate a static jump table for YYGETCONDITION.
736
737       re2c:variable:yystable = yystable ;
738              When  both  -f and -g are active then re2c uses this variable to
739              generate a static jump table for YYGETSTATE.
740
741       re2c:variable:yytarget = yytarget ;
742              Allows to overwrite the name of the variable yytarget.
743
744

UNDERSTANDING RE2C

746       The subdirectory lessons of the re2c distribution contains a  few  step
747       by  step  lessons  to  get  you  started with re2c. All examples in the
748       lessons subdirectory can be compiled and actually work.
749
750

FEATURES

752       re2c does not provide a default action: the generated code assumes that
753       the  input will consist of a sequence of tokens.  Typically this can be
754       dealt with by adding a rule such as the one for  unexpected  characters
755       in the example above.
756
757       The  user  must  arrange  for  a sentinel token to appear at the end of
758       input (and provide a rule for matching it): re2c does  not  provide  an
759       <<EOF>>  expression.   If  the  source  is  from a null-byte terminated
760       string, a rule matching a null character will suffice.  If  the  source
761       is  from  a  file  then you could pad the input with a newline (or some
762       other character that cannot appear within another token);  upon  recog‐
763       nizing  such  a  character  check  to see if it is the sentinel and act
764       accordingly. And you can also use YYFILL(n) to end the scanner in  case
765       not enough characters are available which is nothing else then e detec‐
766       tion of end of data/file.
767
768

BUGS

770       Difference only works for character sets.
771
772       The re2c internal algorithms need documentation.
773
774

SEE ALSO

776       flex(1), lex(1).
777
778       More information on re2c can be found here:
779       http://re2c.org/
780
781

AUTHORS

783       Peter Bumbulis <peter@csg.uwaterloo.ca>
784       Brian Young <bayoung@acm.org>
785       Dan Nuffer <nuffer@users.sourceforge.net>
786       Marcus Boerger <helly@users.sourceforge.net>
787       Hartmut Kaiser <hkaiser@users.sourceforge.net>
788       Emmanuel Mogenet <mgix@mgix.com> added storable state

VERSION INFORMATION

790       This manpage describes re2c, version 0.13.5.
791
792
793
794
795Version 0.13.5                    12 Jul 2010                          RE2C(1)
Impressum