lex(1) - osol9

1lex(1)                           User Commands                          lex(1)
2
3
4

NAME

6       lex - generate programs for lexical tasks
7

SYNOPSIS

9       lex [-cntv] [-e | -w] [-V -Q [y | n]] [file]...
10
11

DESCRIPTION

13       The  lex  utility generates C programs to be used in lexical processing
14       of character input, and that can be used as an interface to yacc. The C
15       programs  are  generated  from lex source code and conform to the ISO C
16       standard. Usually, the lex utility writes the program it  generates  to
17       the  file  lex.yy.c. The state of this file is unspecified if lex exits
18       with a non-zero exit status. See EXTENDED DESCRIPTION  for  a  complete
19       description of the lex input language.
20

OPTIONS

22       The following options are supported:
23
24       -c          Indicates C-language action (default option).
25
26
27       -e          Generates  a program that can handle EUC characters (cannot
28                   be used with the -w option). yytext[] is of  type  unsigned
29                   char[].
30
31
32       -n          Suppresses  the  summary of statistics usually written with
33                   the -v option. If no table sizes are specified in  the  lex
34                   source  code and the -v option is not specified, then -n is
35                   implied.
36
37
38       -t          Writes the resulting program to standard output instead  of
39                   lex.yy.c.
40
41
42       -v          Writes  a  summary of lex statistics to the standard error.
43                   (See the discussion of lex table sizes  under  the  heading
44                   Definitions  in  lex.)  If table sizes are specified in the
45                   lex source code, and if the -n option is not specified, the
46                   -v option may be enabled.
47
48
49       -w          Generates  a program that can handle EUC characters (cannot
50                   be used with the -e option). Unlike the -e option, yytext[]
51                   is of type wchar_t[].
52
53
54       -V          Prints out version information on standard error.
55
56
57       -Q[y|n]     Prints  out  version information to output file lex.yy.c by
58                   using -Qy. The -Qn option does not print out version infor‐
59                   mation and is the default.
60
61

OPERANDS

63       The following operand is supported:
64
65       file     A  pathname  of  an  input file. If more than one such file is
66                specified, all files will be concatenated to produce a  single
67                lex  program.  If no file operands are specified, or if a file
68                operand is −, the standard input will be used.
69
70

OUTPUT

72       The lex output files are described below.
73
74   Stdout
75       If the -t option is specified, the text file of C source code output of
76       lex will be written to standard output.
77
78   Stderr
79       If the -t option is specified informational, error and warning messages
80       concerning the contents of lex source code input will be written to the
81       standard error.
82
83
84       If the -t option is not specified:
85
86           1.     Informational error and warning messages concerning the con‐
87                  tents of lex source code input will be written to either the
88                  standard output or standard error.
89
90           2.     If the -v option is specified and the -n option is not spec‐
91                  ified, lex statistics  will  also  be  written  to  standard
92                  error. These statistics may also be generated if table sizes
93                  are specified with a % operator in the  Definitions  in  lex
94                  section (see EXTENDED DESCRIPTION), as long as the -n option
95                  is not specified.
96
97   Output Files
98       A text file containing C source code will be written to lex.yy.c, or to
99       the standard output if the -t option is present.
100

EXTENDED DESCRIPTION

102       Each  input  file contains lex source code, which is a table of regular
103       expressions with corresponding actions in the form of C  program  frag‐
104       ments.
105
106
107       When lex.yy.c is compiled and linked with the lex library (using the -l
108       l operand with c89 or cc), the resulting program reads character  input
109       from  the  standard input and partitions it into strings that match the
110       given expressions.
111
112
113       When an expression is matched, these actions will occur:
114
115           o      The input string that was matched is left  in  yytext  as  a
116                  null-terminated string; yytext is either an external charac‐
117                  ter array or a pointer to a character string.  As  explained
118                  in  Definitions  in lex, the type can be explicitly selected
119                  using the %array or %pointer declarations, but  the  default
120                  is %array.
121
122           o      The external int yyleng is set to the length of the matching
123                  string.
124
125           o      The expression's corresponding program fragment, or  action,
126                  is executed.
127
128
129       During  pattern matching, lex searches the set of patterns for the sin‐
130       gle longest possible match. Among rules that match the same  number  of
131       characters, the rule given first will be chosen.
132
133
134       The general format of lex source is:
135
136         Definitions
137         %%
138         Rules
139         %%
140         User Subroutines
141
142
143
144       The  first  %%  is required to mark the beginning of the rules (regular
145       expressions and actions); the second %% is required only if  user  sub‐
146       routines follow.
147
148
149       Any line in the Definitions in lex section beginning with a blank char‐
150       acter will be assumed to be a C program fragment and will be copied  to
151       the  external definition area of the lex.yy.c file. Similarly, anything
152       in the Definitions in lex section included between delimiter lines con‐
153       taining  only  %{  and %} will also be copied unchanged to the external
154       definition area of the lex.yy.c file.
155
156
157       Any such input (beginning with a blank character or within  %{  and  %}
158       delimiter lines) appearing at the beginning of the Rules section before
159       any rules are specified will be written to lex.yy.c after the  declara‐
160       tions  of variables for the yylex function and before the first line of
161       code in yylex. Thus, user variables local  to  yylex  can  be  declared
162       here, as well as application code to execute upon entry to yylex.
163
164
165       The  action  taken  by lex when encountering any input beginning with a
166       blank character or within %{ and %} delimiter lines  appearing  in  the
167       Rules  section  but  coming  after  one or more rules is undefined. The
168       presence of such input may result in an  erroneous  definition  of  the
169       yylex function.
170
171   Definitions in lex
172       Definitions  in  lex  appear before the first %% delimiter. Any line in
173       this section not contained between %{ and %} lines  and  not  beginning
174       with  a blank character is assumed to define a lex substitution string.
175       The format of these lines is:
176
177         name   substitute
178
179
180
181
182       If a name does not meet the requirements for identifiers in the  ISO  C
183       standard,  the  result is undefined. The string substitute will replace
184       the string { name } when it is used in a rule. The name string is  rec‐
185       ognized  in  this context only when the braces are provided and when it
186       does not appear within a bracket expression or within double-quotes.
187
188
189       In the Definitions in lex section, any line beginning with a % (percent
190       sign)  character  and  followed  by an alphanumeric word beginning with
191       either s or S defines a set of start  conditions.  Any  line  beginning
192       with  a % followed by a word beginning with either x or X defines a set
193       of exclusive start conditions. When the generated scanner is  in  a  %s
194       state,  patterns  with  no state specified will be also active; in a %x
195       state, such patterns will not be active. The rest of  the  line,  after
196       the  first  word, is considered to be one or more blank-character-sepa‐
197       rated names of start conditions. Start condition names are  constructed
198       in  the  same  way as definition names. Start conditions can be used to
199       restrict the matching of regular expressions to one or more  states  as
200       described in Regular expressions in lex.
201
202
203       Implementations  accept  either of the following two mutually exclusive
204       declarations in the Definitions in lex section:
205
206       %array       Declare the type of yytext to be a null-terminated charac‐
207                    ter array.
208
209
210       %pointer     Declare  the type of yytext to be a pointer to a null-ter‐
211                    minated character string.
212
213
214
215       Note: When using the %pointer option, you may not also use  the  yyless
216       function to alter yytext.
217
218
219       %array  is  the  default. If %array is specified (or neither %array nor
220       %pointer is specified), then the correct way to make an external refer‐
221       ence to yyext is with a declaration of the form:
222
223
224       extern char yytext[]
225
226
227       If %pointer is specified, then the correct external reference is of the
228       form:
229
230
231       extern char *yytext;
232
233
234       lex will accept declarations in the Definitions in lex section for set‐
235       ting  certain  internal  table sizes. The declarations are shown in the
236       following table.
237
238
239       Table Size Declaration in lex
240
241
242
243
244       ┌───────────────────────────────────────────────────────────────────┐
245       │ Declaration               Description                  Default    │
246       ├───────────────────────────────────────────────────────────────────┤
247       │%pn             Number of positions                  2500          │
248       │%nn             Number of states                     500           │
249       │%a n            Number of transitions                2000          │
250       │%en             Number of parse tree nodes           1000          │
251       │%kn             Number of packed character classes   10000         │
252       │%on             Size of the output array             3000          │
253       └───────────────────────────────────────────────────────────────────┘
254
255
256       Programs generated by lex need either the -e or  -w  option  to  handle
257       input that contains EUC characters from supplementary codesets. If nei‐
258       ther of these options is specified, yytext is of the type  char[],  and
259       the generated program can handle only ASCII characters.
260
261
262       When  the  -e option is used, yytext is of the type unsigned char[] and
263       yyleng gives the total number of bytes in the matched string. With this
264       option,  the  macros input(), unput(c), and output(c) should do a byte-
265       based I/O in the same way as with the regular ASCII lex. Two more vari‐
266       ables  are  available  with  the  -e option, yywtext and yywleng, which
267       behave the same as yytext and yyleng would under the -w option.
268
269
270       When the -w option is used, yytext is of the type wchar_t[] and  yyleng
271       gives  the  total  number  of characters in the matched string.  If you
272       supply your own  input(),  unput(c),  or  output(c)  macros  with  this
273       option,  they  must return or accept EUC characters in the form of wide
274       character (wchar_t). This allows a  different  interface  between  your
275       program and the lex internals, to expedite some programs.
276
277   Rules in lex
278       The Rules in lex source files are a table in which the left column con‐
279       tains regular expressions and the right column contains actions (C pro‐
280       gram fragments) to be executed when the expressions are recognized.
281
282         ERE action
283         ERE action
284         ...
285
286
287
288       The  extended  regular  expression (ERE) portion of a row will be sepa‐
289       rated from action by one or more blank characters. A regular expression
290       containing  blank  characters  is recognized under one of the following
291       conditions:
292
293           o      The entire expression appears within double-quotes.
294
295           o      The blank characters appear within double-quotes  or  square
296                  brackets.
297
298           o      Each blank character is preceded by a backslash character.
299
300   User Subroutines in lex
301       Anything  in  the  user  subroutines section will be copied to lex.yy.c
302       following yylex.
303
304   Regular Expressions     in lex
305       The lex utility supports the set of Extended Regular Expressions (EREs)
306       described  on  regex(5)  with the following additions and exceptions to
307       the syntax:
308
309       ...           Any string enclosed in double-quotes will  represent  the
310                     characters within the double-quotes as themselves, except
311                     that backslash escapes (which appear in the following ta‐
312                     ble)  are  recognized.  Any  backslash-escape sequence is
313                     terminated by the closing quote.  For  example,  "\01""1"
314                     represents a single string: the octal value 1 followed by
315                     the character 1.
316
317
318
319       <state>r
320
321       <state1, state2, ...>r
322
323           The regular expression r will be matched only when the  program  is
324           in  one  of the start conditions indicated by state, state1, and so
325           forth. For more information, see Actions in lex. As an exception to
326           the typographical conventions of the rest of this document, in this
327           case <state> does not represent a  metavariable,  but  the  literal
328           angle-bracket  characters surrounding a symbol. The start condition
329           is recognized as such only at the beginning of  a  regular  expres‐
330           sion.
331
332
333       r/x
334
335           The  regular expression r will be matched only if it is followed by
336           an occurrence of regular expression x. The token returned in yytext
337           will  only match r. If the trailing portion of r matches the begin‐
338           ning of x, the result  is  unspecified.  The  r  expression  cannot
339           include further trailing context or the $ (match-end-of-line) oper‐
340           ator; x cannot include the  ^  (match-beginning-of-line)  operator,
341           nor  trailing context, nor the $ operator. That is, only one occur‐
342           rence of trailing context is allowed in a lex  regular  expression,
343           and  the  ^  operator  only can be used at the beginning of such an
344           expression. A further  restriction  is  that  the  trailing-context
345           operator / (slash) cannot be grouped within parentheses.
346
347
348       {name}
349
350           When  name  is one of the substitution symbols from the Definitions
351           section, the  string,  including  the  enclosing  braces,  will  be
352           replaced  by  the  substitute  value.  The substitute value will be
353           treated in the extended regular expression as if it  were  enclosed
354           in  parentheses. No substitution will occur if {name} occurs within
355           a bracket expression or within double-quotes.
356
357
358
359       Within an ERE, a backslash character (\\, \a, \b, \f, \n, \r,  \t,  \v)
360       is  considered  to  begin  an  escape sequence. In addition, the escape
361       sequences in the following table will be recognized.
362
363
364       A literal newline character cannot occur  within  an  ERE;  the  escape
365       sequence  \n  can  be  used to represent a newline character. A newline
366       character cannot be matched by a period operator.
367
368
369       Escape Sequences in lex
370
371
372
373
374       ┌──────────────────────────────────────────────────────────────────────────────────────┐
375       │Escape Sequences in lex                                                               │
376       ├──────────────────────────────────────────────────────────────────────────────────────┤
377       │    Escape Sequence               Description                       Meaning           │
378       ├──────────────────────────────────────────────────────────────────────────────────────┤
379       │        \digits           A  backslash  character  fol‐   The  character whose encod‐ │
380       │                          lowed by the longest sequence   ing is represented  by  the │
381       │                          of one, two or  three  octal-   one-,  two-  or three-digit │
382       │                          digit  characters (01234567).   octal  integer.  Multi-byte │
383       │                          Ifall of the  digits  are  0,   characters  require  multi‐ │
384       │                          (that  is,  representation of   ple,  concatenated   escape │
385       │                          the   NUL   character),   the   sequences   of  this  type, │
386       │                          behavior is undefined.          including the leading \ for │
387       │                                                          each byte.                  │
388       ├──────────────────────────────────────────────────────────────────────────────────────┤
389       │       \xdigits           A  backslash  character  fol‐   The character whose  encod‐ │
390       │                          lowed by the longest sequence   ing  is  represented by the │
391       │                          of  hexadecimal-digit charac‐   hexadecimal integer.        │
392       │                          ters  (01234567abcdefABCDEF).                               │
393       │                          If  all  of the digits are 0,                               │
394       │                          (that is,  representation  of                               │
395       │                          the   NUL   character),   the                               │
396       │                          behavior is undefined.                                      │
397       ├──────────────────────────────────────────────────────────────────────────────────────┤
398       │          \c              A  backslash  character  fol‐   The character c, unchanged. │
399       │                          lowed  by  any  character not                               │
400       │                          described  in   this   table.                               │
401       │                          (\\, \a, \b, \f, \en, \r, \t,                               │
402       │                          \v).                                                        │
403       └──────────────────────────────────────────────────────────────────────────────────────┘
404
405
406       The order of precedence given to extended regular expressions  for  lex
407       is as shown in the following table, from high to low.
408
409       Note:     The escaped characters entry is not meant to imply that these
410                 are operators, but they are included in  the  table  to  show
411                 their  relationships  to the true operators. The start condi‐
412                 tion, trailing context  and  anchoring  notations  have  been
413                 omitted  from the table because of the placement restrictions
414                 described in this section; they can only appear at the begin‐
415                 ning or ending of an ERE.
416
417
418
419
420
421       ┌────────────────────────────────────────────────────────────────┐
422       │      ERE Precedence in lex                                     │
423       ├────────────────────────────────────────────────────────────────┤
424       │collation-related bracket symbols   [= =]  [: :]  [. .]         │
425       │escaped characters                  \<special character>        │
426       │bracket expression                  [ ]                         │
427       │quoting                             "..."                       │
428       │grouping                            ()                          │
429       │definition                          {name}                      │
430       │single-character RE duplication     * + ?                       │
431       │concatenation                                                   │
432       │interval expression                 {m,n}                       │
433       │alternation                         |                           │
434       └────────────────────────────────────────────────────────────────┘
435
436
437       The  ERE anchoring operators (^ and $) do not appear in the table. With
438       lex regular expressions, these operators are restricted in  their  use:
439       the  ^  operator can only be used at the beginning of an entire regular
440       expression, and the $ operator only at the end. The operators apply  to
441       the   entire   regular  expression.  Thus,  for  example,  the  pattern
442       (^abc)|(def$) is undefined; it can instead be written as  two  separate
443       rules,  one  with  the regular expression ^abc and one with def$, which
444       share a common action via the special | action (see below). If the pat‐
445       tern  were  written ^abc|def$, it would match either of abc or def on a
446       line by itself.
447
448
449       Unlike the general ERE rules, embedded anchoring is not allowed by most
450       historical  lex implementations. An example of embedded anchoring would
451       be for patterns such as (^)foo($) to match foo when it exists as a com‐
452       plete  word. This functionality can be obtained using existing lex fea‐
453       tures:
454
455         ^foo/[ \n]|
456         " foo"/[ \n]    /* found foo as a separate word */
457
458
459
460       Notice also that $ is a form of trailing context (it is  equivalent  to
461       /\n  and  as  such  cannot  be used with regular expressions containing
462       another instance of the  operator  (see  the  preceding  discussion  of
463       trailing context).
464
465
466       The  additional regular expressions trailing-context operator / (slash)
467       can be used as an ordinary character if presented within double-quotes,
468       "/";  preceded by a backslash, \/; or within a bracket expression, [/].
469       The start-condition < and > operators are special only in a start  con‐
470       dition at the beginning of a regular expression; elsewhere in the regu‐
471       lar expression they are treated as ordinary characters.
472
473
474       The following examples clarify  the  differences  between  lex  regular
475       expressions  and  regular expressions appearing elsewhere in this docu‐
476       ment. For regular expressions of the form r/x, the string matching r is
477       always  returned;  confusion  may arise when the beginning of x matches
478       the trailing portion of r. For example, given  the  regular  expression
479       a*b/cc  and  the  input aaabcc, yytext would contain the string aaab on
480       this match. But given the regular expression x*/xy and the input  xxxy,
481       the  token xxx, not xx, is returned by some implementations because xxx
482       matches x*.
483
484
485       In the rule ab*/bc, the b* at the end of r will extend r's  match  into
486       the beginning of the trailing context, so the result is unspecified. If
487       this rule were ab/bc, however, the rule matches the text ab when it  is
488       followed  by the text bc. In this latter case, the matching of r cannot
489       extend into the beginning of x, so the result is specified.
490
491   Actions in lex
492       The action to be taken when an ERE is matched can be a C program  frag‐
493       ment  or  the special actions described below; the program fragment can
494       contain one or more C statements, and can also include special actions.
495       The  empty  C statement ; is a valid action; any string in the lex.yy.c
496       input that matches the pattern portion of such a  rule  is  effectively
497       ignored or skipped. However, the absence of an action is not valid, and
498       the action lex takes in such a condition is undefined.
499
500
501       The specification for an action, including  C  statements  and  special
502       actions, can extend across several lines if enclosed in braces:
503
504         ERE <one or more blanks> { program statement
505         program statement }
506
507
508
509
510       The  default action when a string in the input to a lex.yy.c program is
511       not matched by any expression is to copy  the  string  to  the  output.
512       Because  the  default behavior of a program generated by lex is to read
513       the input and copy it to the output, a minimal lex source program  that
514       has  just  %% generates a C program that simply copies the input to the
515       output unchanged.
516
517
518       Four special actions are available:
519
520         |       ECHO;      REJECT;      BEGIN
521
522
523
524       |           The action | means that the action for the next rule is the
525                   action  for  this  rule.  Unlike the other three actions, |
526                   cannot be enclosed in braces or be semicolon-terminated. It
527                   must be specified alone, with no other actions.
528
529
530       ECHO;       Writes the contents of the string yytext on the output.
531
532
533       REJECT;     Usually  only  a  single  expression  is matched by a given
534                   string in the input. REJECT means  "continue  to  the  next
535                   expression  that  matches  the  current  input," and causes
536                   whatever rule was the second choice after the current  rule
537                   to be executed for the same input. Thus, multiple rules can
538                   be matched and executed for one input string or overlapping
539                   input  strings.  For example, given the regular expressions
540                   xyz and xy and the input  xyz,  usually  only  the  regular
541                   expression  xyz would match. The next attempted match would
542                   start after z. If the last action in the xyz rule is REJECT
543                   ,  both  this  rule  and the xy rule would be executed. The
544                   REJECT action may be implemented in  such  a  fashion  that
545                   flow  of  control does not continue after it, as if it were
546                   equivalent to a goto to another part of yylex. The  use  of
547                   REJECT may result in somewhat larger and slower scanners.
548
549
550       BEGIN       The action:
551
552                   BEGIN newstate;
553
554                   switches  the  state  (start condition) to newstate. If the
555                   string newstate has not been declared previously as a start
556                   condition  in  the  Definitions in lex section, the results
557                   are unspecified. The initial  state  is  indicated  by  the
558                   digit 0 or the token INITIAL.
559
560
561
562       The  functions  or  macros  described below are accessible to user code
563       included in the lex input. It is unspecified whether they appear in the
564       C  code  output of lex, or are accessible only through the -l l operand
565       to c89 or cc (the lex library).
566
567       int yylex(void)      Performs lexical analysis on the  input;  this  is
568                            the primary function generated by the lex utility.
569                            The function returns zero when the end of input is
570                            reached;  otherwise  it  returns  non-zero  values
571                            (tokens)  determined  by  the  actions  that   are
572                            selected.
573
574
575       int yymore(void)     When  called,  indicates  that when the next input
576                            string is recognized, it is to be appended to  the
577                            current  value of yytext rather than replacing it;
578                            the value in yyleng is adjusted accordingly.
579
580
581       intyyless(int n)     Retains n initial characters in yytext, NUL-termi‐
582                            nated,  and  treats the remaining characters as if
583                            they had not been read; the  value  in  yyleng  is
584                            adjusted accordingly.
585
586
587       int input(void)      Returns the next character from the input, or zero
588                            on end-of-file. It obtains input from  the  stream
589                            pointer  yyin, although possibly via an intermedi‐
590                            ate buffer. Thus, once  scanning  has  begun,  the
591                            effect of altering the value of yyin is undefined.
592                            The character  read  is  removed  from  the  input
593                            stream  of  the  scanner without any processing by
594                            the scanner.
595
596
597       int unput(int c)     Returns the character c to the input;  yytext  and
598                            yyleng  are undefined until the next expression is
599                            matched. The result of using unput for more  char‐
600                            acters than have been input is unspecified.
601
602
603
604       The  following  functions  appear  only  in  the lex library accessible
605       through the -l l operand; they can therefore be redefined by a portable
606       application:
607
608       int yywrap(void)
609
610           Called  by  yylex  at  end-of-file;  the default yywrap always will
611           return 1. If the application requires yylex to continue  processing
612           with  another  source  of input, then the application can include a
613           function yywrap, which associates another file  with  the  external
614           variable FILE *yyin and will return a value of zero.
615
616
617       int main(int argc, char *argv[])
618
619           Calls  yylex to perform lexical analysis, then exits. The user code
620           can contain main to perform application-specific operations,  call‐
621           ing yylex as applicable.
622
623
624
625       The  reason  for  breaking  these functions into two lists is that only
626       those functions in libl.a can  be  reliably  redefined  by  a  portable
627       application.
628
629
630       Except  for input, unput and main, all external and static names gener‐
631       ated by lex begin with the prefix yy or YY.
632

USAGE

634       Portable applications are warned that in the Rules in lex  section,  an
635       ERE  without  an  action is not acceptable, but need not be detected as
636       erroneous by lex. This may result in compilation or run-time errors.
637
638
639       The purpose of input is to take characters off  the  input  stream  and
640       discard  them as far as the lexical analysis is concerned. A common use
641       is to discard the body of a comment once the beginning of a comment  is
642       recognized.
643
644
645       The lex utility is not fully internationalized in its treatment of reg‐
646       ular expressions in the lex source code or generated lexical  analyzer.
647       It would seem desirable to have the lexical analyzer interpret the reg‐
648       ular expressions given in the lex source according to  the  environment
649       specified when the lexical analyzer is executed, but this is not possi‐
650       ble with the current lex technology. Furthermore, the  very  nature  of
651       the lexical analyzers produced by lex must be closely tied to the lexi‐
652       cal requirements of the input language being described, which will fre‐
653       quently  be  locale-specific  anyway. (For example, writing an analyzer
654       that is used for French text will not automatically be useful for  pro‐
655       cessing other languages.)
656

EXAMPLES

658       Example 1 Using lex
659
660
661       The following is an example of a lex program that implements a rudimen‐
662       tary scanner for a Pascal-like syntax:
663
664
665         %{
666         /* need this for the call to atof() below */
667         #include <math.h>
668         /* need this for printf(), fopen() and stdin below */
669         #include <stdio.h>
670         %}
671
672         DIGIT    [0-9]
673         ID       [a-z][a-z0-9]*
674         %%
675
676         {DIGIT}+                          {
677                                printf("An integer: %s (%d)\n", yytext,
678                                atoi(yytext));
679                                }
680
681         {DIGIT}+"."{DIGIT}*    {
682                                printf("A float: %s (%g)\n", yytext,
683                                atof(yytext));
684                                }
685
686         if|then|begin|end|procedure|function        {
687                                printf("A keyword: %s\n", yytext);
688                                }
689
690         {ID}                   printf("An identifier: %s\n", yytext);
691
692         "+"|"-"|"*"|"/"        printf("An operator: %s\n", yytext);
693
694         "{"[^}\n]*"}"         /* eat up one-line comments */
695
696         [ \t\n]+               /* eat up white space */
697
698         .                      printf("Unrecognized character: %s\n", yytext);
699
700         %%
701
702         int main(int argc, char *argv[])
703         {
704                               ++argv, --argc;  /* skip over program name */
705                               if (argc > 0)
706                                                                                           yyin = fopen(argv[0], "r");
707                               else
708                               yyin = stdin;
709
710                               yylex();
711         }
712
713
714

ENVIRONMENT VARIABLES

716       See environ(5) for descriptions of the following environment  variables
717       that  affect  the execution of lex: LANG, LC_ALL, LC_COLLATE, LC_CTYPE,
718       LC_MESSAGES, and NLSPATH.
719

EXIT STATUS

721       The following exit values are returned:
722
723       0      Successful completion.
724
725
726       >0     An error occurred.
727
728

ATTRIBUTES

730       See attributes(5) for descriptions of the following attributes:
731
732
733
734
735       ┌─────────────────────────────┬─────────────────────────────┐
736       │      ATTRIBUTE TYPE         │      ATTRIBUTE VALUE        │
737       ├─────────────────────────────┼─────────────────────────────┤
738       │Availability                 │SUNWbtool                    │
739       ├─────────────────────────────┼─────────────────────────────┤
740       │Interface Stability          │Standard                     │
741       └─────────────────────────────┴─────────────────────────────┘
742

NOTES

747       If routines such as yyback(), yywrap(), and yylock() in .l (ell)  files
748       are  to be external C functions, the command line to compile a C++ pro‐
749       gram must define the __EXTERN_C__ macro. For example:
750
751         example%  CC -D__EXTERN_C__ ... file
752
753
754
755
756
757SunOS 5.11                        22 Aug 1997                           lex(1)