1Genlex(3)                        OCaml library                       Genlex(3)
2
3
4

NAME

6       Genlex - A generic lexical analyzer.
7

Module

9       Module   Genlex
10

Documentation

12       Module Genlex
13        : sig end
14
15
16       A generic lexical analyzer.
17
18       This  module  implements  a  simple ``standard'' lexical analyzer, pre‐
19       sented as a function from character streams to token streams. It imple‐
20       ments  roughly the lexical conventions of Caml, but is parameterized by
21       the set of keywords of your language.
22
23       Example: a lexer suitable for a desk  calculator  is  obtained  by  let
24       lexer = make_lexer ["+";"-";"*";"/";"let";"="; ( ; ) ]
25
26       The  associated  parser  would  be a function from token stream to, for
27       instance, int , and would have rules such as:
28
29
30       let parse_expr = parser [< 'Int n >] -> n | [< 'Kwd ( ; n = parse_expr;
31       'Kwd  )  >] -> n | [< n1 = parse_expr; n2 = parse_remainder n1 >] -> n2
32       and parse_remainder n1 = parser [< 'Kwd + ; n2 = parse_expr >] -> n1+n2
33       | ...
34
35
36
37
38
39
40
41       type token =
42        | Kwd of string
43        | Ident of string
44        | Int of int
45        | Float of float
46        | String of string
47        | Char of char
48
49
50       The  type of tokens. The lexical classes are: Int and Float for integer
51       and floating-point numbers; String for  string  literals,  enclosed  in
52       double  quotes; Char for character literals, enclosed in single quotes;
53       Ident for identifiers (either sequences of letters, digits, underscores
54       and  quotes,  or  sequences  of ``operator characters'' such as + , * ,
55       etc); and Kwd for keywords  (either  identifiers  or  single  ``special
56       characters'' such as ( , } , etc).
57
58
59
60
61       val make_lexer : string list -> char Stream.t -> token Stream.t
62
63       Construct  the  lexer  function. The first argument is the list of key‐
64       words. An identifier s is returned as Kwd s if s belongs to this  list,
65       and  as  Ident s otherwise.  A special character s is returned as Kwd s
66       if s belongs to  this  list,  and  cause  a  lexical  error  (exception
67       Parse_error  )  otherwise.  Blanks  and newlines are skipped.  Comments
68       delimited by (* and *) are skipped as well, and can be nested.
69
70
71
72
73
74
75OCamldoc                          2010-01-29                         Genlex(3)
Impressum