1Stdlib.Lexing(3) OCaml library Stdlib.Lexing(3)
2
3
4
6 Stdlib.Lexing - no description
7
9 Module Stdlib.Lexing
10
12 Module Lexing
13 : (module Stdlib__Lexing)
14
15
16
17
18
19
20
21
22
23 Positions
24 type position = {
25 pos_fname : string ;
26 pos_lnum : int ;
27 pos_bol : int ;
28 pos_cnum : int ;
29 }
30
31
32 A value of type position describes a point in a source file. pos_fname
33 is the file name; pos_lnum is the line number; pos_bol is the offset of
34 the beginning of the line (number of characters between the beginning
35 of the lexbuf and the beginning of the line); pos_cnum is the offset of
36 the position (number of characters between the beginning of the lexbuf
37 and the position). The difference between pos_cnum and pos_bol is the
38 character offset within the line (i.e. the column number, assuming each
39 character is one column wide).
40
41 See the documentation of type lexbuf for information about how the lex‐
42 ing engine will manage positions.
43
44
45
46 val dummy_pos : position
47
48 A value of type position , guaranteed to be different from any valid
49 position.
50
51
52
53
54 Lexer buffers
55 type lexbuf = {
56 refill_buff : lexbuf -> unit ;
57
58 mutable lex_buffer : bytes ;
59
60 mutable lex_buffer_len : int ;
61
62 mutable lex_abs_pos : int ;
63
64 mutable lex_start_pos : int ;
65
66 mutable lex_curr_pos : int ;
67
68 mutable lex_last_pos : int ;
69
70 mutable lex_last_action : int ;
71
72 mutable lex_eof_reached : bool ;
73
74 mutable lex_mem : int array ;
75
76 mutable lex_start_p : position ;
77
78 mutable lex_curr_p : position ;
79 }
80
81
82 The type of lexer buffers. A lexer buffer is the argument passed to the
83 scanning functions defined by the generated scanners. The lexer buffer
84 holds the current state of the scanner, plus a function to refill the
85 buffer from the input.
86
87 Lexers can optionally maintain the lex_curr_p and lex_start_p position
88 fields. This "position tracking" mode is the default, and it corre‐
89 sponds to passing ~with_position:true to functions that create lexer
90 buffers. In this mode, the lexing engine and lexer actions are co-re‐
91 sponsible for properly updating the position fields, as described in
92 the next paragraph. When the mode is explicitly disabled (with
93 ~with_position:false ), the lexing engine will not touch the position
94 fields and the lexer actions should be careful not to do it either; the
95 lex_curr_p and lex_start_p field will then always hold the dummy_pos
96 invalid position. Not tracking positions avoids allocations and memory
97 writes and can significantly improve the performance of the lexer in
98 contexts where lex_start_p and lex_curr_p are not needed.
99
100 Position tracking mode works as follows. At each token, the lexing en‐
101 gine will copy lex_curr_p to lex_start_p , then change the pos_cnum
102 field of lex_curr_p by updating it with the number of characters read
103 since the start of the lexbuf . The other fields are left unchanged by
104 the lexing engine. In order to keep them accurate, they must be ini‐
105 tialised before the first use of the lexbuf, and updated by the rele‐
106 vant lexer actions (i.e. at each end of line -- see also new_line ).
107
108
109
110 val from_channel : ?with_positions:bool -> in_channel -> lexbuf
111
112 Create a lexer buffer on the given input channel. Lexing.from_channel
113 inchan returns a lexer buffer which reads from the input channel inchan
114 , at the current reading position.
115
116
117
118 val from_string : ?with_positions:bool -> string -> lexbuf
119
120 Create a lexer buffer which reads from the given string. Reading starts
121 from the first character in the string. An end-of-input condition is
122 generated when the end of the string is reached.
123
124
125
126 val from_function : ?with_positions:bool -> (bytes -> int -> int) ->
127 lexbuf
128
129 Create a lexer buffer with the given function as its reading method.
130 When the scanner needs more characters, it will call the given func‐
131 tion, giving it a byte sequence s and a byte count n . The function
132 should put n bytes or fewer in s , starting at index 0, and return the
133 number of bytes provided. A return value of 0 means end of input.
134
135
136
137 val set_position : lexbuf -> position -> unit
138
139 Set the initial tracked input position for lexbuf to a custom value.
140 Ignores pos_fname . See Lexing.set_filename for changing this field.
141
142
143 Since 4.11
144
145
146
147 val set_filename : lexbuf -> string -> unit
148
149 Set filename in the initial tracked position to file in lexbuf .
150
151
152 Since 4.11
153
154
155
156 val with_positions : lexbuf -> bool
157
158 Tell whether the lexer buffer keeps track of position fields lex_curr_p
159 / lex_start_p , as determined by the corresponding optional argument
160 for functions that create lexer buffers (whose default value is true ).
161
162 When with_positions is false , lexer actions should not modify position
163 fields. Doing it nevertheless could re-enable the with_position mode
164 and degrade performances.
165
166
167
168
169 Functions for lexer semantic actions
170 The following functions can be called from the semantic actions of
171 lexer definitions (the ML code enclosed in braces that computes the
172 value returned by lexing functions). They give access to the character
173 string matched by the regular expression associated with the semantic
174 action. These functions must be applied to the argument lexbuf , which,
175 in the code generated by ocamllex , is bound to the lexer buffer passed
176 to the parsing function.
177
178 val lexeme : lexbuf -> string
179
180
181 Lexing.lexeme lexbuf returns the string matched by the regular expres‐
182 sion.
183
184
185
186 val lexeme_char : lexbuf -> int -> char
187
188
189 Lexing.lexeme_char lexbuf i returns character number i in the matched
190 string.
191
192
193
194 val lexeme_start : lexbuf -> int
195
196
197 Lexing.lexeme_start lexbuf returns the offset in the input stream of
198 the first character of the matched string. The first character of the
199 stream has offset 0.
200
201
202
203 val lexeme_end : lexbuf -> int
204
205
206 Lexing.lexeme_end lexbuf returns the offset in the input stream of the
207 character following the last character of the matched string. The first
208 character of the stream has offset 0.
209
210
211
212 val lexeme_start_p : lexbuf -> position
213
214 Like lexeme_start , but return a complete position instead of an off‐
215 set. When position tracking is disabled, the function returns
216 dummy_pos .
217
218
219
220 val lexeme_end_p : lexbuf -> position
221
222 Like lexeme_end , but return a complete position instead of an offset.
223 When position tracking is disabled, the function returns dummy_pos .
224
225
226
227 val new_line : lexbuf -> unit
228
229 Update the lex_curr_p field of the lexbuf to reflect the start of a new
230 line. You can call this function in the semantic action of the rule
231 that matches the end-of-line character. The function does nothing when
232 position tracking is disabled.
233
234
235 Since 3.11.0
236
237
238
239
240 Miscellaneous functions
241 val flush_input : lexbuf -> unit
242
243 Discard the contents of the buffer and reset the current position to 0.
244 The next use of the lexbuf will trigger a refill.
245
246
247
248
249
250OCamldoc 2023-01-23 Stdlib.Lexing(3)