1erl_scan(3) Erlang Module Definition erl_scan(3)
2
3
4
6 erl_scan - The Erlang token scanner.
7
9 This module contains functions for tokenizing (scanning) characters
10 into Erlang tokens.
11
13 category() = atom()
14
15 error_description() = term()
16
17 error_info() =
18 {erl_anno:location(), module(), error_description()}
19
20 option() =
21 return |
22 return_white_spaces |
23 return_comments |
24 text |
25 {reserved_word_fun, resword_fun()}
26
27 options() = option() | [option()]
28
29 symbol() = atom() | float() | integer() | string()
30
31 resword_fun() = fun((atom()) -> boolean())
32
33 token() =
34 {category(), Anno :: erl_anno:anno(), symbol()} |
35 {category(), Anno :: erl_anno:anno()}
36
37 tokens() = [token()]
38
39 tokens_result() =
40 {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
41 {eof, EndLocation :: erl_anno:location()} |
42 {error,
43 ErrorInfo :: error_info(),
44 EndLocation :: erl_anno:location()}
45
47 category(Token) -> category()
48
49 Types:
50
51 Token = token()
52
53 Returns the category of Token.
54
55 column(Token) -> erl_anno:column() | undefined
56
57 Types:
58
59 Token = token()
60
61 Returns the column of Token's collection of annotations.
62
63 end_location(Token) -> erl_anno:location() | undefined
64
65 Types:
66
67 Token = token()
68
69 Returns the end location of the text of Token's collection of
70 annotations. If there is no text, undefined is returned.
71
72 format_error(ErrorDescriptor) -> string()
73
74 Types:
75
76 ErrorDescriptor = error_description()
77
78 Uses an ErrorDescriptor and returns a string that describes the
79 error or warning. This function is usually called implicitly
80 when an ErrorInfo structure is processed (see section Error
81 Information).
82
83 line(Token) -> erl_anno:line()
84
85 Types:
86
87 Token = token()
88
89 Returns the line of Token's collection of annotations.
90
91 location(Token) -> erl_anno:location()
92
93 Types:
94
95 Token = token()
96
97 Returns the location of Token's collection of annotations.
98
99 reserved_word(Atom :: atom()) -> boolean()
100
101 Returns true if Atom is an Erlang reserved word, otherwise
102 false.
103
104 string(String) -> Return
105
106 string(String, StartLocation) -> Return
107
108 string(String, StartLocation, Options) -> Return
109
110 Types:
111
112 String = string()
113 Options = options()
114 Return =
115 {ok, Tokens :: tokens(), EndLocation} |
116 {error, ErrorInfo :: error_info(), ErrorLocation}
117 StartLocation = EndLocation = ErrorLocation = erl_anno:loca‐
118 tion()
119
120 Takes the list of characters String and tries to scan (tokenize)
121 them. Returns one of the following:
122
123 {ok, Tokens, EndLocation}:
124 Tokens are the Erlang tokens from String. EndLocation is the
125 first location after the last token.
126
127 {error, ErrorInfo, ErrorLocation}:
128 An error occurred. ErrorLocation is the first location after
129 the erroneous token.
130
131 string(String) is equivalent to string(String, 1), and
132 string(String, StartLocation) is equivalent to string(String,
133 StartLocation, []).
134
135 StartLocation indicates the initial location when scanning
136 starts. If StartLocation is a line, Anno, EndLocation, and
137 ErrorLocation are lines. If StartLocation is a pair of a line
138 and a column, Anno takes the form of an opaque compound data
139 type, and EndLocation and ErrorLocation are pairs of a line and
140 a column. The token annotations contain information about the
141 column and the line where the token begins, as well as the text
142 of the token (if option text is specified), all of which can be
143 accessed by calling column/1, line/1, location/1, and text/1.
144
145 A token is a tuple containing information about syntactic cate‐
146 gory, the token annotations, and the terminal symbol. For punc‐
147 tuation characters (such as ; and |) and reserved words, the
148 category and the symbol coincide, and the token is represented
149 by a two-tuple. Three-tuples have one of the following forms:
150
151 * {atom, Anno, atom()}
152
153 * {char, Anno, char()}
154
155 * {comment, Anno, string()}
156
157 * {float, Anno, float()}
158
159 * {integer, Anno, integer()}
160
161 * {var, Anno, atom()}
162
163 * {white_space, Anno, string()}
164
165 Valid options:
166
167 {reserved_word_fun, reserved_word_fun()}:
168 A callback function that is called when the scanner has
169 found an unquoted atom. If the function returns true, the
170 unquoted atom itself becomes the category of the token. If
171 the function returns false, atom becomes the category of the
172 unquoted atom.
173
174 return_comments:
175 Return comment tokens.
176
177 return_white_spaces:
178 Return white space tokens. By convention, a newline charac‐
179 ter, if present, is always the first character of the text
180 (there cannot be more than one newline in a white space
181 token).
182
183 return:
184 Short for [return_comments, return_white_spaces].
185
186 text:
187 Include the token text in the token annotation. The text is
188 the part of the input corresponding to the token.
189
190 symbol(Token) -> symbol()
191
192 Types:
193
194 Token = token()
195
196 Returns the symbol of Token.
197
198 text(Token) -> erl_anno:text() | undefined
199
200 Types:
201
202 Token = token()
203
204 Returns the text of Token's collection of annotations. If there
205 is no text, undefined is returned.
206
207 tokens(Continuation, CharSpec, StartLocation) -> Return
208
209 tokens(Continuation, CharSpec, StartLocation, Options) -> Return
210
211 Types:
212
213 Continuation = return_cont() | []
214 CharSpec = char_spec()
215 StartLocation = erl_anno:location()
216 Options = options()
217 Return =
218 {done,
219 Result :: tokens_result(),
220 LeftOverChars :: char_spec()} |
221 {more, Continuation1 :: return_cont()}
222 char_spec() = string() | eof
223 return_cont()
224 An opaque continuation.
225
226 This is the re-entrant scanner, which scans characters until
227 either a dot ('.' followed by a white space) or eof is reached.
228 It returns:
229
230 {done, Result, LeftOverChars}:
231 Indicates that there is sufficient input data to get a
232 result. Result is:
233
234 {ok, Tokens, EndLocation}:
235 The scanning was successful. Tokens is the list of tokens
236 including dot.
237
238 {eof, EndLocation}:
239 End of file was encountered before any more tokens.
240
241 {error, ErrorInfo, EndLocation}:
242 An error occurred. LeftOverChars is the remaining charac‐
243 ters of the input data, starting from EndLocation.
244
245 {more, Continuation1}:
246 More data is required for building a term. Continuation1
247 must be passed in a new call to tokens/3,4 when more data is
248 available.
249
250 The CharSpec eof signals end of file. LeftOverChars then takes
251 the value eof as well.
252
253 tokens(Continuation, CharSpec, StartLocation) is equivalent to
254 tokens(Continuation, CharSpec, StartLocation, []).
255
256 For a description of the options, see string/3.
257
259 ErrorInfo is the standard ErrorInfo structure that is returned from all
260 I/O modules. The format is as follows:
261
262 {ErrorLocation, Module, ErrorDescriptor}
263
264 A string describing the error is obtained with the following call:
265
266 Module:format_error(ErrorDescriptor)
267
269 The continuation of the first call to the re-entrant input functions
270 must be []. For a complete description of how the re-entrant input
271 scheme works, see Armstrong, Virding and Williams: 'Concurrent Program‐
272 ming in Erlang', Chapter 13.
273
275 erl_anno(3), erl_parse(3), io(3)
276
277
278
279Ericsson AB stdlib 3.4.5.1 erl_scan(3)