1erl_scan(3) Erlang Module Definition erl_scan(3)
2
3
4
6 erl_scan - The Erlang token scanner.
7
9 This module contains functions for tokenizing (scanning) characters
10 into Erlang tokens.
11
13 category() = atom()
14
15 error_description() = term()
16
17 error_info() =
18 {erl_anno:location(), module(), error_description()}
19
20 option() =
21 return | return_white_spaces | return_comments | text |
22 {reserved_word_fun, resword_fun()}
23
24 options() = option() | [option()]
25
26 symbol() = atom() | float() | integer() | string()
27
28 resword_fun() = fun((atom()) -> boolean())
29
30 token() =
31 {category(), Anno :: erl_anno:anno(), symbol()} |
32 {category(), Anno :: erl_anno:anno()}
33
34 tokens() = [token()]
35
36 tokens_result() =
37 {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
38 {eof, EndLocation :: erl_anno:location()} |
39 {error,
40 ErrorInfo :: error_info(),
41 EndLocation :: erl_anno:location()}
42
44 category(Token) -> category()
45
46 Types:
47
48 Token = token()
49
50 Returns the category of Token.
51
52 column(Token) -> erl_anno:column() | undefined
53
54 Types:
55
56 Token = token()
57
58 Returns the column of Token's collection of annotations.
59
60 end_location(Token) -> erl_anno:location() | undefined
61
62 Types:
63
64 Token = token()
65
66 Returns the end location of the text of Token's collection of
67 annotations. If there is no text, undefined is returned.
68
69 format_error(ErrorDescriptor) -> string()
70
71 Types:
72
73 ErrorDescriptor = error_description()
74
75 Uses an ErrorDescriptor and returns a string that describes the
76 error or warning. This function is usually called implicitly
77 when an ErrorInfo structure is processed (see section Error In‐
78 formation).
79
80 line(Token) -> erl_anno:line()
81
82 Types:
83
84 Token = token()
85
86 Returns the line of Token's collection of annotations.
87
88 location(Token) -> erl_anno:location()
89
90 Types:
91
92 Token = token()
93
94 Returns the location of Token's collection of annotations.
95
96 reserved_word(Atom :: atom()) -> boolean()
97
98 Returns true if Atom is an Erlang reserved word, otherwise
99 false.
100
101 string(String) -> Return
102
103 string(String, StartLocation) -> Return
104
105 string(String, StartLocation, Options) -> Return
106
107 Types:
108
109 String = string()
110 Options = options()
111 Return =
112 {ok, Tokens :: tokens(), EndLocation} |
113 {error, ErrorInfo :: error_info(), ErrorLocation}
114 StartLocation = EndLocation = ErrorLocation = erl_anno:loca‐
115 tion()
116
117 Takes the list of characters String and tries to scan (tokenize)
118 them. Returns one of the following:
119
120 {ok, Tokens, EndLocation}:
121 Tokens are the Erlang tokens from String. EndLocation is the
122 first location after the last token.
123
124 {error, ErrorInfo, ErrorLocation}:
125 An error occurred. ErrorLocation is the first location after
126 the erroneous token.
127
128 string(String) is equivalent to string(String, 1), and
129 string(String, StartLocation) is equivalent to string(String,
130 StartLocation, []).
131
132 StartLocation indicates the initial location when scanning
133 starts. If StartLocation is a line, Anno, EndLocation, and Er‐
134 rorLocation are lines. If StartLocation is a pair of a line and
135 a column, Anno takes the form of an opaque compound data type,
136 and EndLocation and ErrorLocation are pairs of a line and a col‐
137 umn. The token annotations contain information about the column
138 and the line where the token begins, as well as the text of the
139 token (if option text is specified), all of which can be ac‐
140 cessed by calling column/1, line/1, location/1, and text/1.
141
142 A token is a tuple containing information about syntactic cate‐
143 gory, the token annotations, and the terminal symbol. For punc‐
144 tuation characters (such as ; and |) and reserved words, the
145 category and the symbol coincide, and the token is represented
146 by a two-tuple. Three-tuples have one of the following forms:
147
148 * {atom, Anno, atom()}
149
150 * {char, Anno, char()}
151
152 * {comment, Anno, string()}
153
154 * {float, Anno, float()}
155
156 * {integer, Anno, integer()}
157
158 * {var, Anno, atom()}
159
160 * {white_space, Anno, string()}
161
162 Valid options:
163
164 {reserved_word_fun, reserved_word_fun()}:
165 A callback function that is called when the scanner has
166 found an unquoted atom. If the function returns true, the
167 unquoted atom itself becomes the category of the token. If
168 the function returns false, atom becomes the category of the
169 unquoted atom.
170
171 return_comments:
172 Return comment tokens.
173
174 return_white_spaces:
175 Return white space tokens. By convention, a newline charac‐
176 ter, if present, is always the first character of the text
177 (there cannot be more than one newline in a white space to‐
178 ken).
179
180 return:
181 Short for [return_comments, return_white_spaces].
182
183 text:
184 Include the token text in the token annotation. The text is
185 the part of the input corresponding to the token.
186
187 symbol(Token) -> symbol()
188
189 Types:
190
191 Token = token()
192
193 Returns the symbol of Token.
194
195 text(Token) -> erl_anno:text() | undefined
196
197 Types:
198
199 Token = token()
200
201 Returns the text of Token's collection of annotations. If there
202 is no text, undefined is returned.
203
204 tokens(Continuation, CharSpec, StartLocation) -> Return
205
206 tokens(Continuation, CharSpec, StartLocation, Options) -> Return
207
208 Types:
209
210 Continuation = return_cont() | []
211 CharSpec = char_spec()
212 StartLocation = erl_anno:location()
213 Options = options()
214 Return =
215 {done,
216 Result :: tokens_result(),
217 LeftOverChars :: char_spec()} |
218 {more, Continuation1 :: return_cont()}
219 char_spec() = string() | eof
220 return_cont()
221 An opaque continuation.
222
223 This is the re-entrant scanner, which scans characters until ei‐
224 ther a dot ('.' followed by a white space) or eof is reached. It
225 returns:
226
227 {done, Result, LeftOverChars}:
228 Indicates that there is sufficient input data to get a re‐
229 sult. Result is:
230
231 {ok, Tokens, EndLocation}:
232 The scanning was successful. Tokens is the list of tokens
233 including dot.
234
235 {eof, EndLocation}:
236 End of file was encountered before any more tokens.
237
238 {error, ErrorInfo, EndLocation}:
239 An error occurred. LeftOverChars is the remaining charac‐
240 ters of the input data, starting from EndLocation.
241
242 {more, Continuation1}:
243 More data is required for building a term. Continuation1
244 must be passed in a new call to tokens/3,4 when more data is
245 available.
246
247 The CharSpec eof signals end of file. LeftOverChars then takes
248 the value eof as well.
249
250 tokens(Continuation, CharSpec, StartLocation) is equivalent to
251 tokens(Continuation, CharSpec, StartLocation, []).
252
253 For a description of the options, see string/3.
254
256 ErrorInfo is the standard ErrorInfo structure that is returned from all
257 I/O modules. The format is as follows:
258
259 {ErrorLocation, Module, ErrorDescriptor}
260
261 A string describing the error is obtained with the following call:
262
263 Module:format_error(ErrorDescriptor)
264
266 The continuation of the first call to the re-entrant input functions
267 must be []. For a complete description of how the re-entrant input
268 scheme works, see Armstrong, Virding and Williams: 'Concurrent Program‐
269 ming in Erlang', Chapter 13.
270
272 erl_anno(3), erl_parse(3), io(3)
273
274
275
276Ericsson AB stdlib 3.14.2.1 erl_scan(3)