1erl_scan(3)                Erlang Module Definition                erl_scan(3)
2
3
4

NAME

6       erl_scan - The Erlang token scanner.
7

DESCRIPTION

9       This  module  contains  functions  for tokenizing (scanning) characters
10       into Erlang tokens.
11

DATA TYPES

13       category() = atom()
14
15       error_description() = term()
16
17       error_info() =
18           {erl_anno:location(), module(), error_description()}
19
20       option() =
21           return |
22           return_white_spaces |
23           return_comments |
24           text |
25           {reserved_word_fun, resword_fun()}
26
27       options() = option() | [option()]
28
29       symbol() = atom() | float() | integer() | string()
30
31       resword_fun() = fun((atom()) -> boolean())
32
33       token() =
34           {category(), Anno :: erl_anno:anno(), symbol()} |
35           {category(), Anno :: erl_anno:anno()}
36
37       tokens() = [token()]
38
39       tokens_result() =
40           {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
41           {eof, EndLocation :: erl_anno:location()} |
42           {error,
43            ErrorInfo :: error_info(),
44            EndLocation :: erl_anno:location()}
45

EXPORTS

47       category(Token) -> category()
48
49              Types:
50
51                 Token = token()
52
53              Returns the category of Token.
54
55       column(Token) -> erl_anno:column() | undefined
56
57              Types:
58
59                 Token = token()
60
61              Returns the column of Token's collection of annotations.
62
63       end_location(Token) -> erl_anno:location() | undefined
64
65              Types:
66
67                 Token = token()
68
69              Returns the end location of the text of  Token's  collection  of
70              annotations. If there is no text, undefined is returned.
71
72       format_error(ErrorDescriptor) -> string()
73
74              Types:
75
76                 ErrorDescriptor = error_description()
77
78              Uses  an ErrorDescriptor and returns a string that describes the
79              error or warning. This function  is  usually  called  implicitly
80              when  an  ErrorInfo  structure  is  processed (see section Error
81              Information).
82
83       line(Token) -> erl_anno:line()
84
85              Types:
86
87                 Token = token()
88
89              Returns the line of Token's collection of annotations.
90
91       location(Token) -> erl_anno:location()
92
93              Types:
94
95                 Token = token()
96
97              Returns the location of Token's collection of annotations.
98
99       reserved_word(Atom :: atom()) -> boolean()
100
101              Returns true if Atom  is  an  Erlang  reserved  word,  otherwise
102              false.
103
104       string(String) -> Return
105
106       string(String, StartLocation) -> Return
107
108       string(String, StartLocation, Options) -> Return
109
110              Types:
111
112                 String = string()
113                 Options = options()
114                 Return =
115                     {ok, Tokens :: tokens(), EndLocation} |
116                     {error, ErrorInfo :: error_info(), ErrorLocation}
117                 StartLocation  = EndLocation = ErrorLocation = erl_anno:loca‐
118                 tion()
119
120              Takes the list of characters String and tries to scan (tokenize)
121              them. Returns one of the following:
122
123                {ok, Tokens, EndLocation}:
124                  Tokens are the Erlang tokens from String. EndLocation is the
125                  first location after the last token.
126
127                {error, ErrorInfo, ErrorLocation}:
128                  An error occurred. ErrorLocation is the first location after
129                  the erroneous token.
130
131              string(String)   is   equivalent   to   string(String,  1),  and
132              string(String, StartLocation) is  equivalent  to  string(String,
133              StartLocation, []).
134
135              StartLocation  indicates  the  initial  location  when  scanning
136              starts. If StartLocation  is  a  line,  Anno,  EndLocation,  and
137              ErrorLocation  are  lines.  If StartLocation is a pair of a line
138              and a column, Anno takes the form of  an  opaque  compound  data
139              type,  and EndLocation and ErrorLocation are pairs of a line and
140              a column. The token annotations contain  information  about  the
141              column  and the line where the token begins, as well as the text
142              of the token (if option text is specified), all of which can  be
143              accessed by calling column/1, line/1, location/1, and text/1.
144
145              A  token is a tuple containing information about syntactic cate‐
146              gory, the token annotations, and the terminal symbol. For  punc‐
147              tuation  characters  (such  as  ; and |) and reserved words, the
148              category and the symbol coincide, and the token  is  represented
149              by a two-tuple. Three-tuples have one of the following forms:
150
151                * {atom, Anno, atom()}
152
153                * {char, Anno, char()}
154
155                * {comment, Anno, string()}
156
157                * {float, Anno, float()}
158
159                * {integer, Anno, integer()}
160
161                * {var, Anno, atom()}
162
163                * {white_space, Anno, string()}
164
165              Valid options:
166
167                {reserved_word_fun, reserved_word_fun()}:
168                  A  callback  function  that  is  called when the scanner has
169                  found an unquoted atom. If the function  returns  true,  the
170                  unquoted  atom  itself becomes the category of the token. If
171                  the function returns false, atom becomes the category of the
172                  unquoted atom.
173
174                return_comments:
175                  Return comment tokens.
176
177                return_white_spaces:
178                  Return  white space tokens. By convention, a newline charac‐
179                  ter, if present, is always the first character of  the  text
180                  (there  cannot  be  more  than  one newline in a white space
181                  token).
182
183                return:
184                  Short for [return_comments, return_white_spaces].
185
186                text:
187                  Include the token text in the token annotation. The text  is
188                  the part of the input corresponding to the token.
189
190       symbol(Token) -> symbol()
191
192              Types:
193
194                 Token = token()
195
196              Returns the symbol of Token.
197
198       text(Token) -> erl_anno:text() | undefined
199
200              Types:
201
202                 Token = token()
203
204              Returns  the text of Token's collection of annotations. If there
205              is no text, undefined is returned.
206
207       tokens(Continuation, CharSpec, StartLocation) -> Return
208
209       tokens(Continuation, CharSpec, StartLocation, Options) -> Return
210
211              Types:
212
213                 Continuation = return_cont() | []
214                 CharSpec = char_spec()
215                 StartLocation = erl_anno:location()
216                 Options = options()
217                 Return =
218                     {done,
219                      Result :: tokens_result(),
220                      LeftOverChars :: char_spec()} |
221                     {more, Continuation1 :: return_cont()}
222                 char_spec() = string() | eof
223                 return_cont()
224                   An opaque continuation.
225
226              This is the re-entrant scanner,  which  scans  characters  until
227              either  a dot ('.' followed by a white space) or eof is reached.
228              It returns:
229
230                {done, Result, LeftOverChars}:
231                  Indicates that there is  sufficient  input  data  to  get  a
232                  result. Result is:
233
234                  {ok, Tokens, EndLocation}:
235                    The  scanning was successful. Tokens is the list of tokens
236                    including dot.
237
238                  {eof, EndLocation}:
239                    End of file was encountered before any more tokens.
240
241                  {error, ErrorInfo, EndLocation}:
242                    An error occurred. LeftOverChars is the remaining  charac‐
243                    ters of the input data, starting from EndLocation.
244
245                {more, Continuation1}:
246                  More  data  is  required  for building a term. Continuation1
247                  must be passed in a new call to tokens/3,4 when more data is
248                  available.
249
250              The  CharSpec  eof signals end of file. LeftOverChars then takes
251              the value eof as well.
252
253              tokens(Continuation, CharSpec, StartLocation) is  equivalent  to
254              tokens(Continuation, CharSpec, StartLocation, []).
255
256              For a description of the options, see string/3.
257

ERROR INFORMATION

259       ErrorInfo is the standard ErrorInfo structure that is returned from all
260       I/O modules. The format is as follows:
261
262       {ErrorLocation, Module, ErrorDescriptor}
263
264       A string describing the error is obtained with the following call:
265
266       Module:format_error(ErrorDescriptor)
267

NOTES

269       The continuation of the first call to the  re-entrant  input  functions
270       must  be  [].  For  a  complete description of how the re-entrant input
271       scheme works, see Armstrong, Virding and Williams: 'Concurrent Program‐
272       ming in Erlang', Chapter 13.
273

SEE ALSO

275       erl_anno(3), erl_parse(3), io(3)
276
277
278
279Ericsson AB                     stdlib 3.4.5.1                     erl_scan(3)
Impressum