1Scanf(3) OCaml library Scanf(3)
2
3
4
6 Scanf - Formatted input functions.
7
9 Module Scanf
10
12 Module Scanf
13 : sig end
14
15
16 Formatted input functions.
17
18
19
20
21
22
23 module Scanning : sig end
24
25
26 Scanning buffers.
27
28
29
30
31 exception Scan_failure of string
32
33
34 The exception that formatted input functions raise when the input can‐
35 not be read according to the given format.
36
37
38
39
40 val bscanf : Scanning.scanbuf -> ('a, Scanning.scanbuf, 'b) Perva‐
41 sives.format -> 'a -> 'b
42
43
44 bscanf ib fmt f reads tokens from the scanning buffer ib according to
45 the format string fmt , converts these tokens to values, and applies
46 the function f to these values. The result of this application of f is
47 the result of the whole construct.
48
49 For instance, if p is the function fun s i -> i + 1 , then Scanf.sscanf
50 x = 1 %s = %i p returns 2 .
51
52 The format is a character string which contains three types of objects:
53
54 -plain characters, which are simply matched with the characters of the
55 input,
56
57 -conversion specifications, each of which causes reading and conversion
58 of one argument for f ,
59
60 -scanning indications to specify boundaries of tokens.
61
62 Among plain characters the space character (ASCII code 32) has a spe‐
63 cial meaning: it matches ``whitespace'', that is any number of tab,
64 space, newline and carriage return characters. Hence, a space in the
65 format matches any amount of whitespace in the input.
66
67 Conversion specifications consist in the % character, followed by an
68 optional flag, an optional field width, and followed by one or two con‐
69 version characters. The conversion characters and their meanings are:
70
71
72 - d : reads an optionally signed decimal integer.
73
74 - i : reads an optionally signed integer (usual input formats for hexa‐
75 decimal ( 0x[d]+ and 0X[d]+ ), octal ( 0o[d]+ ), and binary 0b[d]+
76 notations are understood).
77
78 - u : reads an unsigned decimal integer.
79
80 - x or X : reads an unsigned hexadecimal integer.
81
82 - o : reads an unsigned octal integer.
83
84 - s : reads a string argument that spreads as much as possible, until
85 the next white space, the next scanning indication, or the end-of-input
86 is reached. Hence, this conversion always succeeds: it returns an empty
87 string if the bounding condition holds when the scan begins.
88
89 - S : reads a delimited string argument (delimiters and special escaped
90 characters follow the lexical conventions of Caml).
91
92 - c : reads a single character. To test the current input character
93 without reading it, specify a null field width, i.e. use specification
94 %0c . Raise Invalid_argument , if the field width specification is
95 greater than 1.
96
97 - C : reads a single delimited character (delimiters and special
98 escaped characters follow the lexical conventions of Caml).
99
100 - f , e , E , g , G : reads an optionally signed floating-point number
101 in decimal notation, in the style dddd.ddd e/E+-dd .
102
103 - F : reads a floating point number according to the lexical conven‐
104 tions of Caml (hence the decimal point is mandatory if the exponent
105 part is not mentioned).
106
107 - B : reads a boolean argument ( true or false ).
108
109 - b : reads a boolean argument (for backward compatibility; do not use
110 in new programs).
111
112 - ld , li , lu , lx , lX , lo : reads an int32 argument to the format
113 specified by the second letter (decimal, hexadecimal, etc).
114
115 - nd , ni , nu , nx , nX , no : reads a nativeint argument to the for‐
116 mat specified by the second letter.
117
118 - Ld , Li , Lu , Lx , LX , Lo : reads an int64 argument to the format
119 specified by the second letter.
120
121 - [ range ] : reads characters that matches one of the characters men‐
122 tioned in the range of characters range (or not mentioned in it, if the
123 range starts with ^ ). Reads a string that can be empty, if no charac‐
124 ter in the input matches the range. The set of characters from c1 to c2
125 (inclusively) is denoted by c1-c2 . Hence, %[0-9] returns a string
126 representing a decimal number or an empty string if no decimal digit is
127 found; similarly, %[\\048-\\057\\065-\\070] returns a string of hexa‐
128 decimal digits. If a closing bracket appears in a range, it must occur
129 as the first character of the range (or just after the ^ in case of
130 range negation); hence []] matches a ] character and [^]] matches any
131 character that is not ] .
132
133 - { fmt %} : reads a format string argument to the format specified by
134 the internal format fmt . The format string to be read must have the
135 same type as the internal format fmt . For instance, "%{%i%}" reads
136 any format string that can read a value of type int ; hence
137 Scanf.sscanf fmt:\\\ number is %u\\\"" fmt:%{%i%} succeeds and returns
138 the format string number is %u .
139
140 - \( fmt %\) : scanning format substitution. Reads a format string to
141 replace fmt . The format string read must have the same type as fmt .
142
143 - l : applies f to the number of lines read so far.
144
145 - n : applies f to the number of characters read so far.
146
147 - N or L : applies f to the number of tokens read so far.
148
149 - ! : matches the end of input condition.
150
151 - % : matches one % character in the input.
152
153 Following the % character introducing a conversion, there may be the
154 special flag _ : the conversion that follows occurs as usual, but the
155 resulting value is discarded.
156
157 The field widths are composed of an optional integer literal indicating
158 the maximal width of the token to read. For instance, %6d reads an
159 integer, having at most 6 decimal digits; %4f reads a float with at
160 most 4 characters; and %8[\\000-\\255] returns the next 8 characters
161 (or all the characters still available, if less than 8 characters are
162 available in the input).
163
164 Scanning indications appear just after the string conversions s and [
165 range ] to delimit the end of the token. A scanning indication is
166 introduced by a @ character, followed by some constant character c . It
167 means that the string token should end just before the next matching c
168 (which is skipped). If no c character is encountered, the string token
169 spreads as much as possible. For instance, %s@\t reads a string up to
170 the next tabulation character or to the end of input. If a scanning
171 indication @c does not follow a string conversion, it is treated as a
172 plain c character.
173
174 Raise Scanf.Scan_failure if the given input does not match the format.
175
176 Raise Failure if a conversion to a number is not possible.
177
178 Raise End_of_file if the end of input is encountered while some more
179 characters are needed to read the current conversion specification
180 (this means in particular that scanning a %s conversion never raises
181 exception End_of_file : if the end of input is reached the conversion
182 succeeds and simply returns ).
183
184 Notes:
185
186
187 -the scanning indications introduce slight differences in the syntax of
188 Scanf format strings compared to those used by the Printf module. How‐
189 ever, scanning indications are similar to those of the Format module;
190 hence, when producing formatted text to be scanned by !Scanf.bscanf ,
191 it is wise to use printing functions from Format (or, if you need to
192 use functions from Printf , banish or carefully double check the format
193 strings that contain '@' characters).
194
195
196 -in addition to relevant digits, '_' characters may appear inside num‐
197 bers (this is reminiscent to the usual Caml conventions). If stricter
198 scanning is desired, use the range conversion facility instead of the
199 number conversions.
200
201
202 -the scanf facility is not intended for heavy duty lexical analysis and
203 parsing. If it appears not expressive enough for your needs, several
204 alternative exists: regular expressions (module Str ), stream parsers,
205 ocamllex -generated lexers, ocamlyacc -generated parsers.
206
207
208
209
210
211 val fscanf : Pervasives.in_channel -> ('a, Scanning.scanbuf, 'b) Perva‐
212 sives.format -> 'a -> 'b
213
214 Same as Scanf.bscanf , but inputs from the given channel.
215
216 Warning: since all scanning functions operate from a scanning buffer,
217 be aware that each fscanf invocation must allocate a new fresh scanning
218 buffer (unless careful use of partial evaluation in the program).
219 Hence, there are chances that some characters seem to be skipped (in
220 fact they are pending in the previously used buffer). This happens in
221 particular when calling fscanf again after a scan involving a format
222 that necessitates some look ahead (such as a format that ends by skip‐
223 ping whitespace in the input).
224
225 To avoid confusion, consider using bscanf with an explicitly created
226 scanning buffer. Use for instance Scanning.from_file f to allocate the
227 scanning buffer reading from file f .
228
229 This method is not only clearer it is also faster, since scanning buf‐
230 fers to files are optimized for fast bufferized reading.
231
232
233
234
235 val sscanf : string -> ('a, Scanning.scanbuf, 'b) Pervasives.format ->
236 'a -> 'b
237
238 Same as Scanf.bscanf , but inputs from the given string.
239
240
241
242
243 val scanf : ('a, Scanning.scanbuf, 'b) Pervasives.format -> 'a -> 'b
244
245 Same as Scanf.bscanf , but reads from the predefined scanning buffer
246 Scanf.Scanning.stdib that is connected to stdin .
247
248
249
250
251 val kscanf : Scanning.scanbuf -> (Scanning.scanbuf -> exn -> 'a) ->
252 ('b, Scanning.scanbuf, 'a) Pervasives.format -> 'b -> 'a
253
254 Same as Scanf.bscanf , but takes an additional function argument ef
255 that is called in case of error: if the scanning process or some con‐
256 version fails, the scanning function aborts and applies the error han‐
257 dling function ef to the scanning buffer and the exception that aborted
258 the scanning process.
259
260
261
262
263 val bscanf_format : Scanning.scanbuf -> ('a, 'b, 'c, 'd) format4 ->
264 (('a, 'b, 'c, 'd) format4 -> 'e) -> 'e
265
266
267
268
269
270 === bscanf_format ib fmt f reads a format string token in buffer ib,
271 according to the format string fmt, and applies the function f to the
272 resulting format string value. Raises Scan_failure if the format
273 string value read has not the same type as fmt. ===
274
275
276 val sscanf_format : string -> ('a, 'b, 'c, 'd) format4 -> ('a, 'b, 'c,
277 'd) format4
278
279 Same as Scanf.bscanf_format , but converts the given string to a format
280 string.
281
282
283
284
285
286
287OCamldoc 2007-05-24 Scanf(3)