1Stdlib.Scanf(3)                  OCaml library                 Stdlib.Scanf(3)
2
3
4

NAME

6       Stdlib.Scanf - no description
7

Module

9       Module   Stdlib.Scanf
10

Documentation

12       Module Scanf
13        : (module Stdlib__scanf)
14
15
16
17
18
19
20
21
22
23   Introduction
24   Functional input with format strings
25       The module Scanf provides formatted input functions or scanners.
26
27       The  formatted input functions can read from any kind of input, includ‐
28       ing strings, files, or anything that can return  characters.  The  more
29       general  source  of  characters  is named a formatted input channel (or
30       scanning buffer) and has type Scanf.Scanning.in_channel . The more gen‐
31       eral  formatted  input  function  reads from any scanning buffer and is
32       named bscanf .
33
34       Generally speaking, the formatted input functions have 3 arguments:
35
36       -the first argument is a source of characters for the input,
37
38       -the second argument is a format string that specifies  the  values  to
39       read,
40
41       -the  third argument is a receiver function that is applied to the val‐
42       ues read.
43
44       Hence, a typical call to the formatted input function  Scanf.bscanf  is
45       bscanf ic fmt f , where:
46
47
48       - ic is a source of characters (typically a     formatted input channel
49       with type Scanf.Scanning.in_channel ),
50
51
52       - fmt is a format string (the same format  strings  as  those  used  to
53       print material with module Printf or Format ),
54
55
56       - f is a function that has as many arguments as the number of values to
57       read in the input according to fmt .
58
59
60   A simple example
61       As suggested above, the expression bscanf ic %d f reads a decimal inte‐
62       ger n from the source of characters ic and returns f n .
63
64       For instance,
65
66
67       -if  we use stdin as the source of characters ( Scanf.Scanning.stdin is
68       the predefined formatted input channel that reads from standard input),
69
70
71       -if we define the receiver f as let f x = x + 1 ,
72
73       then bscanf Scanning.stdin %d f reads an integer n  from  the  standard
74       input  and  returns  f  n (that is n + 1 ). Thus, if we evaluate bscanf
75       stdin %d f , and then enter 41 at the keyboard, the result we get is 42
76       .
77
78   Formatted input as a functional feature
79       The  OCaml scanning facility is reminiscent of the corresponding C fea‐
80       ture.  However, it is also largely different,  simpler,  and  yet  more
81       powerful:  the  formatted  input functions are higher-order functionals
82       and the parameter passing mechanism is just the regular function appli‐
83       cation not the variable assignment based mechanism which is typical for
84       formatted input in imperative languages; the OCaml format strings  also
85       feature  useful  additions to easily define complex tokens; as expected
86       within a functional programming language, the formatted input functions
87       also  support  polymorphism,  in  particular arbitrary interaction with
88       polymorphic user-defined scanners.  Furthermore,  the  OCaml  formatted
89       input facility is fully type-checked at compile time.
90
91   Formatted input channel
92       module Scanning : sig end
93
94
95
96
97
98
99   Type of formatted input functions
100       type ('a, 'b, 'c, 'd) scanner = ('a, Scanning.in_channel, 'b, 'c, 'a ->
101       'd, 'd) format6 -> 'c
102
103
104       The type of formatted input scanners: ('a, 'b, 'c, 'd) scanner  is  the
105       type of a formatted input function that reads from some formatted input
106       channel according to some format string; more  precisely,  if  scan  is
107       some  formatted input function, then scan ic fmt f applies f to all the
108       arguments specified by format string fmt , when  scan  has  read  those
109       arguments from the Scanf.Scanning.in_channel formatted input channel ic
110       .
111
112       For instance, the Scanf.scanf function below has type ('a, 'b, 'c,  'd)
113       scanner  ,  since  it  is  a  formatted  input function that reads from
114       Scanf.Scanning.stdin : scanf fmt f applies f to the arguments specified
115       by fmt , reading those arguments from stdin as expected.
116
117       If  the format fmt has some %r indications, the corresponding formatted
118       input functions must be provided  before  receiver  function  f  .  For
119       instance, if read_elem is an input function for values of type t , then
120       bscanf ic %r; read_elem f reads a value v of type t followed by  a  ';'
121       character, and returns f v .
122
123
124       Since 3.10.0
125
126
127
128       exception Scan_failure of string
129
130
131       When  the input can not be read according to the format string specifi‐
132       cation, formatted input functions typically raise exception  Scan_fail‐
133       ure .
134
135
136
137
138   The general formatted input function
139       val bscanf : Scanning.in_channel -> ('a, 'b, 'c, 'd) scanner
140
141
142
143
144
145       bscanf  ic  fmt  r1  ...  rN  f  reads  characters from the Scanf.Scan‐
146       ning.in_channel formatted input channel ic and converts them to  values
147       according  to format string fmt .  As a final step, receiver function f
148       is applied to the values read and gives the result of the bscanf call.
149
150       For instance, if f is the function fun s i -> i + 1 , then Scanf.sscanf
151       x= 1 %s = %i f returns 2 .
152
153       Arguments r1 to rN are user-defined input functions that read the argu‐
154       ment corresponding to  the  %r  conversions  specified  in  the  format
155       string.
156
157   Format string description
158       The  format  string is a character string which contains three types of
159       objects:
160
161       -plain characters, which are simply matched with the characters of  the
162       input (with a special case for space and line feed, see Scanf.space ),
163
164       -conversion specifications, each of which causes reading and conversion
165       of one argument for the function f (see Scanf.conversion ),
166
167       -scanning indications to specify boundaries  of  tokens  (see  scanning
168       Scanf.indication ).
169
170
171   The space character in format strings
172       As  mentioned  above,  a  plain  character in the format string is just
173       matched with the next character of the input; however,  two  characters
174       are special exceptions to this rule: the space character ( ' ' or ASCII
175       code 32) and the line feed character ( '\n' or ASCII code 10).  A space
176       does not match a single space character, but any amount of 'whitespace'
177       in the input. More precisely, a space inside the format string  matches
178       any  number  of  tab,  space, line feed and carriage return characters.
179       Similarly, a line feed character in the format string matches either  a
180       single line feed or a carriage return followed by a line feed.
181
182       Matching  any  amount  of whitespace, a space in the format string also
183       matches no amount of whitespace at all; hence, the call bscanf ib Price
184       =  %d  $ (fun p -> p) succeeds and returns 1 when reading an input with
185       various whitespace in it, such as Price = 1 $ , Price = 1 $ ,  or  even
186       Price=1$ .
187
188   Conversion specifications in format strings
189       Conversion  specifications  consist  in the % character, followed by an
190       optional flag, an optional field width, and followed by one or two con‐
191       version characters.
192
193       The conversion characters and their meanings are:
194
195
196       - d : reads an optionally signed decimal integer ( 0-9 +).
197
198       -  i  : reads an optionally signed integer (usual input conventions for
199       decimal ( 0-9 +), hexadecimal ( 0x[0-9a-f]+ and 0X[0-9A-F]+ ), octal  (
200       0o[0-7]+ ), and binary ( 0b[0-1]+ ) notations are understood).
201
202       - u : reads an unsigned decimal integer.
203
204       - x or X : reads an unsigned hexadecimal integer ( [0-9a-fA-F]+ ).
205
206       - o : reads an unsigned octal integer ( [0-7]+ ).
207
208       -  s  : reads a string argument that spreads as much as possible, until
209       the following bounding condition holds:
210
211       -a whitespace has been found (see Scanf.space ),
212
213       -a scanning  indication  (see  scanning  Scanf.indication  )  has  been
214       encountered,
215
216       -the end-of-input has been reached.
217
218       Hence,  this  conversion always succeeds: it returns an empty string if
219       the bounding condition holds when the scan begins.
220
221       - S : reads a delimited string argument (delimiters and special escaped
222       characters follow the lexical conventions of OCaml).
223
224       -  c  :  reads  a single character. To test the current input character
225       without reading it, specify a null field width, i.e. use  specification
226       %0c  .  Raise  Invalid_argument  ,  if the field width specification is
227       greater than 1.
228
229       - C : reads  a  single  delimited  character  (delimiters  and  special
230       escaped characters follow the lexical conventions of OCaml).
231
232       -  f , e , E , g , G : reads an optionally signed floating-point number
233       in decimal notation, in the style dddd.ddd e/E+-dd .
234
235       - h , H : reads an optionally signed floating-point number in hexadeci‐
236       mal notation.
237
238       -  F  :  reads a floating point number according to the lexical conven‐
239       tions of OCaml (hence the decimal point is mandatory  if  the  exponent
240       part is not mentioned).
241
242       - B : reads a boolean argument ( true or false ).
243
244       -  b : reads a boolean argument (for backward compatibility; do not use
245       in new programs).
246
247       - ld , li , lu , lx , lX , lo : reads an int32 argument to  the  format
248       specified by the second letter for regular integers.
249
250       -  nd , ni , nu , nx , nX , no : reads a nativeint argument to the for‐
251       mat specified by the second letter for regular integers.
252
253       - Ld , Li , Lu , Lx , LX , Lo : reads an int64 argument to  the  format
254       specified by the second letter for regular integers.
255
256       -  [ range ] : reads characters that matches one of the characters men‐
257       tioned in the range of characters range (or not mentioned in it, if the
258       range  starts  with  ^ ). Reads a string that can be empty, if the next
259       input character does not match the range. The set of characters from c1
260       to  c2  (inclusively)  is  denoted  by c1-c2 .  Hence, %[0-9] returns a
261       string representing a decimal number or an empty string if  no  decimal
262       digit  is  found;  similarly, %[0-9a-f] returns a string of hexadecimal
263       digits.  If a closing bracket appears in a range, it must occur as  the
264       first  character  of  the  range  (or just after the ^ in case of range
265       negation); hence []] matches a ] character and [^]] matches any charac‐
266       ter that is not ] .  Use %% and %@ to include a % or a @ in a range.
267
268       -  r  : user-defined reader. Takes the next ri formatted input function
269       and applies it to the scanning buffer ib to read the next argument. The
270       input  function  ri  must therefore have type Scanning.in_channel -> 'a
271       and the argument read has type 'a .
272
273       - { fmt %} : reads a format string argument.  The  format  string  read
274       must  have  the  same type as the format string specification fmt . For
275       instance, %{ %i %} reads any format string that can  read  a  value  of
276       type  int  ;  hence,  if  s  is the string fmt:\ number is %u\"" , then
277       Scanf.sscanf s fmt: %{%i%} succeeds and returns the format string  num‐
278       ber is %u .
279
280       -  (  fmt %) : scanning sub-format substitution.  Reads a format string
281       rf in the input, then goes on scanning with rf instead of scanning with
282       fmt  .   The  format  string  rf  must have the same type as the format
283       string specification fmt that it replaces.   For  instance,  %(  %i  %)
284       reads  any  format string that can read a value of type int .  The con‐
285       version returns the format string read rf , and then a value read using
286       rf  .   Hence, if s is the string \ %4d\"1234.00" , then Scanf.sscanf s
287       %(%i%) (fun fmt i -> fmt, i) evaluates to ("%4d", 1234) .  This  behav‐
288       iour  is not mere format substitution, since the conversion returns the
289       format string read as additional argument. If you need pure format sub‐
290       stitution,  use special flag _ to discard the extraneous argument: con‐
291       version %_( fmt %) reads a format string rf and then behaves  the  same
292       as format string rf .  Hence, if s is the string \ %4d\"1234.00" , then
293       Scanf.sscanf s %_(%i%) is simply equivalent to Scanf.sscanf 1234.00 %4d
294       .
295
296       - l : returns the number of lines read so far.
297
298       - n : returns the number of characters read so far.
299
300       - N or L : returns the number of tokens read so far.
301
302       - !  : matches the end of input condition.
303
304       - % : matches one % character in the input.
305
306       - @ : matches one @ character in the input.
307
308       - , : does nothing.
309
310       Following  the  %  character that introduces a conversion, there may be
311       the special flag _ : the conversion that follows occurs as  usual,  but
312       the  resulting  value is discarded.  For instance, if f is the function
313       fun i -> i + 1 , and s is the string x = 1 , then Scanf.sscanf s %_s  =
314       %i f returns 2 .
315
316       The  field  width is composed of an optional integer literal indicating
317       the maximal width of the token to read.  For  instance,  %6d  reads  an
318       integer,  having  at  most  6 decimal digits; %4f reads a float with at
319       most 4 characters; and %8[\000-\255] returns the next 8 characters  (or
320       all  the  characters  still  available,  if fewer than 8 characters are
321       available in the input).
322
323       Notes:
324
325
326       -as mentioned above, a %s conversion always succeeds, even if there  is
327       nothing to read in the input: in this case, it simply returns  .
328
329
330       -in  addition  to the relevant digits, '_' characters may appear inside
331       numbers (this is reminiscent to the usual OCaml  lexical  conventions).
332       If  stricter  scanning  is  desired,  use the range conversion facility
333       instead of the number conversions.
334
335
336       -the scanf facility is not intended for heavy duty lexical analysis and
337       parsing.  If  it  appears not expressive enough for your needs, several
338       alternative exists: regular expressions (module Str ), stream  parsers,
339       ocamllex -generated lexers, ocamlyacc -generated parsers.
340
341
342   Scanning indications in format strings
343       Scanning indications appear just after the string conversions %s and %[
344       range ] to delimit the end of  the  token.  A  scanning  indication  is
345       introduced  by  a  @ character, followed by some plain character c . It
346       means that the string token should end just before the next matching  c
347       (which  is skipped). If no c character is encountered, the string token
348       spreads as much as possible. For instance, %s@\t reads a string  up  to
349       the next tab character or to the end of input. If a @ character appears
350       anywhere else in the format string, it is treated as a plain character.
351
352       Note:
353
354
355       -As usual in format strings, % and @ characters must be  escaped  using
356       %% and %@ ; this rule still holds within range specifications and scan‐
357       ning indications.  For instance, format %s@%% reads a string up to  the
358       next % character, and format %s@%@ reads a string up to the next @ .
359
360       -The scanning indications introduce slight differences in the syntax of
361       Scanf format strings, compared to those used  for  the  Printf  module.
362       However, the scanning indications are similar to those used in the For‐
363       mat module; hence, when producing  formatted  text  to  be  scanned  by
364       Scanf.bscanf  ,  it  is  wise to use printing functions from the Format
365       module (or, if you need to use functions from Printf , banish or  care‐
366       fully double check the format strings that contain '@' characters).
367
368
369   Exceptions during scanning
370       Scanners  may  raise  the following exceptions when the input cannot be
371       read according to the format string:
372
373
374       -Raise Scanf.Scan_failure if the input does not match the format.
375
376
377       -Raise Failure if a conversion to a number is not possible.
378
379
380       -Raise End_of_file if the end of input is encountered while  some  more
381       characters are needed to read the current conversion specification.
382
383
384       -Raise Invalid_argument if the format string is invalid.
385
386       Note:
387
388
389       -as  a  consequence,  scanning  a  %s conversion never raises exception
390       End_of_file : if the end of input is reached  the  conversion  succeeds
391       and  simply  returns  the characters read so far, or  if none were ever
392       read.
393
394
395   Specialised formatted input functions
396       val sscanf : string -> ('a, 'b, 'c, 'd) scanner
397
398       Same as Scanf.bscanf , but reads from the given string.
399
400
401
402       val scanf : ('a, 'b, 'c, 'd) scanner
403
404       Same as Scanf.bscanf , but reads from the  predefined  formatted  input
405       channel Scanf.Scanning.stdin that is connected to stdin .
406
407
408
409       val  kscanf : Scanning.in_channel -> (Scanning.in_channel -> exn -> 'd)
410       -> ('a, 'b, 'c, 'd) scanner
411
412       Same as Scanf.bscanf , but takes an  additional  function  argument  ef
413       that  is  called in case of error: if the scanning process or some con‐
414       version fails, the scanning function aborts and calls  the  error  han‐
415       dling  function  ef  with the formatted input channel and the exception
416       that aborted the scanning process as arguments.
417
418
419
420       val ksscanf : string -> (Scanning.in_channel -> exn -> 'd) -> ('a,  'b,
421       'c, 'd) scanner
422
423       Same as Scanf.kscanf but reads from the given string.
424
425
426       Since 4.02.0
427
428
429
430
431   Reading format strings from input
432       val  bscanf_format  :  Scanning.in_channel  -> ('a, 'b, 'c, 'd, 'e, 'f)
433       format6 -> (('a, 'b, 'c, 'd, 'e, 'f) format6 -> 'g) -> 'g
434
435
436       bscanf_format ic fmt f reads a format string token from  the  formatted
437       input  channel  ic  ,  according  to  the given format string fmt , and
438       applies f to the resulting format string value.  Raise Scanf.Scan_fail‐
439       ure  if the format string value read does not have the same type as fmt
440       .
441
442
443       Since 3.09.0
444
445
446
447       val sscanf_format : string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 -> (('a,
448       'b, 'c, 'd, 'e, 'f) format6 -> 'g) -> 'g
449
450       Same as Scanf.bscanf_format , but reads from the given string.
451
452
453       Since 3.09.0
454
455
456
457       val  format_from_string : string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 ->
458       ('a, 'b, 'c, 'd, 'e, 'f) format6
459
460
461       format_from_string s fmt converts a string argument to a format string,
462       according to the given format string fmt .  Raise Scanf.Scan_failure if
463       s , considered as a format string, does not have the same type as fmt .
464
465
466       Since 3.10.0
467
468
469
470       val unescaped : string -> string
471
472
473       unescaped s return a copy of s with escape sequences (according to  the
474       lexical  conventions  of OCaml) replaced by their corresponding special
475       characters.  More precisely, Scanf.unescaped has  the  following  prop‐
476       erty: for all string s , Scanf.unescaped (String.escaped s) = s .
477
478       Always  return  a  copy  of  the  argument,  even if there is no escape
479       sequence in the argument.  Raise Scanf.Scan_failure if s is  not  prop‐
480       erly  escaped  (i.e.  s has invalid escape sequences or special charac‐
481       ters that are not properly escaped).  For instance, Scanf.unescaped  \"
482       will fail.
483
484
485       Since 4.00.0
486
487
488
489
490   Deprecated
491       val fscanf : in_channel -> ('a, 'b, 'c, 'd) scanner
492
493       Deprecated.
494
495       Scanf.fscanf is error prone and deprecated since 4.03.0.
496
497       This  function violates the following invariant of the Scanf module: To
498       preserve scanning semantics, all scanning functions  defined  in  Scanf
499       must read from a user defined Scanf.Scanning.in_channel formatted input
500       channel.
501
502       If you need to read from a in_channel input channel ic , simply  define
503       a  Scanf.Scanning.in_channel  formatted  input  channel  as in let ib =
504       Scanning.from_channel ic , then use Scanf.bscanf ib as usual.
505
506
507
508       val kfscanf : in_channel -> (Scanning.in_channel -> exn -> 'd) ->  ('a,
509       'b, 'c, 'd) scanner
510
511       Deprecated.
512
513       Scanf.kfscanf is error prone and deprecated since 4.03.0.
514
515
516
517
518
519OCamldoc                          2019-07-30                   Stdlib.Scanf(3)
Impressum