scan(n) - f7

1scan(n)                      Tcl Built-In Commands                     scan(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       scan - Parse string using conversion specifiers in the style of sscanf
9

SYNOPSIS

11       scan string format ?varName varName ...?
12_________________________________________________________________
13
14

INTRODUCTION

16       This  command parses fields from an input string in the same fashion as
17       the ANSI C sscanf procedure and returns a count of the number  of  con‐
18       versions  performed,  or  -1  if the end of the input string is reached
19       before any conversions have been performed.  String gives the input  to
20       be  parsed  and  format  indicates  how to parse it, using % conversion
21       specifiers as in sscanf.  Each varName gives the name  of  a  variable;
22       when a field is scanned from string the result is converted back into a
23       string and assigned to the corresponding variable.  If no varName vari‐
24       ables are specified, then scan works in an inline manner, returning the
25       data that would otherwise be stored in the variables as a list.  In the
26       inline  case,  an  empty  string  is returned when the end of the input
27       string is reached before any conversions have been performed.
28

DETAILS ON SCANNING

30       Scan operates by scanning string and  format  together.   If  the  next
31       character  in  format  is  a blank or tab then it matches any number of
32       white space characters in string (including zero).   Otherwise,  if  it
33       isn't  a  %  character then it must match the next character of string.
34       When a % is encountered in format, it indicates the start of a  conver‐
35       sion  specifier.   A  conversion  specifier  contains up to four fields │
36       after the %: a *, which indicates that the converted  value  is  to  be │
37       discarded instead of assigned to a variable; a XPG3 position specifier; │
38       a number indicating a maximum field width; a field size modifier; and a │
39       conversion  character.  All of these fields are optional except for the
40       conversion character.  The fields that are present must appear  in  the
41       order given above.
42
43       When  scan  finds  a conversion specifier in format, it first skips any
44       white-space characters in string (unless the  specifier  is  [  or  c).
45       Then  it converts the next input characters according to the conversion
46       specifier and stores the result in the variable given by the next argu‐
47       ment to scan.
48
49       If  the % is followed by a decimal number and a $, as in ``%2$d'', then
50       the variable to use is not taken from  the  next  sequential  argument.
51       Instead, it is taken from the argument indicated by the number, where 1
52       corresponds to the first varName.  If there are any  positional  speci‐
53       fiers  in  format then all of the specifiers must be positional.  Every
54       varName on the argument list must correspond to exactly one  conversion
55       specifier or an error is generated, or in the inline case, any position
56       can be specified at most once and the empty positions will be filled in
57       with empty strings.
58
59       The following conversion characters are supported:
60
61       d         The input field must be a decimal integer.  It is read in and
62                 the value is stored in the variable as a decimal string.   If │
63                 the  l  or  L field size modifier is given, the scanned value │
64                 will have an internal representation that is at least 64-bits │
65                 in size.
66
67       o         The  input  field must be an octal integer. It is read in and
68                 the value is stored in the variable as a decimal string.   If │
69                 the  l  or  L field size modifier is given, the scanned value │
70                 will have an internal representation that is at least 64-bits │
71                 in size.  If the value exceeds MAX_INT (017777777777 on plat‐ │
72                 forms using 32-bit integers when the l and  L  modifiers  are │
73                 not given), it will be truncated to a signed integer.  Hence, │
74                 037777777777 will  appear  as  -1  on  a  32-bit  machine  by │
75                 default.
76
77       x         The  input field must be a hexadecimal integer. It is read in
78                 and the value is stored in the variable as a decimal  string. │
79                 If the l or L field size modifier is given, the scanned value │
80                 will have an internal representation that is at least 64-bits │
81                 in  size.   If the value exceeds MAX_INT (0x7FFFFFFF on plat‐ │
82                 forms using 32-bit integers when the l and  L  modifiers  are │
83                 not given), it will be truncated to a signed integer.  Hence, │
84                 0xFFFFFFFF will appear as -1 on a 32-bit machine.
85
86       u         The input field must be a  decimal  integer.   The  value  is
87                 stored in the variable as an unsigned decimal integer string. │
88                 If the l or L field size modifier is given, the scanned value │
89                 will have an internal representation that is at least 64-bits │
90                 in size.
91
92       i         The input field must be an integer.  The base (i.e.  decimal,
93                 octal,  or  hexadecimal) is determined in the same fashion as
94                 described in expr.  The value is stored in the variable as  a
95                 decimal  string.  If the l or L field size modifier is given, │
96                 the scanned value will have an internal  representation  that │
97                 is at least 64-bits in size.
98
99       c         A  single character is read in and its binary value is stored
100                 in the variable as a decimal string.  Initial white space  is
101                 not  skipped in this case, so the input field may be a white-
102                 space character.  This conversion is different from the  ANSI
103                 standard  in that the input field always consists of a single
104                 character and no field width may be specified.
105
106       s         The input field consists of all the characters up to the next
107                 white-space character; the characters are copied to the vari‐
108                 able.
109
110       e or f or g
111                 The input field must be a floating-point number consisting of
112                 an  optional  sign,  a string of decimal digits possibly con‐
113                 taining a decimal point, and an optional exponent  consisting
114                 of  an  e  or  E followed by an optional sign and a string of
115                 decimal digits.  It is read in and stored in the variable  as
116                 a floating-point string.
117
118       [chars]   The  input  field  consists  of  any  number of characters in
119                 chars.  The matching string is stored in  the  variable.   If
120                 the  first  character  between the brackets is a ] then it is
121                 treated as part of chars rather than the closing bracket  for
122                 the  set.   If chars contains a sequence of the form a-b then
123                 any character between a and b (inclusive) will match.  If the
124                 first  or last character between the brackets is a -, then it
125                 is treated as part of chars rather than indicating a range.
126
127       [^chars]  The input field consists of any number of characters  not  in
128                 chars.   The  matching  string is stored in the variable.  If
129                 the character immediately following the ^ is a ] then  it  is
130                 treated  as  part  of the set rather than the closing bracket
131                 for the set.  If chars contains a sequence of  the  form  a-b
132                 then  any  character  between  a  and  b  (inclusive) will be
133                 excluded from the  set.   If  the  first  or  last  character
134                 between  the  brackets  is a -, then it is treated as part of
135                 chars rather than indicating a range.
136
137       n         No input is consumed from the  input  string.   Instead,  the
138                 total  number  of characters scanned from the input string so
139                 far is stored in the variable.
140
141       The number of characters read from the input for a  conversion  is  the
142       largest  number  that  makes sense for that particular conversion (e.g.
143       as many decimal digits as possible for %d, as many octal digits as pos‐
144       sible  for %o, and so on).  The input field for a given conversion ter‐
145       minates either when a white-space character is encountered or when  the
146       maximum field width has been reached, whichever comes first.  If a * is
147       present in the conversion specifier then no variable  is  assigned  and
148       the next scan argument is not consumed.
149

DIFFERENCES FROM ANSI SSCANF

151       The  behavior  of  the  scan command is the same as the behavior of the
152       ANSI C sscanf procedure except for the following differences:
153
154       [1]    %p conversion specifier is not currently supported.
155
156       [2]    For %c conversions a single character value is  converted  to  a
157              decimal string, which is then assigned to the corresponding var‐
158              Name; no field width may be specified for this conversion.
159
160       [3]    The h modifier is always ignored and the l and L  modifiers  are │
161              ignored  when  converting  real values (i.e. type double is used │
162              for the internal representation).
163
164       [4]    If the end of the input string is reached before any conversions
165              have  been performed and no variables are given, an empty string
166              is returned.
167

EXAMPLES

169       Parse a simple color specification of the form #RRGGBB using  hexadeci‐
170       mal conversions with field sizes:
171              set string "#08D03F"
172              scan $string "#%2x%2x%2x" r g b
173
174       Parse  a HH:MM time string, noting that this avoids problems with octal
175       numbers by forcing interpretation as decimals (if we did not  care,  we
176       would use the %i conversion instead):
177              set string "08:08"   ;# *Not* octal!
178              if {[scan $string "%d:%d" hours minutes] != 2} {
179                 error "not a valid time string"
180              }
181              # We have to understand numeric ranges ourselves...
182              if {$minutes < 0 || $minutes > 59} {
183                 error "invalid number of minutes"
184              }
185
186       Break a string up into sequences of non-whitespace characters (note the
187       use of the %n conversion so that we get skipping  over  leading  white‐
188       space correct):
189              set string " a string {with braced words} + leading space "
190              set words {}
191              while {[scan $string %s%n word length] == 2} {
192                 lappend words $word
193                 set string [string range $string $length end]
194              }
195
196       Parse a simple coordinate string, checking that it is complete by look‐
197       ing for the terminating character explicitly:
198              set string "(5.2,-4e-2)"
199              # Note that the spaces before the literal parts of
200              # the scan pattern are significant, and that ")" is
201              # the Unicode character \u0029
202              if {
203                 [scan $string " (%f ,%f %c" x y last] != 3
204                 || $last != 0x0029
205              } then {
206                 error "invalid coordinate string"
207              }
208              puts "X=$x, Y=$y"
209
210

KEYWORDS

216       conversion specifier, parse, scan
217
218
219
220Tcl                                   8.4                              scan(n)