scan(n) - f14

1scan(n)                      Tcl Built-In Commands                     scan(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       scan - Parse string using conversion specifiers in the style of sscanf
9

SYNOPSIS

11       scan string format ?varName varName ...?
12_________________________________________________________________
13

INTRODUCTION

15       This  command parses substrings from an input string in a fashion simi‐
16       lar to the ANSI C sscanf procedure and returns a count of the number of
17       conversions  performed, or -1 if the end of the input string is reached
18       before any conversions have been performed.  String gives the input  to
19       be  parsed  and  format  indicates  how to parse it, using % conversion
20       specifiers as in sscanf.  Each varName gives the name  of  a  variable;
21       when a substring is scanned from string that matches a conversion spec‐
22       ifier, the substring is assigned to the corresponding variable.  If  no
23       varName  variables  are specified, then scan works in an inline manner,
24       returning the data that would otherwise be stored in the variables as a
25       list.   In the inline case, an empty string is returned when the end of
26       the input string is reached before any conversions have been performed.
27

DETAILS ON SCANNING

29       Scan operates by scanning string and  format  together.   If  the  next
30       character  in  format  is  a blank or tab then it matches any number of
31       white space characters in string (including zero).  Otherwise, if it is
32       not  a  %  character  then  it must match the next character of string.
33       When a % is encountered in format, it indicates the start of a  conver‐
34       sion  specifier.   A  conversion  specifier  contains up to four fields
35       after the %: a XPG3 position specifier (or a *  to  indicate  the  con‐
36       verted value is to be discarded instead of assigned to any variable); a
37       number indicating a maximum substring width; a  size  modifier;  and  a
38       conversion  character.  All of these fields are optional except for the
39       conversion character.  The fields that are present must appear  in  the
40       order given above.
41
42       When  scan  finds  a conversion specifier in format, it first skips any
43       white-space characters in string (unless the conversion character is  [
44       or  c).   Then  it  converts the next input characters according to the
45       conversion specifier and stores the result in the variable given by the
46       next argument to scan.
47
48       If  the  %  is followed by a decimal number and a $, as in “%2$d”, then
49       the variable to use is not taken from  the  next  sequential  argument.
50       Instead, it is taken from the argument indicated by the number, where 1
51       corresponds to the first varName.  If there are any  positional  speci‐
52       fiers  in  format then all of the specifiers must be positional.  Every
53       varName on the argument list must correspond to exactly one  conversion
54       specifier or an error is generated, or in the inline case, any position
55       can be specified at most once and the empty positions will be filled in
56       with empty strings.
57
58       The size modifier field is used only when scanning a substring into one │
59       of Tcl's integer values.  The size modifier field dictates the  integer │
60       range  acceptable  to be stored in a variable, or, for the inline case, │
61       in a position in the result list.  The syntactically valid  values  for │
62       the  size  modifier  are h, L, l, and ll.  The h size modifier value is │
63       equivalent to the absence of a size  modifier  in  the  the  conversion │
64       specifier.  Either one indicates the integer range to be stored is lim‐ │
65       ited to the same range produced by the int() function of the expr  com‐ │
66       mand.  The L size modifier is equivalent to the l size modifier. Either │
67       one indicates the integer range to be stored is  limited  to  the  same │
68       range produced by the wide() function of the expr command.  The ll size │
69       modifier indicates that the integer range to be stored is unlimited.
70
71       The following conversion characters are supported:
72
73       d         The input substring must be a decimal integer.  It is read in
74                 and the integer value is stored in the variable, truncated as
75                 required by the size modifier value.
76
77       o         The input substring must be an octal integer. It is  read  in
78                 and the integer value is stored in the variable, truncated as
79                 required by the size modifier value.
80
81       x         The input substring must be a  hexadecimal  integer.   It  is
82                 read  in  and  the  integer  value is stored in the variable,
83                 truncated as required by the size modifier value.
84
85       u         The input substring must be a decimal integer.   The  integer
86                 value  is  truncated  as required by the size modifier value,
87                 and the corresponding unsigned value for that truncated range
88                 is  computed  and stored in the variable as a decimal string.
89                 The conversion makes no sense without reference to a  trunca‐
90                 tion  range, so the size modifier ll is not permitted in com‐
91                 bination with conversion character u.
92
93       i         The input substring must be an integer.  The base (i.e. deci‐
94                 mal, binary, octal, or hexadecimal) is determined in the same
95                 fashion as described in expr.  The integer value is stored in
96                 the  variable,  truncated  as  required  by the size modifier
97                 value.
98
99       c         A single character is read in and its Unicode value is stored
100                 in  the variable as an integer value.  Initial white space is
101                 not skipped in this case, so the input  substring  may  be  a
102                 white-space character.
103
104       s         The  input substring consists of all the characters up to the
105                 next white-space character; the characters are copied to  the
106                 variable.
107
108       e or f or g
109                 The  input substring must be a floating-point number consist‐
110                 ing of an optional sign, a string of decimal digits  possibly
111                 containing a decimal point, and an optional exponent consist‐
112                 ing of an e or E followed by an optional sign and a string of
113                 decimal  digits.  It is read in and stored in the variable as
114                 a floating-point value.
115
116       [chars]   The input substring consists of one  or  more  characters  in
117                 chars.   The  matching  string is stored in the variable.  If
118                 the first character between the brackets is a ]  then  it  is
119                 treated  as part of chars rather than the closing bracket for
120                 the set.  If chars contains a sequence of the form  a-b  then
121                 any character between a and b (inclusive) will match.  If the
122                 first or last character between the brackets is a -, then  it
123                 is treated as part of chars rather than indicating a range.
124
125       [^chars]  The input substring consists of one or more characters not in
126                 chars.  The matching string is stored in  the  variable.   If
127                 the  character  immediately following the ^ is a ] then it is
128                 treated as part of the set rather than  the  closing  bracket
129                 for  the  set.   If chars contains a sequence of the form a-b
130                 then any character  between  a  and  b  (inclusive)  will  be
131                 excluded  from  the  set.   If  the  first  or last character
132                 between the brackets is a -, then it is treated  as  part  of
133                 chars rather than indicating a range value.
134
135       n         No  input  is  consumed  from the input string.  Instead, the
136                 total number of characters scanned from the input  string  so
137                 far is stored in the variable.
138
139       The  number  of  characters read from the input for a conversion is the
140       largest number that makes sense for that  particular  conversion  (e.g.
141       as many decimal digits as possible for %d, as many octal digits as pos‐
142       sible for %o, and so on).  The input substring for a  given  conversion
143       terminates  either  when a white-space character is encountered or when
144       the maximum substring width has been reached,  whichever  comes  first.
145       If  a  *  is  present  in  the conversion specifier then no variable is
146       assigned and the next scan argument is not consumed.
147

DIFFERENCES FROM ANSI SSCANF

149       The behavior of the scan command is the same as  the  behavior  of  the
150       ANSI C sscanf procedure except for the following differences:
151
152       [1]    %p conversion specifier is not supported.
153
154       [2]    For  %c  conversions  a single character value is converted to a
155              decimal string, which is then assigned to the corresponding var‐
156              Name; no substring width may be specified for this conversion.
157
158       [3]    The  h  modifier is always ignored and the l and L modifiers are
159              ignored when converting real values (i.e. type  double  is  used
160              for the internal representation).  The ll modifier has no sscanf
161              counterpart.
162
163       [4]    If the end of the input string is reached before any conversions
164              have  been performed and no variables are given, an empty string
165              is returned.
166

EXAMPLES

168       Convert a UNICODE character to its numeric value:
169              set char "x"
170              set value [scan $char %c]
171
172       Parse a simple color specification of the form #RRGGBB using  hexadeci‐
173       mal conversions with substring sizes:
174              set string "#08D03F"
175              scan $string "#%2x%2x%2x" r g b
176
177       Parse  a HH:MM time string, noting that this avoids problems with octal
178       numbers by forcing interpretation as decimals (if we did not  care,  we
179       would use the %i conversion instead):
180              set string "08:08"   ;# *Not* octal!
181              if {[scan $string "%d:%d" hours minutes] != 2} {
182                 error "not a valid time string"
183              }
184              # We have to understand numeric ranges ourselves...
185              if {$minutes < 0 || $minutes > 59} {
186                 error "invalid number of minutes"
187              }
188
189       Break a string up into sequences of non-whitespace characters (note the
190       use of the %n conversion so that we get skipping  over  leading  white‐
191       space correct):
192              set string " a string {with braced words} + leading space "
193              set words {}
194              while {[scan $string %s%n word length] == 2} {
195                 lappend words $word
196                 set string [string range $string $length end]
197              }
198
199       Parse a simple coordinate string, checking that it is complete by look‐
200       ing for the terminating character explicitly:
201              set string "(5.2,-4e-2)"
202              # Note that the spaces before the literal parts of
203              # the scan pattern are significant, and that ")" is
204              # the Unicode character \u0029
205              if {
206                 [scan $string " (%f ,%f %c" x y last] != 3
207                 || $last != 0x0029
208              } then {
209                 error "invalid coordinate string"
210              }
211              puts "X=$x, Y=$y"
212
213       An interactive session demonstrating the truncation of  integer  values │
214       determined by size modifiers:                                           │
215              % set tcl_platform(wordSize)                                     │
216              4                                                                │
217              % scan 20000000000000000000 %d                                   │
218              2147483647                                                       │
219              % scan 20000000000000000000 %ld                                  │
220              9223372036854775807                                              │
221              % scan 20000000000000000000 %lld                                 │
222              20000000000000000000                                             │
223

KEYWORDS

228       conversion specifier, parse, scan
229
230
231
232Tcl                                   8.4                              scan(n)