1string(n) Tcl Built-In Commands string(n)
2
3
4
5______________________________________________________________________________
6
8 string - Manipulate strings
9
11 string option arg ?arg ...?
12______________________________________________________________________________
13
15 Performs one of several string operations, depending on option. The
16 legal options (which may be abbreviated) are:
17
18 string cat ?string1? ?string2...?
19 Concatenate the given strings just like placing them directly │
20 next to each other and return the resulting compound string. If │
21 no strings are present, the result is an empty string. │
22
23 This primitive is occasionally handier than juxtaposition of │
24 strings when mixed quoting is wanted, or when the aim is to │
25 return the result of a concatenation without resorting to return │
26 -level 0, and is more efficient than building a list of argu‐ │
27 ments and using join with an empty join string. │
28
29 string compare ?-nocase? ?-length length? string1 string2
30 Perform a character-by-character comparison of strings string1
31 and string2. Returns -1, 0, or 1, depending on whether string1
32 is lexicographically less than, equal to, or greater than
33 string2. If -length is specified, then only the first length
34 characters are used in the comparison. If -length is negative,
35 it is ignored. If -nocase is specified, then the strings are
36 compared in a case-insensitive manner.
37
38 string equal ?-nocase? ?-length length? string1 string2
39 Perform a character-by-character comparison of strings string1
40 and string2. Returns 1 if string1 and string2 are identical, or
41 0 when not. If -length is specified, then only the first length
42 characters are used in the comparison. If -length is negative,
43 it is ignored. If -nocase is specified, then the strings are
44 compared in a case-insensitive manner.
45
46 string first needleString haystackString ?startIndex?
47 Search haystackString for a sequence of characters that exactly
48 match the characters in needleString. If found, return the
49 index of the first character in the first such match within
50 haystackString. If not found, return -1. If startIndex is
51 specified (in any of the forms described in STRING INDICES),
52 then the search is constrained to start with the character in
53 haystackString specified by the index. For example,
54
55 string first a 0a23456789abcdef 5
56
57 will return 10, but
58
59 string first a 0123456789abcdef 11
60
61 will return -1.
62
63 string index string charIndex
64 Returns the charIndex'th character of the string argument. A
65 charIndex of 0 corresponds to the first character of the string.
66 charIndex may be specified as described in the STRING INDICES
67 section.
68
69 If charIndex is less than 0 or greater than or equal to the
70 length of the string then this command returns an empty string.
71
72 string is class ?-strict? ?-failindex varname? string
73 Returns 1 if string is a valid member of the specified character
74 class, otherwise returns 0. If -strict is specified, then an
75 empty string returns 0, otherwise an empty string will return 1
76 on any class. If -failindex is specified, then if the function
77 returns 0, the index in the string where the class was no longer
78 valid will be stored in the variable named varname. The varname
79 will not be set if string is returns 1. The following character
80 classes are recognized (the class name can be abbreviated):
81
82 alnum Any Unicode alphabet or digit character.
83
84 alpha Any Unicode alphabet character.
85
86 ascii Any character with a value less than \u0080 (those
87 that are in the 7-bit ascii range).
88
89 boolean Any of the forms allowed to Tcl_GetBoolean.
90
91 control Any Unicode control character.
92
93 digit Any Unicode digit character. Note that this
94 includes characters outside of the [0-9] range.
95
96 double Any of the valid forms for a double in Tcl, with
97 optional surrounding whitespace. In case of
98 under/overflow in the value, 0 is returned and the
99 varname will contain -1.
100
101 entier Any of the valid string formats for an integer value │
102 of arbitrary size in Tcl, with optional surrounding │
103 whitespace. The formats accepted are exactly those │
104 accepted by the C routine Tcl_GetBignumFromObj.
105
106 false Any of the forms allowed to Tcl_GetBoolean where the
107 value is false.
108
109 graph Any Unicode printing character, except space.
110
111 integer Any of the valid string formats for a 32-bit integer
112 value in Tcl, with optional surrounding whitespace.
113 In case of under/overflow in the value, 0 is
114 returned and the varname will contain -1.
115
116 list Any proper list structure, with optional surrounding
117 whitespace. In case of improper list structure, 0 is
118 returned and the varname will contain the index of
119 the “element” where the list parsing fails, or -1 if
120 this cannot be determined.
121
122 lower Any Unicode lower case alphabet character.
123
124 print Any Unicode printing character, including space.
125
126 punct Any Unicode punctuation character.
127
128 space Any Unicode whitespace character, mongolian vowel
129 separator (U+180e), zero width space (U+200b), word
130 joiner (U+2060) or zero width no-break space
131 (U+feff) (=BOM).
132
133 true Any of the forms allowed to Tcl_GetBoolean where the
134 value is true.
135
136 upper Any upper case alphabet character in the Unicode
137 character set.
138
139 wideinteger Any of the valid forms for a wide integer in Tcl,
140 with optional surrounding whitespace. In case of
141 under/overflow in the value, 0 is returned and the
142 varname will contain -1.
143
144 wordchar Any Unicode word character. That is any alphanu‐
145 meric character, and any Unicode connector punctua‐
146 tion characters (e.g. underscore).
147
148 xdigit Any hexadecimal digit character ([0-9A-Fa-f]).
149
150 In the case of boolean, true and false, if the function will
151 return 0, then the varname will always be set to 0, due to the
152 varied nature of a valid boolean value.
153
154 string last needleString haystackString ?lastIndex?
155 Search haystackString for a sequence of characters that exactly
156 match the characters in needleString. If found, return the
157 index of the first character in the last such match within
158 haystackString. If there is no match, then return -1. If
159 lastIndex is specified (in any of the forms described in STRING
160 INDICES), then only the characters in haystackString at or
161 before the specified lastIndex will be considered by the search.
162 For example,
163
164 string last a 0a23456789abcdef 15
165
166 will return 10, but
167
168 string last a 0a23456789abcdef 9
169
170 will return 1.
171
172 string length string
173 Returns a decimal string giving the number of characters in
174 string. Note that this is not necessarily the same as the num‐
175 ber of bytes used to store the string. If the value is a byte
176 array value (such as those returned from reading a binary
177 encoded channel), then this will return the actual byte length
178 of the value.
179
180 string map ?-nocase? mapping string
181 Replaces substrings in string based on the key-value pairs in
182 mapping. mapping is a list of key value key value ... as in
183 the form returned by array get. Each instance of a key in the
184 string will be replaced with its corresponding value. If
185 -nocase is specified, then matching is done without regard to
186 case differences. Both key and value may be multiple characters.
187 Replacement is done in an ordered manner, so the key appearing
188 first in the list will be checked first, and so on. string is
189 only iterated over once, so earlier key replacements will have
190 no affect for later key matches. For example,
191
192 string map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc
193
194 will return the string 01321221.
195
196 Note that if an earlier key is a prefix of a later one, it will
197 completely mask the later one. So if the previous example is
198 reordered like this,
199
200 string map {1 0 ab 2 a 3 abc 1} 1abcaababcabababc
201
202 it will return the string 02c322c222c.
203
204 string match ?-nocase? pattern string
205 See if pattern matches string; return 1 if it does, 0 if it does
206 not. If -nocase is specified, then the pattern attempts to
207 match against the string in a case insensitive manner. For the
208 two strings to match, their contents must be identical except
209 that the following special sequences may appear in pattern:
210
211 * Matches any sequence of characters in string, includ‐
212 ing a null string.
213
214 ? Matches any single character in string.
215
216 [chars] Matches any character in the set given by chars. If a
217 sequence of the form x-y appears in chars, then any
218 character between x and y, inclusive, will match.
219 When used with -nocase, the end points of the range
220 are converted to lower case first. Whereas {[A-z]}
221 matches “_” when matching case-sensitively (since “_”
222 falls between the “Z” and “a”), with -nocase this is
223 considered like {[A-Za-z]} (and probably what was
224 meant in the first place).
225
226 \x Matches the single character x. This provides a way
227 of avoiding the special interpretation of the charac‐
228 ters *?[]\ in pattern.
229
230 string range string first last
231 Returns a range of consecutive characters from string, starting
232 with the character whose index is first and ending with the
233 character whose index is last. An index of 0 refers to the first
234 character of the string. first and last may be specified as for
235 the index method. If first is less than zero then it is treated
236 as if it were zero, and if last is greater than or equal to the
237 length of the string then it is treated as if it were end. If
238 first is greater than last then an empty string is returned.
239
240 string repeat string count
241 Returns string repeated count number of times.
242
243 string replace string first last ?newstring?
244 Removes a range of consecutive characters from string, starting
245 with the character whose index is first and ending with the
246 character whose index is last. An index of 0 refers to the
247 first character of the string. First and last may be specified
248 as for the index method. If newstring is specified, then it is
249 placed in the removed character range. If first is less than
250 zero then it is treated as if it were zero, and if last is
251 greater than or equal to the length of the string then it is
252 treated as if it were end. If first is greater than last or the
253 length of the initial string, or last is less than 0, then the
254 initial string is returned untouched.
255
256 string reverse string
257 Returns a string that is the same length as string but with its
258 characters in the reverse order.
259
260 string tolower string ?first? ?last?
261 Returns a value equal to string except that all upper (or title)
262 case letters have been converted to lower case. If first is
263 specified, it refers to the first char index in the string to
264 start modifying. If last is specified, it refers to the char
265 index in the string to stop at (inclusive). first and last may
266 be specified using the forms described in STRING INDICES.
267
268 string totitle string ?first? ?last?
269 Returns a value equal to string except that the first character
270 in string is converted to its Unicode title case variant (or
271 upper case if there is no title case variant) and the rest of
272 the string is converted to lower case. If first is specified,
273 it refers to the first char index in the string to start modify‐
274 ing. If last is specified, it refers to the char index in the
275 string to stop at (inclusive). first and last may be specified
276 using the forms described in STRING INDICES.
277
278 string toupper string ?first? ?last?
279 Returns a value equal to string except that all lower (or title)
280 case letters have been converted to upper case. If first is
281 specified, it refers to the first char index in the string to
282 start modifying. If last is specified, it refers to the char
283 index in the string to stop at (inclusive). first and last may
284 be specified using the forms described in STRING INDICES.
285
286 string trim string ?chars?
287 Returns a value equal to string except that any leading or
288 trailing characters present in the string given by chars are
289 removed. If chars is not specified then white space is removed
290 (any character for which string is space returns 1, and " ").
291
292 string trimleft string ?chars?
293 Returns a value equal to string except that any leading charac‐
294 ters present in the string given by chars are removed. If chars
295 is not specified then white space is removed (any character for
296 which string is space returns 1, and " ").
297
298 string trimright string ?chars?
299 Returns a value equal to string except that any trailing charac‐
300 ters present in the string given by chars are removed. If chars
301 is not specified then white space is removed (any character for
302 which string is space returns 1, and " ").
303
304 OBSOLETE SUBCOMMANDS
305 These subcommands are currently supported, but are likely to go away in
306 a future release as their functionality is either virtually never used
307 or highly misleading.
308
309 string bytelength string
310 Returns a decimal string giving the number of bytes used to rep‐
311 resent string in memory when encoded as Tcl's internal modified
312 UTF-8; Tcl may use other encodings for string as well, and does
313 not guarantee to only use a single encoding for a particular
314 string. Because UTF-8 uses a variable number of bytes to repre‐
315 sent Unicode characters, the byte length will not be the same as
316 the character length in general. The cases where a script cares
317 about the byte length are rare.
318
319 In almost all cases, you should use the string length operation
320 (including determining the length of a Tcl byte array value).
321 Refer to the Tcl_NumUtfChars manual entry for more details on
322 the UTF-8 representation.
323
324 Formally, the string bytelength operation returns the content of
325 the length field of the Tcl_Obj structure, after calling
326 Tcl_GetString to ensure that the bytes field is populated. This
327 is highly unlikely to be useful to Tcl scripts, as Tcl's inter‐
328 nal encoding is not strict UTF-8, but rather a modified CESU-8
329 with a denormalized NUL (identical to that used in a number of
330 places by Java's serialization mechanism) to enable basic pro‐
331 cessing with non-Unicode-aware C functions. As this representa‐
332 tion should only ever be used by Tcl's implementation, the num‐
333 ber of bytes used to store the representation is of very low
334 value (except to C extension code, which has direct access for
335 the purpose of memory management, etc.)
336
337 Compatibility note: it is likely that this subcommand will be
338 withdrawn in a future version of Tcl. It is better to use the
339 encoding convertto command to convert a string to a known encod‐
340 ing and then apply string length to that.
341
342 string length [encoding convertto utf-8 $theString]
343
344 string wordend string charIndex
345 Returns the index of the character just after the last one in
346 the word containing character charIndex of string. charIndex
347 may be specified using the forms in STRING INDICES. A word is
348 considered to be any contiguous range of alphanumeric (Unicode
349 letters or decimal digits) or underscore (Unicode connector
350 punctuation) characters, or any single character other than
351 these.
352
353 string wordstart string charIndex
354 Returns the index of the first character in the word containing
355 character charIndex of string. charIndex may be specified using
356 the forms in STRING INDICES. A word is considered to be any
357 contiguous range of alphanumeric (Unicode letters or decimal
358 digits) or underscore (Unicode connector punctuation) charac‐
359 ters, or any single character other than these.
360
362 When referring to indices into a string (e.g., for string index or
363 string range) the following formats are supported:
364
365 integer For any index value that passes string is integer -strict,
366 the char specified at this integral index (e.g., 2 would
367 refer to the “c” in “abcd”).
368
369 end The last char of the string (e.g., end would refer to the “d”
370 in “abcd”).
371
372 end-N The last char of the string minus the specified integer off‐
373 set N (e.g., “end-1” would refer to the “c” in “abcd”).
374
375 end+N The last char of the string plus the specified integer offset
376 N (e.g., “end+-1” would refer to the “c” in “abcd”).
377
378 M+N The char specified at the integral index that is the sum of
379 integer values M and N (e.g., “1+1” would refer to the “c” in
380 “abcd”).
381
382 M-N The char specified at the integral index that is the differ‐
383 ence of integer values M and N (e.g., “2-1” would refer to
384 the “b” in “abcd”).
385
386 In the specifications above, the integer value M contains no trailing
387 whitespace and the integer value N contains no leading whitespace.
388
390 Test if the string in the variable string is a proper non-empty prefix
391 of the string foobar.
392
393 set length [string length $string]
394 if {$length == 0} {
395 set isPrefix 0
396 } else {
397 set isPrefix [string equal -length $length $string "foobar"]
398 }
399
401 expr(n), list(n)
402
404 case conversion, compare, index, match, pattern, string, word, equal,
405 ctype, character, reverse
406
407
408
409Tcl 8.1 string(n)