1string(n) Tcl Built-In Commands string(n)
2
3
4
5______________________________________________________________________________
6
8 string - Manipulate strings
9
11 string option arg ?arg ...?
12______________________________________________________________________________
13
15 Performs one of several string operations, depending on option. The
16 legal options (which may be abbreviated) are:
17
18 string cat ?string1? ?string2...?
19 Concatenate the given strings just like placing them directly │
20 next to each other and return the resulting compound string. If │
21 no strings are present, the result is an empty string. │
22
23 This primitive is occasionally handier than juxtaposition of │
24 strings when mixed quoting is wanted, or when the aim is to re‐ │
25 turn the result of a concatenation without resorting to return │
26 -level 0, and is more efficient than building a list of argu‐ │
27 ments and using join with an empty join string. │
28
29 string compare ?-nocase? ?-length length? string1 string2
30 Perform a character-by-character comparison of strings string1
31 and string2. Returns -1, 0, or 1, depending on whether string1
32 is lexicographically less than, equal to, or greater than
33 string2. If -length is specified, then only the first length
34 characters are used in the comparison. If -length is negative,
35 it is ignored. If -nocase is specified, then the strings are
36 compared in a case-insensitive manner.
37
38 string equal ?-nocase? ?-length length? string1 string2
39 Perform a character-by-character comparison of strings string1
40 and string2. Returns 1 if string1 and string2 are identical, or
41 0 when not. If -length is specified, then only the first length
42 characters are used in the comparison. If -length is negative,
43 it is ignored. If -nocase is specified, then the strings are
44 compared in a case-insensitive manner.
45
46 string first needleString haystackString ?startIndex?
47 Search haystackString for a sequence of characters that exactly
48 match the characters in needleString. If found, return the in‐
49 dex of the first character in the first such match within
50 haystackString. If not found, return -1. If startIndex is
51 specified (in any of the forms described in STRING INDICES),
52 then the search is constrained to start with the character in
53 haystackString specified by the index. For example,
54
55 string first a 0a23456789abcdef 5
56
57 will return 10, but
58
59 string first a 0123456789abcdef 11
60
61 will return -1.
62
63 string index string charIndex
64 Returns the charIndex'th character of the string argument. A
65 charIndex of 0 corresponds to the first character of the string.
66 charIndex may be specified as described in the STRING INDICES
67 section.
68
69 If charIndex is less than 0 or greater than or equal to the
70 length of the string then this command returns an empty string.
71
72 string is class ?-strict? ?-failindex varname? string
73 Returns 1 if string is a valid member of the specified character
74 class, otherwise returns 0. If -strict is specified, then an
75 empty string returns 0, otherwise an empty string will return 1
76 on any class. If -failindex is specified, then if the function
77 returns 0, the index in the string where the class was no longer
78 valid will be stored in the variable named varname. The varname
79 will not be set if string is returns 1. The following character
80 classes are recognized (the class name can be abbreviated):
81
82 alnum Any Unicode alphabet or digit character.
83
84 alpha Any Unicode alphabet character.
85
86 ascii Any character with a value less than \u0080 (those
87 that are in the 7-bit ascii range).
88
89 boolean Any of the forms allowed to Tcl_GetBoolean.
90
91 control Any Unicode control character.
92
93 digit Any Unicode digit character. Note that this in‐
94 cludes characters outside of the [0-9] range.
95
96 double Any of the forms allowed to Tcl_GetDoubleFromObj.
97
98 entier Any of the valid string formats for an integer value │
99 of arbitrary size in Tcl, with optional surrounding │
100 whitespace. The formats accepted are exactly those │
101 accepted by the C routine Tcl_GetBignumFromObj.
102
103 false Any of the forms allowed to Tcl_GetBoolean where the
104 value is false.
105
106 graph Any Unicode printing character, except space.
107
108 integer Any of the valid string formats for a 32-bit integer
109 value in Tcl, with optional surrounding whitespace.
110 In case of overflow in the value, 0 is returned and
111 the varname will contain -1.
112
113 list Any proper list structure, with optional surrounding
114 whitespace. In case of improper list structure, 0 is
115 returned and the varname will contain the index of
116 the “element” where the list parsing fails, or -1 if
117 this cannot be determined.
118
119 lower Any Unicode lower case alphabet character.
120
121 print Any Unicode printing character, including space.
122
123 punct Any Unicode punctuation character.
124
125 space Any Unicode whitespace character, mongolian vowel
126 separator (U+180e), zero width space (U+200b), word
127 joiner (U+2060) or zero width no-break space
128 (U+feff) (=BOM).
129
130 true Any of the forms allowed to Tcl_GetBoolean where the
131 value is true.
132
133 upper Any upper case alphabet character in the Unicode
134 character set.
135
136 wideinteger Any of the valid forms for a wide integer in Tcl,
137 with optional surrounding whitespace. In case of
138 overflow in the value, 0 is returned and the varname
139 will contain -1.
140
141 wordchar Any Unicode word character. That is any alphanu‐
142 meric character, and any Unicode connector punctua‐
143 tion characters (e.g. underscore).
144
145 xdigit Any hexadecimal digit character ([0-9A-Fa-f]).
146
147 In the case of boolean, true and false, if the function will re‐
148 turn 0, then the varname will always be set to 0, due to the
149 varied nature of a valid boolean value.
150
151 string last needleString haystackString ?lastIndex?
152 Search haystackString for a sequence of characters that exactly
153 match the characters in needleString. If found, return the in‐
154 dex of the first character in the last such match within
155 haystackString. If there is no match, then return -1. If
156 lastIndex is specified (in any of the forms described in STRING
157 INDICES), then only the characters in haystackString at or be‐
158 fore the specified lastIndex will be considered by the search.
159 For example,
160
161 string last a 0a23456789abcdef 15
162
163 will return 10, but
164
165 string last a 0a23456789abcdef 9
166
167 will return 1.
168
169 string length string
170 Returns a decimal string giving the number of characters in
171 string. Note that this is not necessarily the same as the num‐
172 ber of bytes used to store the string. If the value is a byte
173 array value (such as those returned from reading a binary en‐
174 coded channel), then this will return the actual byte length of
175 the value.
176
177 string map ?-nocase? mapping string
178 Replaces substrings in string based on the key-value pairs in
179 mapping. mapping is a list of key value key value ... as in
180 the form returned by array get. Each instance of a key in the
181 string will be replaced with its corresponding value. If -no‐
182 case is specified, then matching is done without regard to case
183 differences. Both key and value may be multiple characters. Re‐
184 placement is done in an ordered manner, so the key appearing
185 first in the list will be checked first, and so on. string is
186 only iterated over once, so earlier key replacements will have
187 no affect for later key matches. For example,
188
189 string map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc
190
191 will return the string 01321221.
192
193 Note that if an earlier key is a prefix of a later one, it will
194 completely mask the later one. So if the previous example is
195 reordered like this,
196
197 string map {1 0 ab 2 a 3 abc 1} 1abcaababcabababc
198
199 it will return the string 02c322c222c.
200
201 string match ?-nocase? pattern string
202 See if pattern matches string; return 1 if it does, 0 if it does
203 not. If -nocase is specified, then the pattern attempts to
204 match against the string in a case insensitive manner. For the
205 two strings to match, their contents must be identical except
206 that the following special sequences may appear in pattern:
207
208 * Matches any sequence of characters in string, includ‐
209 ing a null string.
210
211 ? Matches any single character in string.
212
213 [chars] Matches any character in the set given by chars. If a
214 sequence of the form x-y appears in chars, then any
215 character between x and y, inclusive, will match.
216 When used with -nocase, the end points of the range
217 are converted to lower case first. Whereas {[A-z]}
218 matches “_” when matching case-sensitively (since “_”
219 falls between the “Z” and “a”), with -nocase this is
220 considered like {[A-Za-z]} (and probably what was
221 meant in the first place).
222
223 \x Matches the single character x. This provides a way
224 of avoiding the special interpretation of the charac‐
225 ters *?[]\ in pattern.
226
227 string range string first last
228 Returns a range of consecutive characters from string, starting
229 with the character whose index is first and ending with the
230 character whose index is last. An index of 0 refers to the first
231 character of the string. first and last may be specified as for
232 the index method. If first is less than zero then it is treated
233 as if it were zero, and if last is greater than or equal to the
234 length of the string then it is treated as if it were end. If
235 first is greater than last then an empty string is returned.
236
237 string repeat string count
238 Returns string repeated count number of times.
239
240 string replace string first last ?newstring?
241 Removes a range of consecutive characters from string, starting
242 with the character whose index is first and ending with the
243 character whose index is last. An index of 0 refers to the
244 first character of the string. First and last may be specified
245 as for the index method. If newstring is specified, then it is
246 placed in the removed character range. If first is less than
247 zero then it is treated as if it were zero, and if last is
248 greater than or equal to the length of the string then it is
249 treated as if it were end. The initial string is returned un‐
250 touched, if first is greater than last, or if first is equal to
251 or greater than the length of the initial string, or last is
252 less than 0.
253
254 string reverse string
255 Returns a string that is the same length as string but with its
256 characters in the reverse order.
257
258 string tolower string ?first? ?last?
259 Returns a value equal to string except that all upper (or title)
260 case letters have been converted to lower case. If first is
261 specified, it refers to the first char index in the string to
262 start modifying. If last is specified, it refers to the char
263 index in the string to stop at (inclusive). first and last may
264 be specified using the forms described in STRING INDICES.
265
266 string totitle string ?first? ?last?
267 Returns a value equal to string except that the first character
268 in string is converted to its Unicode title case variant (or up‐
269 per case if there is no title case variant) and the rest of the
270 string is converted to lower case. If first is specified, it
271 refers to the first char index in the string to start modifying.
272 If last is specified, it refers to the char index in the string
273 to stop at (inclusive). first and last may be specified using
274 the forms described in STRING INDICES.
275
276 string toupper string ?first? ?last?
277 Returns a value equal to string except that all lower (or title)
278 case letters have been converted to upper case. If first is
279 specified, it refers to the first char index in the string to
280 start modifying. If last is specified, it refers to the char
281 index in the string to stop at (inclusive). first and last may
282 be specified using the forms described in STRING INDICES.
283
284 string trim string ?chars?
285 Returns a value equal to string except that any leading or
286 trailing characters present in the string given by chars are re‐
287 moved. If chars is not specified then white space is removed
288 (any character for which string is space returns 1, and "\0").
289
290 string trimleft string ?chars?
291 Returns a value equal to string except that any leading charac‐
292 ters present in the string given by chars are removed. If chars
293 is not specified then white space is removed (any character for
294 which string is space returns 1, and "\0").
295
296 string trimright string ?chars?
297 Returns a value equal to string except that any trailing charac‐
298 ters present in the string given by chars are removed. If chars
299 is not specified then white space is removed (any character for
300 which string is space returns 1, and "\0").
301
302 OBSOLETE SUBCOMMANDS
303 These subcommands are currently supported, but are likely to go away in
304 a future release as their functionality is either virtually never used
305 or highly misleading.
306
307 string bytelength string
308 Returns a decimal string giving the number of bytes used to rep‐
309 resent string in memory when encoded as Tcl's internal modified
310 UTF-8; Tcl may use other encodings for string as well, and does
311 not guarantee to only use a single encoding for a particular
312 string. Because UTF-8 uses a variable number of bytes to repre‐
313 sent Unicode characters, the byte length will not be the same as
314 the character length in general. The cases where a script cares
315 about the byte length are rare.
316
317 In almost all cases, you should use the string length operation
318 (including determining the length of a Tcl byte array value).
319 Refer to the Tcl_NumUtfChars manual entry for more details on
320 the UTF-8 representation.
321
322 Formally, the string bytelength operation returns the content of
323 the length field of the Tcl_Obj structure, after calling
324 Tcl_GetString to ensure that the bytes field is populated. This
325 is highly unlikely to be useful to Tcl scripts, as Tcl's inter‐
326 nal encoding is not strict UTF-8, but rather a modified CESU-8
327 with a denormalized NUL (identical to that used in a number of
328 places by Java's serialization mechanism) to enable basic pro‐
329 cessing with non-Unicode-aware C functions. As this representa‐
330 tion should only ever be used by Tcl's implementation, the num‐
331 ber of bytes used to store the representation is of very low
332 value (except to C extension code, which has direct access for
333 the purpose of memory management, etc.)
334
335 Compatibility note: it is likely that this subcommand will be
336 withdrawn in a future version of Tcl. It is better to use the
337 encoding convertto command to convert a string to a known encod‐
338 ing and then apply string length to that.
339
340 string length [encoding convertto utf-8 $theString]
341
342 string wordend string charIndex
343 Returns the index of the character just after the last one in
344 the word containing character charIndex of string. charIndex
345 may be specified using the forms in STRING INDICES. A word is
346 considered to be any contiguous range of alphanumeric (Unicode
347 letters or decimal digits) or underscore (Unicode connector
348 punctuation) characters, or any single character other than
349 these.
350
351 string wordstart string charIndex
352 Returns the index of the first character in the word containing
353 character charIndex of string. charIndex may be specified using
354 the forms in STRING INDICES. A word is considered to be any
355 contiguous range of alphanumeric (Unicode letters or decimal
356 digits) or underscore (Unicode connector punctuation) charac‐
357 ters, or any single character other than these.
358
360 When referring to indices into a string (e.g., for string index or
361 string range) the following formats are supported:
362
363 integer For any index value that passes string is integer -strict,
364 the char specified at this integral index (e.g., 2 would re‐
365 fer to the “c” in “abcd”).
366
367 end The last char of the string (e.g., end would refer to the “d”
368 in “abcd”).
369
370 end-N The last char of the string minus the specified integer off‐
371 set N (e.g., “end-1” would refer to the “c” in “abcd”).
372
373 end+N The last char of the string plus the specified integer offset
374 N (e.g., “end+-1” would refer to the “c” in “abcd”).
375
376 M+N The char specified at the integral index that is the sum of
377 integer values M and N (e.g., “1+1” would refer to the “c” in
378 “abcd”).
379
380 M-N The char specified at the integral index that is the differ‐
381 ence of integer values M and N (e.g., “2-1” would refer to
382 the “b” in “abcd”).
383
384 In the specifications above, the integer value M contains no trailing
385 whitespace and the integer value N contains no leading whitespace.
386
388 Test if the string in the variable string is a proper non-empty prefix
389 of the string foobar.
390
391 set length [string length $string]
392 if {$length == 0} {
393 set isPrefix 0
394 } else {
395 set isPrefix [string equal -length $length $string "foobar"]
396 }
397
399 expr(n), list(n)
400
402 case conversion, compare, index, match, pattern, string, word, equal,
403 ctype, character, reverse
404
405
406
407Tcl 8.1 string(n)