1String(3) OCaml library String(3)
2
3
4
6 String - Strings.
7
9 Module String
10
12 Module String
13 : sig end
14
15
16 Strings.
17
18 A string s of length n is an indexable and immutable sequence of n
19 bytes. For historical reasons these bytes are referred to as charac‐
20 ters.
21
22 The semantics of string functions is defined in terms of indices and
23 positions. These are depicted and described as follows.
24
25 positions 0 1 2 3 4 n-1 n +---+---+---+---+ +-----+
26 indices | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+ +-----+
27
28 -An index i of s is an integer in the range [ 0 ; n-1 ]. It represents
29 the i th byte (character) of s which can be accessed using the constant
30 time string indexing operator s.[i] .
31
32 -A position i of s is an integer in the range [ 0 ; n ]. It represents
33 either the point at the beginning of the string, or the point between
34 two indices, or the point at the end of the string. The i th byte index
35 is between position i and i+1 .
36
37
38 Two integers start and len are said to define a valid substring of s if
39 len >= 0 and start , start+len are positions of s .
40
41 Unicode text. Strings being arbitrary sequences of bytes, they can hold
42 any kind of textual encoding. However the recommended encoding for
43 storing Unicode text in OCaml strings is UTF-8. This is the encoding
44 used by Unicode escapes in string literals. For example the string
45 "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47 Past mutability. OCaml strings used to be modifiable in place, for in‐
48 stance via the String.set and String.blit functions. This use is nowa‐
49 days only possible when the compiler is put in "unsafe-string" mode by
50 giving the -unsafe-string command-line option. This compatibility mode
51 makes the types string and bytes (see Bytes.t ) interchangeable so that
52 functions expecting byte sequences can also accept strings as arguments
53 and modify them.
54
55 The distinction between bytes and string was introduced in OCaml 4.02,
56 and the "unsafe-string" compatibility mode was the default until OCaml
57 4.05. Starting with 4.06, the compatibility mode is opt-in; we intend
58 to remove the option in the future.
59
60 The labeled version of this module can be used as described in the Std‐
61 Labels module.
62
63
64
65
66
67
68
69 Strings
70 type t = string
71
72
73 The type for strings.
74
75
76
77 val make : int -> char -> string
78
79
80 make n c is a string of length n with each index holding the character
81 c .
82
83
84 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
85
86
87
88 val init : int -> (int -> char) -> string
89
90
91 init n f is a string of length n with index i holding the character f i
92 (called in increasing index order).
93
94
95 Since 4.02.0
96
97
98 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102 val length : string -> int
103
104
105 length s is the length (number of bytes/characters) of s .
106
107
108
109 val get : string -> int -> char
110
111
112 get s i is the character at index i in s . This is the same as writing
113 s.[i] .
114
115
116 Raises Invalid_argument if i not an index of s .
117
118
119
120
121 Concatenating
122 Note. The (^) binary operator concatenates two strings.
123
124 val concat : string -> string list -> string
125
126
127 concat sep ss concatenates the list of strings ss , inserting the sepa‐
128 rator string sep between each.
129
130
131 Raises Invalid_argument if the result is longer than
132 Sys.max_string_length bytes.
133
134
135
136
137 Predicates and comparisons
138 val equal : t -> t -> bool
139
140
141 equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
142
143
144 Since 4.03.0 (4.05.0 in StringLabels)
145
146
147
148 val compare : t -> t -> int
149
150
151 compare s0 s1 sorts s0 and s1 in lexicographical order. compare be‐
152 haves like compare on strings but may be more efficient.
153
154
155
156 val contains_from : string -> int -> char -> bool
157
158
159 contains_from s start c is true if and only if c appears in s after po‐
160 sition start .
161
162
163 Raises Invalid_argument if start is not a valid position in s .
164
165
166
167 val rcontains_from : string -> int -> char -> bool
168
169
170 rcontains_from s stop c is true if and only if c appears in s before
171 position stop+1 .
172
173
174 Raises Invalid_argument if stop < 0 or stop+1 is not a valid position
175 in s .
176
177
178
179 val contains : string -> char -> bool
180
181
182 contains s c is String.contains_from s 0 c .
183
184
185
186
187 Extracting substrings
188 val sub : string -> int -> int -> string
189
190
191 sub s pos len is a string of length len , containing the substring of s
192 that starts at position pos and has length len .
193
194
195 Raises Invalid_argument if pos and len do not designate a valid sub‐
196 string of s .
197
198
199
200 val split_on_char : char -> string -> string list
201
202
203 split_on_char sep s is the list of all (possibly empty) substrings of s
204 that are delimited by the character sep .
205
206 The function's result is specified by the following invariants:
207
208 -The list is not empty.
209
210 -Concatenating its elements using sep as a separator returns a string
211 equal to the input ( concat (make 1 sep)
212 (split_on_char sep s) = s ).
213
214 -No string in the result contains the sep character.
215
216
217
218 Since 4.04.0 (4.05.0 in StringLabels)
219
220
221
222
223 Transforming
224 val map : (char -> char) -> string -> string
225
226
227 map f s is the string resulting from applying f to all the characters
228 of s in increasing order.
229
230
231 Since 4.00.0
232
233
234
235 val mapi : (int -> char -> char) -> string -> string
236
237
238 mapi f s is like String.map but the index of the character is also
239 passed to f .
240
241
242 Since 4.02.0
243
244
245
246 val trim : string -> string
247
248
249 trim s is s without leading and trailing whitespace. Whitespace charac‐
250 ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
251
252
253 Since 4.00.0
254
255
256
257 val escaped : string -> string
258
259
260 escaped s is s with special characters represented by escape sequences,
261 following the lexical conventions of OCaml.
262
263 All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
264 caped, as well as backslash (0x2F) and double-quote (0x22).
265
266 The function Scanf.unescaped is a left inverse of escaped , i.e.
267 Scanf.unescaped (escaped s) = s for any string s (unless escaped s
268 fails).
269
270
271 Raises Invalid_argument if the result is longer than
272 Sys.max_string_length bytes.
273
274
275
276 val uppercase_ascii : string -> string
277
278
279 uppercase_ascii s is s with all lowercase letters translated to upper‐
280 case, using the US-ASCII character set.
281
282
283 Since 4.03.0 (4.05.0 in StringLabels)
284
285
286
287 val lowercase_ascii : string -> string
288
289
290 lowercase_ascii s is s with all uppercase letters translated to lower‐
291 case, using the US-ASCII character set.
292
293
294 Since 4.03.0 (4.05.0 in StringLabels)
295
296
297
298 val capitalize_ascii : string -> string
299
300
301 capitalize_ascii s is s with the first character set to uppercase, us‐
302 ing the US-ASCII character set.
303
304
305 Since 4.03.0 (4.05.0 in StringLabels)
306
307
308
309 val uncapitalize_ascii : string -> string
310
311
312 uncapitalize_ascii s is s with the first character set to lowercase,
313 using the US-ASCII character set.
314
315
316 Since 4.03.0 (4.05.0 in StringLabels)
317
318
319
320
321 Traversing
322 val iter : (char -> unit) -> string -> unit
323
324
325 iter f s applies function f in turn to all the characters of s . It is
326 equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
327
328
329
330 val iteri : (int -> char -> unit) -> string -> unit
331
332
333 iteri is like String.iter , but the function is also given the corre‐
334 sponding character index.
335
336
337 Since 4.00.0
338
339
340
341
342 Searching
343 val index_from : string -> int -> char -> int
344
345
346 index_from s i c is the index of the first occurrence of c in s after
347 position i .
348
349
350 Raises Not_found if c does not occur in s after position i .
351
352
353 Raises Invalid_argument if i is not a valid position in s .
354
355
356
357 val index_from_opt : string -> int -> char -> int option
358
359
360 index_from_opt s i c is the index of the first occurrence of c in s af‐
361 ter position i (if any).
362
363
364 Since 4.05
365
366
367 Raises Invalid_argument if i is not a valid position in s .
368
369
370
371 val rindex_from : string -> int -> char -> int
372
373
374 rindex_from s i c is the index of the last occurrence of c in s before
375 position i+1 .
376
377
378 Raises Not_found if c does not occur in s before position i+1 .
379
380
381 Raises Invalid_argument if i+1 is not a valid position in s .
382
383
384
385 val rindex_from_opt : string -> int -> char -> int option
386
387
388 rindex_from_opt s i c is the index of the last occurrence of c in s be‐
389 fore position i+1 (if any).
390
391
392 Since 4.05
393
394
395 Raises Invalid_argument if i+1 is not a valid position in s .
396
397
398
399 val index : string -> char -> int
400
401
402 index s c is String.index_from s 0 c .
403
404
405
406 val index_opt : string -> char -> int option
407
408
409 index_opt s c is String.index_from_opt s 0 c .
410
411
412 Since 4.05
413
414
415
416 val rindex : string -> char -> int
417
418
419 rindex s c is String.rindex_from s (length s - 1) c .
420
421
422
423 val rindex_opt : string -> char -> int option
424
425
426 rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
427
428
429 Since 4.05
430
431
432
433
434 Converting
435 val to_seq : t -> char Seq.t
436
437
438 to_seq s is a sequence made of the string's characters in increasing
439 order. In "unsafe-string" mode, modifications of the string during it‐
440 eration will be reflected in the iterator.
441
442
443 Since 4.07
444
445
446
447 val to_seqi : t -> (int * char) Seq.t
448
449
450 to_seqi s is like String.to_seq but also tuples the corresponding in‐
451 dex.
452
453
454 Since 4.07
455
456
457
458 val of_seq : char Seq.t -> t
459
460
461 of_seq s is a string made of the sequence's characters.
462
463
464 Since 4.07
465
466
467
468
469 Deprecated functions
470 val create : int -> bytes
471
472 Deprecated. This is a deprecated alias of Bytes.create / BytesLa‐
473 bels.create .
474
475
476
477 create n returns a fresh byte sequence of length n . The sequence is
478 uninitialized and contains arbitrary bytes.
479
480
481 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
482
483
484
485 val set : bytes -> int -> char -> unit
486
487 Deprecated. This is a deprecated alias of Bytes.set / BytesLabels.set
488 .
489
490
491
492 set s n c modifies byte sequence s in place, replacing the byte at in‐
493 dex n with c . You can also write s.[n] <- c instead of set s n c .
494
495
496 Raises Invalid_argument if n is not a valid index in s .
497
498
499
500 val blit : string -> int -> bytes -> int -> int -> unit
501
502
503 blit src src_pos dst dst_pos len copies len bytes from the string src ,
504 starting at index src_pos , to byte sequence dst , starting at charac‐
505 ter number dst_pos .
506
507
508 Raises Invalid_argument if src_pos and len do not designate a valid
509 range of src , or if dst_pos and len do not designate a valid range of
510 dst .
511
512
513
514 val copy : string -> string
515
516 Deprecated. Because strings are immutable, it doesn't make much sense
517 to make identical copies of them.
518
519
520 Return a copy of the given string.
521
522
523
524 val fill : bytes -> int -> int -> char -> unit
525
526 Deprecated. This is a deprecated alias of Bytes.fill / BytesLa‐
527 bels.fill .
528
529
530
531 fill s pos len c modifies byte sequence s in place, replacing len bytes
532 by c , starting at pos .
533
534
535 Raises Invalid_argument if pos and len do not designate a valid sub‐
536 string of s .
537
538
539
540 val uppercase : string -> string
541
542 Deprecated. Functions operating on Latin-1 character set are depre‐
543 cated.
544
545
546 Return a copy of the argument, with all lowercase letters translated to
547 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
548 acter set.
549
550
551
552 val lowercase : string -> string
553
554 Deprecated. Functions operating on Latin-1 character set are depre‐
555 cated.
556
557
558 Return a copy of the argument, with all uppercase letters translated to
559 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
560 acter set.
561
562
563
564 val capitalize : string -> string
565
566 Deprecated. Functions operating on Latin-1 character set are depre‐
567 cated.
568
569
570 Return a copy of the argument, with the first character set to upper‐
571 case, using the ISO Latin-1 (8859-1) character set..
572
573
574
575 val uncapitalize : string -> string
576
577 Deprecated. Functions operating on Latin-1 character set are depre‐
578 cated.
579
580
581 Return a copy of the argument, with the first character set to lower‐
582 case, using the ISO Latin-1 (8859-1) character set.
583
584
585
586
587
588OCamldoc 2021-07-22 String(3)