1StringLabels(3) OCaml library StringLabels(3)
2
3
4
6 StringLabels - Strings.
7
9 Module StringLabels
10
12 Module StringLabels
13 : sig end
14
15
16 Strings.
17
18 A string s of length n is an indexable and immutable sequence of n
19 bytes. For historical reasons these bytes are referred to as charac‐
20 ters.
21
22 The semantics of string functions is defined in terms of indices and
23 positions. These are depicted and described as follows.
24
25 positions 0 1 2 3 4 n-1 n +---+---+---+---+ +-----+
26 indices | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+ +-----+
27
28 -An index i of s is an integer in the range [ 0 ; n-1 ]. It represents
29 the i th byte (character) of s which can be accessed using the constant
30 time string indexing operator s.[i] .
31
32 -A position i of s is an integer in the range [ 0 ; n ]. It represents
33 either the point at the beginning of the string, or the point between
34 two indices, or the point at the end of the string. The i th byte index
35 is between position i and i+1 .
36
37
38 Two integers start and len are said to define a valid substring of s if
39 len >= 0 and start , start+len are positions of s .
40
41 Unicode text. Strings being arbitrary sequences of bytes, they can hold
42 any kind of textual encoding. However the recommended encoding for
43 storing Unicode text in OCaml strings is UTF-8. This is the encoding
44 used by Unicode escapes in string literals. For example the string
45 "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47 Past mutability. OCaml strings used to be modifiable in place, for in‐
48 stance via the String.set and String.blit functions. This use is nowa‐
49 days only possible when the compiler is put in "unsafe-string" mode by
50 giving the -unsafe-string command-line option. This compatibility mode
51 makes the types string and bytes (see Bytes.t ) interchangeable so that
52 functions expecting byte sequences can also accept strings as arguments
53 and modify them.
54
55 The distinction between bytes and string was introduced in OCaml 4.02,
56 and the "unsafe-string" compatibility mode was the default until OCaml
57 4.05. Starting with 4.06, the compatibility mode is opt-in; we intend
58 to remove the option in the future.
59
60 The labeled version of this module can be used as described in the Std‐
61 Labels module.
62
63
64
65
66
67
68
69 Strings
70 type t = string
71
72
73 The type for strings.
74
75
76
77 val make : int -> char -> string
78
79
80 make n c is a string of length n with each index holding the character
81 c .
82
83
84 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
85
86
87
88 val init : int -> f:(int -> char) -> string
89
90
91 init n ~f is a string of length n with index i holding the character f
92 i (called in increasing index order).
93
94
95 Since 4.02.0
96
97
98 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102 val length : string -> int
103
104
105 length s is the length (number of bytes/characters) of s .
106
107
108
109 val get : string -> int -> char
110
111
112 get s i is the character at index i in s . This is the same as writing
113 s.[i] .
114
115
116 Raises Invalid_argument if i not an index of s .
117
118
119
120
121 Concatenating
122 Note. The (^) binary operator concatenates two strings.
123
124 val concat : sep:string -> string list -> string
125
126
127 concat ~sep ss concatenates the list of strings ss , inserting the sep‐
128 arator string sep between each.
129
130
131 Raises Invalid_argument if the result is longer than
132 Sys.max_string_length bytes.
133
134
135
136
137 Predicates and comparisons
138 val equal : t -> t -> bool
139
140
141 equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
142
143
144 Since 4.05.0
145
146
147
148 val compare : t -> t -> int
149
150
151 compare s0 s1 sorts s0 and s1 in lexicographical order. compare be‐
152 haves like compare on strings but may be more efficient.
153
154
155
156 val contains_from : string -> int -> char -> bool
157
158
159 contains_from s start c is true if and only if c appears in s after po‐
160 sition start .
161
162
163 Raises Invalid_argument if start is not a valid position in s .
164
165
166
167 val rcontains_from : string -> int -> char -> bool
168
169
170 rcontains_from s stop c is true if and only if c appears in s before
171 position stop+1 .
172
173
174 Raises Invalid_argument if stop < 0 or stop+1 is not a valid position
175 in s .
176
177
178
179 val contains : string -> char -> bool
180
181
182 contains s c is String.contains_from s 0 c .
183
184
185
186
187 Extracting substrings
188 val sub : string -> pos:int -> len:int -> string
189
190
191 sub s ~pos ~len is a string of length len , containing the substring of
192 s that starts at position pos and has length len .
193
194
195 Raises Invalid_argument if pos and len do not designate a valid sub‐
196 string of s .
197
198
199
200 val split_on_char : sep:char -> string -> string list
201
202
203 split_on_char ~sep s is the list of all (possibly empty) substrings of
204 s that are delimited by the character sep .
205
206 The function's result is specified by the following invariants:
207
208 -The list is not empty.
209
210 -Concatenating its elements using sep as a separator returns a string
211 equal to the input ( concat (make 1 sep)
212 (split_on_char sep s) = s ).
213
214 -No string in the result contains the sep character.
215
216
217
218 Since 4.05.0
219
220
221
222
223 Transforming
224 val map : f:(char -> char) -> string -> string
225
226
227 map f s is the string resulting from applying f to all the characters
228 of s in increasing order.
229
230
231 Since 4.00.0
232
233
234
235 val mapi : f:(int -> char -> char) -> string -> string
236
237
238 mapi ~f s is like StringLabels.map but the index of the character is
239 also passed to f .
240
241
242 Since 4.02.0
243
244
245
246 val trim : string -> string
247
248
249 trim s is s without leading and trailing whitespace. Whitespace charac‐
250 ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
251
252
253 Since 4.00.0
254
255
256
257 val escaped : string -> string
258
259
260 escaped s is s with special characters represented by escape sequences,
261 following the lexical conventions of OCaml.
262
263 All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
264 caped, as well as backslash (0x2F) and double-quote (0x22).
265
266 The function Scanf.unescaped is a left inverse of escaped , i.e.
267 Scanf.unescaped (escaped s) = s for any string s (unless escaped s
268 fails).
269
270
271 Raises Invalid_argument if the result is longer than
272 Sys.max_string_length bytes.
273
274
275
276 val uppercase_ascii : string -> string
277
278
279 uppercase_ascii s is s with all lowercase letters translated to upper‐
280 case, using the US-ASCII character set.
281
282
283 Since 4.05.0
284
285
286
287 val lowercase_ascii : string -> string
288
289
290 lowercase_ascii s is s with all uppercase letters translated to lower‐
291 case, using the US-ASCII character set.
292
293
294 Since 4.05.0
295
296
297
298 val capitalize_ascii : string -> string
299
300
301 capitalize_ascii s is s with the first character set to uppercase, us‐
302 ing the US-ASCII character set.
303
304
305 Since 4.05.0
306
307
308
309 val uncapitalize_ascii : string -> string
310
311
312 uncapitalize_ascii s is s with the first character set to lowercase,
313 using the US-ASCII character set.
314
315
316 Since 4.05.0
317
318
319
320
321 Traversing
322 val iter : f:(char -> unit) -> string -> unit
323
324
325 iter ~f s applies function f in turn to all the characters of s . It
326 is equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
327
328
329
330 val iteri : f:(int -> char -> unit) -> string -> unit
331
332
333 iteri is like StringLabels.iter , but the function is also given the
334 corresponding character index.
335
336
337 Since 4.00.0
338
339
340
341
342 Searching
343 val index_from : string -> int -> char -> int
344
345
346 index_from s i c is the index of the first occurrence of c in s after
347 position i .
348
349
350 Raises Not_found if c does not occur in s after position i .
351
352
353 Raises Invalid_argument if i is not a valid position in s .
354
355
356
357 val index_from_opt : string -> int -> char -> int option
358
359
360 index_from_opt s i c is the index of the first occurrence of c in s af‐
361 ter position i (if any).
362
363
364 Since 4.05
365
366
367 Raises Invalid_argument if i is not a valid position in s .
368
369
370
371 val rindex_from : string -> int -> char -> int
372
373
374 rindex_from s i c is the index of the last occurrence of c in s before
375 position i+1 .
376
377
378 Raises Not_found if c does not occur in s before position i+1 .
379
380
381 Raises Invalid_argument if i+1 is not a valid position in s .
382
383
384
385 val rindex_from_opt : string -> int -> char -> int option
386
387
388 rindex_from_opt s i c is the index of the last occurrence of c in s be‐
389 fore position i+1 (if any).
390
391
392 Since 4.05
393
394
395 Raises Invalid_argument if i+1 is not a valid position in s .
396
397
398
399 val index : string -> char -> int
400
401
402 index s c is String.index_from s 0 c .
403
404
405
406 val index_opt : string -> char -> int option
407
408
409 index_opt s c is String.index_from_opt s 0 c .
410
411
412 Since 4.05
413
414
415
416 val rindex : string -> char -> int
417
418
419 rindex s c is String.rindex_from s (length s - 1) c .
420
421
422
423 val rindex_opt : string -> char -> int option
424
425
426 rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
427
428
429 Since 4.05
430
431
432
433
434 Converting
435 val to_seq : t -> char Seq.t
436
437
438 to_seq s is a sequence made of the string's characters in increasing
439 order. In "unsafe-string" mode, modifications of the string during it‐
440 eration will be reflected in the iterator.
441
442
443 Since 4.07
444
445
446
447 val to_seqi : t -> (int * char) Seq.t
448
449
450 to_seqi s is like StringLabels.to_seq but also tuples the corresponding
451 index.
452
453
454 Since 4.07
455
456
457
458 val of_seq : char Seq.t -> t
459
460
461 of_seq s is a string made of the sequence's characters.
462
463
464 Since 4.07
465
466
467
468
469 Deprecated functions
470 val create : int -> bytes
471
472 Deprecated. This is a deprecated alias of Bytes.create / BytesLa‐
473 bels.create .
474
475
476
477 create n returns a fresh byte sequence of length n . The sequence is
478 uninitialized and contains arbitrary bytes.
479
480
481 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
482
483
484
485 val set : bytes -> int -> char -> unit
486
487 Deprecated. This is a deprecated alias of Bytes.set / BytesLabels.set
488 .
489
490
491
492 set s n c modifies byte sequence s in place, replacing the byte at in‐
493 dex n with c . You can also write s.[n] <- c instead of set s n c .
494
495
496 Raises Invalid_argument if n is not a valid index in s .
497
498
499
500 val blit : src:string -> src_pos:int -> dst:bytes -> dst_pos:int ->
501 len:int -> unit
502
503
504 blit ~src ~src_pos ~dst ~dst_pos ~len copies len bytes from the string
505 src , starting at index src_pos , to byte sequence dst , starting at
506 character number dst_pos .
507
508
509 Raises Invalid_argument if src_pos and len do not designate a valid
510 range of src , or if dst_pos and len do not designate a valid range of
511 dst .
512
513
514
515 val copy : string -> string
516
517 Deprecated. Because strings are immutable, it doesn't make much sense
518 to make identical copies of them.
519
520
521 Return a copy of the given string.
522
523
524
525 val fill : bytes -> pos:int -> len:int -> char -> unit
526
527 Deprecated. This is a deprecated alias of Bytes.fill / BytesLa‐
528 bels.fill .
529
530
531
532 fill s ~pos ~len c modifies byte sequence s in place, replacing len
533 bytes by c , starting at pos .
534
535
536 Raises Invalid_argument if pos and len do not designate a valid sub‐
537 string of s .
538
539
540
541 val uppercase : string -> string
542
543 Deprecated. Functions operating on Latin-1 character set are depre‐
544 cated.
545
546
547 Return a copy of the argument, with all lowercase letters translated to
548 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
549 acter set.
550
551
552
553 val lowercase : string -> string
554
555 Deprecated. Functions operating on Latin-1 character set are depre‐
556 cated.
557
558
559 Return a copy of the argument, with all uppercase letters translated to
560 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
561 acter set.
562
563
564
565 val capitalize : string -> string
566
567 Deprecated. Functions operating on Latin-1 character set are depre‐
568 cated.
569
570
571 Return a copy of the argument, with the first character set to upper‐
572 case, using the ISO Latin-1 (8859-1) character set..
573
574
575
576 val uncapitalize : string -> string
577
578 Deprecated. Functions operating on Latin-1 character set are depre‐
579 cated.
580
581
582 Return a copy of the argument, with the first character set to lower‐
583 case, using the ISO Latin-1 (8859-1) character set.
584
585
586
587
588
589OCamldoc 2021-07-22 StringLabels(3)