1String(3)                        OCaml library                       String(3)
2
3
4

NAME

6       String - Strings.
7

Module

9       Module   String
10

Documentation

12       Module String
13        : sig end
14
15
16       Strings.
17
18       A  string  s  of  length  n is an indexable and immutable sequence of n
19       bytes. For historical reasons these bytes are referred  to  as  charac‐
20       ters.
21
22       The  semantics  of  string functions is defined in terms of indices and
23       positions. These are depicted and described as follows.
24
25       positions  0   1   2   3   4    n-1    n +---+---+---+---+      +-----+
26       indices  | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+     +-----+
27
28       -An index i of s is an integer in the range [ 0 ; n-1 ].  It represents
29       the i th byte (character) of s which can be accessed using the constant
30       time string indexing operator s.[i] .
31
32       -A  position i of s is an integer in the range [ 0 ; n ]. It represents
33       either the point at the beginning of the string, or the  point  between
34       two indices, or the point at the end of the string. The i th byte index
35       is between position i and i+1 .
36
37
38       Two integers start and len are said to define a valid substring of s if
39       len >= 0 and start , start+len are positions of s .
40
41       Unicode text. Strings being arbitrary sequences of bytes, they can hold
42       any kind of textual encoding.  However  the  recommended  encoding  for
43       storing  Unicode  text  in OCaml strings is UTF-8. This is the encoding
44       used by Unicode escapes in string  literals.  For  example  the  string
45       "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47       Past  mutability. OCaml strings used to be modifiable in place, for in‐
48       stance via the String.set and String.blit functions. This use is  nowa‐
49       days  only possible when the compiler is put in "unsafe-string" mode by
50       giving the -unsafe-string command-line option. This compatibility  mode
51       makes the types string and bytes (see Bytes.t ) interchangeable so that
52       functions expecting byte sequences can also accept strings as arguments
53       and modify them.
54
55       The  distinction between bytes and string was introduced in OCaml 4.02,
56       and the "unsafe-string" compatibility mode was the default until  OCaml
57       4.05.  Starting  with 4.06, the compatibility mode is opt-in; we intend
58       to remove the option in the future.
59
60       The labeled version of this module can be used as described in the Std‐
61       Labels module.
62
63
64
65
66
67
68
69   Strings
70       type t = string
71
72
73       The type for strings.
74
75
76
77       val make : int -> char -> string
78
79
80       make  n c is a string of length n with each index holding the character
81       c .
82
83
84       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
85
86
87
88       val init : int -> (int -> char) -> string
89
90
91       init n f is a string of length n with index i holding the character f i
92       (called in increasing index order).
93
94
95       Since 4.02.0
96
97
98       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102       val length : string -> int
103
104
105       length s is the length (number of bytes/characters) of s .
106
107
108
109       val get : string -> int -> char
110
111
112       get  s i is the character at index i in s . This is the same as writing
113       s.[i] .
114
115
116       Raises Invalid_argument if i not an index of s .
117
118
119
120
121   Concatenating
122       Note. The (^) binary operator concatenates two strings.
123
124       val concat : string -> string list -> string
125
126
127       concat sep ss concatenates the list of strings ss , inserting the sepa‐
128       rator string sep between each.
129
130
131       Raises    Invalid_argument    if    the    result    is   longer   than
132       Sys.max_string_length bytes.
133
134
135
136
137   Predicates and comparisons
138       val equal : t -> t -> bool
139
140
141       equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
142
143
144       Since 4.03.0 (4.05.0 in StringLabels)
145
146
147
148       val compare : t -> t -> int
149
150
151       compare s0 s1 sorts s0 and s1 in lexicographical  order.   compare  be‐
152       haves like compare on strings but may be more efficient.
153
154
155
156       val contains_from : string -> int -> char -> bool
157
158
159       contains_from s start c is true if and only if c appears in s after po‐
160       sition start .
161
162
163       Raises Invalid_argument if start is not a valid position in s .
164
165
166
167       val rcontains_from : string -> int -> char -> bool
168
169
170       rcontains_from s stop c is true if and only if c appears  in  s  before
171       position stop+1 .
172
173
174       Raises  Invalid_argument  if stop < 0 or stop+1 is not a valid position
175       in s .
176
177
178
179       val contains : string -> char -> bool
180
181
182       contains s c is String.contains_from s 0 c .
183
184
185
186
187   Extracting substrings
188       val sub : string -> int -> int -> string
189
190
191       sub s pos len is a string of length len , containing the substring of s
192       that starts at position pos and has length len .
193
194
195       Raises  Invalid_argument  if  pos and len do not designate a valid sub‐
196       string of s .
197
198
199
200       val split_on_char : char -> string -> string list
201
202
203       split_on_char sep s is the list of all (possibly empty) substrings of s
204       that are delimited by the character sep .
205
206       The function's result is specified by the following invariants:
207
208       -The list is not empty.
209
210       -Concatenating  its  elements using sep as a separator returns a string
211       equal to the input ( concat (make 1 sep)
212             (split_on_char sep s) = s ).
213
214       -No string in the result contains the sep character.
215
216
217
218       Since 4.04.0 (4.05.0 in StringLabels)
219
220
221
222
223   Transforming
224       val map : (char -> char) -> string -> string
225
226
227       map f s is the string resulting from applying f to all  the  characters
228       of s in increasing order.
229
230
231       Since 4.00.0
232
233
234
235       val mapi : (int -> char -> char) -> string -> string
236
237
238       mapi  f  s  is  like  String.map but the index of the character is also
239       passed to f .
240
241
242       Since 4.02.0
243
244
245
246       val trim : string -> string
247
248
249       trim s is s without leading and trailing whitespace. Whitespace charac‐
250       ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
251
252
253       Since 4.00.0
254
255
256
257       val escaped : string -> string
258
259
260       escaped s is s with special characters represented by escape sequences,
261       following the lexical conventions of OCaml.
262
263       All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
264       caped, as well as backslash (0x2F) and double-quote (0x22).
265
266       The  function  Scanf.unescaped  is  a  left  inverse  of escaped , i.e.
267       Scanf.unescaped (escaped s) = s for any  string  s  (unless  escaped  s
268       fails).
269
270
271       Raises    Invalid_argument    if    the    result    is   longer   than
272       Sys.max_string_length bytes.
273
274
275
276       val uppercase_ascii : string -> string
277
278
279       uppercase_ascii s is s with all lowercase letters translated to  upper‐
280       case, using the US-ASCII character set.
281
282
283       Since 4.03.0 (4.05.0 in StringLabels)
284
285
286
287       val lowercase_ascii : string -> string
288
289
290       lowercase_ascii  s is s with all uppercase letters translated to lower‐
291       case, using the US-ASCII character set.
292
293
294       Since 4.03.0 (4.05.0 in StringLabels)
295
296
297
298       val capitalize_ascii : string -> string
299
300
301       capitalize_ascii s is s with the first character set to uppercase,  us‐
302       ing the US-ASCII character set.
303
304
305       Since 4.03.0 (4.05.0 in StringLabels)
306
307
308
309       val uncapitalize_ascii : string -> string
310
311
312       uncapitalize_ascii  s  is  s with the first character set to lowercase,
313       using the US-ASCII character set.
314
315
316       Since 4.03.0 (4.05.0 in StringLabels)
317
318
319
320
321   Traversing
322       val iter : (char -> unit) -> string -> unit
323
324
325       iter f s applies function f in turn to all the characters of s .  It is
326       equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
327
328
329
330       val iteri : (int -> char -> unit) -> string -> unit
331
332
333       iteri  is  like String.iter , but the function is also given the corre‐
334       sponding character index.
335
336
337       Since 4.00.0
338
339
340
341
342   Searching
343       val index_from : string -> int -> char -> int
344
345
346       index_from s i c is the index of the first occurrence of c in  s  after
347       position i .
348
349
350       Raises Not_found if c does not occur in s after position i .
351
352
353       Raises Invalid_argument if i is not a valid position in s .
354
355
356
357       val index_from_opt : string -> int -> char -> int option
358
359
360       index_from_opt s i c is the index of the first occurrence of c in s af‐
361       ter position i (if any).
362
363
364       Since 4.05
365
366
367       Raises Invalid_argument if i is not a valid position in s .
368
369
370
371       val rindex_from : string -> int -> char -> int
372
373
374       rindex_from s i c is the index of the last occurrence of c in s  before
375       position i+1 .
376
377
378       Raises Not_found if c does not occur in s before position i+1 .
379
380
381       Raises Invalid_argument if i+1 is not a valid position in s .
382
383
384
385       val rindex_from_opt : string -> int -> char -> int option
386
387
388       rindex_from_opt s i c is the index of the last occurrence of c in s be‐
389       fore position i+1 (if any).
390
391
392       Since 4.05
393
394
395       Raises Invalid_argument if i+1 is not a valid position in s .
396
397
398
399       val index : string -> char -> int
400
401
402       index s c is String.index_from s 0 c .
403
404
405
406       val index_opt : string -> char -> int option
407
408
409       index_opt s c is String.index_from_opt s 0 c .
410
411
412       Since 4.05
413
414
415
416       val rindex : string -> char -> int
417
418
419       rindex s c is String.rindex_from s (length s - 1) c .
420
421
422
423       val rindex_opt : string -> char -> int option
424
425
426       rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
427
428
429       Since 4.05
430
431
432
433
434   Converting
435       val to_seq : t -> char Seq.t
436
437
438       to_seq s is a sequence made of the string's  characters  in  increasing
439       order.  In "unsafe-string" mode, modifications of the string during it‐
440       eration will be reflected in the iterator.
441
442
443       Since 4.07
444
445
446
447       val to_seqi : t -> (int * char) Seq.t
448
449
450       to_seqi s is like String.to_seq but also tuples the  corresponding  in‐
451       dex.
452
453
454       Since 4.07
455
456
457
458       val of_seq : char Seq.t -> t
459
460
461       of_seq s is a string made of the sequence's characters.
462
463
464       Since 4.07
465
466
467
468
469   Deprecated functions
470       val create : int -> bytes
471
472       Deprecated.   This  is  a  deprecated  alias of Bytes.create / BytesLa‐
473       bels.create .
474
475
476
477       create n returns a fresh byte sequence of length n .  The  sequence  is
478       uninitialized and contains arbitrary bytes.
479
480
481       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
482
483
484
485       val set : bytes -> int -> char -> unit
486
487       Deprecated.   This is a deprecated alias of Bytes.set / BytesLabels.set
488       .
489
490
491
492       set s n c modifies byte sequence s in place, replacing the byte at  in‐
493       dex n with c .  You can also write s.[n] <- c instead of set s n c .
494
495
496       Raises Invalid_argument if n is not a valid index in s .
497
498
499
500       val blit : string -> int -> bytes -> int -> int -> unit
501
502
503       blit src src_pos dst dst_pos len copies len bytes from the string src ,
504       starting at index src_pos , to byte sequence dst , starting at  charac‐
505       ter number dst_pos .
506
507
508       Raises  Invalid_argument  if  src_pos  and len do not designate a valid
509       range of src , or if dst_pos and len do not designate a valid range  of
510       dst .
511
512
513
514       val copy : string -> string
515
516       Deprecated.   Because strings are immutable, it doesn't make much sense
517       to make identical copies of them.
518
519
520       Return a copy of the given string.
521
522
523
524       val fill : bytes -> int -> int -> char -> unit
525
526       Deprecated.  This is  a  deprecated  alias  of  Bytes.fill  /  BytesLa‐
527       bels.fill .
528
529
530
531       fill s pos len c modifies byte sequence s in place, replacing len bytes
532       by c , starting at pos .
533
534
535       Raises Invalid_argument if pos and len do not designate  a  valid  sub‐
536       string of s .
537
538
539
540       val uppercase : string -> string
541
542       Deprecated.   Functions  operating  on Latin-1 character set are depre‐
543       cated.
544
545
546       Return a copy of the argument, with all lowercase letters translated to
547       uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
548       acter set.
549
550
551
552       val lowercase : string -> string
553
554       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
555       cated.
556
557
558       Return a copy of the argument, with all uppercase letters translated to
559       lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
560       acter set.
561
562
563
564       val capitalize : string -> string
565
566       Deprecated.   Functions  operating  on Latin-1 character set are depre‐
567       cated.
568
569
570       Return a copy of the argument, with the first character set  to  upper‐
571       case, using the ISO Latin-1 (8859-1) character set..
572
573
574
575       val uncapitalize : string -> string
576
577       Deprecated.   Functions  operating  on Latin-1 character set are depre‐
578       cated.
579
580
581       Return a copy of the argument, with the first character set  to  lower‐
582       case, using the ISO Latin-1 (8859-1) character set.
583
584
585
586
587
588OCamldoc                          2021-07-22                         String(3)
Impressum