1String(3)                        OCaml library                       String(3)
2
3
4

NAME

6       String - Strings.
7

Module

9       Module   String
10

Documentation

12       Module String
13        : sig end
14
15
16       Strings.
17
18       A  string  s  of  length  n is an indexable and immutable sequence of n
19       bytes. For historical reasons these bytes are referred  to  as  charac‐
20       ters.
21
22       The  semantics  of  string functions is defined in terms of indices and
23       positions. These are depicted and described as follows.
24
25       positions  0   1   2   3   4    n-1    n +---+---+---+---+      +-----+
26       indices  | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+     +-----+
27
28       -An index i of s is an integer in the range [ 0 ; n-1 ].  It represents
29       the i th byte (character) of s which can be accessed using the constant
30       time string indexing operator s.[i] .
31
32       -A  position i of s is an integer in the range [ 0 ; n ]. It represents
33       either the point at the beginning of the string, or the  point  between
34       two indices, or the point at the end of the string. The i th byte index
35       is between position i and i+1 .
36
37
38       Two integers start and len are said to define a valid substring of s if
39       len >= 0 and start , start+len are positions of s .
40
41       Unicode text. Strings being arbitrary sequences of bytes, they can hold
42       any kind of textual encoding.  However  the  recommended  encoding  for
43       storing  Unicode  text  in OCaml strings is UTF-8. This is the encoding
44       used by Unicode escapes in string  literals.  For  example  the  string
45       "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47       Past  mutability.  Before  OCaml 4.02, strings used to be modifiable in
48       place like Bytes.t mutable sequences of bytes.   OCaml  4  had  various
49       compiler  flags and configuration options to support the transition pe‐
50       riod from mutable to immutable strings.  Those options  are  no  longer
51       available, and strings are now always immutable.
52
53       The labeled version of this module can be used as described in the Std‐
54       Labels module.
55
56
57
58
59
60
61
62   Strings
63       type t = string
64
65
66       The type for strings.
67
68
69
70       val make : int -> char -> string
71
72
73       make n c is a string of length n with each index holding the  character
74       c .
75
76
77       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
78
79
80
81       val init : int -> (int -> char) -> string
82
83
84       init n f is a string of length n with index i holding the character f i
85       (called in increasing index order).
86
87
88       Since 4.02.0
89
90
91       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
92
93
94
95       val empty : string
96
97       The empty string.
98
99
100       Since 4.13.0
101
102
103
104       val of_bytes : bytes -> string
105
106       Return a new string that contains the same bytes as the given byte  se‐
107       quence.
108
109
110       Since 4.13.0
111
112
113
114       val to_bytes : string -> bytes
115
116       Return  a  new  byte sequence that contains the same bytes as the given
117       string.
118
119
120       Since 4.13.0
121
122
123
124       val length : string -> int
125
126
127       length s is the length (number of bytes/characters) of s .
128
129
130
131       val get : string -> int -> char
132
133
134       get s i is the character at index i in s . This is the same as  writing
135       s.[i] .
136
137
138       Raises Invalid_argument if i not an index of s .
139
140
141
142
143   Concatenating
144       Note. The (^) binary operator concatenates two strings.
145
146       val concat : string -> string list -> string
147
148
149       concat sep ss concatenates the list of strings ss , inserting the sepa‐
150       rator string sep between each.
151
152
153       Raises   Invalid_argument   if    the    result    is    longer    than
154       Sys.max_string_length bytes.
155
156
157
158       val cat : string -> string -> string
159
160
161       cat s1 s2 concatenates s1 and s2 ( s1 ^ s2 ).
162
163
164       Since 4.13.0
165
166
167       Raises    Invalid_argument    if    the    result    is   longer   than
168       Sys.max_string_length bytes.
169
170
171
172
173   Predicates and comparisons
174       val equal : t -> t -> bool
175
176
177       equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
178
179
180       Since 4.03.0 (4.05.0 in StringLabels)
181
182
183
184       val compare : t -> t -> int
185
186
187       compare s0 s1 sorts s0 and s1 in lexicographical  order.   compare  be‐
188       haves like compare on strings but may be more efficient.
189
190
191
192       val starts_with : prefix:string -> string -> bool
193
194
195       starts_with ~prefix s is true if and only if s starts with prefix .
196
197
198       Since 4.13.0
199
200
201
202       val ends_with : suffix:string -> string -> bool
203
204
205       ends_with ~suffix s is true if and only if s ends with suffix .
206
207
208       Since 4.13.0
209
210
211
212       val contains_from : string -> int -> char -> bool
213
214
215       contains_from s start c is true if and only if c appears in s after po‐
216       sition start .
217
218
219       Raises Invalid_argument if start is not a valid position in s .
220
221
222
223       val rcontains_from : string -> int -> char -> bool
224
225
226       rcontains_from s stop c is true if and only if c appears  in  s  before
227       position stop+1 .
228
229
230       Raises  Invalid_argument  if stop < 0 or stop+1 is not a valid position
231       in s .
232
233
234
235       val contains : string -> char -> bool
236
237
238       contains s c is String.contains_from s 0 c .
239
240
241
242
243   Extracting substrings
244       val sub : string -> int -> int -> string
245
246
247       sub s pos len is a string of length len , containing the substring of s
248       that starts at position pos and has length len .
249
250
251       Raises  Invalid_argument  if  pos and len do not designate a valid sub‐
252       string of s .
253
254
255
256       val split_on_char : char -> string -> string list
257
258
259       split_on_char sep s is the list of all (possibly empty) substrings of s
260       that are delimited by the character sep .
261
262       The function's result is specified by the following invariants:
263
264       -The list is not empty.
265
266       -Concatenating  its  elements using sep as a separator returns a string
267       equal to the input ( concat (make 1 sep)
268             (split_on_char sep s) = s ).
269
270       -No string in the result contains the sep character.
271
272
273
274       Since 4.04.0 (4.05.0 in StringLabels)
275
276
277
278
279   Transforming
280       val map : (char -> char) -> string -> string
281
282
283       map f s is the string resulting from applying f to all  the  characters
284       of s in increasing order.
285
286
287       Since 4.00.0
288
289
290
291       val mapi : (int -> char -> char) -> string -> string
292
293
294       mapi  f  s  is  like  String.map but the index of the character is also
295       passed to f .
296
297
298       Since 4.02.0
299
300
301
302       val fold_left : ('a -> char -> 'a) -> 'a -> string -> 'a
303
304
305       fold_left f x s computes f (... (f (f x s.[0]) s.[1])  ...)  s.[n-1]  ,
306       where n is the length of the string s .
307
308
309       Since 4.13.0
310
311
312
313       val fold_right : (char -> 'a -> 'a) -> string -> 'a -> 'a
314
315
316       fold_right f s x computes f s.[0] (f s.[1] ( ... (f s.[n-1] x) ...))  ,
317       where n is the length of the string s .
318
319
320       Since 4.13.0
321
322
323
324       val for_all : (char -> bool) -> string -> bool
325
326
327       for_all p s checks if all characters in s satisfy the predicate p .
328
329
330       Since 4.13.0
331
332
333
334       val exists : (char -> bool) -> string -> bool
335
336
337       exists p s checks if at least one character of s satisfies  the  predi‐
338       cate p .
339
340
341       Since 4.13.0
342
343
344
345       val trim : string -> string
346
347
348       trim s is s without leading and trailing whitespace. Whitespace charac‐
349       ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
350
351
352       Since 4.00.0
353
354
355
356       val escaped : string -> string
357
358
359       escaped s is s with special characters represented by escape sequences,
360       following the lexical conventions of OCaml.
361
362       All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
363       caped, as well as backslash (0x2F) and double-quote (0x22).
364
365       The function Scanf.unescaped is  a  left  inverse  of  escaped  ,  i.e.
366       Scanf.unescaped  (escaped  s)  =  s  for any string s (unless escaped s
367       fails).
368
369
370       Raises   Invalid_argument   if    the    result    is    longer    than
371       Sys.max_string_length bytes.
372
373
374
375       val uppercase_ascii : string -> string
376
377
378       uppercase_ascii  s is s with all lowercase letters translated to upper‐
379       case, using the US-ASCII character set.
380
381
382       Since 4.03.0 (4.05.0 in StringLabels)
383
384
385
386       val lowercase_ascii : string -> string
387
388
389       lowercase_ascii s is s with all uppercase letters translated to  lower‐
390       case, using the US-ASCII character set.
391
392
393       Since 4.03.0 (4.05.0 in StringLabels)
394
395
396
397       val capitalize_ascii : string -> string
398
399
400       capitalize_ascii  s is s with the first character set to uppercase, us‐
401       ing the US-ASCII character set.
402
403
404       Since 4.03.0 (4.05.0 in StringLabels)
405
406
407
408       val uncapitalize_ascii : string -> string
409
410
411       uncapitalize_ascii s is s with the first character  set  to  lowercase,
412       using the US-ASCII character set.
413
414
415       Since 4.03.0 (4.05.0 in StringLabels)
416
417
418
419
420   Traversing
421       val iter : (char -> unit) -> string -> unit
422
423
424       iter f s applies function f in turn to all the characters of s .  It is
425       equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
426
427
428
429       val iteri : (int -> char -> unit) -> string -> unit
430
431
432       iteri is like String.iter , but the function is also given  the  corre‐
433       sponding character index.
434
435
436       Since 4.00.0
437
438
439
440
441   Searching
442       val index_from : string -> int -> char -> int
443
444
445       index_from  s  i c is the index of the first occurrence of c in s after
446       position i .
447
448
449       Raises Not_found if c does not occur in s after position i .
450
451
452       Raises Invalid_argument if i is not a valid position in s .
453
454
455
456       val index_from_opt : string -> int -> char -> int option
457
458
459       index_from_opt s i c is the index of the first occurrence of c in s af‐
460       ter position i (if any).
461
462
463       Since 4.05
464
465
466       Raises Invalid_argument if i is not a valid position in s .
467
468
469
470       val rindex_from : string -> int -> char -> int
471
472
473       rindex_from  s i c is the index of the last occurrence of c in s before
474       position i+1 .
475
476
477       Raises Not_found if c does not occur in s before position i+1 .
478
479
480       Raises Invalid_argument if i+1 is not a valid position in s .
481
482
483
484       val rindex_from_opt : string -> int -> char -> int option
485
486
487       rindex_from_opt s i c is the index of the last occurrence of c in s be‐
488       fore position i+1 (if any).
489
490
491       Since 4.05
492
493
494       Raises Invalid_argument if i+1 is not a valid position in s .
495
496
497
498       val index : string -> char -> int
499
500
501       index s c is String.index_from s 0 c .
502
503
504
505       val index_opt : string -> char -> int option
506
507
508       index_opt s c is String.index_from_opt s 0 c .
509
510
511       Since 4.05
512
513
514
515       val rindex : string -> char -> int
516
517
518       rindex s c is String.rindex_from s (length s - 1) c .
519
520
521
522       val rindex_opt : string -> char -> int option
523
524
525       rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
526
527
528       Since 4.05
529
530
531
532
533   Strings and Sequences
534       val to_seq : t -> char Seq.t
535
536
537       to_seq  s  is  a sequence made of the string's characters in increasing
538       order. In "unsafe-string" mode, modifications of the string during  it‐
539       eration will be reflected in the sequence.
540
541
542       Since 4.07
543
544
545
546       val to_seqi : t -> (int * char) Seq.t
547
548
549       to_seqi  s  is like String.to_seq but also tuples the corresponding in‐
550       dex.
551
552
553       Since 4.07
554
555
556
557       val of_seq : char Seq.t -> t
558
559
560       of_seq s is a string made of the sequence's characters.
561
562
563       Since 4.07
564
565
566
567
568   UTF decoding and validations
569   UTF-8
570       val get_utf_8_uchar : t -> int -> Uchar.utf_decode
571
572
573       get_utf_8_uchar b i decodes an UTF-8 character at index i in b .
574
575
576
577       val is_valid_utf_8 : t -> bool
578
579
580       is_valid_utf_8 b is true if and only if b contains valid UTF-8 data.
581
582
583
584
585   UTF-16BE
586       val get_utf_16be_uchar : t -> int -> Uchar.utf_decode
587
588
589       get_utf_16be_uchar b i decodes an UTF-16BE character at index i in b .
590
591
592
593       val is_valid_utf_16be : t -> bool
594
595
596       is_valid_utf_16be b is true if and only if b  contains  valid  UTF-16BE
597       data.
598
599
600
601
602   UTF-16LE
603       val get_utf_16le_uchar : t -> int -> Uchar.utf_decode
604
605
606       get_utf_16le_uchar b i decodes an UTF-16LE character at index i in b .
607
608
609
610       val is_valid_utf_16le : t -> bool
611
612
613       is_valid_utf_16le  b  is  true if and only if b contains valid UTF-16LE
614       data.
615
616
617
618       val blit : string -> int -> bytes -> int -> int -> unit
619
620
621       blit src src_pos dst dst_pos len copies len bytes from the string src ,
622       starting  at index src_pos , to byte sequence dst , starting at charac‐
623       ter number dst_pos .
624
625
626       Raises Invalid_argument if src_pos and len do  not  designate  a  valid
627       range  of src , or if dst_pos and len do not designate a valid range of
628       dst .
629
630
631
632
633   Binary decoding of integers
634       The functions in this section binary decode integers from strings.
635
636       All following functions raise Invalid_argument if the characters needed
637       at index i to decode the integer are not available.
638
639       Little-endian (resp. big-endian) encoding means that least (resp. most)
640       significant bytes are stored first.  Big-endian is also known  as  net‐
641       work  byte  order.   Native-endian  encoding is either little-endian or
642       big-endian depending on Sys.big_endian .
643
644       32-bit and 64-bit integers are  represented  by  the  int32  and  int64
645       types, which can be interpreted either as signed or unsigned numbers.
646
647       8-bit  and  16-bit  integers are represented by the int type, which has
648       more bits than the binary encoding.  These extra bits are sign-extended
649       (or  zero-extended) for functions which decode 8-bit or 16-bit integers
650       and represented them with int values.
651
652       val get_uint8 : string -> int -> int
653
654
655       get_uint8 b i is b 's unsigned 8-bit integer starting at character  in‐
656       dex i .
657
658
659       Since 4.13.0
660
661
662
663       val get_int8 : string -> int -> int
664
665
666       get_int8 b i is b 's signed 8-bit integer starting at character index i
667       .
668
669
670       Since 4.13.0
671
672
673
674       val get_uint16_ne : string -> int -> int
675
676
677       get_uint16_ne b i is b 's native-endian unsigned 16-bit integer  start‐
678       ing at character index i .
679
680
681       Since 4.13.0
682
683
684
685       val get_uint16_be : string -> int -> int
686
687
688       get_uint16_be  b  i is b 's big-endian unsigned 16-bit integer starting
689       at character index i .
690
691
692       Since 4.13.0
693
694
695
696       val get_uint16_le : string -> int -> int
697
698
699       get_uint16_le b i is b 's little-endian unsigned 16-bit integer  start‐
700       ing at character index i .
701
702
703       Since 4.13.0
704
705
706
707       val get_int16_ne : string -> int -> int
708
709
710       get_int16_ne  b  i is b 's native-endian signed 16-bit integer starting
711       at character index i .
712
713
714       Since 4.13.0
715
716
717
718       val get_int16_be : string -> int -> int
719
720
721       get_int16_be b i is b 's big-endian signed 16-bit integer  starting  at
722       character index i .
723
724
725       Since 4.13.0
726
727
728
729       val get_int16_le : string -> int -> int
730
731
732       get_int16_le  b  i is b 's little-endian signed 16-bit integer starting
733       at character index i .
734
735
736       Since 4.13.0
737
738
739
740       val get_int32_ne : string -> int -> int32
741
742
743       get_int32_ne b i is b 's native-endian 32-bit integer starting at char‐
744       acter index i .
745
746
747       Since 4.13.0
748
749
750
751       val hash : t -> int
752
753       An  unseeded  hash  function for strings, with the same output value as
754       Hashtbl.hash . This function allows this module to be passed  as  argu‐
755       ment to the functor Hashtbl.Make .
756
757
758       Since 5.0.0
759
760
761
762       val seeded_hash : int -> t -> int
763
764       A  seeded  hash  function  for  strings,  with the same output value as
765       Hashtbl.seeded_hash . This function allows this module to be passed  as
766       argument to the functor Hashtbl.MakeSeeded .
767
768
769       Since 5.0.0
770
771
772
773       val get_int32_be : string -> int -> int32
774
775
776       get_int32_be  b i is b 's big-endian 32-bit integer starting at charac‐
777       ter index i .
778
779
780       Since 4.13.0
781
782
783
784       val get_int32_le : string -> int -> int32
785
786
787       get_int32_le b i is b 's little-endian 32-bit integer starting at char‐
788       acter index i .
789
790
791       Since 4.13.0
792
793
794
795       val get_int64_ne : string -> int -> int64
796
797
798       get_int64_ne b i is b 's native-endian 64-bit integer starting at char‐
799       acter index i .
800
801
802       Since 4.13.0
803
804
805
806       val get_int64_be : string -> int -> int64
807
808
809       get_int64_be b i is b 's big-endian 64-bit integer starting at  charac‐
810       ter index i .
811
812
813       Since 4.13.0
814
815
816
817       val get_int64_le : string -> int -> int64
818
819
820       get_int64_le b i is b 's little-endian 64-bit integer starting at char‐
821       acter index i .
822
823
824       Since 4.13.0
825
826
827
828
829
830OCamldoc                          2023-07-20                         String(3)
Impressum