1StringLabels(3)                  OCaml library                 StringLabels(3)
2
3
4

NAME

6       StringLabels - Strings.
7

Module

9       Module   StringLabels
10

Documentation

12       Module StringLabels
13        : sig end
14
15
16       Strings.
17
18       A  string  s  of  length  n is an indexable and immutable sequence of n
19       bytes. For historical reasons these bytes are referred  to  as  charac‐
20       ters.
21
22       The  semantics  of  string functions is defined in terms of indices and
23       positions. These are depicted and described as follows.
24
25       positions  0   1   2   3   4    n-1    n +---+---+---+---+      +-----+
26       indices  | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+     +-----+
27
28       -An index i of s is an integer in the range [ 0 ; n-1 ].  It represents
29       the i th byte (character) of s which can be accessed using the constant
30       time string indexing operator s.[i] .
31
32       -A  position i of s is an integer in the range [ 0 ; n ]. It represents
33       either the point at the beginning of the string, or the  point  between
34       two indices, or the point at the end of the string. The i th byte index
35       is between position i and i+1 .
36
37
38       Two integers start and len are said to define a valid substring of s if
39       len >= 0 and start , start+len are positions of s .
40
41       Unicode text. Strings being arbitrary sequences of bytes, they can hold
42       any kind of textual encoding.  However  the  recommended  encoding  for
43       storing  Unicode  text  in OCaml strings is UTF-8. This is the encoding
44       used by Unicode escapes in string  literals.  For  example  the  string
45       "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47       Past  mutability.  Before  OCaml 4.02, strings used to be modifiable in
48       place like Bytes.t mutable sequences of bytes.   OCaml  4  had  various
49       compiler  flags and configuration options to support the transition pe‐
50       riod from mutable to immutable strings.  Those options  are  no  longer
51       available, and strings are now always immutable.
52
53       The labeled version of this module can be used as described in the Std‐
54       Labels module.
55
56
57
58
59
60
61
62   Strings
63       type t = string
64
65
66       The type for strings.
67
68
69
70       val make : int -> char -> string
71
72
73       make n c is a string of length n with each index holding the  character
74       c .
75
76
77       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
78
79
80
81       val init : int -> f:(int -> char) -> string
82
83
84       init  n ~f is a string of length n with index i holding the character f
85       i (called in increasing index order).
86
87
88       Since 4.02.0
89
90
91       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
92
93
94
95       val empty : string
96
97       The empty string.
98
99
100       Since 4.13.0
101
102
103
104       val of_bytes : bytes -> string
105
106       Return a new string that contains the same bytes as the given byte  se‐
107       quence.
108
109
110       Since 4.13.0
111
112
113
114       val to_bytes : string -> bytes
115
116       Return  a  new  byte sequence that contains the same bytes as the given
117       string.
118
119
120       Since 4.13.0
121
122
123
124       val length : string -> int
125
126
127       length s is the length (number of bytes/characters) of s .
128
129
130
131       val get : string -> int -> char
132
133
134       get s i is the character at index i in s . This is the same as  writing
135       s.[i] .
136
137
138       Raises Invalid_argument if i not an index of s .
139
140
141
142
143   Concatenating
144       Note. The (^) binary operator concatenates two strings.
145
146       val concat : sep:string -> string list -> string
147
148
149       concat ~sep ss concatenates the list of strings ss , inserting the sep‐
150       arator string sep between each.
151
152
153       Raises   Invalid_argument   if    the    result    is    longer    than
154       Sys.max_string_length bytes.
155
156
157
158       val cat : string -> string -> string
159
160
161       cat s1 s2 concatenates s1 and s2 ( s1 ^ s2 ).
162
163
164       Since 4.13.0
165
166
167       Raises    Invalid_argument    if    the    result    is   longer   than
168       Sys.max_string_length bytes.
169
170
171
172
173   Predicates and comparisons
174       val equal : t -> t -> bool
175
176
177       equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
178
179
180       Since 4.05.0
181
182
183
184       val compare : t -> t -> int
185
186
187       compare s0 s1 sorts s0 and s1 in lexicographical  order.   compare  be‐
188       haves like compare on strings but may be more efficient.
189
190
191
192       val starts_with : prefix:string -> string -> bool
193
194
195       starts_with ~prefix s is true if and only if s starts with prefix .
196
197
198       Since 4.13.0
199
200
201
202       val ends_with : suffix:string -> string -> bool
203
204
205       ends_with ~suffix s is true if and only if s ends with suffix .
206
207
208       Since 4.13.0
209
210
211
212       val contains_from : string -> int -> char -> bool
213
214
215       contains_from s start c is true if and only if c appears in s after po‐
216       sition start .
217
218
219       Raises Invalid_argument if start is not a valid position in s .
220
221
222
223       val rcontains_from : string -> int -> char -> bool
224
225
226       rcontains_from s stop c is true if and only if c appears  in  s  before
227       position stop+1 .
228
229
230       Raises  Invalid_argument  if stop < 0 or stop+1 is not a valid position
231       in s .
232
233
234
235       val contains : string -> char -> bool
236
237
238       contains s c is String.contains_from s 0 c .
239
240
241
242
243   Extracting substrings
244       val sub : string -> pos:int -> len:int -> string
245
246
247       sub s ~pos ~len is a string of length len , containing the substring of
248       s that starts at position pos and has length len .
249
250
251       Raises  Invalid_argument  if  pos and len do not designate a valid sub‐
252       string of s .
253
254
255
256       val split_on_char : sep:char -> string -> string list
257
258
259       split_on_char ~sep s is the list of all (possibly empty) substrings  of
260       s that are delimited by the character sep .
261
262       The function's result is specified by the following invariants:
263
264       -The list is not empty.
265
266       -Concatenating  its  elements using sep as a separator returns a string
267       equal to the input ( concat (make 1 sep)
268             (split_on_char sep s) = s ).
269
270       -No string in the result contains the sep character.
271
272
273
274       Since 4.05.0
275
276
277
278
279   Transforming
280       val map : f:(char -> char) -> string -> string
281
282
283       map f s is the string resulting from applying f to all  the  characters
284       of s in increasing order.
285
286
287       Since 4.00.0
288
289
290
291       val mapi : f:(int -> char -> char) -> string -> string
292
293
294       mapi  ~f  s  is like StringLabels.map but the index of the character is
295       also passed to f .
296
297
298       Since 4.02.0
299
300
301
302       val fold_left : f:('a -> char -> 'a) -> init:'a -> string -> 'a
303
304
305       fold_left f x s computes f (... (f (f x s.[0]) s.[1])  ...)  s.[n-1]  ,
306       where n is the length of the string s .
307
308
309       Since 4.13.0
310
311
312
313       val fold_right : f:(char -> 'a -> 'a) -> string -> init:'a -> 'a
314
315
316       fold_right f s x computes f s.[0] (f s.[1] ( ... (f s.[n-1] x) ...))  ,
317       where n is the length of the string s .
318
319
320       Since 4.13.0
321
322
323
324       val for_all : f:(char -> bool) -> string -> bool
325
326
327       for_all p s checks if all characters in s satisfy the predicate p .
328
329
330       Since 4.13.0
331
332
333
334       val exists : f:(char -> bool) -> string -> bool
335
336
337       exists p s checks if at least one character of s satisfies  the  predi‐
338       cate p .
339
340
341       Since 4.13.0
342
343
344
345       val trim : string -> string
346
347
348       trim s is s without leading and trailing whitespace. Whitespace charac‐
349       ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
350
351
352       Since 4.00.0
353
354
355
356       val escaped : string -> string
357
358
359       escaped s is s with special characters represented by escape sequences,
360       following the lexical conventions of OCaml.
361
362       All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
363       caped, as well as backslash (0x2F) and double-quote (0x22).
364
365       The function Scanf.unescaped is  a  left  inverse  of  escaped  ,  i.e.
366       Scanf.unescaped  (escaped  s)  =  s  for any string s (unless escaped s
367       fails).
368
369
370       Raises   Invalid_argument   if    the    result    is    longer    than
371       Sys.max_string_length bytes.
372
373
374
375       val uppercase_ascii : string -> string
376
377
378       uppercase_ascii  s is s with all lowercase letters translated to upper‐
379       case, using the US-ASCII character set.
380
381
382       Since 4.05.0
383
384
385
386       val lowercase_ascii : string -> string
387
388
389       lowercase_ascii s is s with all uppercase letters translated to  lower‐
390       case, using the US-ASCII character set.
391
392
393       Since 4.05.0
394
395
396
397       val capitalize_ascii : string -> string
398
399
400       capitalize_ascii  s is s with the first character set to uppercase, us‐
401       ing the US-ASCII character set.
402
403
404       Since 4.05.0
405
406
407
408       val uncapitalize_ascii : string -> string
409
410
411       uncapitalize_ascii s is s with the first character  set  to  lowercase,
412       using the US-ASCII character set.
413
414
415       Since 4.05.0
416
417
418
419
420   Traversing
421       val iter : f:(char -> unit) -> string -> unit
422
423
424       iter  ~f  s applies function f in turn to all the characters of s .  It
425       is equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
426
427
428
429       val iteri : f:(int -> char -> unit) -> string -> unit
430
431
432       iteri is like StringLabels.iter , but the function is  also  given  the
433       corresponding character index.
434
435
436       Since 4.00.0
437
438
439
440
441   Searching
442       val index_from : string -> int -> char -> int
443
444
445       index_from  s  i c is the index of the first occurrence of c in s after
446       position i .
447
448
449       Raises Not_found if c does not occur in s after position i .
450
451
452       Raises Invalid_argument if i is not a valid position in s .
453
454
455
456       val index_from_opt : string -> int -> char -> int option
457
458
459       index_from_opt s i c is the index of the first occurrence of c in s af‐
460       ter position i (if any).
461
462
463       Since 4.05
464
465
466       Raises Invalid_argument if i is not a valid position in s .
467
468
469
470       val rindex_from : string -> int -> char -> int
471
472
473       rindex_from  s i c is the index of the last occurrence of c in s before
474       position i+1 .
475
476
477       Raises Not_found if c does not occur in s before position i+1 .
478
479
480       Raises Invalid_argument if i+1 is not a valid position in s .
481
482
483
484       val rindex_from_opt : string -> int -> char -> int option
485
486
487       rindex_from_opt s i c is the index of the last occurrence of c in s be‐
488       fore position i+1 (if any).
489
490
491       Since 4.05
492
493
494       Raises Invalid_argument if i+1 is not a valid position in s .
495
496
497
498       val index : string -> char -> int
499
500
501       index s c is String.index_from s 0 c .
502
503
504
505       val index_opt : string -> char -> int option
506
507
508       index_opt s c is String.index_from_opt s 0 c .
509
510
511       Since 4.05
512
513
514
515       val rindex : string -> char -> int
516
517
518       rindex s c is String.rindex_from s (length s - 1) c .
519
520
521
522       val rindex_opt : string -> char -> int option
523
524
525       rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
526
527
528       Since 4.05
529
530
531
532
533   Strings and Sequences
534       val to_seq : t -> char Seq.t
535
536
537       to_seq  s  is  a sequence made of the string's characters in increasing
538       order. In "unsafe-string" mode, modifications of the string during  it‐
539       eration will be reflected in the sequence.
540
541
542       Since 4.07
543
544
545
546       val to_seqi : t -> (int * char) Seq.t
547
548
549       to_seqi s is like StringLabels.to_seq but also tuples the corresponding
550       index.
551
552
553       Since 4.07
554
555
556
557       val of_seq : char Seq.t -> t
558
559
560       of_seq s is a string made of the sequence's characters.
561
562
563       Since 4.07
564
565
566
567
568   UTF decoding and validations
569   UTF-8
570       val get_utf_8_uchar : t -> int -> Uchar.utf_decode
571
572
573       get_utf_8_uchar b i decodes an UTF-8 character at index i in b .
574
575
576
577       val is_valid_utf_8 : t -> bool
578
579
580       is_valid_utf_8 b is true if and only if b contains valid UTF-8 data.
581
582
583
584
585   UTF-16BE
586       val get_utf_16be_uchar : t -> int -> Uchar.utf_decode
587
588
589       get_utf_16be_uchar b i decodes an UTF-16BE character at index i in b .
590
591
592
593       val is_valid_utf_16be : t -> bool
594
595
596       is_valid_utf_16be b is true if and only if b  contains  valid  UTF-16BE
597       data.
598
599
600
601
602   UTF-16LE
603       val get_utf_16le_uchar : t -> int -> Uchar.utf_decode
604
605
606       get_utf_16le_uchar b i decodes an UTF-16LE character at index i in b .
607
608
609
610       val is_valid_utf_16le : t -> bool
611
612
613       is_valid_utf_16le  b  is  true if and only if b contains valid UTF-16LE
614       data.
615
616
617
618       val blit : src:string -> src_pos:int ->  dst:bytes  ->  dst_pos:int  ->
619       len:int -> unit
620
621
622       blit  ~src ~src_pos ~dst ~dst_pos ~len copies len bytes from the string
623       src , starting at index src_pos , to byte sequence dst  ,  starting  at
624       character number dst_pos .
625
626
627       Raises  Invalid_argument  if  src_pos  and len do not designate a valid
628       range of src , or if dst_pos and len do not designate a valid range  of
629       dst .
630
631
632
633
634   Binary decoding of integers
635       The functions in this section binary decode integers from strings.
636
637       All following functions raise Invalid_argument if the characters needed
638       at index i to decode the integer are not available.
639
640       Little-endian (resp. big-endian) encoding means that least (resp. most)
641       significant  bytes  are stored first.  Big-endian is also known as net‐
642       work byte order.  Native-endian encoding  is  either  little-endian  or
643       big-endian depending on Sys.big_endian .
644
645       32-bit  and  64-bit  integers  are  represented  by the int32 and int64
646       types, which can be interpreted either as signed or unsigned numbers.
647
648       8-bit and 16-bit integers are represented by the int  type,  which  has
649       more bits than the binary encoding.  These extra bits are sign-extended
650       (or zero-extended) for functions which decode 8-bit or 16-bit  integers
651       and represented them with int values.
652
653       val get_uint8 : string -> int -> int
654
655
656       get_uint8  b i is b 's unsigned 8-bit integer starting at character in‐
657       dex i .
658
659
660       Since 4.13.0
661
662
663
664       val get_int8 : string -> int -> int
665
666
667       get_int8 b i is b 's signed 8-bit integer starting at character index i
668       .
669
670
671       Since 4.13.0
672
673
674
675       val get_uint16_ne : string -> int -> int
676
677
678       get_uint16_ne  b i is b 's native-endian unsigned 16-bit integer start‐
679       ing at character index i .
680
681
682       Since 4.13.0
683
684
685
686       val get_uint16_be : string -> int -> int
687
688
689       get_uint16_be b i is b 's big-endian unsigned 16-bit  integer  starting
690       at character index i .
691
692
693       Since 4.13.0
694
695
696
697       val get_uint16_le : string -> int -> int
698
699
700       get_uint16_le  b i is b 's little-endian unsigned 16-bit integer start‐
701       ing at character index i .
702
703
704       Since 4.13.0
705
706
707
708       val get_int16_ne : string -> int -> int
709
710
711       get_int16_ne b i is b 's native-endian signed 16-bit  integer  starting
712       at character index i .
713
714
715       Since 4.13.0
716
717
718
719       val get_int16_be : string -> int -> int
720
721
722       get_int16_be  b  i is b 's big-endian signed 16-bit integer starting at
723       character index i .
724
725
726       Since 4.13.0
727
728
729
730       val get_int16_le : string -> int -> int
731
732
733       get_int16_le b i is b 's little-endian signed 16-bit  integer  starting
734       at character index i .
735
736
737       Since 4.13.0
738
739
740
741       val get_int32_ne : string -> int -> int32
742
743
744       get_int32_ne b i is b 's native-endian 32-bit integer starting at char‐
745       acter index i .
746
747
748       Since 4.13.0
749
750
751
752       val hash : t -> int
753
754       An unseeded hash function for strings, with the same  output  value  as
755       Hashtbl.hash  .  This function allows this module to be passed as argu‐
756       ment to the functor Hashtbl.Make .
757
758
759       Since 5.0.0
760
761
762
763       val seeded_hash : int -> t -> int
764
765       A seeded hash function for strings,  with  the  same  output  value  as
766       Hashtbl.seeded_hash  . This function allows this module to be passed as
767       argument to the functor Hashtbl.MakeSeeded .
768
769
770       Since 5.0.0
771
772
773
774       val get_int32_be : string -> int -> int32
775
776
777       get_int32_be b i is b 's big-endian 32-bit integer starting at  charac‐
778       ter index i .
779
780
781       Since 4.13.0
782
783
784
785       val get_int32_le : string -> int -> int32
786
787
788       get_int32_le b i is b 's little-endian 32-bit integer starting at char‐
789       acter index i .
790
791
792       Since 4.13.0
793
794
795
796       val get_int64_ne : string -> int -> int64
797
798
799       get_int64_ne b i is b 's native-endian 64-bit integer starting at char‐
800       acter index i .
801
802
803       Since 4.13.0
804
805
806
807       val get_int64_be : string -> int -> int64
808
809
810       get_int64_be  b i is b 's big-endian 64-bit integer starting at charac‐
811       ter index i .
812
813
814       Since 4.13.0
815
816
817
818       val get_int64_le : string -> int -> int64
819
820
821       get_int64_le b i is b 's little-endian 64-bit integer starting at char‐
822       acter index i .
823
824
825       Since 4.13.0
826
827
828
829
830
831OCamldoc                          2023-07-20                   StringLabels(3)
Impressum