1StringLabels(3)                  OCaml library                 StringLabels(3)
2
3
4

NAME

6       StringLabels - Strings.
7

Module

9       Module   StringLabels
10

Documentation

12       Module StringLabels
13        : sig end
14
15
16       Strings.
17
18       A  string  s  of  length  n is an indexable and immutable sequence of n
19       bytes. For historical reasons these bytes are referred  to  as  charac‐
20       ters.
21
22       The  semantics  of  string functions is defined in terms of indices and
23       positions. These are depicted and described as follows.
24
25       positions  0   1   2   3   4    n-1    n +---+---+---+---+      +-----+
26       indices  | 0 | 1 | 2 | 3 | ... | n-1 | +---+---+---+---+     +-----+
27
28       -An index i of s is an integer in the range [ 0 ; n-1 ].  It represents
29       the i th byte (character) of s which can be accessed using the constant
30       time string indexing operator s.[i] .
31
32       -A  position i of s is an integer in the range [ 0 ; n ]. It represents
33       either the point at the beginning of the string, or the  point  between
34       two indices, or the point at the end of the string. The i th byte index
35       is between position i and i+1 .
36
37
38       Two integers start and len are said to define a valid substring of s if
39       len >= 0 and start , start+len are positions of s .
40
41       Unicode text. Strings being arbitrary sequences of bytes, they can hold
42       any kind of textual encoding.  However  the  recommended  encoding  for
43       storing  Unicode  text  in OCaml strings is UTF-8. This is the encoding
44       used by Unicode escapes in string  literals.  For  example  the  string
45       "\u{1F42B}" is the UTF-8 encoding of the Unicode character U+1F42B.
46
47       Past  mutability. OCaml strings used to be modifiable in place, for in‐
48       stance via the String.set and String.blit functions. This use is  nowa‐
49       days  only possible when the compiler is put in "unsafe-string" mode by
50       giving the -unsafe-string command-line option. This compatibility  mode
51       makes the types string and bytes (see Bytes.t ) interchangeable so that
52       functions expecting byte sequences can also accept strings as arguments
53       and modify them.
54
55       The  distinction between bytes and string was introduced in OCaml 4.02,
56       and the "unsafe-string" compatibility mode was the default until  OCaml
57       4.05.  Starting  with 4.06, the compatibility mode is opt-in; we intend
58       to remove the option in the future.
59
60       The labeled version of this module can be used as described in the Std‐
61       Labels module.
62
63
64
65
66
67
68
69   Strings
70       type t = string
71
72
73       The type for strings.
74
75
76
77       val make : int -> char -> string
78
79
80       make  n c is a string of length n with each index holding the character
81       c .
82
83
84       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
85
86
87
88       val init : int -> f:(int -> char) -> string
89
90
91       init n ~f is a string of length n with index i holding the character  f
92       i (called in increasing index order).
93
94
95       Since 4.02.0
96
97
98       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102       val length : string -> int
103
104
105       length s is the length (number of bytes/characters) of s .
106
107
108
109       val get : string -> int -> char
110
111
112       get  s i is the character at index i in s . This is the same as writing
113       s.[i] .
114
115
116       Raises Invalid_argument if i not an index of s .
117
118
119
120
121   Concatenating
122       Note. The (^) binary operator concatenates two strings.
123
124       val concat : sep:string -> string list -> string
125
126
127       concat ~sep ss concatenates the list of strings ss , inserting the sep‐
128       arator string sep between each.
129
130
131       Raises    Invalid_argument    if    the    result    is   longer   than
132       Sys.max_string_length bytes.
133
134
135
136
137   Predicates and comparisons
138       val equal : t -> t -> bool
139
140
141       equal s0 s1 is true if and only if s0 and s1 are character-wise equal.
142
143
144       Since 4.05.0
145
146
147
148       val compare : t -> t -> int
149
150
151       compare s0 s1 sorts s0 and s1 in lexicographical  order.   compare  be‐
152       haves like compare on strings but may be more efficient.
153
154
155
156       val contains_from : string -> int -> char -> bool
157
158
159       contains_from s start c is true if and only if c appears in s after po‐
160       sition start .
161
162
163       Raises Invalid_argument if start is not a valid position in s .
164
165
166
167       val rcontains_from : string -> int -> char -> bool
168
169
170       rcontains_from s stop c is true if and only if c appears  in  s  before
171       position stop+1 .
172
173
174       Raises  Invalid_argument  if stop < 0 or stop+1 is not a valid position
175       in s .
176
177
178
179       val contains : string -> char -> bool
180
181
182       contains s c is String.contains_from s 0 c .
183
184
185
186
187   Extracting substrings
188       val sub : string -> pos:int -> len:int -> string
189
190
191       sub s ~pos ~len is a string of length len , containing the substring of
192       s that starts at position pos and has length len .
193
194
195       Raises  Invalid_argument  if  pos and len do not designate a valid sub‐
196       string of s .
197
198
199
200       val split_on_char : sep:char -> string -> string list
201
202
203       split_on_char ~sep s is the list of all (possibly empty) substrings  of
204       s that are delimited by the character sep .
205
206       The function's result is specified by the following invariants:
207
208       -The list is not empty.
209
210       -Concatenating  its  elements using sep as a separator returns a string
211       equal to the input ( concat (make 1 sep)
212             (split_on_char sep s) = s ).
213
214       -No string in the result contains the sep character.
215
216
217
218       Since 4.05.0
219
220
221
222
223   Transforming
224       val map : f:(char -> char) -> string -> string
225
226
227       map f s is the string resulting from applying f to all  the  characters
228       of s in increasing order.
229
230
231       Since 4.00.0
232
233
234
235       val mapi : f:(int -> char -> char) -> string -> string
236
237
238       mapi  ~f  s  is like StringLabels.map but the index of the character is
239       also passed to f .
240
241
242       Since 4.02.0
243
244
245
246       val trim : string -> string
247
248
249       trim s is s without leading and trailing whitespace. Whitespace charac‐
250       ters are: ' ' , '\x0C' (form feed), '\n' , '\r' , and '\t' .
251
252
253       Since 4.00.0
254
255
256
257       val escaped : string -> string
258
259
260       escaped s is s with special characters represented by escape sequences,
261       following the lexical conventions of OCaml.
262
263       All characters outside the US-ASCII printable range [0x20;0x7E] are es‐
264       caped, as well as backslash (0x2F) and double-quote (0x22).
265
266       The  function  Scanf.unescaped  is  a  left  inverse  of escaped , i.e.
267       Scanf.unescaped (escaped s) = s for any  string  s  (unless  escaped  s
268       fails).
269
270
271       Raises    Invalid_argument    if    the    result    is   longer   than
272       Sys.max_string_length bytes.
273
274
275
276       val uppercase_ascii : string -> string
277
278
279       uppercase_ascii s is s with all lowercase letters translated to  upper‐
280       case, using the US-ASCII character set.
281
282
283       Since 4.05.0
284
285
286
287       val lowercase_ascii : string -> string
288
289
290       lowercase_ascii  s is s with all uppercase letters translated to lower‐
291       case, using the US-ASCII character set.
292
293
294       Since 4.05.0
295
296
297
298       val capitalize_ascii : string -> string
299
300
301       capitalize_ascii s is s with the first character set to uppercase,  us‐
302       ing the US-ASCII character set.
303
304
305       Since 4.05.0
306
307
308
309       val uncapitalize_ascii : string -> string
310
311
312       uncapitalize_ascii  s  is  s with the first character set to lowercase,
313       using the US-ASCII character set.
314
315
316       Since 4.05.0
317
318
319
320
321   Traversing
322       val iter : f:(char -> unit) -> string -> unit
323
324
325       iter ~f s applies function f in turn to all the characters of s  .   It
326       is equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; () .
327
328
329
330       val iteri : f:(int -> char -> unit) -> string -> unit
331
332
333       iteri  is  like  StringLabels.iter , but the function is also given the
334       corresponding character index.
335
336
337       Since 4.00.0
338
339
340
341
342   Searching
343       val index_from : string -> int -> char -> int
344
345
346       index_from s i c is the index of the first occurrence of c in  s  after
347       position i .
348
349
350       Raises Not_found if c does not occur in s after position i .
351
352
353       Raises Invalid_argument if i is not a valid position in s .
354
355
356
357       val index_from_opt : string -> int -> char -> int option
358
359
360       index_from_opt s i c is the index of the first occurrence of c in s af‐
361       ter position i (if any).
362
363
364       Since 4.05
365
366
367       Raises Invalid_argument if i is not a valid position in s .
368
369
370
371       val rindex_from : string -> int -> char -> int
372
373
374       rindex_from s i c is the index of the last occurrence of c in s  before
375       position i+1 .
376
377
378       Raises Not_found if c does not occur in s before position i+1 .
379
380
381       Raises Invalid_argument if i+1 is not a valid position in s .
382
383
384
385       val rindex_from_opt : string -> int -> char -> int option
386
387
388       rindex_from_opt s i c is the index of the last occurrence of c in s be‐
389       fore position i+1 (if any).
390
391
392       Since 4.05
393
394
395       Raises Invalid_argument if i+1 is not a valid position in s .
396
397
398
399       val index : string -> char -> int
400
401
402       index s c is String.index_from s 0 c .
403
404
405
406       val index_opt : string -> char -> int option
407
408
409       index_opt s c is String.index_from_opt s 0 c .
410
411
412       Since 4.05
413
414
415
416       val rindex : string -> char -> int
417
418
419       rindex s c is String.rindex_from s (length s - 1) c .
420
421
422
423       val rindex_opt : string -> char -> int option
424
425
426       rindex_opt s c is String.rindex_from_opt s (length s - 1) c .
427
428
429       Since 4.05
430
431
432
433
434   Converting
435       val to_seq : t -> char Seq.t
436
437
438       to_seq s is a sequence made of the string's  characters  in  increasing
439       order.  In "unsafe-string" mode, modifications of the string during it‐
440       eration will be reflected in the iterator.
441
442
443       Since 4.07
444
445
446
447       val to_seqi : t -> (int * char) Seq.t
448
449
450       to_seqi s is like StringLabels.to_seq but also tuples the corresponding
451       index.
452
453
454       Since 4.07
455
456
457
458       val of_seq : char Seq.t -> t
459
460
461       of_seq s is a string made of the sequence's characters.
462
463
464       Since 4.07
465
466
467
468
469   Deprecated functions
470       val create : int -> bytes
471
472       Deprecated.   This  is  a  deprecated  alias of Bytes.create / BytesLa‐
473       bels.create .
474
475
476
477       create n returns a fresh byte sequence of length n .  The  sequence  is
478       uninitialized and contains arbitrary bytes.
479
480
481       Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
482
483
484
485       val set : bytes -> int -> char -> unit
486
487       Deprecated.   This is a deprecated alias of Bytes.set / BytesLabels.set
488       .
489
490
491
492       set s n c modifies byte sequence s in place, replacing the byte at  in‐
493       dex n with c .  You can also write s.[n] <- c instead of set s n c .
494
495
496       Raises Invalid_argument if n is not a valid index in s .
497
498
499
500       val  blit  :  src:string  -> src_pos:int -> dst:bytes -> dst_pos:int ->
501       len:int -> unit
502
503
504       blit ~src ~src_pos ~dst ~dst_pos ~len copies len bytes from the  string
505       src  ,  starting  at index src_pos , to byte sequence dst , starting at
506       character number dst_pos .
507
508
509       Raises Invalid_argument if src_pos and len do  not  designate  a  valid
510       range  of src , or if dst_pos and len do not designate a valid range of
511       dst .
512
513
514
515       val copy : string -> string
516
517       Deprecated.  Because strings are immutable, it doesn't make much  sense
518       to make identical copies of them.
519
520
521       Return a copy of the given string.
522
523
524
525       val fill : bytes -> pos:int -> len:int -> char -> unit
526
527       Deprecated.   This  is  a  deprecated  alias  of  Bytes.fill / BytesLa‐
528       bels.fill .
529
530
531
532       fill s ~pos ~len c modifies byte sequence s  in  place,  replacing  len
533       bytes by c , starting at pos .
534
535
536       Raises  Invalid_argument  if  pos and len do not designate a valid sub‐
537       string of s .
538
539
540
541       val uppercase : string -> string
542
543       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
544       cated.
545
546
547       Return a copy of the argument, with all lowercase letters translated to
548       uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
549       acter set.
550
551
552
553       val lowercase : string -> string
554
555       Deprecated.   Functions  operating  on Latin-1 character set are depre‐
556       cated.
557
558
559       Return a copy of the argument, with all uppercase letters translated to
560       lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
561       acter set.
562
563
564
565       val capitalize : string -> string
566
567       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
568       cated.
569
570
571       Return  a  copy of the argument, with the first character set to upper‐
572       case, using the ISO Latin-1 (8859-1) character set..
573
574
575
576       val uncapitalize : string -> string
577
578       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
579       cated.
580
581
582       Return  a  copy of the argument, with the first character set to lower‐
583       case, using the ISO Latin-1 (8859-1) character set.
584
585
586
587
588
589OCamldoc                          2021-07-22                   StringLabels(3)
Impressum