1Bytes(3)                         OCaml library                        Bytes(3)
2
3
4

NAME

6       Bytes - Byte sequence operations.
7

Module

9       Module   Bytes
10

Documentation

12       Module Bytes
13        : sig end
14
15
16       Byte sequence operations.
17
18       A   byte   sequence  is  a  mutable  data  structure  that  contains  a
19       fixed-length sequence of bytes. Each byte can be  indexed  in  constant
20       time for reading or writing.
21
22       Given a byte sequence s of length l , we can access each of the l bytes
23       of s via its index in the sequence. Indexes start at 0 ,  and  we  will
24       call an index valid in s if it falls within the range [0...l-1] (inclu‐
25       sive). A position is the point between two bytes or at the beginning or
26       end  of the sequence.  We call a position valid in s if it falls within
27       the range [0...l] (inclusive). Note that the byte at index n is between
28       positions n and n+1 .
29
30       Two  parameters  start and len are said to designate a valid range of s
31       if len >= 0 and start and start+len are valid positions in s .
32
33       Byte sequences can be modified in place, for instance via the  set  and
34       blit  functions  described  below.   See also strings (module String ),
35       which are almost the same data structure, but  cannot  be  modified  in
36       place.
37
38       Bytes are represented by the OCaml type char .
39
40
41       Since 4.02.0
42
43
44
45
46
47
48       val length : bytes -> int
49
50       Return the length (number of bytes) of the argument.
51
52
53
54       val get : bytes -> int -> char
55
56
57       get s n returns the byte at index n in argument s .
58
59       Raise Invalid_argument if n is not a valid index in s .
60
61
62
63       val set : bytes -> int -> char -> unit
64
65
66       set s n c modifies s in place, replacing the byte at index n with c .
67
68       Raise Invalid_argument if n is not a valid index in s .
69
70
71
72       val create : int -> bytes
73
74
75       create  n  returns  a  new  byte sequence of length n . The sequence is
76       uninitialized and contains arbitrary bytes.
77
78       Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
79
80
81
82       val make : int -> char -> bytes
83
84
85       make n c returns a new byte sequence of length n , filled with the byte
86       c .
87
88       Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
89
90
91
92       val init : int -> (int -> char) -> bytes
93
94
95       Bytes.init n f returns a fresh byte sequence of length n , with charac‐
96       ter i initialized to the result of f i (in increasing index order).
97
98       Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102       val empty : bytes
103
104       A byte sequence of size 0.
105
106
107
108       val copy : bytes -> bytes
109
110       Return a new byte sequence that contains the same bytes  as  the  argu‐
111       ment.
112
113
114
115       val of_string : string -> bytes
116
117       Return  a  new  byte sequence that contains the same bytes as the given
118       string.
119
120
121
122       val to_string : bytes -> string
123
124       Return a new string that contains the same  bytes  as  the  given  byte
125       sequence.
126
127
128
129       val sub : bytes -> int -> int -> bytes
130
131
132       sub  s start len returns a new byte sequence of length len , containing
133       the subsequence of s that starts at position start and has length len .
134
135       Raise Invalid_argument if start and len do not designate a valid  range
136       of s .
137
138
139
140       val sub_string : bytes -> int -> int -> string
141
142       Same as sub but return a string instead of a byte sequence.
143
144
145
146       val extend : bytes -> int -> int -> bytes
147
148
149       extend s left right returns a new byte sequence that contains the bytes
150       of s , with left uninitialized bytes prepended and right  uninitialized
151       bytes  appended  to  it.  If  left or right is negative, then bytes are
152       removed (instead of appended) from the corresponding side of s .
153
154       Raise Invalid_argument if the result length is negative or longer  than
155       Sys.max_string_length bytes.
156
157
158
159       val fill : bytes -> int -> int -> char -> unit
160
161
162       fill s start len c modifies s in place, replacing len characters with c
163       , starting at start .
164
165       Raise Invalid_argument if start and len do not designate a valid  range
166       of s .
167
168
169
170       val blit : bytes -> int -> bytes -> int -> int -> unit
171
172
173       blit  src  srcoff  dst  dstoff len copies len bytes from sequence src ,
174       starting at index srcoff , to sequence dst , starting at index dstoff .
175       It  works correctly even if src and dst are the same byte sequence, and
176       the source and destination intervals overlap.
177
178       Raise Invalid_argument if srcoff and len do not designate a valid range
179       of src , or if dstoff and len do not designate a valid range of dst .
180
181
182
183       val blit_string : string -> int -> bytes -> int -> int -> unit
184
185
186       blit  src  srcoff  dst  dstoff  len  copies len bytes from string src ,
187       starting at index srcoff , to byte sequence dst  ,  starting  at  index
188       dstoff .
189
190       Raise Invalid_argument if srcoff and len do not designate a valid range
191       of src , or if dstoff and len do not designate a valid range of dst .
192
193
194
195       val concat : bytes -> bytes list -> bytes
196
197
198       concat sep sl concatenates the list of byte sequences  sl  ,  inserting
199       the separator byte sequence sep between each, and returns the result as
200       a new byte sequence.
201
202       Raise   Invalid_argument    if    the    result    is    longer    than
203       Sys.max_string_length bytes.
204
205
206
207       val cat : bytes -> bytes -> bytes
208
209
210       cat  s1  s2  concatenates  s1 and s2 and returns the result as new byte
211       sequence.
212
213       Raise   Invalid_argument    if    the    result    is    longer    than
214       Sys.max_string_length bytes.
215
216
217
218       val iter : (char -> unit) -> bytes -> unit
219
220
221       iter  f  s  applies  function  f in turn to all the bytes of s .  It is
222       equivalent to f (get s 0); f (get s 1); ...; f (get s (length s -  1));
223       () .
224
225
226
227       val iteri : (int -> char -> unit) -> bytes -> unit
228
229       Same  as  Bytes.iter  , but the function is applied to the index of the
230       byte as first argument and the byte itself as second argument.
231
232
233
234       val map : (char -> char) -> bytes -> bytes
235
236
237       map f s applies function f in turn to all the bytes of s (in increasing
238       index  order)  and stores the resulting bytes in a new sequence that is
239       returned as the result.
240
241
242
243       val mapi : (int -> char -> char) -> bytes -> bytes
244
245
246       mapi f s calls f with each character of s and its index (in  increasing
247       index  order)  and stores the resulting bytes in a new sequence that is
248       returned as the result.
249
250
251
252       val trim : bytes -> bytes
253
254       Return a copy of the argument, without leading and trailing whitespace.
255       The  bytes regarded as whitespace are the ASCII characters ' ' , '\012'
256       , '\n' , '\r' , and '\t' .
257
258
259
260       val escaped : bytes -> bytes
261
262       Return a copy of the argument, with special characters  represented  by
263       escape  sequences,  following  the  lexical  conventions of OCaml.  All
264       characters outside the ASCII printable range (32..126) are escaped,  as
265       well as backslash and double-quote.
266
267       Raise    Invalid_argument    if    the    result    is    longer   than
268       Sys.max_string_length bytes.
269
270
271
272       val index : bytes -> char -> int
273
274
275       index s c returns the index of the first occurrence of byte c in s .
276
277       Raise Not_found if c does not occur in s .
278
279
280
281       val index_opt : bytes -> char -> int option
282
283
284       index_opt s c returns the index of the first occurrence of byte c in  s
285       or None if c does not occur in s .
286
287
288       Since 4.05
289
290
291
292       val rindex : bytes -> char -> int
293
294
295       rindex s c returns the index of the last occurrence of byte c in s .
296
297       Raise Not_found if c does not occur in s .
298
299
300
301       val rindex_opt : bytes -> char -> int option
302
303
304       rindex_opt  s c returns the index of the last occurrence of byte c in s
305       or None if c does not occur in s .
306
307
308       Since 4.05
309
310
311
312       val index_from : bytes -> int -> char -> int
313
314
315       index_from s i c returns the index of the first occurrence of byte c in
316       s after position i .  Bytes.index s c is equivalent to Bytes.index_from
317       s 0 c .
318
319       Raise Invalid_argument if i is not a  valid  position  in  s  .   Raise
320       Not_found if c does not occur in s after position i .
321
322
323
324       val index_from_opt : bytes -> int -> char -> int option
325
326
327       index_from  _opts i c returns the index of the first occurrence of byte
328       c in s after position i or None if c does not occur in s after position
329       i .  Bytes.index_opt s c is equivalent to Bytes.index_from_opt s 0 c .
330
331       Raise Invalid_argument if i is not a valid position in s .
332
333
334       Since 4.05
335
336
337
338       val rindex_from : bytes -> int -> char -> int
339
340
341       rindex_from s i c returns the index of the last occurrence of byte c in
342       s before position i+1 .  rindex s c  is  equivalent  to  rindex_from  s
343       (Bytes.length s - 1) c .
344
345       Raise  Invalid_argument  if  i+1  is not a valid position in s .  Raise
346       Not_found if c does not occur in s before position i+1 .
347
348
349
350       val rindex_from_opt : bytes -> int -> char -> int option
351
352
353       rindex_from_opt s i c returns the index of the last occurrence of  byte
354       c  in  s  before  position  i+1 or None if c does not occur in s before
355       position  i+1  .   rindex_opt  s  c  is  equivalent  to  rindex_from  s
356       (Bytes.length s - 1) c .
357
358       Raise Invalid_argument if i+1 is not a valid position in s .
359
360
361       Since 4.05
362
363
364
365       val contains : bytes -> char -> bool
366
367
368       contains s c tests if byte c appears in s .
369
370
371
372       val contains_from : bytes -> int -> char -> bool
373
374
375       contains_from  s  start  c  tests if byte c appears in s after position
376       start .  contains s c is equivalent to contains_from s 0 c .
377
378       Raise Invalid_argument if start is not a valid position in s .
379
380
381
382       val rcontains_from : bytes -> int -> char -> bool
383
384
385       rcontains_from s stop c tests if byte c appears in  s  before  position
386       stop+1 .
387
388       Raise Invalid_argument if stop < 0 or stop+1 is not a valid position in
389       s .
390
391
392
393       val uppercase : bytes -> bytes
394
395       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
396       cated.
397
398
399       Return a copy of the argument, with all lowercase letters translated to
400       uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
401       acter set.
402
403
404
405       val lowercase : bytes -> bytes
406
407       Deprecated.   Functions  operating  on Latin-1 character set are depre‐
408       cated.
409
410
411       Return a copy of the argument, with all uppercase letters translated to
412       lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
413       acter set.
414
415
416
417       val capitalize : bytes -> bytes
418
419       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
420       cated.
421
422
423       Return  a  copy of the argument, with the first character set to upper‐
424       case, using the ISO Latin-1 (8859-1) character set..
425
426
427
428       val uncapitalize : bytes -> bytes
429
430       Deprecated.  Functions operating on Latin-1 character  set  are  depre‐
431       cated.
432
433
434       Return  a  copy of the argument, with the first character set to lower‐
435       case, using the ISO Latin-1 (8859-1) character set..
436
437
438
439       val uppercase_ascii : bytes -> bytes
440
441       Return a copy of the argument, with all lowercase letters translated to
442       uppercase, using the US-ASCII character set.
443
444
445       Since 4.03.0
446
447
448
449       val lowercase_ascii : bytes -> bytes
450
451       Return a copy of the argument, with all uppercase letters translated to
452       lowercase, using the US-ASCII character set.
453
454
455       Since 4.03.0
456
457
458
459       val capitalize_ascii : bytes -> bytes
460
461       Return a copy of the argument, with the first character set  to  upper‐
462       case, using the US-ASCII character set.
463
464
465       Since 4.03.0
466
467
468
469       val uncapitalize_ascii : bytes -> bytes
470
471       Return  a  copy of the argument, with the first character set to lower‐
472       case, using the US-ASCII character set.
473
474
475       Since 4.03.0
476
477
478       type t = bytes
479
480
481       An alias for the type of byte sequences.
482
483
484
485       val compare : t -> t -> int
486
487       The comparison function for byte sequences, with the same specification
488       as  Pervasives.compare .  Along with the type t , this function compare
489       allows the module Bytes to  be  passed  as  argument  to  the  functors
490       Set.Make and Map.Make .
491
492
493
494       val equal : t -> t -> bool
495
496       The equality function for byte sequences.
497
498
499       Since 4.03.0
500
501
502
503
504       ===  Unsafe  conversions  (for  advanced  users) This section describes
505       unsafe, low-level conversion functions between bytes and  string.  They
506       do  not  copy  the  internal  data; used improperly, they can break the
507       immutability invariant on strings provided by the -safe-string  option.
508       They  are  available  for expert library authors, but for most purposes
509       you should use the always-correct Bytes.to_string  and  Bytes.of_string
510       instead. ===
511
512
513       val unsafe_to_string : bytes -> string
514
515       Unsafely convert a byte sequence into a string.
516
517       To  reason about the use of unsafe_to_string , it is convenient to con‐
518       sider an "ownership" discipline. A piece of code that manipulates  some
519       data "owns" it; there are several disjoint ownership modes, including:
520
521       -Unique ownership: the data may be accessed and mutated
522
523       -Shared  ownership:  the  data has several owners, that may only access
524       it, not mutate it.
525
526       Unique ownership is linear: passing the data to another piece  of  code
527       means  giving  up  ownership (we cannot write the data again). A unique
528       owner may decide to make the data shared (giving up mutation rights  on
529       it), but shared data may not become uniquely-owned again.
530
531
532       unsafe_to_string  s  can  only  be  used  when the caller owns the byte
533       sequence s -- either uniquely or as shared immutable data.  The  caller
534       gives up ownership of s , and gains ownership of the returned string.
535
536       There are two valid use-cases that respect this ownership discipline:
537
538       1.  Creating a string by initializing and mutating a byte sequence that
539       is never changed after initialization is performed.
540
541
542       let string_init len f : string = let s = Bytes.create len in for i =  0
543       to len - 1 do Bytes.set s i (f i) done; Bytes.unsafe_to_string s
544
545       This  function  is  safe  because  the  byte  sequence  s will never be
546       accessed or mutated after unsafe_to_string is called.  The  string_init
547       code gives up ownership of s , and returns the ownership of the result‐
548       ing string to its caller.
549
550       Note that it would be unsafe if s was passed as an additional parameter
551       to  the  function  f  as it could escape this way and be mutated in the
552       future -- string_init would give up ownership of s to pass it  to  f  ,
553       and could not call unsafe_to_string safely.
554
555       We have provided the String.init , String.map and String.mapi functions
556       to cover most cases of building new strings. You  should  prefer  those
557       over to_string or unsafe_to_string whenever applicable.
558
559       2.  Temporarily  giving ownership of a byte sequence to a function that
560       expects a uniquely owned string and returns ownership back, so that  we
561       can mutate the sequence again after the call ended.
562
563
564       let bytes_length (s : bytes) = String.length (Bytes.unsafe_to_string s)
565
566       In  this use-case, we do not promise that s will never be mutated after
567       the call to bytes_length s .  The  String.length  function  temporarily
568       borrows  unique ownership of the byte sequence (and sees it as a string
569       ), but returns this ownership back to the caller, which may assume that
570       s is still a valid byte sequence after the call. Note that this is only
571       correct because we know that String.length does not capture  its  argu‐
572       ment  -- it could escape by a side-channel such as a memoization combi‐
573       nator.
574
575       The caller may not mutate s while the string is borrowed (it has tempo‐
576       rarily  given up ownership). This affects concurrent programs, but also
577       higher-order functions: if  String.length  returned  a  closure  to  be
578       called  later,  s  should  not  be  mutated until this closure is fully
579       applied and returns ownership.
580
581
582
583       val unsafe_of_string : string -> bytes
584
585       Unsafely convert a shared string to a byte sequence that should not  be
586       mutated.
587
588       The  same  ownership  discipline  that  makes  unsafe_to_string correct
589       applies to unsafe_of_string : you may use it if you were the  owner  of
590       the string value, and you will own the return bytes in the same mode.
591
592       In  practice,  unique ownership of string values is extremely difficult
593       to reason about correctly. You should always assume strings are shared,
594       never uniquely owned.
595
596       For  example, string literals are implicitly shared by the compiler, so
597       you never uniquely own them.
598
599
600       let incorrect = Bytes.unsafe_of_string hello let  s  =  Bytes.of_string
601       hello
602
603       The  first  declaration  is incorrect, because the string literal hello
604       could be shared by the compiler with other parts of  the  program,  and
605       mutating  incorrect  is  a bug. You must always use the second version,
606       which performs a copy and is thus correct.
607
608       Assuming unique ownership of strings that are not string literals,  but
609       are  (partly)  built from string literals, is also incorrect. For exam‐
610       ple, mutating unsafe_of_string ("foo" ^  s)  could  mutate  the  shared
611       string foo -- assuming a rope-like representation of strings. More gen‐
612       erally, functions operating on strings will  assume  shared  ownership,
613       they  do  not preserve unique ownership. It is thus incorrect to assume
614       unique ownership of the result of unsafe_of_string .
615
616       The only case we have reasonable confidence is safe is if the  produced
617       bytes is shared -- used as an immutable byte sequence. This is possibly
618       useful for incremental migration of low-level programs that  manipulate
619       immutable sequences of bytes (for example Marshal.from_bytes ) and pre‐
620       viously used the string type for this purpose.
621
622
623
624
625       === Iterators ===
626
627
628       val to_seq : t -> char Seq.t
629
630       Iterate on the string, in increasing index order. Modifications of  the
631       string during iteration will be reflected in the iterator.
632
633
634       Since 4.07
635
636
637
638       val to_seqi : t -> (int * char) Seq.t
639
640       Iterate  on  the  string,  in  increasing order, yielding indices along
641       chars
642
643
644       Since 4.07
645
646
647
648       val of_seq : char Seq.t -> t
649
650       Create a string from the generator
651
652
653       Since 4.07
654
655
656
657
658
659OCamldoc                          2019-02-02                          Bytes(3)
Impressum