1Bytes(3) OCaml library Bytes(3)
2
3
4
6 Bytes - Byte sequence operations.
7
9 Module Bytes
10
12 Module Bytes
13 : sig end
14
15
16 Byte sequence operations.
17
18 A byte sequence is a mutable data structure that contains a
19 fixed-length sequence of bytes. Each byte can be indexed in constant
20 time for reading or writing.
21
22 Given a byte sequence s of length l , we can access each of the l bytes
23 of s via its index in the sequence. Indexes start at 0 , and we will
24 call an index valid in s if it falls within the range [0...l-1] (inclu‐
25 sive). A position is the point between two bytes or at the beginning or
26 end of the sequence. We call a position valid in s if it falls within
27 the range [0...l] (inclusive). Note that the byte at index n is between
28 positions n and n+1 .
29
30 Two parameters start and len are said to designate a valid range of s
31 if len >= 0 and start and start+len are valid positions in s .
32
33 Byte sequences can be modified in place, for instance via the set and
34 blit functions described below. See also strings (module String ),
35 which are almost the same data structure, but cannot be modified in
36 place.
37
38 Bytes are represented by the OCaml type char .
39
40
41 Since 4.02.0
42
43
44
45
46
47
48 val length : bytes -> int
49
50 Return the length (number of bytes) of the argument.
51
52
53
54 val get : bytes -> int -> char
55
56
57 get s n returns the byte at index n in argument s .
58
59 Raise Invalid_argument if n is not a valid index in s .
60
61
62
63 val set : bytes -> int -> char -> unit
64
65
66 set s n c modifies s in place, replacing the byte at index n with c .
67
68 Raise Invalid_argument if n is not a valid index in s .
69
70
71
72 val create : int -> bytes
73
74
75 create n returns a new byte sequence of length n . The sequence is
76 uninitialized and contains arbitrary bytes.
77
78 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
79
80
81
82 val make : int -> char -> bytes
83
84
85 make n c returns a new byte sequence of length n , filled with the byte
86 c .
87
88 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
89
90
91
92 val init : int -> (int -> char) -> bytes
93
94
95 Bytes.init n f returns a fresh byte sequence of length n , with charac‐
96 ter i initialized to the result of f i (in increasing index order).
97
98 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102 val empty : bytes
103
104 A byte sequence of size 0.
105
106
107
108 val copy : bytes -> bytes
109
110 Return a new byte sequence that contains the same bytes as the argu‐
111 ment.
112
113
114
115 val of_string : string -> bytes
116
117 Return a new byte sequence that contains the same bytes as the given
118 string.
119
120
121
122 val to_string : bytes -> string
123
124 Return a new string that contains the same bytes as the given byte
125 sequence.
126
127
128
129 val sub : bytes -> int -> int -> bytes
130
131
132 sub s start len returns a new byte sequence of length len , containing
133 the subsequence of s that starts at position start and has length len .
134
135 Raise Invalid_argument if start and len do not designate a valid range
136 of s .
137
138
139
140 val sub_string : bytes -> int -> int -> string
141
142 Same as sub but return a string instead of a byte sequence.
143
144
145
146 val extend : bytes -> int -> int -> bytes
147
148
149 extend s left right returns a new byte sequence that contains the bytes
150 of s , with left uninitialized bytes prepended and right uninitialized
151 bytes appended to it. If left or right is negative, then bytes are
152 removed (instead of appended) from the corresponding side of s .
153
154 Raise Invalid_argument if the result length is negative or longer than
155 Sys.max_string_length bytes.
156
157
158
159 val fill : bytes -> int -> int -> char -> unit
160
161
162 fill s start len c modifies s in place, replacing len characters with c
163 , starting at start .
164
165 Raise Invalid_argument if start and len do not designate a valid range
166 of s .
167
168
169
170 val blit : bytes -> int -> bytes -> int -> int -> unit
171
172
173 blit src srcoff dst dstoff len copies len bytes from sequence src ,
174 starting at index srcoff , to sequence dst , starting at index dstoff .
175 It works correctly even if src and dst are the same byte sequence, and
176 the source and destination intervals overlap.
177
178 Raise Invalid_argument if srcoff and len do not designate a valid range
179 of src , or if dstoff and len do not designate a valid range of dst .
180
181
182
183 val blit_string : string -> int -> bytes -> int -> int -> unit
184
185
186 blit_string src srcoff dst dstoff len copies len bytes from string src
187 , starting at index srcoff , to byte sequence dst , starting at index
188 dstoff .
189
190 Raise Invalid_argument if srcoff and len do not designate a valid range
191 of src , or if dstoff and len do not designate a valid range of dst .
192
193
194
195 val concat : bytes -> bytes list -> bytes
196
197
198 concat sep sl concatenates the list of byte sequences sl , inserting
199 the separator byte sequence sep between each, and returns the result as
200 a new byte sequence.
201
202 Raise Invalid_argument if the result is longer than
203 Sys.max_string_length bytes.
204
205
206
207 val cat : bytes -> bytes -> bytes
208
209
210 cat s1 s2 concatenates s1 and s2 and returns the result as new byte
211 sequence.
212
213 Raise Invalid_argument if the result is longer than
214 Sys.max_string_length bytes.
215
216
217
218 val iter : (char -> unit) -> bytes -> unit
219
220
221 iter f s applies function f in turn to all the bytes of s . It is
222 equivalent to f (get s 0); f (get s 1); ...; f (get s
223 (length s - 1)); () .
224
225
226
227 val iteri : (int -> char -> unit) -> bytes -> unit
228
229 Same as Bytes.iter , but the function is applied to the index of the
230 byte as first argument and the byte itself as second argument.
231
232
233
234 val map : (char -> char) -> bytes -> bytes
235
236
237 map f s applies function f in turn to all the bytes of s (in increasing
238 index order) and stores the resulting bytes in a new sequence that is
239 returned as the result.
240
241
242
243 val mapi : (int -> char -> char) -> bytes -> bytes
244
245
246 mapi f s calls f with each character of s and its index (in increasing
247 index order) and stores the resulting bytes in a new sequence that is
248 returned as the result.
249
250
251
252 val trim : bytes -> bytes
253
254 Return a copy of the argument, without leading and trailing whitespace.
255 The bytes regarded as whitespace are the ASCII characters ' ' , '\012'
256 , '\n' , '\r' , and '\t' .
257
258
259
260 val escaped : bytes -> bytes
261
262 Return a copy of the argument, with special characters represented by
263 escape sequences, following the lexical conventions of OCaml. All
264 characters outside the ASCII printable range (32..126) are escaped, as
265 well as backslash and double-quote.
266
267 Raise Invalid_argument if the result is longer than
268 Sys.max_string_length bytes.
269
270
271
272 val index : bytes -> char -> int
273
274
275 index s c returns the index of the first occurrence of byte c in s .
276
277 Raise Not_found if c does not occur in s .
278
279
280
281 val index_opt : bytes -> char -> int option
282
283
284 index_opt s c returns the index of the first occurrence of byte c in s
285 or None if c does not occur in s .
286
287
288 Since 4.05
289
290
291
292 val rindex : bytes -> char -> int
293
294
295 rindex s c returns the index of the last occurrence of byte c in s .
296
297 Raise Not_found if c does not occur in s .
298
299
300
301 val rindex_opt : bytes -> char -> int option
302
303
304 rindex_opt s c returns the index of the last occurrence of byte c in s
305 or None if c does not occur in s .
306
307
308 Since 4.05
309
310
311
312 val index_from : bytes -> int -> char -> int
313
314
315 index_from s i c returns the index of the first occurrence of byte c in
316 s after position i . Bytes.index s c is equivalent to Bytes.index_from
317 s 0 c .
318
319 Raise Invalid_argument if i is not a valid position in s . Raise
320 Not_found if c does not occur in s after position i .
321
322
323
324 val index_from_opt : bytes -> int -> char -> int option
325
326
327 index_from_opt s i c returns the index of the first occurrence of byte
328 c in s after position i or None if c does not occur in s after position
329 i . Bytes.index_opt s c is equivalent to Bytes.index_from_opt s 0 c .
330
331 Raise Invalid_argument if i is not a valid position in s .
332
333
334 Since 4.05
335
336
337
338 val rindex_from : bytes -> int -> char -> int
339
340
341 rindex_from s i c returns the index of the last occurrence of byte c in
342 s before position i+1 . rindex s c is equivalent to rindex_from s
343 (Bytes.length s - 1) c .
344
345 Raise Invalid_argument if i+1 is not a valid position in s . Raise
346 Not_found if c does not occur in s before position i+1 .
347
348
349
350 val rindex_from_opt : bytes -> int -> char -> int option
351
352
353 rindex_from_opt s i c returns the index of the last occurrence of byte
354 c in s before position i+1 or None if c does not occur in s before
355 position i+1 . rindex_opt s c is equivalent to rindex_from s
356 (Bytes.length s - 1) c .
357
358 Raise Invalid_argument if i+1 is not a valid position in s .
359
360
361 Since 4.05
362
363
364
365 val contains : bytes -> char -> bool
366
367
368 contains s c tests if byte c appears in s .
369
370
371
372 val contains_from : bytes -> int -> char -> bool
373
374
375 contains_from s start c tests if byte c appears in s after position
376 start . contains s c is equivalent to contains_from
377 s 0 c .
378
379 Raise Invalid_argument if start is not a valid position in s .
380
381
382
383 val rcontains_from : bytes -> int -> char -> bool
384
385
386 rcontains_from s stop c tests if byte c appears in s before position
387 stop+1 .
388
389 Raise Invalid_argument if stop < 0 or stop+1 is not a valid position in
390 s .
391
392
393
394 val uppercase : bytes -> bytes
395
396 Deprecated. Functions operating on Latin-1 character set are depre‐
397 cated.
398
399
400 Return a copy of the argument, with all lowercase letters translated to
401 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
402 acter set.
403
404
405
406 val lowercase : bytes -> bytes
407
408 Deprecated. Functions operating on Latin-1 character set are depre‐
409 cated.
410
411
412 Return a copy of the argument, with all uppercase letters translated to
413 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
414 acter set.
415
416
417
418 val capitalize : bytes -> bytes
419
420 Deprecated. Functions operating on Latin-1 character set are depre‐
421 cated.
422
423
424 Return a copy of the argument, with the first character set to upper‐
425 case, using the ISO Latin-1 (8859-1) character set..
426
427
428
429 val uncapitalize : bytes -> bytes
430
431 Deprecated. Functions operating on Latin-1 character set are depre‐
432 cated.
433
434
435 Return a copy of the argument, with the first character set to lower‐
436 case, using the ISO Latin-1 (8859-1) character set..
437
438
439
440 val uppercase_ascii : bytes -> bytes
441
442 Return a copy of the argument, with all lowercase letters translated to
443 uppercase, using the US-ASCII character set.
444
445
446 Since 4.03.0
447
448
449
450 val lowercase_ascii : bytes -> bytes
451
452 Return a copy of the argument, with all uppercase letters translated to
453 lowercase, using the US-ASCII character set.
454
455
456 Since 4.03.0
457
458
459
460 val capitalize_ascii : bytes -> bytes
461
462 Return a copy of the argument, with the first character set to upper‐
463 case, using the US-ASCII character set.
464
465
466 Since 4.03.0
467
468
469
470 val uncapitalize_ascii : bytes -> bytes
471
472 Return a copy of the argument, with the first character set to lower‐
473 case, using the US-ASCII character set.
474
475
476 Since 4.03.0
477
478
479 type t = bytes
480
481
482 An alias for the type of byte sequences.
483
484
485
486 val compare : t -> t -> int
487
488 The comparison function for byte sequences, with the same specification
489 as compare . Along with the type t , this function compare allows the
490 module Bytes to be passed as argument to the functors Set.Make and
491 Map.Make .
492
493
494
495 val equal : t -> t -> bool
496
497 The equality function for byte sequences.
498
499
500 Since 4.03.0
501
502
503
504
505 Unsafe conversions (for advanced users)
506 This section describes unsafe, low-level conversion functions between
507 bytes and string . They do not copy the internal data; used improperly,
508 they can break the immutability invariant on strings provided by the
509 -safe-string option. They are available for expert library authors, but
510 for most purposes you should use the always-correct Bytes.to_string and
511 Bytes.of_string instead.
512
513 val unsafe_to_string : bytes -> string
514
515 Unsafely convert a byte sequence into a string.
516
517 To reason about the use of unsafe_to_string , it is convenient to con‐
518 sider an "ownership" discipline. A piece of code that manipulates some
519 data "owns" it; there are several disjoint ownership modes, including:
520
521 -Unique ownership: the data may be accessed and mutated
522
523 -Shared ownership: the data has several owners, that may only access
524 it, not mutate it.
525
526 Unique ownership is linear: passing the data to another piece of code
527 means giving up ownership (we cannot write the data again). A unique
528 owner may decide to make the data shared (giving up mutation rights on
529 it), but shared data may not become uniquely-owned again.
530
531
532 unsafe_to_string s can only be used when the caller owns the byte
533 sequence s -- either uniquely or as shared immutable data. The caller
534 gives up ownership of s , and gains ownership of the returned string.
535
536 There are two valid use-cases that respect this ownership discipline:
537
538 1. Creating a string by initializing and mutating a byte sequence that
539 is never changed after initialization is performed.
540
541
542 let string_init len f : string =
543 let s = Bytes.create len in
544 for i = 0 to len - 1 do Bytes.set s i (f i) done;
545 Bytes.unsafe_to_string s
546
547
548 This function is safe because the byte sequence s will never be
549 accessed or mutated after unsafe_to_string is called. The string_init
550 code gives up ownership of s , and returns the ownership of the result‐
551 ing string to its caller.
552
553 Note that it would be unsafe if s was passed as an additional parameter
554 to the function f as it could escape this way and be mutated in the
555 future -- string_init would give up ownership of s to pass it to f ,
556 and could not call unsafe_to_string safely.
557
558 We have provided the String.init , String.map and String.mapi functions
559 to cover most cases of building new strings. You should prefer those
560 over to_string or unsafe_to_string whenever applicable.
561
562 2. Temporarily giving ownership of a byte sequence to a function that
563 expects a uniquely owned string and returns ownership back, so that we
564 can mutate the sequence again after the call ended.
565
566
567 let bytes_length (s : bytes) =
568 String.length (Bytes.unsafe_to_string s)
569
570
571 In this use-case, we do not promise that s will never be mutated after
572 the call to bytes_length s . The String.length function temporarily
573 borrows unique ownership of the byte sequence (and sees it as a string
574 ), but returns this ownership back to the caller, which may assume that
575 s is still a valid byte sequence after the call. Note that this is only
576 correct because we know that String.length does not capture its argu‐
577 ment -- it could escape by a side-channel such as a memoization combi‐
578 nator.
579
580 The caller may not mutate s while the string is borrowed (it has tempo‐
581 rarily given up ownership). This affects concurrent programs, but also
582 higher-order functions: if String.length returned a closure to be
583 called later, s should not be mutated until this closure is fully
584 applied and returns ownership.
585
586
587
588 val unsafe_of_string : string -> bytes
589
590 Unsafely convert a shared string to a byte sequence that should not be
591 mutated.
592
593 The same ownership discipline that makes unsafe_to_string correct
594 applies to unsafe_of_string : you may use it if you were the owner of
595 the string value, and you will own the return bytes in the same mode.
596
597 In practice, unique ownership of string values is extremely difficult
598 to reason about correctly. You should always assume strings are shared,
599 never uniquely owned.
600
601 For example, string literals are implicitly shared by the compiler, so
602 you never uniquely own them.
603
604
605 let incorrect = Bytes.unsafe_of_string "hello"
606 let s = Bytes.of_string "hello"
607
608
609 The first declaration is incorrect, because the string literal "hello"
610 could be shared by the compiler with other parts of the program, and
611 mutating incorrect is a bug. You must always use the second version,
612 which performs a copy and is thus correct.
613
614 Assuming unique ownership of strings that are not string literals, but
615 are (partly) built from string literals, is also incorrect. For exam‐
616 ple, mutating unsafe_of_string ("foo" ^ s) could mutate the shared
617 string "foo" -- assuming a rope-like representation of strings. More
618 generally, functions operating on strings will assume shared ownership,
619 they do not preserve unique ownership. It is thus incorrect to assume
620 unique ownership of the result of unsafe_of_string .
621
622 The only case we have reasonable confidence is safe is if the produced
623 bytes is shared -- used as an immutable byte sequence. This is possibly
624 useful for incremental migration of low-level programs that manipulate
625 immutable sequences of bytes (for example Marshal.from_bytes ) and pre‐
626 viously used the string type for this purpose.
627
628
629
630
631 Iterators
632 val to_seq : t -> char Seq.t
633
634 Iterate on the string, in increasing index order. Modifications of the
635 string during iteration will be reflected in the iterator.
636
637
638 Since 4.07
639
640
641
642 val to_seqi : t -> (int * char) Seq.t
643
644 Iterate on the string, in increasing order, yielding indices along
645 chars
646
647
648 Since 4.07
649
650
651
652 val of_seq : char Seq.t -> t
653
654 Create a string from the generator
655
656
657 Since 4.07
658
659
660
661
662 Binary encoding/decoding of integers
663 The functions in this section binary encode and decode integers to and
664 from byte sequences.
665
666 All following functions raise Invalid_argument if the space needed at
667 index i to decode or encode the integer is not available.
668
669 Little-endian (resp. big-endian) encoding means that least (resp. most)
670 significant bytes are stored first. Big-endian is also known as net‐
671 work byte order. Native-endian encoding is either little-endian or
672 big-endian depending on Sys.big_endian .
673
674 32-bit and 64-bit integers are represented by the int32 and int64
675 types, which can be interpreted either as signed or unsigned numbers.
676
677 8-bit and 16-bit integers are represented by the int type, which has
678 more bits than the binary encoding. These extra bits are handled as
679 follows:
680
681 -Functions that decode signed (resp. unsigned) 8-bit or 16-bit integers
682 represented by int values sign-extend (resp. zero-extend) their result.
683
684 -Functions that encode 8-bit or 16-bit integers represented by int val‐
685 ues truncate their input to their least significant bytes.
686
687
688 val get_uint8 : bytes -> int -> int
689
690
691 get_uint8 b i is b 's unsigned 8-bit integer starting at byte index i .
692
693
694 Since 4.08
695
696
697
698 val get_int8 : bytes -> int -> int
699
700
701 get_int8 b i is b 's signed 8-bit integer starting at byte index i .
702
703
704 Since 4.08
705
706
707
708 val get_uint16_ne : bytes -> int -> int
709
710
711 get_uint16_ne b i is b 's native-endian unsigned 16-bit integer start‐
712 ing at byte index i .
713
714
715 Since 4.08
716
717
718
719 val get_uint16_be : bytes -> int -> int
720
721
722 get_uint16_be b i is b 's big-endian unsigned 16-bit integer starting
723 at byte index i .
724
725
726 Since 4.08
727
728
729
730 val get_uint16_le : bytes -> int -> int
731
732
733 get_uint16_le b i is b 's little-endian unsigned 16-bit integer start‐
734 ing at byte index i .
735
736
737 Since 4.08
738
739
740
741 val get_int16_ne : bytes -> int -> int
742
743
744 get_int16_ne b i is b 's native-endian signed 16-bit integer starting
745 at byte index i .
746
747
748 Since 4.08
749
750
751
752 val get_int16_be : bytes -> int -> int
753
754
755 get_int16_be b i is b 's big-endian signed 16-bit integer starting at
756 byte index i .
757
758
759 Since 4.08
760
761
762
763 val get_int16_le : bytes -> int -> int
764
765
766 get_int16_le b i is b 's little-endian signed 16-bit integer starting
767 at byte index i .
768
769
770 Since 4.08
771
772
773
774 val get_int32_ne : bytes -> int -> int32
775
776
777 get_int32_ne b i is b 's native-endian 32-bit integer starting at byte
778 index i .
779
780
781 Since 4.08
782
783
784
785 val get_int32_be : bytes -> int -> int32
786
787
788 get_int32_be b i is b 's big-endian 32-bit integer starting at byte
789 index i .
790
791
792 Since 4.08
793
794
795
796 val get_int32_le : bytes -> int -> int32
797
798
799 get_int32_le b i is b 's little-endian 32-bit integer starting at byte
800 index i .
801
802
803 Since 4.08
804
805
806
807 val get_int64_ne : bytes -> int -> int64
808
809
810 get_int64_ne b i is b 's native-endian 64-bit integer starting at byte
811 index i .
812
813
814 Since 4.08
815
816
817
818 val get_int64_be : bytes -> int -> int64
819
820
821 get_int64_be b i is b 's big-endian 64-bit integer starting at byte
822 index i .
823
824
825 Since 4.08
826
827
828
829 val get_int64_le : bytes -> int -> int64
830
831
832 get_int64_le b i is b 's little-endian 64-bit integer starting at byte
833 index i .
834
835
836 Since 4.08
837
838
839
840 val set_uint8 : bytes -> int -> int -> unit
841
842
843 set_uint8 b i v sets b 's unsigned 8-bit integer starting at byte index
844 i to v .
845
846
847 Since 4.08
848
849
850
851 val set_int8 : bytes -> int -> int -> unit
852
853
854 set_int8 b i v sets b 's signed 8-bit integer starting at byte index i
855 to v .
856
857
858 Since 4.08
859
860
861
862 val set_uint16_ne : bytes -> int -> int -> unit
863
864
865 set_uint16_ne b i v sets b 's native-endian unsigned 16-bit integer
866 starting at byte index i to v .
867
868
869 Since 4.08
870
871
872
873 val set_uint16_be : bytes -> int -> int -> unit
874
875
876 set_uint16_be b i v sets b 's big-endian unsigned 16-bit integer start‐
877 ing at byte index i to v .
878
879
880 Since 4.08
881
882
883
884 val set_uint16_le : bytes -> int -> int -> unit
885
886
887 set_uint16_le b i v sets b 's little-endian unsigned 16-bit integer
888 starting at byte index i to v .
889
890
891 Since 4.08
892
893
894
895 val set_int16_ne : bytes -> int -> int -> unit
896
897
898 set_int16_ne b i v sets b 's native-endian signed 16-bit integer start‐
899 ing at byte index i to v .
900
901
902 Since 4.08
903
904
905
906 val set_int16_be : bytes -> int -> int -> unit
907
908
909 set_int16_be b i v sets b 's big-endian signed 16-bit integer starting
910 at byte index i to v .
911
912
913 Since 4.08
914
915
916
917 val set_int16_le : bytes -> int -> int -> unit
918
919
920 set_int16_le b i v sets b 's little-endian signed 16-bit integer start‐
921 ing at byte index i to v .
922
923
924 Since 4.08
925
926
927
928 val set_int32_ne : bytes -> int -> int32 -> unit
929
930
931 set_int32_ne b i v sets b 's native-endian 32-bit integer starting at
932 byte index i to v .
933
934
935 Since 4.08
936
937
938
939 val set_int32_be : bytes -> int -> int32 -> unit
940
941
942 set_int32_be b i v sets b 's big-endian 32-bit integer starting at byte
943 index i to v .
944
945
946 Since 4.08
947
948
949
950 val set_int32_le : bytes -> int -> int32 -> unit
951
952
953 set_int32_le b i v sets b 's little-endian 32-bit integer starting at
954 byte index i to v .
955
956
957 Since 4.08
958
959
960
961 val set_int64_ne : bytes -> int -> int64 -> unit
962
963
964 set_int64_ne b i v sets b 's native-endian 64-bit integer starting at
965 byte index i to v .
966
967
968 Since 4.08
969
970
971
972 val set_int64_be : bytes -> int -> int64 -> unit
973
974
975 set_int64_be b i v sets b 's big-endian 64-bit integer starting at byte
976 index i to v .
977
978
979 Since 4.08
980
981
982
983 val set_int64_le : bytes -> int -> int64 -> unit
984
985
986 set_int64_le b i v sets b 's little-endian 64-bit integer starting at
987 byte index i to v .
988
989
990 Since 4.08
991
992
993
994
995
996OCamldoc 2020-02-27 Bytes(3)