1Bytes(3) OCaml library Bytes(3)
2
3
4
6 Bytes - Byte sequence operations.
7
9 Module Bytes
10
12 Module Bytes
13 : sig end
14
15
16 Byte sequence operations.
17
18 A byte sequence is a mutable data structure that contains a
19 fixed-length sequence of bytes. Each byte can be indexed in constant
20 time for reading or writing.
21
22 Given a byte sequence s of length l , we can access each of the l bytes
23 of s via its index in the sequence. Indexes start at 0 , and we will
24 call an index valid in s if it falls within the range [0...l-1] (inclu‐
25 sive). A position is the point between two bytes or at the beginning or
26 end of the sequence. We call a position valid in s if it falls within
27 the range [0...l] (inclusive). Note that the byte at index n is between
28 positions n and n+1 .
29
30 Two parameters start and len are said to designate a valid range of s
31 if len >= 0 and start and start+len are valid positions in s .
32
33 Byte sequences can be modified in place, for instance via the set and
34 blit functions described below. See also strings (module String ),
35 which are almost the same data structure, but cannot be modified in
36 place.
37
38 Bytes are represented by the OCaml type char .
39
40
41 Since 4.02.0
42
43
44
45
46
47
48 val length : bytes -> int
49
50 Return the length (number of bytes) of the argument.
51
52
53
54 val get : bytes -> int -> char
55
56
57 get s n returns the byte at index n in argument s .
58
59
60 Raises Invalid_argument if n is not a valid index in s .
61
62
63
64 val set : bytes -> int -> char -> unit
65
66
67 set s n c modifies s in place, replacing the byte at index n with c .
68
69
70 Raises Invalid_argument if n is not a valid index in s .
71
72
73
74 val create : int -> bytes
75
76
77 create n returns a new byte sequence of length n . The sequence is
78 uninitialized and contains arbitrary bytes.
79
80
81 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
82
83
84
85 val make : int -> char -> bytes
86
87
88 make n c returns a new byte sequence of length n , filled with the byte
89 c .
90
91
92 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
93
94
95
96 val init : int -> (int -> char) -> bytes
97
98
99 Bytes.init n f returns a fresh byte sequence of length n , with charac‐
100 ter i initialized to the result of f i (in increasing index order).
101
102
103 Raises Invalid_argument if n < 0 or n > Sys.max_string_length .
104
105
106
107 val empty : bytes
108
109 A byte sequence of size 0.
110
111
112
113 val copy : bytes -> bytes
114
115 Return a new byte sequence that contains the same bytes as the argu‐
116 ment.
117
118
119
120 val of_string : string -> bytes
121
122 Return a new byte sequence that contains the same bytes as the given
123 string.
124
125
126
127 val to_string : bytes -> string
128
129 Return a new string that contains the same bytes as the given byte
130 sequence.
131
132
133
134 val sub : bytes -> int -> int -> bytes
135
136
137 sub s start len returns a new byte sequence of length len , containing
138 the subsequence of s that starts at position start and has length len .
139
140
141 Raises Invalid_argument if start and len do not designate a valid range
142 of s .
143
144
145
146 val sub_string : bytes -> int -> int -> string
147
148 Same as sub but return a string instead of a byte sequence.
149
150
151
152 val extend : bytes -> int -> int -> bytes
153
154
155 extend s left right returns a new byte sequence that contains the bytes
156 of s , with left uninitialized bytes prepended and right uninitialized
157 bytes appended to it. If left or right is negative, then bytes are
158 removed (instead of appended) from the corresponding side of s .
159
160
161 Raises Invalid_argument if the result length is negative or longer than
162 Sys.max_string_length bytes.
163
164
165
166 val fill : bytes -> int -> int -> char -> unit
167
168
169 fill s start len c modifies s in place, replacing len characters with c
170 , starting at start .
171
172
173 Raises Invalid_argument if start and len do not designate a valid range
174 of s .
175
176
177
178 val blit : bytes -> int -> bytes -> int -> int -> unit
179
180
181 blit src srcoff dst dstoff len copies len bytes from sequence src ,
182 starting at index srcoff , to sequence dst , starting at index dstoff .
183 It works correctly even if src and dst are the same byte sequence, and
184 the source and destination intervals overlap.
185
186
187 Raises Invalid_argument if srcoff and len do not designate a valid
188 range of src , or if dstoff and len do not designate a valid range of
189 dst .
190
191
192
193 val blit_string : string -> int -> bytes -> int -> int -> unit
194
195
196 blit_string src srcoff dst dstoff len copies len bytes from string src
197 , starting at index srcoff , to byte sequence dst , starting at index
198 dstoff .
199
200
201 Raises Invalid_argument if srcoff and len do not designate a valid
202 range of src , or if dstoff and len do not designate a valid range of
203 dst .
204
205
206
207 val concat : bytes -> bytes list -> bytes
208
209
210 concat sep sl concatenates the list of byte sequences sl , inserting
211 the separator byte sequence sep between each, and returns the result as
212 a new byte sequence.
213
214
215 Raises Invalid_argument if the result is longer than
216 Sys.max_string_length bytes.
217
218
219
220 val cat : bytes -> bytes -> bytes
221
222
223 cat s1 s2 concatenates s1 and s2 and returns the result as a new byte
224 sequence.
225
226
227 Raises Invalid_argument if the result is longer than
228 Sys.max_string_length bytes.
229
230
231
232 val iter : (char -> unit) -> bytes -> unit
233
234
235 iter f s applies function f in turn to all the bytes of s . It is
236 equivalent to f (get s 0); f (get s 1); ...; f (get s
237 (length s - 1)); () .
238
239
240
241 val iteri : (int -> char -> unit) -> bytes -> unit
242
243 Same as Bytes.iter , but the function is applied to the index of the
244 byte as first argument and the byte itself as second argument.
245
246
247
248 val map : (char -> char) -> bytes -> bytes
249
250
251 map f s applies function f in turn to all the bytes of s (in increasing
252 index order) and stores the resulting bytes in a new sequence that is
253 returned as the result.
254
255
256
257 val mapi : (int -> char -> char) -> bytes -> bytes
258
259
260 mapi f s calls f with each character of s and its index (in increasing
261 index order) and stores the resulting bytes in a new sequence that is
262 returned as the result.
263
264
265
266 val trim : bytes -> bytes
267
268 Return a copy of the argument, without leading and trailing whitespace.
269 The bytes regarded as whitespace are the ASCII characters ' ' , '\012'
270 , '\n' , '\r' , and '\t' .
271
272
273
274 val escaped : bytes -> bytes
275
276 Return a copy of the argument, with special characters represented by
277 escape sequences, following the lexical conventions of OCaml. All
278 characters outside the ASCII printable range (32..126) are escaped, as
279 well as backslash and double-quote.
280
281
282 Raises Invalid_argument if the result is longer than
283 Sys.max_string_length bytes.
284
285
286
287 val index : bytes -> char -> int
288
289
290 index s c returns the index of the first occurrence of byte c in s .
291
292
293 Raises Not_found if c does not occur in s .
294
295
296
297 val index_opt : bytes -> char -> int option
298
299
300 index_opt s c returns the index of the first occurrence of byte c in s
301 or None if c does not occur in s .
302
303
304 Since 4.05
305
306
307
308 val rindex : bytes -> char -> int
309
310
311 rindex s c returns the index of the last occurrence of byte c in s .
312
313
314 Raises Not_found if c does not occur in s .
315
316
317
318 val rindex_opt : bytes -> char -> int option
319
320
321 rindex_opt s c returns the index of the last occurrence of byte c in s
322 or None if c does not occur in s .
323
324
325 Since 4.05
326
327
328
329 val index_from : bytes -> int -> char -> int
330
331
332 index_from s i c returns the index of the first occurrence of byte c in
333 s after position i . Bytes.index s c is equivalent to Bytes.index_from
334 s 0 c .
335
336
337 Raises Invalid_argument if i is not a valid position in s .
338
339
340 Raises Not_found if c does not occur in s after position i .
341
342
343
344 val index_from_opt : bytes -> int -> char -> int option
345
346
347 index_from_opt s i c returns the index of the first occurrence of byte
348 c in s after position i or None if c does not occur in s after position
349 i . Bytes.index_opt s c is equivalent to Bytes.index_from_opt s 0 c .
350
351
352 Since 4.05
353
354
355 Raises Invalid_argument if i is not a valid position in s .
356
357
358
359 val rindex_from : bytes -> int -> char -> int
360
361
362 rindex_from s i c returns the index of the last occurrence of byte c in
363 s before position i+1 . rindex s c is equivalent to rindex_from s
364 (Bytes.length s - 1) c .
365
366
367 Raises Invalid_argument if i+1 is not a valid position in s .
368
369
370 Raises Not_found if c does not occur in s before position i+1 .
371
372
373
374 val rindex_from_opt : bytes -> int -> char -> int option
375
376
377 rindex_from_opt s i c returns the index of the last occurrence of byte
378 c in s before position i+1 or None if c does not occur in s before
379 position i+1 . rindex_opt s c is equivalent to rindex_from s
380 (Bytes.length s - 1) c .
381
382
383 Since 4.05
384
385
386 Raises Invalid_argument if i+1 is not a valid position in s .
387
388
389
390 val contains : bytes -> char -> bool
391
392
393 contains s c tests if byte c appears in s .
394
395
396
397 val contains_from : bytes -> int -> char -> bool
398
399
400 contains_from s start c tests if byte c appears in s after position
401 start . contains s c is equivalent to contains_from
402 s 0 c .
403
404
405 Raises Invalid_argument if start is not a valid position in s .
406
407
408
409 val rcontains_from : bytes -> int -> char -> bool
410
411
412 rcontains_from s stop c tests if byte c appears in s before position
413 stop+1 .
414
415
416 Raises Invalid_argument if stop < 0 or stop+1 is not a valid position
417 in s .
418
419
420
421 val uppercase : bytes -> bytes
422
423 Deprecated. Functions operating on Latin-1 character set are depre‐
424 cated.
425
426
427 Return a copy of the argument, with all lowercase letters translated to
428 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
429 acter set.
430
431
432
433 val lowercase : bytes -> bytes
434
435 Deprecated. Functions operating on Latin-1 character set are depre‐
436 cated.
437
438
439 Return a copy of the argument, with all uppercase letters translated to
440 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
441 acter set.
442
443
444
445 val capitalize : bytes -> bytes
446
447 Deprecated. Functions operating on Latin-1 character set are depre‐
448 cated.
449
450
451 Return a copy of the argument, with the first character set to upper‐
452 case, using the ISO Latin-1 (8859-1) character set..
453
454
455
456 val uncapitalize : bytes -> bytes
457
458 Deprecated. Functions operating on Latin-1 character set are depre‐
459 cated.
460
461
462 Return a copy of the argument, with the first character set to lower‐
463 case, using the ISO Latin-1 (8859-1) character set..
464
465
466
467 val uppercase_ascii : bytes -> bytes
468
469 Return a copy of the argument, with all lowercase letters translated to
470 uppercase, using the US-ASCII character set.
471
472
473 Since 4.03.0
474
475
476
477 val lowercase_ascii : bytes -> bytes
478
479 Return a copy of the argument, with all uppercase letters translated to
480 lowercase, using the US-ASCII character set.
481
482
483 Since 4.03.0
484
485
486
487 val capitalize_ascii : bytes -> bytes
488
489 Return a copy of the argument, with the first character set to upper‐
490 case, using the US-ASCII character set.
491
492
493 Since 4.03.0
494
495
496
497 val uncapitalize_ascii : bytes -> bytes
498
499 Return a copy of the argument, with the first character set to lower‐
500 case, using the US-ASCII character set.
501
502
503 Since 4.03.0
504
505
506 type t = bytes
507
508
509 An alias for the type of byte sequences.
510
511
512
513 val compare : t -> t -> int
514
515 The comparison function for byte sequences, with the same specification
516 as compare . Along with the type t , this function compare allows the
517 module Bytes to be passed as argument to the functors Set.Make and
518 Map.Make .
519
520
521
522 val equal : t -> t -> bool
523
524 The equality function for byte sequences.
525
526
527 Since 4.03.0
528
529
530
531
532 Unsafe conversions (for advanced users)
533 This section describes unsafe, low-level conversion functions between
534 bytes and string . They do not copy the internal data; used improperly,
535 they can break the immutability invariant on strings provided by the
536 -safe-string option. They are available for expert library authors, but
537 for most purposes you should use the always-correct Bytes.to_string and
538 Bytes.of_string instead.
539
540 val unsafe_to_string : bytes -> string
541
542 Unsafely convert a byte sequence into a string.
543
544 To reason about the use of unsafe_to_string , it is convenient to con‐
545 sider an "ownership" discipline. A piece of code that manipulates some
546 data "owns" it; there are several disjoint ownership modes, including:
547
548 -Unique ownership: the data may be accessed and mutated
549
550 -Shared ownership: the data has several owners, that may only access
551 it, not mutate it.
552
553 Unique ownership is linear: passing the data to another piece of code
554 means giving up ownership (we cannot write the data again). A unique
555 owner may decide to make the data shared (giving up mutation rights on
556 it), but shared data may not become uniquely-owned again.
557
558
559 unsafe_to_string s can only be used when the caller owns the byte
560 sequence s -- either uniquely or as shared immutable data. The caller
561 gives up ownership of s , and gains ownership of the returned string.
562
563 There are two valid use-cases that respect this ownership discipline:
564
565 1. Creating a string by initializing and mutating a byte sequence that
566 is never changed after initialization is performed.
567
568
569 let string_init len f : string =
570 let s = Bytes.create len in
571 for i = 0 to len - 1 do Bytes.set s i (f i) done;
572 Bytes.unsafe_to_string s
573
574
575 This function is safe because the byte sequence s will never be
576 accessed or mutated after unsafe_to_string is called. The string_init
577 code gives up ownership of s , and returns the ownership of the result‐
578 ing string to its caller.
579
580 Note that it would be unsafe if s was passed as an additional parameter
581 to the function f as it could escape this way and be mutated in the
582 future -- string_init would give up ownership of s to pass it to f ,
583 and could not call unsafe_to_string safely.
584
585 We have provided the String.init , String.map and String.mapi functions
586 to cover most cases of building new strings. You should prefer those
587 over to_string or unsafe_to_string whenever applicable.
588
589 2. Temporarily giving ownership of a byte sequence to a function that
590 expects a uniquely owned string and returns ownership back, so that we
591 can mutate the sequence again after the call ended.
592
593
594 let bytes_length (s : bytes) =
595 String.length (Bytes.unsafe_to_string s)
596
597
598 In this use-case, we do not promise that s will never be mutated after
599 the call to bytes_length s . The String.length function temporarily
600 borrows unique ownership of the byte sequence (and sees it as a string
601 ), but returns this ownership back to the caller, which may assume that
602 s is still a valid byte sequence after the call. Note that this is only
603 correct because we know that String.length does not capture its argu‐
604 ment -- it could escape by a side-channel such as a memoization combi‐
605 nator.
606
607 The caller may not mutate s while the string is borrowed (it has tempo‐
608 rarily given up ownership). This affects concurrent programs, but also
609 higher-order functions: if String.length returned a closure to be
610 called later, s should not be mutated until this closure is fully
611 applied and returns ownership.
612
613
614
615 val unsafe_of_string : string -> bytes
616
617 Unsafely convert a shared string to a byte sequence that should not be
618 mutated.
619
620 The same ownership discipline that makes unsafe_to_string correct
621 applies to unsafe_of_string : you may use it if you were the owner of
622 the string value, and you will own the return bytes in the same mode.
623
624 In practice, unique ownership of string values is extremely difficult
625 to reason about correctly. You should always assume strings are shared,
626 never uniquely owned.
627
628 For example, string literals are implicitly shared by the compiler, so
629 you never uniquely own them.
630
631
632 let incorrect = Bytes.unsafe_of_string "hello"
633 let s = Bytes.of_string "hello"
634
635
636 The first declaration is incorrect, because the string literal "hello"
637 could be shared by the compiler with other parts of the program, and
638 mutating incorrect is a bug. You must always use the second version,
639 which performs a copy and is thus correct.
640
641 Assuming unique ownership of strings that are not string literals, but
642 are (partly) built from string literals, is also incorrect. For exam‐
643 ple, mutating unsafe_of_string ("foo" ^ s) could mutate the shared
644 string "foo" -- assuming a rope-like representation of strings. More
645 generally, functions operating on strings will assume shared ownership,
646 they do not preserve unique ownership. It is thus incorrect to assume
647 unique ownership of the result of unsafe_of_string .
648
649 The only case we have reasonable confidence is safe is if the produced
650 bytes is shared -- used as an immutable byte sequence. This is possibly
651 useful for incremental migration of low-level programs that manipulate
652 immutable sequences of bytes (for example Marshal.from_bytes ) and pre‐
653 viously used the string type for this purpose.
654
655
656
657
658 Iterators
659 val to_seq : t -> char Seq.t
660
661 Iterate on the string, in increasing index order. Modifications of the
662 string during iteration will be reflected in the iterator.
663
664
665 Since 4.07
666
667
668
669 val to_seqi : t -> (int * char) Seq.t
670
671 Iterate on the string, in increasing order, yielding indices along
672 chars
673
674
675 Since 4.07
676
677
678
679 val of_seq : char Seq.t -> t
680
681 Create a string from the generator
682
683
684 Since 4.07
685
686
687
688
689 Binary encoding/decoding of integers
690 The functions in this section binary encode and decode integers to and
691 from byte sequences.
692
693 All following functions raise Invalid_argument if the space needed at
694 index i to decode or encode the integer is not available.
695
696 Little-endian (resp. big-endian) encoding means that least (resp. most)
697 significant bytes are stored first. Big-endian is also known as net‐
698 work byte order. Native-endian encoding is either little-endian or
699 big-endian depending on Sys.big_endian .
700
701 32-bit and 64-bit integers are represented by the int32 and int64
702 types, which can be interpreted either as signed or unsigned numbers.
703
704 8-bit and 16-bit integers are represented by the int type, which has
705 more bits than the binary encoding. These extra bits are handled as
706 follows:
707
708 -Functions that decode signed (resp. unsigned) 8-bit or 16-bit integers
709 represented by int values sign-extend (resp. zero-extend) their result.
710
711 -Functions that encode 8-bit or 16-bit integers represented by int val‐
712 ues truncate their input to their least significant bytes.
713
714
715 val get_uint8 : bytes -> int -> int
716
717
718 get_uint8 b i is b 's unsigned 8-bit integer starting at byte index i .
719
720
721 Since 4.08
722
723
724
725 val get_int8 : bytes -> int -> int
726
727
728 get_int8 b i is b 's signed 8-bit integer starting at byte index i .
729
730
731 Since 4.08
732
733
734
735 val get_uint16_ne : bytes -> int -> int
736
737
738 get_uint16_ne b i is b 's native-endian unsigned 16-bit integer start‐
739 ing at byte index i .
740
741
742 Since 4.08
743
744
745
746 val get_uint16_be : bytes -> int -> int
747
748
749 get_uint16_be b i is b 's big-endian unsigned 16-bit integer starting
750 at byte index i .
751
752
753 Since 4.08
754
755
756
757 val get_uint16_le : bytes -> int -> int
758
759
760 get_uint16_le b i is b 's little-endian unsigned 16-bit integer start‐
761 ing at byte index i .
762
763
764 Since 4.08
765
766
767
768 val get_int16_ne : bytes -> int -> int
769
770
771 get_int16_ne b i is b 's native-endian signed 16-bit integer starting
772 at byte index i .
773
774
775 Since 4.08
776
777
778
779 val get_int16_be : bytes -> int -> int
780
781
782 get_int16_be b i is b 's big-endian signed 16-bit integer starting at
783 byte index i .
784
785
786 Since 4.08
787
788
789
790 val get_int16_le : bytes -> int -> int
791
792
793 get_int16_le b i is b 's little-endian signed 16-bit integer starting
794 at byte index i .
795
796
797 Since 4.08
798
799
800
801 val get_int32_ne : bytes -> int -> int32
802
803
804 get_int32_ne b i is b 's native-endian 32-bit integer starting at byte
805 index i .
806
807
808 Since 4.08
809
810
811
812 val get_int32_be : bytes -> int -> int32
813
814
815 get_int32_be b i is b 's big-endian 32-bit integer starting at byte
816 index i .
817
818
819 Since 4.08
820
821
822
823 val get_int32_le : bytes -> int -> int32
824
825
826 get_int32_le b i is b 's little-endian 32-bit integer starting at byte
827 index i .
828
829
830 Since 4.08
831
832
833
834 val get_int64_ne : bytes -> int -> int64
835
836
837 get_int64_ne b i is b 's native-endian 64-bit integer starting at byte
838 index i .
839
840
841 Since 4.08
842
843
844
845 val get_int64_be : bytes -> int -> int64
846
847
848 get_int64_be b i is b 's big-endian 64-bit integer starting at byte
849 index i .
850
851
852 Since 4.08
853
854
855
856 val get_int64_le : bytes -> int -> int64
857
858
859 get_int64_le b i is b 's little-endian 64-bit integer starting at byte
860 index i .
861
862
863 Since 4.08
864
865
866
867 val set_uint8 : bytes -> int -> int -> unit
868
869
870 set_uint8 b i v sets b 's unsigned 8-bit integer starting at byte index
871 i to v .
872
873
874 Since 4.08
875
876
877
878 val set_int8 : bytes -> int -> int -> unit
879
880
881 set_int8 b i v sets b 's signed 8-bit integer starting at byte index i
882 to v .
883
884
885 Since 4.08
886
887
888
889 val set_uint16_ne : bytes -> int -> int -> unit
890
891
892 set_uint16_ne b i v sets b 's native-endian unsigned 16-bit integer
893 starting at byte index i to v .
894
895
896 Since 4.08
897
898
899
900 val set_uint16_be : bytes -> int -> int -> unit
901
902
903 set_uint16_be b i v sets b 's big-endian unsigned 16-bit integer start‐
904 ing at byte index i to v .
905
906
907 Since 4.08
908
909
910
911 val set_uint16_le : bytes -> int -> int -> unit
912
913
914 set_uint16_le b i v sets b 's little-endian unsigned 16-bit integer
915 starting at byte index i to v .
916
917
918 Since 4.08
919
920
921
922 val set_int16_ne : bytes -> int -> int -> unit
923
924
925 set_int16_ne b i v sets b 's native-endian signed 16-bit integer start‐
926 ing at byte index i to v .
927
928
929 Since 4.08
930
931
932
933 val set_int16_be : bytes -> int -> int -> unit
934
935
936 set_int16_be b i v sets b 's big-endian signed 16-bit integer starting
937 at byte index i to v .
938
939
940 Since 4.08
941
942
943
944 val set_int16_le : bytes -> int -> int -> unit
945
946
947 set_int16_le b i v sets b 's little-endian signed 16-bit integer start‐
948 ing at byte index i to v .
949
950
951 Since 4.08
952
953
954
955 val set_int32_ne : bytes -> int -> int32 -> unit
956
957
958 set_int32_ne b i v sets b 's native-endian 32-bit integer starting at
959 byte index i to v .
960
961
962 Since 4.08
963
964
965
966 val set_int32_be : bytes -> int -> int32 -> unit
967
968
969 set_int32_be b i v sets b 's big-endian 32-bit integer starting at byte
970 index i to v .
971
972
973 Since 4.08
974
975
976
977 val set_int32_le : bytes -> int -> int32 -> unit
978
979
980 set_int32_le b i v sets b 's little-endian 32-bit integer starting at
981 byte index i to v .
982
983
984 Since 4.08
985
986
987
988 val set_int64_ne : bytes -> int -> int64 -> unit
989
990
991 set_int64_ne b i v sets b 's native-endian 64-bit integer starting at
992 byte index i to v .
993
994
995 Since 4.08
996
997
998
999 val set_int64_be : bytes -> int -> int64 -> unit
1000
1001
1002 set_int64_be b i v sets b 's big-endian 64-bit integer starting at byte
1003 index i to v .
1004
1005
1006 Since 4.08
1007
1008
1009
1010 val set_int64_le : bytes -> int -> int64 -> unit
1011
1012
1013 set_int64_le b i v sets b 's little-endian 64-bit integer starting at
1014 byte index i to v .
1015
1016
1017 Since 4.08
1018
1019
1020
1021
1022
1023OCamldoc 2020-09-01 Bytes(3)