1Bytes(3) OCaml library Bytes(3)
2
3
4
6 Bytes - Byte sequence operations.
7
9 Module Bytes
10
12 Module Bytes
13 : sig end
14
15
16 Byte sequence operations.
17
18 A byte sequence is a mutable data structure that contains a
19 fixed-length sequence of bytes. Each byte can be indexed in constant
20 time for reading or writing.
21
22 Given a byte sequence s of length l , we can access each of the l bytes
23 of s via its index in the sequence. Indexes start at 0 , and we will
24 call an index valid in s if it falls within the range [0...l-1] (inclu‐
25 sive). A position is the point between two bytes or at the beginning or
26 end of the sequence. We call a position valid in s if it falls within
27 the range [0...l] (inclusive). Note that the byte at index n is between
28 positions n and n+1 .
29
30 Two parameters start and len are said to designate a valid range of s
31 if len >= 0 and start and start+len are valid positions in s .
32
33 Byte sequences can be modified in place, for instance via the set and
34 blit functions described below. See also strings (module String ),
35 which are almost the same data structure, but cannot be modified in
36 place.
37
38 Bytes are represented by the OCaml type char .
39
40
41 Since 4.02.0
42
43
44
45
46
47
48 val length : bytes -> int
49
50 Return the length (number of bytes) of the argument.
51
52
53
54 val get : bytes -> int -> char
55
56
57 get s n returns the byte at index n in argument s .
58
59 Raise Invalid_argument if n is not a valid index in s .
60
61
62
63 val set : bytes -> int -> char -> unit
64
65
66 set s n c modifies s in place, replacing the byte at index n with c .
67
68 Raise Invalid_argument if n is not a valid index in s .
69
70
71
72 val create : int -> bytes
73
74
75 create n returns a new byte sequence of length n . The sequence is
76 uninitialized and contains arbitrary bytes.
77
78 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
79
80
81
82 val make : int -> char -> bytes
83
84
85 make n c returns a new byte sequence of length n , filled with the byte
86 c .
87
88 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
89
90
91
92 val init : int -> (int -> char) -> bytes
93
94
95 Bytes.init n f returns a fresh byte sequence of length n , with charac‐
96 ter i initialized to the result of f i (in increasing index order).
97
98 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102 val empty : bytes
103
104 A byte sequence of size 0.
105
106
107
108 val copy : bytes -> bytes
109
110 Return a new byte sequence that contains the same bytes as the argu‐
111 ment.
112
113
114
115 val of_string : string -> bytes
116
117 Return a new byte sequence that contains the same bytes as the given
118 string.
119
120
121
122 val to_string : bytes -> string
123
124 Return a new string that contains the same bytes as the given byte
125 sequence.
126
127
128
129 val sub : bytes -> int -> int -> bytes
130
131
132 sub s start len returns a new byte sequence of length len , containing
133 the subsequence of s that starts at position start and has length len .
134
135 Raise Invalid_argument if start and len do not designate a valid range
136 of s .
137
138
139
140 val sub_string : bytes -> int -> int -> string
141
142 Same as sub but return a string instead of a byte sequence.
143
144
145
146 val extend : bytes -> int -> int -> bytes
147
148
149 extend s left right returns a new byte sequence that contains the bytes
150 of s , with left uninitialized bytes prepended and right uninitialized
151 bytes appended to it. If left or right is negative, then bytes are
152 removed (instead of appended) from the corresponding side of s .
153
154 Raise Invalid_argument if the result length is negative or longer than
155 Sys.max_string_length bytes.
156
157
158
159 val fill : bytes -> int -> int -> char -> unit
160
161
162 fill s start len c modifies s in place, replacing len characters with c
163 , starting at start .
164
165 Raise Invalid_argument if start and len do not designate a valid range
166 of s .
167
168
169
170 val blit : bytes -> int -> bytes -> int -> int -> unit
171
172
173 blit src srcoff dst dstoff len copies len bytes from sequence src ,
174 starting at index srcoff , to sequence dst , starting at index dstoff .
175 It works correctly even if src and dst are the same byte sequence, and
176 the source and destination intervals overlap.
177
178 Raise Invalid_argument if srcoff and len do not designate a valid range
179 of src , or if dstoff and len do not designate a valid range of dst .
180
181
182
183 val blit_string : string -> int -> bytes -> int -> int -> unit
184
185
186 blit src srcoff dst dstoff len copies len bytes from string src ,
187 starting at index srcoff , to byte sequence dst , starting at index
188 dstoff .
189
190 Raise Invalid_argument if srcoff and len do not designate a valid range
191 of src , or if dstoff and len do not designate a valid range of dst .
192
193
194
195 val concat : bytes -> bytes list -> bytes
196
197
198 concat sep sl concatenates the list of byte sequences sl , inserting
199 the separator byte sequence sep between each, and returns the result as
200 a new byte sequence.
201
202 Raise Invalid_argument if the result is longer than
203 Sys.max_string_length bytes.
204
205
206
207 val cat : bytes -> bytes -> bytes
208
209
210 cat s1 s2 concatenates s1 and s2 and returns the result as new byte
211 sequence.
212
213 Raise Invalid_argument if the result is longer than
214 Sys.max_string_length bytes.
215
216
217
218 val iter : (char -> unit) -> bytes -> unit
219
220
221 iter f s applies function f in turn to all the bytes of s . It is
222 equivalent to f (get s 0); f (get s 1); ...; f (get s (length s - 1));
223 () .
224
225
226
227 val iteri : (int -> char -> unit) -> bytes -> unit
228
229 Same as Bytes.iter , but the function is applied to the index of the
230 byte as first argument and the byte itself as second argument.
231
232
233
234 val map : (char -> char) -> bytes -> bytes
235
236
237 map f s applies function f in turn to all the bytes of s (in increasing
238 index order) and stores the resulting bytes in a new sequence that is
239 returned as the result.
240
241
242
243 val mapi : (int -> char -> char) -> bytes -> bytes
244
245
246 mapi f s calls f with each character of s and its index (in increasing
247 index order) and stores the resulting bytes in a new sequence that is
248 returned as the result.
249
250
251
252 val trim : bytes -> bytes
253
254 Return a copy of the argument, without leading and trailing whitespace.
255 The bytes regarded as whitespace are the ASCII characters ' ' , '\012'
256 , '\n' , '\r' , and '\t' .
257
258
259
260 val escaped : bytes -> bytes
261
262 Return a copy of the argument, with special characters represented by
263 escape sequences, following the lexical conventions of OCaml. All
264 characters outside the ASCII printable range (32..126) are escaped, as
265 well as backslash and double-quote.
266
267 Raise Invalid_argument if the result is longer than
268 Sys.max_string_length bytes.
269
270
271
272 val index : bytes -> char -> int
273
274
275 index s c returns the index of the first occurrence of byte c in s .
276
277 Raise Not_found if c does not occur in s .
278
279
280
281 val index_opt : bytes -> char -> int option
282
283
284 index_opt s c returns the index of the first occurrence of byte c in s
285 or None if c does not occur in s .
286
287
288 Since 4.05
289
290
291
292 val rindex : bytes -> char -> int
293
294
295 rindex s c returns the index of the last occurrence of byte c in s .
296
297 Raise Not_found if c does not occur in s .
298
299
300
301 val rindex_opt : bytes -> char -> int option
302
303
304 rindex_opt s c returns the index of the last occurrence of byte c in s
305 or None if c does not occur in s .
306
307
308 Since 4.05
309
310
311
312 val index_from : bytes -> int -> char -> int
313
314
315 index_from s i c returns the index of the first occurrence of byte c in
316 s after position i . Bytes.index s c is equivalent to Bytes.index_from
317 s 0 c .
318
319 Raise Invalid_argument if i is not a valid position in s . Raise
320 Not_found if c does not occur in s after position i .
321
322
323
324 val index_from_opt : bytes -> int -> char -> int option
325
326
327 index_from _opts i c returns the index of the first occurrence of byte
328 c in s after position i or None if c does not occur in s after position
329 i . Bytes.index_opt s c is equivalent to Bytes.index_from_opt s 0 c .
330
331 Raise Invalid_argument if i is not a valid position in s .
332
333
334 Since 4.05
335
336
337
338 val rindex_from : bytes -> int -> char -> int
339
340
341 rindex_from s i c returns the index of the last occurrence of byte c in
342 s before position i+1 . rindex s c is equivalent to rindex_from s
343 (Bytes.length s - 1) c .
344
345 Raise Invalid_argument if i+1 is not a valid position in s . Raise
346 Not_found if c does not occur in s before position i+1 .
347
348
349
350 val rindex_from_opt : bytes -> int -> char -> int option
351
352
353 rindex_from_opt s i c returns the index of the last occurrence of byte
354 c in s before position i+1 or None if c does not occur in s before
355 position i+1 . rindex_opt s c is equivalent to rindex_from s
356 (Bytes.length s - 1) c .
357
358 Raise Invalid_argument if i+1 is not a valid position in s .
359
360
361 Since 4.05
362
363
364
365 val contains : bytes -> char -> bool
366
367
368 contains s c tests if byte c appears in s .
369
370
371
372 val contains_from : bytes -> int -> char -> bool
373
374
375 contains_from s start c tests if byte c appears in s after position
376 start . contains s c is equivalent to contains_from s 0 c .
377
378 Raise Invalid_argument if start is not a valid position in s .
379
380
381
382 val rcontains_from : bytes -> int -> char -> bool
383
384
385 rcontains_from s stop c tests if byte c appears in s before position
386 stop+1 .
387
388 Raise Invalid_argument if stop < 0 or stop+1 is not a valid position in
389 s .
390
391
392
393 val uppercase : bytes -> bytes
394
395 Deprecated. Functions operating on Latin-1 character set are depre‐
396 cated.
397
398
399 Return a copy of the argument, with all lowercase letters translated to
400 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
401 acter set.
402
403
404
405 val lowercase : bytes -> bytes
406
407 Deprecated. Functions operating on Latin-1 character set are depre‐
408 cated.
409
410
411 Return a copy of the argument, with all uppercase letters translated to
412 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
413 acter set.
414
415
416
417 val capitalize : bytes -> bytes
418
419 Deprecated. Functions operating on Latin-1 character set are depre‐
420 cated.
421
422
423 Return a copy of the argument, with the first character set to upper‐
424 case, using the ISO Latin-1 (8859-1) character set..
425
426
427
428 val uncapitalize : bytes -> bytes
429
430 Deprecated. Functions operating on Latin-1 character set are depre‐
431 cated.
432
433
434 Return a copy of the argument, with the first character set to lower‐
435 case, using the ISO Latin-1 (8859-1) character set..
436
437
438
439 val uppercase_ascii : bytes -> bytes
440
441 Return a copy of the argument, with all lowercase letters translated to
442 uppercase, using the US-ASCII character set.
443
444
445 Since 4.03.0
446
447
448
449 val lowercase_ascii : bytes -> bytes
450
451 Return a copy of the argument, with all uppercase letters translated to
452 lowercase, using the US-ASCII character set.
453
454
455 Since 4.03.0
456
457
458
459 val capitalize_ascii : bytes -> bytes
460
461 Return a copy of the argument, with the first character set to upper‐
462 case, using the US-ASCII character set.
463
464
465 Since 4.03.0
466
467
468
469 val uncapitalize_ascii : bytes -> bytes
470
471 Return a copy of the argument, with the first character set to lower‐
472 case, using the US-ASCII character set.
473
474
475 Since 4.03.0
476
477
478 type t = bytes
479
480
481 An alias for the type of byte sequences.
482
483
484
485 val compare : t -> t -> int
486
487 The comparison function for byte sequences, with the same specification
488 as compare . Along with the type t , this function compare allows the
489 module Bytes to be passed as argument to the functors Set.Make and
490 Map.Make .
491
492
493
494 val equal : t -> t -> bool
495
496 The equality function for byte sequences.
497
498
499 Since 4.03.0
500
501
502
503
504 Unsafe conversions (for advanced users)
505 This section describes unsafe, low-level conversion functions between
506 bytes and string . They do not copy the internal data; used improperly,
507 they can break the immutability invariant on strings provided by the
508 -safe-string option. They are available for expert library authors, but
509 for most purposes you should use the always-correct Bytes.to_string and
510 Bytes.of_string instead.
511
512 val unsafe_to_string : bytes -> string
513
514 Unsafely convert a byte sequence into a string.
515
516 To reason about the use of unsafe_to_string , it is convenient to con‐
517 sider an "ownership" discipline. A piece of code that manipulates some
518 data "owns" it; there are several disjoint ownership modes, including:
519
520 -Unique ownership: the data may be accessed and mutated
521
522 -Shared ownership: the data has several owners, that may only access
523 it, not mutate it.
524
525 Unique ownership is linear: passing the data to another piece of code
526 means giving up ownership (we cannot write the data again). A unique
527 owner may decide to make the data shared (giving up mutation rights on
528 it), but shared data may not become uniquely-owned again.
529
530
531 unsafe_to_string s can only be used when the caller owns the byte
532 sequence s -- either uniquely or as shared immutable data. The caller
533 gives up ownership of s , and gains ownership of the returned string.
534
535 There are two valid use-cases that respect this ownership discipline:
536
537 1. Creating a string by initializing and mutating a byte sequence that
538 is never changed after initialization is performed.
539
540
541 let string_init len f : string = let s = Bytes.create len in for i = 0
542 to len - 1 do Bytes.set s i (f i) done; Bytes.unsafe_to_string s
543
544 This function is safe because the byte sequence s will never be
545 accessed or mutated after unsafe_to_string is called. The string_init
546 code gives up ownership of s , and returns the ownership of the result‐
547 ing string to its caller.
548
549 Note that it would be unsafe if s was passed as an additional parameter
550 to the function f as it could escape this way and be mutated in the
551 future -- string_init would give up ownership of s to pass it to f ,
552 and could not call unsafe_to_string safely.
553
554 We have provided the String.init , String.map and String.mapi functions
555 to cover most cases of building new strings. You should prefer those
556 over to_string or unsafe_to_string whenever applicable.
557
558 2. Temporarily giving ownership of a byte sequence to a function that
559 expects a uniquely owned string and returns ownership back, so that we
560 can mutate the sequence again after the call ended.
561
562
563 let bytes_length (s : bytes) = String.length (Bytes.unsafe_to_string s)
564
565 In this use-case, we do not promise that s will never be mutated after
566 the call to bytes_length s . The String.length function temporarily
567 borrows unique ownership of the byte sequence (and sees it as a string
568 ), but returns this ownership back to the caller, which may assume that
569 s is still a valid byte sequence after the call. Note that this is only
570 correct because we know that String.length does not capture its argu‐
571 ment -- it could escape by a side-channel such as a memoization combi‐
572 nator.
573
574 The caller may not mutate s while the string is borrowed (it has tempo‐
575 rarily given up ownership). This affects concurrent programs, but also
576 higher-order functions: if String.length returned a closure to be
577 called later, s should not be mutated until this closure is fully
578 applied and returns ownership.
579
580
581
582 val unsafe_of_string : string -> bytes
583
584 Unsafely convert a shared string to a byte sequence that should not be
585 mutated.
586
587 The same ownership discipline that makes unsafe_to_string correct
588 applies to unsafe_of_string : you may use it if you were the owner of
589 the string value, and you will own the return bytes in the same mode.
590
591 In practice, unique ownership of string values is extremely difficult
592 to reason about correctly. You should always assume strings are shared,
593 never uniquely owned.
594
595 For example, string literals are implicitly shared by the compiler, so
596 you never uniquely own them.
597
598
599 let incorrect = Bytes.unsafe_of_string hello let s = Bytes.of_string
600 hello
601
602 The first declaration is incorrect, because the string literal hello
603 could be shared by the compiler with other parts of the program, and
604 mutating incorrect is a bug. You must always use the second version,
605 which performs a copy and is thus correct.
606
607 Assuming unique ownership of strings that are not string literals, but
608 are (partly) built from string literals, is also incorrect. For exam‐
609 ple, mutating unsafe_of_string ("foo" ^ s) could mutate the shared
610 string foo -- assuming a rope-like representation of strings. More gen‐
611 erally, functions operating on strings will assume shared ownership,
612 they do not preserve unique ownership. It is thus incorrect to assume
613 unique ownership of the result of unsafe_of_string .
614
615 The only case we have reasonable confidence is safe is if the produced
616 bytes is shared -- used as an immutable byte sequence. This is possibly
617 useful for incremental migration of low-level programs that manipulate
618 immutable sequences of bytes (for example Marshal.from_bytes ) and pre‐
619 viously used the string type for this purpose.
620
621
622
623
624 Iterators
625 val to_seq : t -> char Seq.t
626
627 Iterate on the string, in increasing index order. Modifications of the
628 string during iteration will be reflected in the iterator.
629
630
631 Since 4.07
632
633
634
635 val to_seqi : t -> (int * char) Seq.t
636
637 Iterate on the string, in increasing order, yielding indices along
638 chars
639
640
641 Since 4.07
642
643
644
645 val of_seq : char Seq.t -> t
646
647 Create a string from the generator
648
649
650 Since 4.07
651
652
653
654
655 Binary encoding/decoding of integers
656 The functions in this section binary encode and decode integers to and
657 from byte sequences.
658
659 All following functions raise Invalid_argument if the space needed at
660 index i to decode or encode the integer is not available.
661
662 Little-endian (resp. big-endian) encoding means that least (resp. most)
663 significant bytes are stored first. Big-endian is also known as net‐
664 work byte order. Native-endian encoding is either little-endian or
665 big-endian depending on Sys.big_endian .
666
667 32-bit and 64-bit integers are represented by the int32 and int64
668 types, which can be interpreted either as signed or unsigned numbers.
669
670 8-bit and 16-bit integers are represented by the int type, which has
671 more bits than the binary encoding. These extra bits are handled as
672 follows:
673
674 -Functions that decode signed (resp. unsigned) 8-bit or 16-bit integers
675 represented by int values sign-extend (resp. zero-extend) their result.
676
677 -Functions that encode 8-bit or 16-bit integers represented by int val‐
678 ues truncate their input to their least significant bytes.
679
680
681 val get_uint8 : bytes -> int -> int
682
683
684 get_uint8 b i is b 's unsigned 8-bit integer starting at byte index i .
685
686
687 Since 4.08
688
689
690
691 val get_int8 : bytes -> int -> int
692
693
694 get_int8 b i is b 's signed 8-bit integer starting at byte index i .
695
696
697 Since 4.08
698
699
700
701 val get_uint16_ne : bytes -> int -> int
702
703
704 get_uint16_ne b i is b 's native-endian unsigned 16-bit integer start‐
705 ing at byte index i .
706
707
708 Since 4.08
709
710
711
712 val get_uint16_be : bytes -> int -> int
713
714
715 get_uint16_be b i is b 's big-endian unsigned 16-bit integer starting
716 at byte index i .
717
718
719 Since 4.08
720
721
722
723 val get_uint16_le : bytes -> int -> int
724
725
726 get_uint16_le b i is b 's little-endian unsigned 16-bit integer start‐
727 ing at byte index i .
728
729
730 Since 4.08
731
732
733
734 val get_int16_ne : bytes -> int -> int
735
736
737 get_int16_ne b i is b 's native-endian signed 16-bit integer starting
738 at byte index i .
739
740
741 Since 4.08
742
743
744
745 val get_int16_be : bytes -> int -> int
746
747
748 get_int16_be b i is b 's big-endian signed 16-bit integer starting at
749 byte index i .
750
751
752 Since 4.08
753
754
755
756 val get_int16_le : bytes -> int -> int
757
758
759 get_int16_le b i is b 's little-endian signed 16-bit integer starting
760 at byte index i .
761
762
763 Since 4.08
764
765
766
767 val get_int32_ne : bytes -> int -> int32
768
769
770 get_int32_ne b i is b 's native-endian 32-bit integer starting at byte
771 index i .
772
773
774 Since 4.08
775
776
777
778 val get_int32_be : bytes -> int -> int32
779
780
781 get_int32_be b i is b 's big-endian 32-bit integer starting at byte
782 index i .
783
784
785 Since 4.08
786
787
788
789 val get_int32_le : bytes -> int -> int32
790
791
792 get_int32_le b i is b 's little-endian 32-bit integer starting at byte
793 index i .
794
795
796 Since 4.08
797
798
799
800 val get_int64_ne : bytes -> int -> int64
801
802
803 get_int64_ne b i is b 's native-endian 64-bit integer starting at byte
804 index i .
805
806
807 Since 4.08
808
809
810
811 val get_int64_be : bytes -> int -> int64
812
813
814 get_int64_be b i is b 's big-endian 64-bit integer starting at byte
815 index i .
816
817
818 Since 4.08
819
820
821
822 val get_int64_le : bytes -> int -> int64
823
824
825 get_int64_le b i is b 's little-endian 64-bit integer starting at byte
826 index i .
827
828
829 Since 4.08
830
831
832
833 val set_uint8 : bytes -> int -> int -> unit
834
835
836 set_uint8 b i v sets b 's unsigned 8-bit integer starting at byte index
837 i to v .
838
839
840 Since 4.08
841
842
843
844 val set_int8 : bytes -> int -> int -> unit
845
846
847 set_int8 b i v sets b 's signed 8-bit integer starting at byte index i
848 to v .
849
850
851 Since 4.08
852
853
854
855 val set_uint16_ne : bytes -> int -> int -> unit
856
857
858 set_uint16_ne b i v sets b 's native-endian unsigned 16-bit integer
859 starting at byte index i to v .
860
861
862 Since 4.08
863
864
865
866 val set_uint16_be : bytes -> int -> int -> unit
867
868
869 set_uint16_be b i v sets b 's big-endian unsigned 16-bit integer start‐
870 ing at byte index i to v .
871
872
873 Since 4.08
874
875
876
877 val set_uint16_le : bytes -> int -> int -> unit
878
879
880 set_uint16_le b i v sets b 's little-endian unsigned 16-bit integer
881 starting at byte index i to v .
882
883
884 Since 4.08
885
886
887
888 val set_int16_ne : bytes -> int -> int -> unit
889
890
891 set_int16_ne b i v sets b 's native-endian signed 16-bit integer start‐
892 ing at byte index i to v .
893
894
895 Since 4.08
896
897
898
899 val set_int16_be : bytes -> int -> int -> unit
900
901
902 set_int16_be b i v sets b 's big-endian signed 16-bit integer starting
903 at byte index i to v .
904
905
906 Since 4.08
907
908
909
910 val set_int16_le : bytes -> int -> int -> unit
911
912
913 set_int16_le b i v sets b 's little-endian signed 16-bit integer start‐
914 ing at byte index i to v .
915
916
917 Since 4.08
918
919
920
921 val set_int32_ne : bytes -> int -> int32 -> unit
922
923
924 set_int32_ne b i v sets b 's native-endian 32-bit integer starting at
925 byte index i to v .
926
927
928 Since 4.08
929
930
931
932 val set_int32_be : bytes -> int -> int32 -> unit
933
934
935 set_int32_be b i v sets b 's big-endian 32-bit integer starting at byte
936 index i to v .
937
938
939 Since 4.08
940
941
942
943 val set_int32_le : bytes -> int -> int32 -> unit
944
945
946 set_int32_le b i v sets b 's little-endian 32-bit integer starting at
947 byte index i to v .
948
949
950 Since 4.08
951
952
953
954 val set_int64_ne : bytes -> int -> int64 -> unit
955
956
957 set_int64_ne b i v sets b 's native-endian 64-bit integer starting at
958 byte index i to v .
959
960
961 Since 4.08
962
963
964
965 val set_int64_be : bytes -> int -> int64 -> unit
966
967
968 set_int64_be b i v sets b 's big-endian 64-bit integer starting at byte
969 index i to v .
970
971
972 Since 4.08
973
974
975
976 val set_int64_le : bytes -> int -> int64 -> unit
977
978
979 set_int64_le b i v sets b 's little-endian 64-bit integer starting at
980 byte index i to v .
981
982
983 Since 4.08
984
985
986
987
988
989OCamldoc 2019-07-30 Bytes(3)