1Bytes(3) OCaml library Bytes(3)
2
3
4
6 Bytes - Byte sequence operations.
7
9 Module Bytes
10
12 Module Bytes
13 : sig end
14
15
16 Byte sequence operations.
17
18 A byte sequence is a mutable data structure that contains a
19 fixed-length sequence of bytes. Each byte can be indexed in constant
20 time for reading or writing.
21
22 Given a byte sequence s of length l , we can access each of the l bytes
23 of s via its index in the sequence. Indexes start at 0 , and we will
24 call an index valid in s if it falls within the range [0...l-1] (inclu‐
25 sive). A position is the point between two bytes or at the beginning or
26 end of the sequence. We call a position valid in s if it falls within
27 the range [0...l] (inclusive). Note that the byte at index n is between
28 positions n and n+1 .
29
30 Two parameters start and len are said to designate a valid range of s
31 if len >= 0 and start and start+len are valid positions in s .
32
33 Byte sequences can be modified in place, for instance via the set and
34 blit functions described below. See also strings (module String ),
35 which are almost the same data structure, but cannot be modified in
36 place.
37
38 Bytes are represented by the OCaml type char .
39
40
41 Since 4.02.0
42
43
44
45
46
47
48 val length : bytes -> int
49
50 Return the length (number of bytes) of the argument.
51
52
53
54 val get : bytes -> int -> char
55
56
57 get s n returns the byte at index n in argument s .
58
59 Raise Invalid_argument if n is not a valid index in s .
60
61
62
63 val set : bytes -> int -> char -> unit
64
65
66 set s n c modifies s in place, replacing the byte at index n with c .
67
68 Raise Invalid_argument if n is not a valid index in s .
69
70
71
72 val create : int -> bytes
73
74
75 create n returns a new byte sequence of length n . The sequence is
76 uninitialized and contains arbitrary bytes.
77
78 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
79
80
81
82 val make : int -> char -> bytes
83
84
85 make n c returns a new byte sequence of length n , filled with the byte
86 c .
87
88 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
89
90
91
92 val init : int -> (int -> char) -> bytes
93
94
95 Bytes.init n f returns a fresh byte sequence of length n , with charac‐
96 ter i initialized to the result of f i (in increasing index order).
97
98 Raise Invalid_argument if n < 0 or n > Sys.max_string_length .
99
100
101
102 val empty : bytes
103
104 A byte sequence of size 0.
105
106
107
108 val copy : bytes -> bytes
109
110 Return a new byte sequence that contains the same bytes as the argu‐
111 ment.
112
113
114
115 val of_string : string -> bytes
116
117 Return a new byte sequence that contains the same bytes as the given
118 string.
119
120
121
122 val to_string : bytes -> string
123
124 Return a new string that contains the same bytes as the given byte
125 sequence.
126
127
128
129 val sub : bytes -> int -> int -> bytes
130
131
132 sub s start len returns a new byte sequence of length len , containing
133 the subsequence of s that starts at position start and has length len .
134
135 Raise Invalid_argument if start and len do not designate a valid range
136 of s .
137
138
139
140 val sub_string : bytes -> int -> int -> string
141
142 Same as sub but return a string instead of a byte sequence.
143
144
145
146 val extend : bytes -> int -> int -> bytes
147
148
149 extend s left right returns a new byte sequence that contains the bytes
150 of s , with left uninitialized bytes prepended and right uninitialized
151 bytes appended to it. If left or right is negative, then bytes are
152 removed (instead of appended) from the corresponding side of s .
153
154 Raise Invalid_argument if the result length is negative or longer than
155 Sys.max_string_length bytes.
156
157
158
159 val fill : bytes -> int -> int -> char -> unit
160
161
162 fill s start len c modifies s in place, replacing len characters with c
163 , starting at start .
164
165 Raise Invalid_argument if start and len do not designate a valid range
166 of s .
167
168
169
170 val blit : bytes -> int -> bytes -> int -> int -> unit
171
172
173 blit src srcoff dst dstoff len copies len bytes from sequence src ,
174 starting at index srcoff , to sequence dst , starting at index dstoff .
175 It works correctly even if src and dst are the same byte sequence, and
176 the source and destination intervals overlap.
177
178 Raise Invalid_argument if srcoff and len do not designate a valid range
179 of src , or if dstoff and len do not designate a valid range of dst .
180
181
182
183 val blit_string : string -> int -> bytes -> int -> int -> unit
184
185
186 blit src srcoff dst dstoff len copies len bytes from string src ,
187 starting at index srcoff , to byte sequence dst , starting at index
188 dstoff .
189
190 Raise Invalid_argument if srcoff and len do not designate a valid range
191 of src , or if dstoff and len do not designate a valid range of dst .
192
193
194
195 val concat : bytes -> bytes list -> bytes
196
197
198 concat sep sl concatenates the list of byte sequences sl , inserting
199 the separator byte sequence sep between each, and returns the result as
200 a new byte sequence.
201
202 Raise Invalid_argument if the result is longer than
203 Sys.max_string_length bytes.
204
205
206
207 val cat : bytes -> bytes -> bytes
208
209
210 cat s1 s2 concatenates s1 and s2 and returns the result as new byte
211 sequence.
212
213 Raise Invalid_argument if the result is longer than
214 Sys.max_string_length bytes.
215
216
217
218 val iter : (char -> unit) -> bytes -> unit
219
220
221 iter f s applies function f in turn to all the bytes of s . It is
222 equivalent to f (get s 0); f (get s 1); ...; f (get s (length s - 1));
223 () .
224
225
226
227 val iteri : (int -> char -> unit) -> bytes -> unit
228
229 Same as Bytes.iter , but the function is applied to the index of the
230 byte as first argument and the byte itself as second argument.
231
232
233
234 val map : (char -> char) -> bytes -> bytes
235
236
237 map f s applies function f in turn to all the bytes of s (in increasing
238 index order) and stores the resulting bytes in a new sequence that is
239 returned as the result.
240
241
242
243 val mapi : (int -> char -> char) -> bytes -> bytes
244
245
246 mapi f s calls f with each character of s and its index (in increasing
247 index order) and stores the resulting bytes in a new sequence that is
248 returned as the result.
249
250
251
252 val trim : bytes -> bytes
253
254 Return a copy of the argument, without leading and trailing whitespace.
255 The bytes regarded as whitespace are the ASCII characters ' ' , '\012'
256 , '\n' , '\r' , and '\t' .
257
258
259
260 val escaped : bytes -> bytes
261
262 Return a copy of the argument, with special characters represented by
263 escape sequences, following the lexical conventions of OCaml. All
264 characters outside the ASCII printable range (32..126) are escaped, as
265 well as backslash and double-quote.
266
267 Raise Invalid_argument if the result is longer than
268 Sys.max_string_length bytes.
269
270
271
272 val index : bytes -> char -> int
273
274
275 index s c returns the index of the first occurrence of byte c in s .
276
277 Raise Not_found if c does not occur in s .
278
279
280
281 val index_opt : bytes -> char -> int option
282
283
284 index_opt s c returns the index of the first occurrence of byte c in s
285 or None if c does not occur in s .
286
287
288 Since 4.05
289
290
291
292 val rindex : bytes -> char -> int
293
294
295 rindex s c returns the index of the last occurrence of byte c in s .
296
297 Raise Not_found if c does not occur in s .
298
299
300
301 val rindex_opt : bytes -> char -> int option
302
303
304 rindex_opt s c returns the index of the last occurrence of byte c in s
305 or None if c does not occur in s .
306
307
308 Since 4.05
309
310
311
312 val index_from : bytes -> int -> char -> int
313
314
315 index_from s i c returns the index of the first occurrence of byte c in
316 s after position i . Bytes.index s c is equivalent to Bytes.index_from
317 s 0 c .
318
319 Raise Invalid_argument if i is not a valid position in s . Raise
320 Not_found if c does not occur in s after position i .
321
322
323
324 val index_from_opt : bytes -> int -> char -> int option
325
326
327 index_from _opts i c returns the index of the first occurrence of byte
328 c in s after position i or None if c does not occur in s after position
329 i . Bytes.index_opt s c is equivalent to Bytes.index_from_opt s 0 c .
330
331 Raise Invalid_argument if i is not a valid position in s .
332
333
334 Since 4.05
335
336
337
338 val rindex_from : bytes -> int -> char -> int
339
340
341 rindex_from s i c returns the index of the last occurrence of byte c in
342 s before position i+1 . rindex s c is equivalent to rindex_from s
343 (Bytes.length s - 1) c .
344
345 Raise Invalid_argument if i+1 is not a valid position in s . Raise
346 Not_found if c does not occur in s before position i+1 .
347
348
349
350 val rindex_from_opt : bytes -> int -> char -> int option
351
352
353 rindex_from_opt s i c returns the index of the last occurrence of byte
354 c in s before position i+1 or None if c does not occur in s before
355 position i+1 . rindex_opt s c is equivalent to rindex_from s
356 (Bytes.length s - 1) c .
357
358 Raise Invalid_argument if i+1 is not a valid position in s .
359
360
361 Since 4.05
362
363
364
365 val contains : bytes -> char -> bool
366
367
368 contains s c tests if byte c appears in s .
369
370
371
372 val contains_from : bytes -> int -> char -> bool
373
374
375 contains_from s start c tests if byte c appears in s after position
376 start . contains s c is equivalent to contains_from s 0 c .
377
378 Raise Invalid_argument if start is not a valid position in s .
379
380
381
382 val rcontains_from : bytes -> int -> char -> bool
383
384
385 rcontains_from s stop c tests if byte c appears in s before position
386 stop+1 .
387
388 Raise Invalid_argument if stop < 0 or stop+1 is not a valid position in
389 s .
390
391
392
393 val uppercase : bytes -> bytes
394
395 Deprecated. Functions operating on Latin-1 character set are depre‐
396 cated.
397
398
399 Return a copy of the argument, with all lowercase letters translated to
400 uppercase, including accented letters of the ISO Latin-1 (8859-1) char‐
401 acter set.
402
403
404
405 val lowercase : bytes -> bytes
406
407 Deprecated. Functions operating on Latin-1 character set are depre‐
408 cated.
409
410
411 Return a copy of the argument, with all uppercase letters translated to
412 lowercase, including accented letters of the ISO Latin-1 (8859-1) char‐
413 acter set.
414
415
416
417 val capitalize : bytes -> bytes
418
419 Deprecated. Functions operating on Latin-1 character set are depre‐
420 cated.
421
422
423 Return a copy of the argument, with the first character set to upper‐
424 case, using the ISO Latin-1 (8859-1) character set..
425
426
427
428 val uncapitalize : bytes -> bytes
429
430 Deprecated. Functions operating on Latin-1 character set are depre‐
431 cated.
432
433
434 Return a copy of the argument, with the first character set to lower‐
435 case, using the ISO Latin-1 (8859-1) character set..
436
437
438
439 val uppercase_ascii : bytes -> bytes
440
441 Return a copy of the argument, with all lowercase letters translated to
442 uppercase, using the US-ASCII character set.
443
444
445 Since 4.03.0
446
447
448
449 val lowercase_ascii : bytes -> bytes
450
451 Return a copy of the argument, with all uppercase letters translated to
452 lowercase, using the US-ASCII character set.
453
454
455 Since 4.03.0
456
457
458
459 val capitalize_ascii : bytes -> bytes
460
461 Return a copy of the argument, with the first character set to upper‐
462 case, using the US-ASCII character set.
463
464
465 Since 4.03.0
466
467
468
469 val uncapitalize_ascii : bytes -> bytes
470
471 Return a copy of the argument, with the first character set to lower‐
472 case, using the US-ASCII character set.
473
474
475 Since 4.03.0
476
477
478 type t = bytes
479
480
481 An alias for the type of byte sequences.
482
483
484
485 val compare : t -> t -> int
486
487 The comparison function for byte sequences, with the same specification
488 as Pervasives.compare . Along with the type t , this function compare
489 allows the module Bytes to be passed as argument to the functors
490 Set.Make and Map.Make .
491
492
493
494 val equal : t -> t -> bool
495
496 The equality function for byte sequences.
497
498
499 Since 4.03.0
500
501
502
503
504 === Unsafe conversions (for advanced users) This section describes
505 unsafe, low-level conversion functions between bytes and string. They
506 do not copy the internal data; used improperly, they can break the
507 immutability invariant on strings provided by the -safe-string option.
508 They are available for expert library authors, but for most purposes
509 you should use the always-correct Bytes.to_string and Bytes.of_string
510 instead. ===
511
512
513 val unsafe_to_string : bytes -> string
514
515 Unsafely convert a byte sequence into a string.
516
517 To reason about the use of unsafe_to_string , it is convenient to con‐
518 sider an "ownership" discipline. A piece of code that manipulates some
519 data "owns" it; there are several disjoint ownership modes, including:
520
521 -Unique ownership: the data may be accessed and mutated
522
523 -Shared ownership: the data has several owners, that may only access
524 it, not mutate it.
525
526 Unique ownership is linear: passing the data to another piece of code
527 means giving up ownership (we cannot write the data again). A unique
528 owner may decide to make the data shared (giving up mutation rights on
529 it), but shared data may not become uniquely-owned again.
530
531
532 unsafe_to_string s can only be used when the caller owns the byte
533 sequence s -- either uniquely or as shared immutable data. The caller
534 gives up ownership of s , and gains ownership of the returned string.
535
536 There are two valid use-cases that respect this ownership discipline:
537
538 1. Creating a string by initializing and mutating a byte sequence that
539 is never changed after initialization is performed.
540
541
542 let string_init len f : string = let s = Bytes.create len in for i = 0
543 to len - 1 do Bytes.set s i (f i) done; Bytes.unsafe_to_string s
544
545 This function is safe because the byte sequence s will never be
546 accessed or mutated after unsafe_to_string is called. The string_init
547 code gives up ownership of s , and returns the ownership of the result‐
548 ing string to its caller.
549
550 Note that it would be unsafe if s was passed as an additional parameter
551 to the function f as it could escape this way and be mutated in the
552 future -- string_init would give up ownership of s to pass it to f ,
553 and could not call unsafe_to_string safely.
554
555 We have provided the String.init , String.map and String.mapi functions
556 to cover most cases of building new strings. You should prefer those
557 over to_string or unsafe_to_string whenever applicable.
558
559 2. Temporarily giving ownership of a byte sequence to a function that
560 expects a uniquely owned string and returns ownership back, so that we
561 can mutate the sequence again after the call ended.
562
563
564 let bytes_length (s : bytes) = String.length (Bytes.unsafe_to_string s)
565
566 In this use-case, we do not promise that s will never be mutated after
567 the call to bytes_length s . The String.length function temporarily
568 borrows unique ownership of the byte sequence (and sees it as a string
569 ), but returns this ownership back to the caller, which may assume that
570 s is still a valid byte sequence after the call. Note that this is only
571 correct because we know that String.length does not capture its argu‐
572 ment -- it could escape by a side-channel such as a memoization combi‐
573 nator.
574
575 The caller may not mutate s while the string is borrowed (it has tempo‐
576 rarily given up ownership). This affects concurrent programs, but also
577 higher-order functions: if String.length returned a closure to be
578 called later, s should not be mutated until this closure is fully
579 applied and returns ownership.
580
581
582
583 val unsafe_of_string : string -> bytes
584
585 Unsafely convert a shared string to a byte sequence that should not be
586 mutated.
587
588 The same ownership discipline that makes unsafe_to_string correct
589 applies to unsafe_of_string : you may use it if you were the owner of
590 the string value, and you will own the return bytes in the same mode.
591
592 In practice, unique ownership of string values is extremely difficult
593 to reason about correctly. You should always assume strings are shared,
594 never uniquely owned.
595
596 For example, string literals are implicitly shared by the compiler, so
597 you never uniquely own them.
598
599
600 let incorrect = Bytes.unsafe_of_string hello let s = Bytes.of_string
601 hello
602
603 The first declaration is incorrect, because the string literal hello
604 could be shared by the compiler with other parts of the program, and
605 mutating incorrect is a bug. You must always use the second version,
606 which performs a copy and is thus correct.
607
608 Assuming unique ownership of strings that are not string literals, but
609 are (partly) built from string literals, is also incorrect. For exam‐
610 ple, mutating unsafe_of_string ("foo" ^ s) could mutate the shared
611 string foo -- assuming a rope-like representation of strings. More gen‐
612 erally, functions operating on strings will assume shared ownership,
613 they do not preserve unique ownership. It is thus incorrect to assume
614 unique ownership of the result of unsafe_of_string .
615
616 The only case we have reasonable confidence is safe is if the produced
617 bytes is shared -- used as an immutable byte sequence. This is possibly
618 useful for incremental migration of low-level programs that manipulate
619 immutable sequences of bytes (for example Marshal.from_bytes ) and pre‐
620 viously used the string type for this purpose.
621
622
623
624
625 === Iterators ===
626
627
628 val to_seq : t -> char Seq.t
629
630 Iterate on the string, in increasing index order. Modifications of the
631 string during iteration will be reflected in the iterator.
632
633
634 Since 4.07
635
636
637
638 val to_seqi : t -> (int * char) Seq.t
639
640 Iterate on the string, in increasing order, yielding indices along
641 chars
642
643
644 Since 4.07
645
646
647
648 val of_seq : char Seq.t -> t
649
650 Create a string from the generator
651
652
653 Since 4.07
654
655
656
657
658
659OCamldoc 2019-02-02 Bytes(3)