1binary(3) Erlang Module Definition binary(3)
2
3
4
6 binary - Library for handling binary data.
7
9 This module contains functions for manipulating byte-oriented binaries.
10 Although the majority of functions could be provided using bit-syntax,
11 the functions in this library are highly optimized and are expected to
12 either execute faster or consume less memory, or both, than a counter‐
13 part written in pure Erlang.
14
15 The module is provided according to Erlang Enhancement Proposal (EEP)
16 31.
17
18 Note:
19 The library handles byte-oriented data. For bitstrings that are not bi‐
20 naries (does not contain whole octets of bits) a badarg exception is
21 thrown from any of the functions in this module.
22
23
25 cp()
26
27 Opaque data type representing a compiled search pattern. Guaran‐
28 teed to be a tuple() to allow programs to distinguish it from
29 non-precompiled search patterns.
30
31 part() = {Start :: integer() >= 0, Length :: integer()}
32
33 A representation of a part (or range) in a binary. Start is a
34 zero-based offset into a binary() and Length is the length of
35 that part. As input to functions in this module, a reverse part
36 specification is allowed, constructed with a negative Length, so
37 that the part of the binary begins at Start + Length and is
38 -Length long. This is useful for referencing the last N bytes of
39 a binary as {size(Binary), -N}. The functions in this module al‐
40 ways return part()s with positive Length.
41
43 at(Subject, Pos) -> byte()
44
45 Types:
46
47 Subject = binary()
48 Pos = integer() >= 0
49
50 Returns the byte at position Pos (zero-based) in binary Subject
51 as an integer. If Pos >= byte_size(Subject), a badarg exception
52 is raised.
53
54 bin_to_list(Subject) -> [byte()]
55
56 Types:
57
58 Subject = binary()
59
60 Same as bin_to_list(Subject, {0,byte_size(Subject)}).
61
62 bin_to_list(Subject, PosLen) -> [byte()]
63
64 Types:
65
66 Subject = binary()
67 PosLen = part()
68
69 Converts Subject to a list of byte()s, each representing the
70 value of one byte. part() denotes which part of the binary() to
71 convert.
72
73 Example:
74
75 1> binary:bin_to_list(<<"erlang">>, {1,3}).
76 "rla"
77 %% or [114,108,97] in list notation.
78
79 If PosLen in any way references outside the binary, a badarg ex‐
80 ception is raised.
81
82 bin_to_list(Subject, Pos, Len) -> [byte()]
83
84 Types:
85
86 Subject = binary()
87 Pos = integer() >= 0
88 Len = integer()
89
90 Same as bin_to_list(Subject, {Pos, Len}).
91
92 compile_pattern(Pattern) -> cp()
93
94 Types:
95
96 Pattern = binary() | [binary()]
97
98 Builds an internal structure representing a compilation of a
99 search pattern, later to be used in functions match/3,
100 matches/3, split/3, or replace/4. The cp() returned is guaran‐
101 teed to be a tuple() to allow programs to distinguish it from
102 non-precompiled search patterns.
103
104 When a list of binaries is specified, it denotes a set of alter‐
105 native binaries to search for. For example, if [<<"func‐
106 tional">>,<<"programming">>] is specified as Pattern, this means
107 either <<"functional">> or <<"programming">>". The pattern is a
108 set of alternatives; when only a single binary is specified, the
109 set has only one element. The order of alternatives in a pattern
110 is not significant.
111
112 The list of binaries used for search alternatives must be flat
113 and proper.
114
115 If Pattern is not a binary or a flat proper list of binaries
116 with length > 0, a badarg exception is raised.
117
118 copy(Subject) -> binary()
119
120 Types:
121
122 Subject = binary()
123
124 Same as copy(Subject, 1).
125
126 copy(Subject, N) -> binary()
127
128 Types:
129
130 Subject = binary()
131 N = integer() >= 0
132
133 Creates a binary with the content of Subject duplicated N times.
134
135 This function always creates a new binary, even if N = 1. By us‐
136 ing copy/1 on a binary referencing a larger binary, one can free
137 up the larger binary for garbage collection.
138
139 Note:
140 By deliberately copying a single binary to avoid referencing a
141 larger binary, one can, instead of freeing up the larger binary
142 for later garbage collection, create much more binary data than
143 needed. Sharing binary data is usually good. Only in special
144 cases, when small parts reference large binaries and the large
145 binaries are no longer used in any process, deliberate copying
146 can be a good idea.
147
148
149 If N < 0, a badarg exception is raised.
150
151 decode_unsigned(Subject) -> Unsigned
152
153 Types:
154
155 Subject = binary()
156 Unsigned = integer() >= 0
157
158 Same as decode_unsigned(Subject, big).
159
160 decode_unsigned(Subject, Endianness) -> Unsigned
161
162 Types:
163
164 Subject = binary()
165 Endianness = big | little
166 Unsigned = integer() >= 0
167
168 Converts the binary digit representation, in big endian or lit‐
169 tle endian, of a positive integer in Subject to an Erlang inte‐
170 ger().
171
172 Example:
173
174 1> binary:decode_unsigned(<<169,138,199>>,big).
175 11111111
176
177 encode_unsigned(Unsigned) -> binary()
178
179 Types:
180
181 Unsigned = integer() >= 0
182
183 Same as encode_unsigned(Unsigned, big).
184
185 encode_unsigned(Unsigned, Endianness) -> binary()
186
187 Types:
188
189 Unsigned = integer() >= 0
190 Endianness = big | little
191
192 Converts a positive integer to the smallest possible representa‐
193 tion in a binary digit representation, either big endian or lit‐
194 tle endian.
195
196 Example:
197
198 1> binary:encode_unsigned(11111111, big).
199 <<169,138,199>>
200
201 encode_hex(Bin) -> Bin2
202
203 encode_hex(Bin, Case) -> Bin2
204
205 Types:
206
207 Bin = binary()
208 Case = lowercase | uppercase
209 Bin2 = <<_:_*16>>
210
211 Encodes a binary into a hex encoded binary using the specified
212 case for the hexadecimal digits "a" to "f".
213
214 The default case is uppercase.
215
216 Example:
217
218 1> binary:encode_hex(<<"f">>).
219 <<"66">>
220 2> binary:encode_hex(<<"/">>).
221 <<"2F">>
222 3> binary:encode_hex(<<"/">>, lowercase).
223 <<"2f">>
224 4> binary:encode_hex(<<"/">>, uppercase).
225 <<"2F">>
226
227
228 decode_hex(Bin) -> Bin2
229
230 Types:
231
232 Bin = <<_:_*16>>
233 Bin2 = binary()
234
235 Decodes a hex encoded binary into a binary.
236
237 Example
238
239 1> binary:decode_hex(<<"66">>).
240 <<"f">>
241
242 first(Subject) -> byte()
243
244 Types:
245
246 Subject = binary()
247
248 Returns the first byte of binary Subject as an integer. If the
249 size of Subject is zero, a badarg exception is raised.
250
251 last(Subject) -> byte()
252
253 Types:
254
255 Subject = binary()
256
257 Returns the last byte of binary Subject as an integer. If the
258 size of Subject is zero, a badarg exception is raised.
259
260 list_to_bin(ByteList) -> binary()
261
262 Types:
263
264 ByteList = iolist()
265
266 Works exactly as erlang:list_to_binary/1, added for complete‐
267 ness.
268
269 longest_common_prefix(Binaries) -> integer() >= 0
270
271 Types:
272
273 Binaries = [binary()]
274
275 Returns the length of the longest common prefix of the binaries
276 in list Binaries.
277
278 Example:
279
280 1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
281 2
282 2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
283 0
284
285 If Binaries is not a flat list of binaries, a badarg exception
286 is raised.
287
288 longest_common_suffix(Binaries) -> integer() >= 0
289
290 Types:
291
292 Binaries = [binary()]
293
294 Returns the length of the longest common suffix of the binaries
295 in list Binaries.
296
297 Example:
298
299 1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
300 3
301 2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
302 0
303
304 If Binaries is not a flat list of binaries, a badarg exception
305 is raised.
306
307 match(Subject, Pattern) -> Found | nomatch
308
309 Types:
310
311 Subject = binary()
312 Pattern = binary() | [binary()] | cp()
313 Found = part()
314
315 Same as match(Subject, Pattern, []).
316
317 match(Subject, Pattern, Options) -> Found | nomatch
318
319 Types:
320
321 Subject = binary()
322 Pattern = binary() | [binary()] | cp()
323 Found = part()
324 Options = [Option]
325 Option = {scope, part()}
326 part() = {Start :: integer() >= 0, Length :: integer()}
327
328 Searches for the first occurrence of Pattern in Subject and re‐
329 turns the position and length.
330
331 The function returns {Pos, Length} for the binary in Pattern,
332 starting at the lowest position in Subject.
333
334 Example:
335
336 1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
337 {1,4}
338
339 Even though <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
340 first and is therefore the first match. If two overlapping
341 matches begin at the same position, the longest is returned.
342
343 Summary of the options:
344
345 {scope, {Start, Length}}:
346 Only the specified part is searched. Return values still
347 have offsets from the beginning of Subject. A negative
348 Length is allowed as described in section Data Types in this
349 manual.
350
351 If none of the strings in Pattern is found, the atom nomatch is
352 returned.
353
354 For a description of Pattern, see function compile_pattern/1.
355
356 If {scope, {Start,Length}} is specified in the options such that
357 Start > size of Subject, Start + Length < 0 or Start + Length >
358 size of Subject, a badarg exception is raised.
359
360 matches(Subject, Pattern) -> Found
361
362 Types:
363
364 Subject = binary()
365 Pattern = binary() | [binary()] | cp()
366 Found = [part()]
367
368 Same as matches(Subject, Pattern, []).
369
370 matches(Subject, Pattern, Options) -> Found
371
372 Types:
373
374 Subject = binary()
375 Pattern = binary() | [binary()] | cp()
376 Found = [part()]
377 Options = [Option]
378 Option = {scope, part()}
379 part() = {Start :: integer() >= 0, Length :: integer()}
380
381 As match/2, but Subject is searched until exhausted and a list
382 of all non-overlapping parts matching Pattern is returned (in
383 order).
384
385 The first and longest match is preferred to a shorter, which is
386 illustrated by the following example:
387
388 1> binary:matches(<<"abcde">>,
389 [<<"bcde">>,<<"bc">>,<<"de">>],[]).
390 [{1,4}]
391
392 The result shows that <<"bcde">> is selected instead of the
393 shorter match <<"bc">> (which would have given raise to one more
394 match, <<"de">>). This corresponds to the behavior of POSIX reg‐
395 ular expressions (and programs like awk), but is not consistent
396 with alternative matches in re (and Perl), where instead lexical
397 ordering in the search pattern selects which string matches.
398
399 If none of the strings in a pattern is found, an empty list is
400 returned.
401
402 For a description of Pattern, see compile_pattern/1. For a de‐
403 scription of available options, see match/3.
404
405 If {scope, {Start,Length}} is specified in the options such that
406 Start > size of Subject, Start + Length < 0 or Start + Length is
407 > size of Subject, a badarg exception is raised.
408
409 part(Subject, PosLen) -> binary()
410
411 Types:
412
413 Subject = binary()
414 PosLen = part()
415
416 Extracts the part of binary Subject described by PosLen.
417
418 A negative length can be used to extract bytes at the end of a
419 binary:
420
421 1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
422 2> binary:part(Bin, {byte_size(Bin), -5}).
423 <<6,7,8,9,10>>
424
425 Note:
426 part/2 and part/3 are also available in the erlang module under
427 the names binary_part/2 and binary_part/3. Those BIFs are al‐
428 lowed in guard tests.
429
430
431 If PosLen in any way references outside the binary, a badarg ex‐
432 ception is raised.
433
434 part(Subject, Pos, Len) -> binary()
435
436 Types:
437
438 Subject = binary()
439 Pos = integer() >= 0
440 Len = integer()
441
442 Same as part(Subject, {Pos, Len}).
443
444 referenced_byte_size(Binary) -> integer() >= 0
445
446 Types:
447
448 Binary = binary()
449
450 If a binary references a larger binary (often described as being
451 a subbinary), it can be useful to get the size of the referenced
452 binary. This function can be used in a program to trigger the
453 use of copy/1. By copying a binary, one can dereference the
454 original, possibly large, binary that a smaller binary is a ref‐
455 erence to.
456
457 Example:
458
459 store(Binary, GBSet) ->
460 NewBin =
461 case binary:referenced_byte_size(Binary) of
462 Large when Large > 2 * byte_size(Binary) ->
463 binary:copy(Binary);
464 _ ->
465 Binary
466 end,
467 gb_sets:insert(NewBin,GBSet).
468
469 In this example, we chose to copy the binary content before in‐
470 serting it in gb_sets:set() if it references a binary more than
471 twice the data size we want to keep. Of course, different rules
472 apply when copying to different programs.
473
474 Binary sharing occurs whenever binaries are taken apart. This is
475 the fundamental reason why binaries are fast, decomposition can
476 always be done with O(1) complexity. In rare circumstances this
477 data sharing is however undesirable, why this function together
478 with copy/1 can be useful when optimizing for memory use.
479
480 Example of binary sharing:
481
482 1> A = binary:copy(<<1>>, 100).
483 <<1,1,1,1,1 ...
484 2> byte_size(A).
485 100
486 3> binary:referenced_byte_size(A).
487 100
488 4> <<B:10/binary, C:90/binary>> = A.
489 <<1,1,1,1,1 ...
490 5> {byte_size(B), binary:referenced_byte_size(B)}.
491 {10,10}
492 6> {byte_size(C), binary:referenced_byte_size(C)}.
493 {90,100}
494
495 In the above example, the small binary B was copied while the
496 larger binary C references binary A.
497
498 Note:
499 Binary data is shared among processes. If another process still
500 references the larger binary, copying the part this process uses
501 only consumes more memory and does not free up the larger binary
502 for garbage collection. Use this kind of intrusive functions
503 with extreme care and only if a real problem is detected.
504
505
506 replace(Subject, Pattern, Replacement) -> Result
507
508 Types:
509
510 Subject = binary()
511 Pattern = binary() | [binary()] | cp()
512 Replacement = Result = binary()
513
514 Same as replace(Subject, Pattern, Replacement,[]).
515
516 replace(Subject, Pattern, Replacement, Options) -> Result
517
518 Types:
519
520 Subject = binary()
521 Pattern = binary() | [binary()] | cp()
522 Replacement = binary()
523 Options = [Option]
524 Option = global | {scope, part()} | {insert_replaced, InsPos}
525 InsPos = OnePos | [OnePos]
526 OnePos = integer() >= 0
527 An integer() =< byte_size(Replacement)
528 Result = binary()
529
530 Constructs a new binary by replacing the parts in Subject match‐
531 ing Pattern with the content of Replacement.
532
533 If the matching subpart of Subject giving raise to the replace‐
534 ment is to be inserted in the result, option {insert_replaced,
535 InsPos} inserts the matching part into Replacement at the speci‐
536 fied position (or positions) before inserting Replacement into
537 Subject.
538
539 Example:
540
541 1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
542 <<"a[b]cde">>
543 2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
544 <<"a[b]c[d]e">>
545 3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
546 <<"a[bb]c[dd]e">>
547 4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
548 <<"a[b-b]c[d-d]e">>
549
550 If any position specified in InsPos > size of the replacement
551 binary, a badarg exception is raised.
552
553 Options global and {scope, part()} work as for split/3. The re‐
554 turn type is always a binary().
555
556 For a description of Pattern, see compile_pattern/1.
557
558 split(Subject, Pattern) -> Parts
559
560 Types:
561
562 Subject = binary()
563 Pattern = binary() | [binary()] | cp()
564 Parts = [binary()]
565
566 Same as split(Subject, Pattern, []).
567
568 split(Subject, Pattern, Options) -> Parts
569
570 Types:
571
572 Subject = binary()
573 Pattern = binary() | [binary()] | cp()
574 Options = [Option]
575 Option = {scope, part()} | trim | global | trim_all
576 Parts = [binary()]
577
578 Splits Subject into a list of binaries based on Pattern. If op‐
579 tion global is not specified, only the first occurrence of Pat‐
580 tern in Subject gives rise to a split.
581
582 The parts of Pattern found in Subject are not included in the
583 result.
584
585 Example:
586
587 1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
588 [<<1,255,4>>, <<2,3>>]
589 2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
590 [<<0,1>>,<<4>>,<<9>>]
591
592 Summary of options:
593
594 {scope, part()}:
595 Works as in match/3 and matches/3. Notice that this only de‐
596 fines the scope of the search for matching strings, it does
597 not cut the binary before splitting. The bytes before and
598 after the scope are kept in the result. See the example be‐
599 low.
600
601 trim:
602 Removes trailing empty parts of the result (as does trim in
603 re:split/3.
604
605 trim_all:
606 Removes all empty parts of the result.
607
608 global:
609 Repeats the split until Subject is exhausted. Conceptually
610 option global makes split work on the positions returned by
611 matches/3, while it normally works on the position returned
612 by match/3.
613
614 Example of the difference between a scope and taking the binary
615 apart before splitting:
616
617 1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
618 [<<"ban">>,<<"na">>]
619 2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
620 [<<"n">>,<<"n">>]
621
622 The return type is always a list of binaries that are all refer‐
623 encing Subject. This means that the data in Subject is not
624 copied to new binaries, and that Subject cannot be garbage col‐
625 lected until the results of the split are no longer referenced.
626
627 For a description of Pattern, see compile_pattern/1.
628
629
630
631Ericsson AB stdlib 5.1.1 binary(3)