1binary(3) Erlang Module Definition binary(3)
2
3
4
6 binary - Library for handling binary data.
7
9 This module contains functions for manipulating byte-oriented binaries.
10 Although the majority of functions could be provided using bit-syntax,
11 the functions in this library are highly optimized and are expected to
12 either execute faster or consume less memory, or both, than a counter‐
13 part written in pure Erlang.
14
15 The module is provided according to Erlang Enhancement Proposal (EEP)
16 31.
17
18 Note:
19 The library handles byte-oriented data. For bitstrings that are not bi‐
20 naries (does not contain whole octets of bits) a badarg exception is
21 thrown from any of the functions in this module.
22
23
25 cp()
26
27 Opaque data type representing a compiled search pattern. Guaran‐
28 teed to be a tuple() to allow programs to distinguish it from
29 non-precompiled search patterns.
30
31 part() = {Start :: integer() >= 0, Length :: integer()}
32
33 A representation of a part (or range) in a binary. Start is a
34 zero-based offset into a binary() and Length is the length of
35 that part. As input to functions in this module, a reverse part
36 specification is allowed, constructed with a negative Length, so
37 that the part of the binary begins at Start + Length and is
38 -Length long. This is useful for referencing the last N bytes of
39 a binary as {size(Binary), -N}. The functions in this module al‐
40 ways return part()s with positive Length.
41
43 at(Subject, Pos) -> byte()
44
45 Types:
46
47 Subject = binary()
48 Pos = integer() >= 0
49
50 Returns the byte at position Pos (zero-based) in binary Subject
51 as an integer. If Pos >= byte_size(Subject), a badarg exception
52 is raised.
53
54 bin_to_list(Subject) -> [byte()]
55
56 Types:
57
58 Subject = binary()
59
60 Same as bin_to_list(Subject, {0,byte_size(Subject)}).
61
62 bin_to_list(Subject, PosLen) -> [byte()]
63
64 Types:
65
66 Subject = binary()
67 PosLen = part()
68
69 Converts Subject to a list of byte()s, each representing the
70 value of one byte. part() denotes which part of the binary() to
71 convert.
72
73 Example:
74
75 1> binary:bin_to_list(<<"erlang">>, {1,3}).
76 "rla"
77 %% or [114,108,97] in list notation.
78
79 If PosLen in any way references outside the binary, a badarg ex‐
80 ception is raised.
81
82 bin_to_list(Subject, Pos, Len) -> [byte()]
83
84 Types:
85
86 Subject = binary()
87 Pos = integer() >= 0
88 Len = integer()
89
90 Same as bin_to_list(Subject, {Pos, Len}).
91
92 compile_pattern(Pattern) -> cp()
93
94 Types:
95
96 Pattern = binary() | [binary()]
97
98 Builds an internal structure representing a compilation of a
99 search pattern, later to be used in functions match/3,
100 matches/3, split/3, or replace/4. The cp() returned is guaran‐
101 teed to be a tuple() to allow programs to distinguish it from
102 non-precompiled search patterns.
103
104 When a list of binaries is specified, it denotes a set of alter‐
105 native binaries to search for. For example, if [<<"func‐
106 tional">>,<<"programming">>] is specified as Pattern, this means
107 either <<"functional">> or <<"programming">>". The pattern is a
108 set of alternatives; when only a single binary is specified, the
109 set has only one element. The order of alternatives in a pattern
110 is not significant.
111
112 The list of binaries used for search alternatives must be flat
113 and proper.
114
115 If Pattern is not a binary or a flat proper list of binaries
116 with length > 0, a badarg exception is raised.
117
118 copy(Subject) -> binary()
119
120 Types:
121
122 Subject = binary()
123
124 Same as copy(Subject, 1).
125
126 copy(Subject, N) -> binary()
127
128 Types:
129
130 Subject = binary()
131 N = integer() >= 0
132
133 Creates a binary with the content of Subject duplicated N times.
134
135 This function always creates a new binary, even if N = 1. By us‐
136 ing copy/1 on a binary referencing a larger binary, one can free
137 up the larger binary for garbage collection.
138
139 Note:
140 By deliberately copying a single binary to avoid referencing a
141 larger binary, one can, instead of freeing up the larger binary
142 for later garbage collection, create much more binary data than
143 needed. Sharing binary data is usually good. Only in special
144 cases, when small parts reference large binaries and the large
145 binaries are no longer used in any process, deliberate copying
146 can be a good idea.
147
148
149 If N < 0, a badarg exception is raised.
150
151 decode_unsigned(Subject) -> Unsigned
152
153 Types:
154
155 Subject = binary()
156 Unsigned = integer() >= 0
157
158 Same as decode_unsigned(Subject, big).
159
160 decode_unsigned(Subject, Endianness) -> Unsigned
161
162 Types:
163
164 Subject = binary()
165 Endianness = big | little
166 Unsigned = integer() >= 0
167
168 Converts the binary digit representation, in big endian or lit‐
169 tle endian, of a positive integer in Subject to an Erlang inte‐
170 ger().
171
172 Example:
173
174 1> binary:decode_unsigned(<<169,138,199>>,big).
175 11111111
176
177 encode_unsigned(Unsigned) -> binary()
178
179 Types:
180
181 Unsigned = integer() >= 0
182
183 Same as encode_unsigned(Unsigned, big).
184
185 encode_unsigned(Unsigned, Endianness) -> binary()
186
187 Types:
188
189 Unsigned = integer() >= 0
190 Endianness = big | little
191
192 Converts a positive integer to the smallest possible representa‐
193 tion in a binary digit representation, either big endian or lit‐
194 tle endian.
195
196 Example:
197
198 1> binary:encode_unsigned(11111111, big).
199 <<169,138,199>>
200
201 encode_hex(Bin) -> Bin2
202
203 Types:
204
205 Bin = binary()
206 Bin2 = <<_:_*16>>
207
208 Encodes a binary into a hex encoded binary.
209
210 Example:
211
212 1> binary:encode_hex(<<"f">>).
213 <<"66">>
214
215 decode_hex(Bin) -> Bin2
216
217 Types:
218
219 Bin = <<_:_*16>>
220 Bin2 = binary()
221
222 Decodes a hex encoded binary into a binary.
223
224 Example
225
226 1> binary:decode_hex(<<"66">>).
227 <<"f">>
228
229 first(Subject) -> byte()
230
231 Types:
232
233 Subject = binary()
234
235 Returns the first byte of binary Subject as an integer. If the
236 size of Subject is zero, a badarg exception is raised.
237
238 last(Subject) -> byte()
239
240 Types:
241
242 Subject = binary()
243
244 Returns the last byte of binary Subject as an integer. If the
245 size of Subject is zero, a badarg exception is raised.
246
247 list_to_bin(ByteList) -> binary()
248
249 Types:
250
251 ByteList = iolist()
252
253 Works exactly as erlang:list_to_binary/1, added for complete‐
254 ness.
255
256 longest_common_prefix(Binaries) -> integer() >= 0
257
258 Types:
259
260 Binaries = [binary()]
261
262 Returns the length of the longest common prefix of the binaries
263 in list Binaries.
264
265 Example:
266
267 1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
268 2
269 2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
270 0
271
272 If Binaries is not a flat list of binaries, a badarg exception
273 is raised.
274
275 longest_common_suffix(Binaries) -> integer() >= 0
276
277 Types:
278
279 Binaries = [binary()]
280
281 Returns the length of the longest common suffix of the binaries
282 in list Binaries.
283
284 Example:
285
286 1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
287 3
288 2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
289 0
290
291 If Binaries is not a flat list of binaries, a badarg exception
292 is raised.
293
294 match(Subject, Pattern) -> Found | nomatch
295
296 Types:
297
298 Subject = binary()
299 Pattern = binary() | [binary()] | cp()
300 Found = part()
301
302 Same as match(Subject, Pattern, []).
303
304 match(Subject, Pattern, Options) -> Found | nomatch
305
306 Types:
307
308 Subject = binary()
309 Pattern = binary() | [binary()] | cp()
310 Found = part()
311 Options = [Option]
312 Option = {scope, part()}
313 part() = {Start :: integer() >= 0, Length :: integer()}
314
315 Searches for the first occurrence of Pattern in Subject and re‐
316 turns the position and length.
317
318 The function returns {Pos, Length} for the binary in Pattern,
319 starting at the lowest position in Subject.
320
321 Example:
322
323 1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
324 {1,4}
325
326 Even though <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
327 first and is therefore the first match. If two overlapping
328 matches begin at the same position, the longest is returned.
329
330 Summary of the options:
331
332 {scope, {Start, Length}}:
333 Only the specified part is searched. Return values still
334 have offsets from the beginning of Subject. A negative
335 Length is allowed as described in section Data Types in this
336 manual.
337
338 If none of the strings in Pattern is found, the atom nomatch is
339 returned.
340
341 For a description of Pattern, see function compile_pattern/1.
342
343 If {scope, {Start,Length}} is specified in the options such that
344 Start > size of Subject, Start + Length < 0 or Start + Length >
345 size of Subject, a badarg exception is raised.
346
347 matches(Subject, Pattern) -> Found
348
349 Types:
350
351 Subject = binary()
352 Pattern = binary() | [binary()] | cp()
353 Found = [part()]
354
355 Same as matches(Subject, Pattern, []).
356
357 matches(Subject, Pattern, Options) -> Found
358
359 Types:
360
361 Subject = binary()
362 Pattern = binary() | [binary()] | cp()
363 Found = [part()]
364 Options = [Option]
365 Option = {scope, part()}
366 part() = {Start :: integer() >= 0, Length :: integer()}
367
368 As match/2, but Subject is searched until exhausted and a list
369 of all non-overlapping parts matching Pattern is returned (in
370 order).
371
372 The first and longest match is preferred to a shorter, which is
373 illustrated by the following example:
374
375 1> binary:matches(<<"abcde">>,
376 [<<"bcde">>,<<"bc">>,<<"de">>],[]).
377 [{1,4}]
378
379 The result shows that <<"bcde">> is selected instead of the
380 shorter match <<"bc">> (which would have given raise to one more
381 match, <<"de">>). This corresponds to the behavior of POSIX reg‐
382 ular expressions (and programs like awk), but is not consistent
383 with alternative matches in re (and Perl), where instead lexical
384 ordering in the search pattern selects which string matches.
385
386 If none of the strings in a pattern is found, an empty list is
387 returned.
388
389 For a description of Pattern, see compile_pattern/1. For a de‐
390 scription of available options, see match/3.
391
392 If {scope, {Start,Length}} is specified in the options such that
393 Start > size of Subject, Start + Length < 0 or Start + Length is
394 > size of Subject, a badarg exception is raised.
395
396 part(Subject, PosLen) -> binary()
397
398 Types:
399
400 Subject = binary()
401 PosLen = part()
402
403 Extracts the part of binary Subject described by PosLen.
404
405 A negative length can be used to extract bytes at the end of a
406 binary:
407
408 1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
409 2> binary:part(Bin, {byte_size(Bin), -5}).
410 <<6,7,8,9,10>>
411
412 Note:
413 part/2 and part/3 are also available in the erlang module under
414 the names binary_part/2 and binary_part/3. Those BIFs are al‐
415 lowed in guard tests.
416
417
418 If PosLen in any way references outside the binary, a badarg ex‐
419 ception is raised.
420
421 part(Subject, Pos, Len) -> binary()
422
423 Types:
424
425 Subject = binary()
426 Pos = integer() >= 0
427 Len = integer()
428
429 Same as part(Subject, {Pos, Len}).
430
431 referenced_byte_size(Binary) -> integer() >= 0
432
433 Types:
434
435 Binary = binary()
436
437 If a binary references a larger binary (often described as being
438 a subbinary), it can be useful to get the size of the referenced
439 binary. This function can be used in a program to trigger the
440 use of copy/1. By copying a binary, one can dereference the
441 original, possibly large, binary that a smaller binary is a ref‐
442 erence to.
443
444 Example:
445
446 store(Binary, GBSet) ->
447 NewBin =
448 case binary:referenced_byte_size(Binary) of
449 Large when Large > 2 * byte_size(Binary) ->
450 binary:copy(Binary);
451 _ ->
452 Binary
453 end,
454 gb_sets:insert(NewBin,GBSet).
455
456 In this example, we chose to copy the binary content before in‐
457 serting it in gb_sets:set() if it references a binary more than
458 twice the data size we want to keep. Of course, different rules
459 apply when copying to different programs.
460
461 Binary sharing occurs whenever binaries are taken apart. This is
462 the fundamental reason why binaries are fast, decomposition can
463 always be done with O(1) complexity. In rare circumstances this
464 data sharing is however undesirable, why this function together
465 with copy/1 can be useful when optimizing for memory use.
466
467 Example of binary sharing:
468
469 1> A = binary:copy(<<1>>, 100).
470 <<1,1,1,1,1 ...
471 2> byte_size(A).
472 100
473 3> binary:referenced_byte_size(A).
474 100
475 4> <<B:10/binary, C:90/binary>> = A.
476 <<1,1,1,1,1 ...
477 5> {byte_size(B), binary:referenced_byte_size(B)}.
478 {10,10}
479 6> {byte_size(C), binary:referenced_byte_size(C)}.
480 {90,100}
481
482 In the above example, the small binary B was copied while the
483 larger binary C references binary A.
484
485 Note:
486 Binary data is shared among processes. If another process still
487 references the larger binary, copying the part this process uses
488 only consumes more memory and does not free up the larger binary
489 for garbage collection. Use this kind of intrusive functions
490 with extreme care and only if a real problem is detected.
491
492
493 replace(Subject, Pattern, Replacement) -> Result
494
495 Types:
496
497 Subject = binary()
498 Pattern = binary() | [binary()] | cp()
499 Replacement = Result = binary()
500
501 Same as replace(Subject, Pattern, Replacement,[]).
502
503 replace(Subject, Pattern, Replacement, Options) -> Result
504
505 Types:
506
507 Subject = binary()
508 Pattern = binary() | [binary()] | cp()
509 Replacement = binary()
510 Options = [Option]
511 Option = global | {scope, part()} | {insert_replaced, InsPos}
512 InsPos = OnePos | [OnePos]
513 OnePos = integer() >= 0
514 An integer() =< byte_size(Replacement)
515 Result = binary()
516
517 Constructs a new binary by replacing the parts in Subject match‐
518 ing Pattern with the content of Replacement.
519
520 If the matching subpart of Subject giving raise to the replace‐
521 ment is to be inserted in the result, option {insert_replaced,
522 InsPos} inserts the matching part into Replacement at the speci‐
523 fied position (or positions) before inserting Replacement into
524 Subject.
525
526 Example:
527
528 1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
529 <<"a[b]cde">>
530 2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
531 <<"a[b]c[d]e">>
532 3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
533 <<"a[bb]c[dd]e">>
534 4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
535 <<"a[b-b]c[d-d]e">>
536
537 If any position specified in InsPos > size of the replacement
538 binary, a badarg exception is raised.
539
540 Options global and {scope, part()} work as for split/3. The re‐
541 turn type is always a binary().
542
543 For a description of Pattern, see compile_pattern/1.
544
545 split(Subject, Pattern) -> Parts
546
547 Types:
548
549 Subject = binary()
550 Pattern = binary() | [binary()] | cp()
551 Parts = [binary()]
552
553 Same as split(Subject, Pattern, []).
554
555 split(Subject, Pattern, Options) -> Parts
556
557 Types:
558
559 Subject = binary()
560 Pattern = binary() | [binary()] | cp()
561 Options = [Option]
562 Option = {scope, part()} | trim | global | trim_all
563 Parts = [binary()]
564
565 Splits Subject into a list of binaries based on Pattern. If op‐
566 tion global is not specified, only the first occurrence of Pat‐
567 tern in Subject gives rise to a split.
568
569 The parts of Pattern found in Subject are not included in the
570 result.
571
572 Example:
573
574 1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
575 [<<1,255,4>>, <<2,3>>]
576 2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
577 [<<0,1>>,<<4>>,<<9>>]
578
579 Summary of options:
580
581 {scope, part()}:
582 Works as in match/3 and matches/3. Notice that this only de‐
583 fines the scope of the search for matching strings, it does
584 not cut the binary before splitting. The bytes before and
585 after the scope are kept in the result. See the example be‐
586 low.
587
588 trim:
589 Removes trailing empty parts of the result (as does trim in
590 re:split/3.
591
592 trim_all:
593 Removes all empty parts of the result.
594
595 global:
596 Repeats the split until Subject is exhausted. Conceptually
597 option global makes split work on the positions returned by
598 matches/3, while it normally works on the position returned
599 by match/3.
600
601 Example of the difference between a scope and taking the binary
602 apart before splitting:
603
604 1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
605 [<<"ban">>,<<"na">>]
606 2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
607 [<<"n">>,<<"n">>]
608
609 The return type is always a list of binaries that are all refer‐
610 encing Subject. This means that the data in Subject is not
611 copied to new binaries, and that Subject cannot be garbage col‐
612 lected until the results of the split are no longer referenced.
613
614 For a description of Pattern, see compile_pattern/1.
615
616
617
618Ericsson AB stdlib 3.16.1 binary(3)