1binary(3) Erlang Module Definition binary(3)
2
3
4
6 binary - Library for handling binary data.
7
9 This module contains functions for manipulating byte-oriented binaries.
10 Although the majority of functions could be provided using bit-syntax,
11 the functions in this library are highly optimized and are expected to
12 either execute faster or consume less memory, or both, than a counter‐
13 part written in pure Erlang.
14
15 The module is provided according to Erlang Enhancement Proposal (EEP)
16 31.
17
18 Note:
19 The library handles byte-oriented data. For bitstrings that are not
20 binaries (does not contain whole octets of bits) a badarg exception is
21 thrown from any of the functions in this module.
22
23
25 cp()
26
27 Opaque data type representing a compiled search pattern. Guaran‐
28 teed to be a tuple() to allow programs to distinguish it from
29 non-precompiled search patterns.
30
31 part() = {Start :: integer() >= 0, Length :: integer()}
32
33 A representaion of a part (or range) in a binary. Start is a
34 zero-based offset into a binary() and Length is the length of
35 that part. As input to functions in this module, a reverse part
36 specification is allowed, constructed with a negative Length, so
37 that the part of the binary begins at Start + Length and is
38 -Length long. This is useful for referencing the last N bytes of
39 a binary as {size(Binary), -N}. The functions in this module
40 always return part()s with positive Length.
41
43 at(Subject, Pos) -> byte()
44
45 Types:
46
47 Subject = binary()
48 Pos = integer() >= 0
49
50 Returns the byte at position Pos (zero-based) in binary Subject
51 as an integer. If Pos >= byte_size(Subject), a badarg exception
52 is raised.
53
54 bin_to_list(Subject) -> [byte()]
55
56 Types:
57
58 Subject = binary()
59
60 Same as bin_to_list(Subject, {0,byte_size(Subject)}).
61
62 bin_to_list(Subject, PosLen) -> [byte()]
63
64 Types:
65
66 Subject = binary()
67 PosLen = part()
68
69 Converts Subject to a list of byte()s, each representing the
70 value of one byte. part() denotes which part of the binary() to
71 convert.
72
73 Example:
74
75 1> binary:bin_to_list(<<"erlang">>, {1,3}).
76 "rla"
77 %% or [114,108,97] in list notation.
78
79 If PosLen in any way references outside the binary, a badarg
80 exception is raised.
81
82 bin_to_list(Subject, Pos, Len) -> [byte()]
83
84 Types:
85
86 Subject = binary()
87 Pos = integer() >= 0
88 Len = integer()
89
90 Same as bin_to_list(Subject, {Pos, Len}).
91
92 compile_pattern(Pattern) -> cp()
93
94 Types:
95
96 Pattern = binary() | [binary()]
97
98 Builds an internal structure representing a compilation of a
99 search pattern, later to be used in functions match/3,
100 matches/3, split/3, or replace/4. The cp() returned is guaran‐
101 teed to be a tuple() to allow programs to distinguish it from
102 non-precompiled search patterns.
103
104 When a list of binaries is specified, it denotes a set of alter‐
105 native binaries to search for. For example, if [<<"func‐
106 tional">>,<<"programming">>] is specified as Pattern, this means
107 either <<"functional">> or <<"programming">>". The pattern is a
108 set of alternatives; when only a single binary is specified, the
109 set has only one element. The order of alternatives in a pattern
110 is not significant.
111
112 The list of binaries used for search alternatives must be flat
113 and proper.
114
115 If Pattern is not a binary or a flat proper list of binaries
116 with length > 0, a badarg exception is raised.
117
118 copy(Subject) -> binary()
119
120 Types:
121
122 Subject = binary()
123
124 Same as copy(Subject, 1).
125
126 copy(Subject, N) -> binary()
127
128 Types:
129
130 Subject = binary()
131 N = integer() >= 0
132
133 Creates a binary with the content of Subject duplicated N times.
134
135 This function always creates a new binary, even if N = 1. By
136 using copy/1 on a binary referencing a larger binary, one can
137 free up the larger binary for garbage collection.
138
139 Note:
140 By deliberately copying a single binary to avoid referencing a
141 larger binary, one can, instead of freeing up the larger binary
142 for later garbage collection, create much more binary data than
143 needed. Sharing binary data is usually good. Only in special
144 cases, when small parts reference large binaries and the large
145 binaries are no longer used in any process, deliberate copying
146 can be a good idea.
147
148
149 If N < 0, a badarg exception is raised.
150
151 decode_unsigned(Subject) -> Unsigned
152
153 Types:
154
155 Subject = binary()
156 Unsigned = integer() >= 0
157
158 Same as decode_unsigned(Subject, big).
159
160 decode_unsigned(Subject, Endianness) -> Unsigned
161
162 Types:
163
164 Subject = binary()
165 Endianness = big | little
166 Unsigned = integer() >= 0
167
168 Converts the binary digit representation, in big endian or lit‐
169 tle endian, of a positive integer in Subject to an Erlang inte‐
170 ger().
171
172 Example:
173
174 1> binary:decode_unsigned(<<169,138,199>>,big).
175 11111111
176
177 encode_unsigned(Unsigned) -> binary()
178
179 Types:
180
181 Unsigned = integer() >= 0
182
183 Same as encode_unsigned(Unsigned, big).
184
185 encode_unsigned(Unsigned, Endianness) -> binary()
186
187 Types:
188
189 Unsigned = integer() >= 0
190 Endianness = big | little
191
192 Converts a positive integer to the smallest possible representa‐
193 tion in a binary digit representation, either big endian or lit‐
194 tle endian.
195
196 Example:
197
198 1> binary:encode_unsigned(11111111, big).
199 <<169,138,199>>
200
201 first(Subject) -> byte()
202
203 Types:
204
205 Subject = binary()
206
207 Returns the first byte of binary Subject as an integer. If the
208 size of Subject is zero, a badarg exception is raised.
209
210 last(Subject) -> byte()
211
212 Types:
213
214 Subject = binary()
215
216 Returns the last byte of binary Subject as an integer. If the
217 size of Subject is zero, a badarg exception is raised.
218
219 list_to_bin(ByteList) -> binary()
220
221 Types:
222
223 ByteList = iodata()
224
225 Works exactly as erlang:list_to_binary/1, added for complete‐
226 ness.
227
228 longest_common_prefix(Binaries) -> integer() >= 0
229
230 Types:
231
232 Binaries = [binary()]
233
234 Returns the length of the longest common prefix of the binaries
235 in list Binaries.
236
237 Example:
238
239 1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
240 2
241 2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
242 0
243
244 If Binaries is not a flat list of binaries, a badarg exception
245 is raised.
246
247 longest_common_suffix(Binaries) -> integer() >= 0
248
249 Types:
250
251 Binaries = [binary()]
252
253 Returns the length of the longest common suffix of the binaries
254 in list Binaries.
255
256 Example:
257
258 1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
259 3
260 2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
261 0
262
263 If Binaries is not a flat list of binaries, a badarg exception
264 is raised.
265
266 match(Subject, Pattern) -> Found | nomatch
267
268 Types:
269
270 Subject = binary()
271 Pattern = binary() | [binary()] | cp()
272 Found = part()
273
274 Same as match(Subject, Pattern, []).
275
276 match(Subject, Pattern, Options) -> Found | nomatch
277
278 Types:
279
280 Subject = binary()
281 Pattern = binary() | [binary()] | cp()
282 Found = part()
283 Options = [Option]
284 Option = {scope, part()}
285 part() = {Start :: integer() >= 0, Length :: integer()}
286
287 Searches for the first occurrence of Pattern in Subject and
288 returns the position and length.
289
290 The function returns {Pos, Length} for the binary in Pattern,
291 starting at the lowest position in Subject.
292
293 Example:
294
295 1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
296 {1,4}
297
298 Even though <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
299 first and is therefore the first match. If two overlapping
300 matches begin at the same position, the longest is returned.
301
302 Summary of the options:
303
304 {scope, {Start, Length}}:
305 Only the specified part is searched. Return values still
306 have offsets from the beginning of Subject. A negative
307 Length is allowed as described in section Data Types in this
308 manual.
309
310 If none of the strings in Pattern is found, the atom nomatch is
311 returned.
312
313 For a description of Pattern, see function compile_pattern/1.
314
315 If {scope, {Start,Length}} is specified in the options such that
316 Start > size of Subject, Start + Length < 0 or Start + Length >
317 size of Subject, a badarg exception is raised.
318
319 matches(Subject, Pattern) -> Found
320
321 Types:
322
323 Subject = binary()
324 Pattern = binary() | [binary()] | cp()
325 Found = [part()]
326
327 Same as matches(Subject, Pattern, []).
328
329 matches(Subject, Pattern, Options) -> Found
330
331 Types:
332
333 Subject = binary()
334 Pattern = binary() | [binary()] | cp()
335 Found = [part()]
336 Options = [Option]
337 Option = {scope, part()}
338 part() = {Start :: integer() >= 0, Length :: integer()}
339
340 As match/2, but Subject is searched until exhausted and a list
341 of all non-overlapping parts matching Pattern is returned (in
342 order).
343
344 The first and longest match is preferred to a shorter, which is
345 illustrated by the following example:
346
347 1> binary:matches(<<"abcde">>,
348 [<<"bcde">>,<<"bc">>,<<"de">>],[]).
349 [{1,4}]
350
351 The result shows that <<"bcde">> is selected instead of the
352 shorter match <<"bc">> (which would have given raise to one more
353 match, <<"de">>). This corresponds to the behavior of POSIX reg‐
354 ular expressions (and programs like awk), but is not consistent
355 with alternative matches in re (and Perl), where instead lexical
356 ordering in the search pattern selects which string matches.
357
358 If none of the strings in a pattern is found, an empty list is
359 returned.
360
361 For a description of Pattern, see compile_pattern/1. For a
362 description of available options, see match/3.
363
364 If {scope, {Start,Length}} is specified in the options such that
365 Start > size of Subject, Start + Length < 0 or Start + Length is
366 > size of Subject, a badarg exception is raised.
367
368 part(Subject, PosLen) -> binary()
369
370 Types:
371
372 Subject = binary()
373 PosLen = part()
374
375 Extracts the part of binary Subject described by PosLen.
376
377 A negative length can be used to extract bytes at the end of a
378 binary:
379
380 1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
381 2> binary:part(Bin, {byte_size(Bin), -5}).
382 <<6,7,8,9,10>>
383
384 Note:
385 part/2 and part/3 are also available in the erlang module under
386 the names binary_part/2 and binary_part/3. Those BIFs are
387 allowed in guard tests.
388
389
390 If PosLen in any way references outside the binary, a badarg
391 exception is raised.
392
393 part(Subject, Pos, Len) -> binary()
394
395 Types:
396
397 Subject = binary()
398 Pos = integer() >= 0
399 Len = integer()
400
401 Same as part(Subject, {Pos, Len}).
402
403 referenced_byte_size(Binary) -> integer() >= 0
404
405 Types:
406
407 Binary = binary()
408
409 If a binary references a larger binary (often described as being
410 a subbinary), it can be useful to get the size of the referenced
411 binary. This function can be used in a program to trigger the
412 use of copy/1. By copying a binary, one can dereference the
413 original, possibly large, binary that a smaller binary is a ref‐
414 erence to.
415
416 Example:
417
418 store(Binary, GBSet) ->
419 NewBin =
420 case binary:referenced_byte_size(Binary) of
421 Large when Large > 2 * byte_size(Binary) ->
422 binary:copy(Binary);
423 _ ->
424 Binary
425 end,
426 gb_sets:insert(NewBin,GBSet).
427
428 In this example, we chose to copy the binary content before
429 inserting it in gb_sets:set() if it references a binary more
430 than twice the data size we want to keep. Of course, different
431 rules apply when copying to different programs.
432
433 Binary sharing occurs whenever binaries are taken apart. This is
434 the fundamental reason why binaries are fast, decomposition can
435 always be done with O(1) complexity. In rare circumstances this
436 data sharing is however undesirable, why this function together
437 with copy/1 can be useful when optimizing for memory use.
438
439 Example of binary sharing:
440
441 1> A = binary:copy(<<1>>, 100).
442 <<1,1,1,1,1 ...
443 2> byte_size(A).
444 100
445 3> binary:referenced_byte_size(A)
446 100
447 4> <<_:10/binary,B:10/binary,_/binary>> = A.
448 <<1,1,1,1,1 ...
449 5> byte_size(B).
450 10
451 6> binary:referenced_byte_size(B)
452 100
453
454 Note:
455 Binary data is shared among processes. If another process still
456 references the larger binary, copying the part this process uses
457 only consumes more memory and does not free up the larger binary
458 for garbage collection. Use this kind of intrusive functions
459 with extreme care and only if a real problem is detected.
460
461
462 replace(Subject, Pattern, Replacement) -> Result
463
464 Types:
465
466 Subject = binary()
467 Pattern = binary() | [binary()] | cp()
468 Replacement = Result = binary()
469
470 Same as replace(Subject, Pattern, Replacement,[]).
471
472 replace(Subject, Pattern, Replacement, Options) -> Result
473
474 Types:
475
476 Subject = binary()
477 Pattern = binary() | [binary()] | cp()
478 Replacement = binary()
479 Options = [Option]
480 Option = global | {scope, part()} | {insert_replaced, InsPos}
481 InsPos = OnePos | [OnePos]
482 OnePos = integer() >= 0
483 An integer() =< byte_size(Replacement)
484 Result = binary()
485
486 Constructs a new binary by replacing the parts in Subject match‐
487 ing Pattern with the content of Replacement.
488
489 If the matching subpart of Subject giving raise to the replace‐
490 ment is to be inserted in the result, option {insert_replaced,
491 InsPos} inserts the matching part into Replacement at the speci‐
492 fied position (or positions) before inserting Replacement into
493 Subject.
494
495 Example:
496
497 1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
498 <<"a[b]cde">>
499 2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
500 <<"a[b]c[d]e">>
501 3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
502 <<"a[bb]c[dd]e">>
503 4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
504 <<"a[b-b]c[d-d]e">>
505
506 If any position specified in InsPos > size of the replacement
507 binary, a badarg exception is raised.
508
509 Options global and {scope, part()} work as for split/3. The
510 return type is always a binary().
511
512 For a description of Pattern, see compile_pattern/1.
513
514 split(Subject, Pattern) -> Parts
515
516 Types:
517
518 Subject = binary()
519 Pattern = binary() | [binary()] | cp()
520 Parts = [binary()]
521
522 Same as split(Subject, Pattern, []).
523
524 split(Subject, Pattern, Options) -> Parts
525
526 Types:
527
528 Subject = binary()
529 Pattern = binary() | [binary()] | cp()
530 Options = [Option]
531 Option = {scope, part()} | trim | global | trim_all
532 Parts = [binary()]
533
534 Splits Subject into a list of binaries based on Pattern. If
535 option global is not specified, only the first occurrence of
536 Pattern in Subject gives rise to a split.
537
538 The parts of Pattern found in Subject are not included in the
539 result.
540
541 Example:
542
543 1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
544 [<<1,255,4>>, <<2,3>>]
545 2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
546 [<<0,1>>,<<4>>,<<9>>]
547
548 Summary of options:
549
550 {scope, part()}:
551 Works as in match/3 and matches/3. Notice that this only
552 defines the scope of the search for matching strings, it
553 does not cut the binary before splitting. The bytes before
554 and after the scope are kept in the result. See the example
555 below.
556
557 trim:
558 Removes trailing empty parts of the result (as does trim in
559 re:split/3.
560
561 trim_all:
562 Removes all empty parts of the result.
563
564 global:
565 Repeats the split until Subject is exhausted. Conceptually
566 option global makes split work on the positions returned by
567 matches/3, while it normally works on the position returned
568 by match/3.
569
570 Example of the difference between a scope and taking the binary
571 apart before splitting:
572
573 1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
574 [<<"ban">>,<<"na">>]
575 2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
576 [<<"n">>,<<"n">>]
577
578 The return type is always a list of binaries that are all refer‐
579 encing Subject. This means that the data in Subject is not
580 copied to new binaries, and that Subject cannot be garbage col‐
581 lected until the results of the split are no longer referenced.
582
583 For a description of Pattern, see compile_pattern/1.
584
585
586
587Ericsson AB stdlib 3.4.5.1 binary(3)