1binary(3)                  Erlang Module Definition                  binary(3)
2
3
4

NAME

6       binary - Library for handling binary data.
7

DESCRIPTION

9       This module contains functions for manipulating byte-oriented binaries.
10       Although the majority of functions could be provided using  bit-syntax,
11       the  functions in this library are highly optimized and are expected to
12       either execute faster or consume less memory, or both, than a  counter‐
13       part written in pure Erlang.
14
15       The  module  is provided according to Erlang Enhancement Proposal (EEP)
16       31.
17
18   Note:
19       The library handles byte-oriented data. For bitstrings that are not bi‐
20       naries  (does  not  contain whole octets of bits) a badarg exception is
21       thrown from any of the functions in this module.
22
23

DATA TYPES

25       cp()
26
27              Opaque data type representing a compiled search pattern. Guaran‐
28              teed  to  be  a tuple() to allow programs to distinguish it from
29              non-precompiled search patterns.
30
31       part() = {Start :: integer() >= 0, Length :: integer()}
32
33              A representation of a part (or range) in a binary.  Start  is  a
34              zero-based  offset  into  a binary() and Length is the length of
35              that part. As input to functions in this module, a reverse  part
36              specification is allowed, constructed with a negative Length, so
37              that the part of the binary begins at  Start  +  Length  and  is
38              -Length long. This is useful for referencing the last N bytes of
39              a binary as {size(Binary), -N}. The functions in this module al‐
40              ways return part()s with positive Length.
41

EXPORTS

43       at(Subject, Pos) -> byte()
44
45              Types:
46
47                 Subject = binary()
48                 Pos = integer() >= 0
49
50              Returns  the byte at position Pos (zero-based) in binary Subject
51              as an integer. If Pos >= byte_size(Subject), a badarg  exception
52              is raised.
53
54       bin_to_list(Subject) -> [byte()]
55
56              Types:
57
58                 Subject = binary()
59
60              Same as bin_to_list(Subject, {0,byte_size(Subject)}).
61
62       bin_to_list(Subject, PosLen) -> [byte()]
63
64              Types:
65
66                 Subject = binary()
67                 PosLen = part()
68
69              Converts  Subject  to  a  list of byte()s, each representing the
70              value of one byte. part() denotes which part of the binary()  to
71              convert.
72
73              Example:
74
75              1> binary:bin_to_list(<<"erlang">>, {1,3}).
76              "rla"
77              %% or [114,108,97] in list notation.
78
79              If PosLen in any way references outside the binary, a badarg ex‐
80              ception is raised.
81
82       bin_to_list(Subject, Pos, Len) -> [byte()]
83
84              Types:
85
86                 Subject = binary()
87                 Pos = integer() >= 0
88                 Len = integer()
89
90              Same as bin_to_list(Subject, {Pos, Len}).
91
92       compile_pattern(Pattern) -> cp()
93
94              Types:
95
96                 Pattern = binary() | [binary()]
97
98              Builds an internal structure representing  a  compilation  of  a
99              search   pattern,   later  to  be  used  in  functions  match/3,
100              matches/3, split/3, or replace/4. The cp() returned  is  guaran‐
101              teed  to  be  a tuple() to allow programs to distinguish it from
102              non-precompiled search patterns.
103
104              When a list of binaries is specified, it denotes a set of alter‐
105              native  binaries  to  search  for.  For  example,  if  [<<"func‐
106              tional">>,<<"programming">>] is specified as Pattern, this means
107              either  <<"functional">> or <<"programming">>". The pattern is a
108              set of alternatives; when only a single binary is specified, the
109              set has only one element. The order of alternatives in a pattern
110              is not significant.
111
112              The list of binaries used for search alternatives must  be  flat
113              and proper.
114
115              If  Pattern  is  not  a binary or a flat proper list of binaries
116              with length > 0, a badarg exception is raised.
117
118       copy(Subject) -> binary()
119
120              Types:
121
122                 Subject = binary()
123
124              Same as copy(Subject, 1).
125
126       copy(Subject, N) -> binary()
127
128              Types:
129
130                 Subject = binary()
131                 N = integer() >= 0
132
133              Creates a binary with the content of Subject duplicated N times.
134
135              This function always creates a new binary, even if N = 1. By us‐
136              ing copy/1 on a binary referencing a larger binary, one can free
137              up the larger binary for garbage collection.
138
139          Note:
140              By deliberately copying a single binary to avoid  referencing  a
141              larger  binary, one can, instead of freeing up the larger binary
142              for later garbage collection, create much more binary data  than
143              needed.  Sharing  binary  data  is usually good. Only in special
144              cases, when small parts reference large binaries and  the  large
145              binaries  are  no longer used in any process, deliberate copying
146              can be a good idea.
147
148
149              If N < 0, a badarg exception is raised.
150
151       decode_unsigned(Subject) -> Unsigned
152
153              Types:
154
155                 Subject = binary()
156                 Unsigned = integer() >= 0
157
158              Same as decode_unsigned(Subject, big).
159
160       decode_unsigned(Subject, Endianness) -> Unsigned
161
162              Types:
163
164                 Subject = binary()
165                 Endianness = big | little
166                 Unsigned = integer() >= 0
167
168              Converts the binary digit representation, in big endian or  lit‐
169              tle  endian, of a positive integer in Subject to an Erlang inte‐
170              ger().
171
172              Example:
173
174              1> binary:decode_unsigned(<<169,138,199>>,big).
175              11111111
176
177       encode_unsigned(Unsigned) -> binary()
178
179              Types:
180
181                 Unsigned = integer() >= 0
182
183              Same as encode_unsigned(Unsigned, big).
184
185       encode_unsigned(Unsigned, Endianness) -> binary()
186
187              Types:
188
189                 Unsigned = integer() >= 0
190                 Endianness = big | little
191
192              Converts a positive integer to the smallest possible representa‐
193              tion in a binary digit representation, either big endian or lit‐
194              tle endian.
195
196              Example:
197
198              1> binary:encode_unsigned(11111111, big).
199              <<169,138,199>>
200
201       encode_hex(Bin) -> Bin2
202
203              Types:
204
205                 Bin = binary()
206                 Bin2 = <<_:_*16>>
207
208              Encodes a binary into a hex encoded binary.
209
210              Example:
211
212              1> binary:encode_hex(<<"f">>).
213              <<"66">>
214
215       decode_hex(Bin) -> Bin2
216
217              Types:
218
219                 Bin = <<_:_*16>>
220                 Bin2 = binary()
221
222              Decodes a hex encoded binary into a binary.
223
224              Example
225
226              1> binary:decode_hex(<<"66">>).
227              <<"f">>
228
229       first(Subject) -> byte()
230
231              Types:
232
233                 Subject = binary()
234
235              Returns the first byte of binary Subject as an integer.  If  the
236              size of Subject is zero, a badarg exception is raised.
237
238       last(Subject) -> byte()
239
240              Types:
241
242                 Subject = binary()
243
244              Returns  the  last  byte of binary Subject as an integer. If the
245              size of Subject is zero, a badarg exception is raised.
246
247       list_to_bin(ByteList) -> binary()
248
249              Types:
250
251                 ByteList = iolist()
252
253              Works exactly as erlang:list_to_binary/1,  added  for  complete‐
254              ness.
255
256       longest_common_prefix(Binaries) -> integer() >= 0
257
258              Types:
259
260                 Binaries = [binary()]
261
262              Returns  the length of the longest common prefix of the binaries
263              in list Binaries.
264
265              Example:
266
267              1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
268              2
269              2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
270              0
271
272              If Binaries is not a flat list of binaries, a  badarg  exception
273              is raised.
274
275       longest_common_suffix(Binaries) -> integer() >= 0
276
277              Types:
278
279                 Binaries = [binary()]
280
281              Returns  the length of the longest common suffix of the binaries
282              in list Binaries.
283
284              Example:
285
286              1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
287              3
288              2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
289              0
290
291              If Binaries is not a flat list of binaries, a  badarg  exception
292              is raised.
293
294       match(Subject, Pattern) -> Found | nomatch
295
296              Types:
297
298                 Subject = binary()
299                 Pattern = binary() | [binary()] | cp()
300                 Found = part()
301
302              Same as match(Subject, Pattern, []).
303
304       match(Subject, Pattern, Options) -> Found | nomatch
305
306              Types:
307
308                 Subject = binary()
309                 Pattern = binary() | [binary()] | cp()
310                 Found = part()
311                 Options = [Option]
312                 Option = {scope, part()}
313                 part() = {Start :: integer() >= 0, Length :: integer()}
314
315              Searches  for the first occurrence of Pattern in Subject and re‐
316              turns the position and length.
317
318              The function returns {Pos, Length} for the  binary  in  Pattern,
319              starting at the lowest position in Subject.
320
321              Example:
322
323              1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
324              {1,4}
325
326              Even  though  <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
327              first and is therefore  the  first  match.  If  two  overlapping
328              matches begin at the same position, the longest is returned.
329
330              Summary of the options:
331
332                {scope, {Start, Length}}:
333                  Only  the  specified  part  is searched. Return values still
334                  have offsets from  the  beginning  of  Subject.  A  negative
335                  Length is allowed as described in section Data Types in this
336                  manual.
337
338              If none of the strings in Pattern is found, the atom nomatch  is
339              returned.
340
341              For a description of Pattern, see function compile_pattern/1.
342
343              If {scope, {Start,Length}} is specified in the options such that
344              Start > size of Subject, Start + Length < 0 or Start + Length  >
345              size of Subject, a badarg exception is raised.
346
347       matches(Subject, Pattern) -> Found
348
349              Types:
350
351                 Subject = binary()
352                 Pattern = binary() | [binary()] | cp()
353                 Found = [part()]
354
355              Same as matches(Subject, Pattern, []).
356
357       matches(Subject, Pattern, Options) -> Found
358
359              Types:
360
361                 Subject = binary()
362                 Pattern = binary() | [binary()] | cp()
363                 Found = [part()]
364                 Options = [Option]
365                 Option = {scope, part()}
366                 part() = {Start :: integer() >= 0, Length :: integer()}
367
368              As  match/2,  but Subject is searched until exhausted and a list
369              of all non-overlapping parts matching Pattern  is  returned  (in
370              order).
371
372              The  first and longest match is preferred to a shorter, which is
373              illustrated by the following example:
374
375              1> binary:matches(<<"abcde">>,
376                                [<<"bcde">>,<<"bc">>,<<"de">>],[]).
377              [{1,4}]
378
379              The result shows that <<"bcde">>  is  selected  instead  of  the
380              shorter match <<"bc">> (which would have given raise to one more
381              match, <<"de">>). This corresponds to the behavior of POSIX reg‐
382              ular  expressions (and programs like awk), but is not consistent
383              with alternative matches in re (and Perl), where instead lexical
384              ordering in the search pattern selects which string matches.
385
386              If  none  of the strings in a pattern is found, an empty list is
387              returned.
388
389              For a description of Pattern, see compile_pattern/1. For  a  de‐
390              scription of available options, see match/3.
391
392              If {scope, {Start,Length}} is specified in the options such that
393              Start > size of Subject, Start + Length < 0 or Start + Length is
394              > size of Subject, a badarg exception is raised.
395
396       part(Subject, PosLen) -> binary()
397
398              Types:
399
400                 Subject = binary()
401                 PosLen = part()
402
403              Extracts the part of binary Subject described by PosLen.
404
405              A  negative  length can be used to extract bytes at the end of a
406              binary:
407
408              1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
409              2> binary:part(Bin, {byte_size(Bin), -5}).
410              <<6,7,8,9,10>>
411
412          Note:
413              part/2 and part/3 are also available in the erlang module  under
414              the  names  binary_part/2  and binary_part/3. Those BIFs are al‐
415              lowed in guard tests.
416
417
418              If PosLen in any way references outside the binary, a badarg ex‐
419              ception is raised.
420
421       part(Subject, Pos, Len) -> binary()
422
423              Types:
424
425                 Subject = binary()
426                 Pos = integer() >= 0
427                 Len = integer()
428
429              Same as part(Subject, {Pos, Len}).
430
431       referenced_byte_size(Binary) -> integer() >= 0
432
433              Types:
434
435                 Binary = binary()
436
437              If a binary references a larger binary (often described as being
438              a subbinary), it can be useful to get the size of the referenced
439              binary.  This  function  can be used in a program to trigger the
440              use of copy/1. By copying a  binary,  one  can  dereference  the
441              original, possibly large, binary that a smaller binary is a ref‐
442              erence to.
443
444              Example:
445
446              store(Binary, GBSet) ->
447                NewBin =
448                    case binary:referenced_byte_size(Binary) of
449                        Large when Large > 2 * byte_size(Binary) ->
450                           binary:copy(Binary);
451                        _ ->
452                           Binary
453                    end,
454                gb_sets:insert(NewBin,GBSet).
455
456              In this example, we chose to copy the binary content before  in‐
457              serting  it in gb_sets:set() if it references a binary more than
458              twice the data size we want to keep. Of course, different  rules
459              apply when copying to different programs.
460
461              Binary sharing occurs whenever binaries are taken apart. This is
462              the fundamental reason why binaries are fast, decomposition  can
463              always  be done with O(1) complexity. In rare circumstances this
464              data sharing is however undesirable, why this function  together
465              with copy/1 can be useful when optimizing for memory use.
466
467              Example of binary sharing:
468
469              1> A = binary:copy(<<1>>, 100).
470              <<1,1,1,1,1 ...
471              2> byte_size(A).
472              100
473              3> binary:referenced_byte_size(A).
474              100
475              4> <<B:10/binary, C:90/binary>> = A.
476              <<1,1,1,1,1 ...
477              5> {byte_size(B), binary:referenced_byte_size(B)}.
478              {10,10}
479              6> {byte_size(C), binary:referenced_byte_size(C)}.
480              {90,100}
481
482              In  the  above  example, the small binary B was copied while the
483              larger binary C references binary A.
484
485          Note:
486              Binary data is shared among processes. If another process  still
487              references the larger binary, copying the part this process uses
488              only consumes more memory and does not free up the larger binary
489              for  garbage  collection.  Use  this kind of intrusive functions
490              with extreme care and only if a real problem is detected.
491
492
493       replace(Subject, Pattern, Replacement) -> Result
494
495              Types:
496
497                 Subject = binary()
498                 Pattern = binary() | [binary()] | cp()
499                 Replacement = Result = binary()
500
501              Same as replace(Subject, Pattern, Replacement,[]).
502
503       replace(Subject, Pattern, Replacement, Options) -> Result
504
505              Types:
506
507                 Subject = binary()
508                 Pattern = binary() | [binary()] | cp()
509                 Replacement = binary()
510                 Options = [Option]
511                 Option = global | {scope, part()} | {insert_replaced, InsPos}
512                 InsPos = OnePos | [OnePos]
513                 OnePos = integer() >= 0
514                   An integer() =< byte_size(Replacement)
515                 Result = binary()
516
517              Constructs a new binary by replacing the parts in Subject match‐
518              ing Pattern with the content of Replacement.
519
520              If  the matching subpart of Subject giving raise to the replace‐
521              ment is to be inserted in the result,  option  {insert_replaced,
522              InsPos} inserts the matching part into Replacement at the speci‐
523              fied position (or positions) before inserting  Replacement  into
524              Subject.
525
526              Example:
527
528              1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
529              <<"a[b]cde">>
530              2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
531              <<"a[b]c[d]e">>
532              3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
533              <<"a[bb]c[dd]e">>
534              4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
535              <<"a[b-b]c[d-d]e">>
536
537              If  any  position  specified in InsPos > size of the replacement
538              binary, a badarg exception is raised.
539
540              Options global and {scope, part()} work as for split/3. The  re‐
541              turn type is always a binary().
542
543              For a description of Pattern, see compile_pattern/1.
544
545       split(Subject, Pattern) -> Parts
546
547              Types:
548
549                 Subject = binary()
550                 Pattern = binary() | [binary()] | cp()
551                 Parts = [binary()]
552
553              Same as split(Subject, Pattern, []).
554
555       split(Subject, Pattern, Options) -> Parts
556
557              Types:
558
559                 Subject = binary()
560                 Pattern = binary() | [binary()] | cp()
561                 Options = [Option]
562                 Option = {scope, part()} | trim | global | trim_all
563                 Parts = [binary()]
564
565              Splits  Subject into a list of binaries based on Pattern. If op‐
566              tion global is not specified, only the first occurrence of  Pat‐
567              tern in Subject gives rise to a split.
568
569              The  parts  of  Pattern found in Subject are not included in the
570              result.
571
572              Example:
573
574              1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
575              [<<1,255,4>>, <<2,3>>]
576              2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
577              [<<0,1>>,<<4>>,<<9>>]
578
579              Summary of options:
580
581                {scope, part()}:
582                  Works as in match/3 and matches/3. Notice that this only de‐
583                  fines  the scope of the search for matching strings, it does
584                  not cut the binary before splitting. The  bytes  before  and
585                  after  the scope are kept in the result. See the example be‐
586                  low.
587
588                trim:
589                  Removes trailing empty parts of the result (as does trim  in
590                  re:split/3.
591
592                trim_all:
593                  Removes all empty parts of the result.
594
595                global:
596                  Repeats  the  split until Subject is exhausted. Conceptually
597                  option global makes split work on the positions returned  by
598                  matches/3,  while it normally works on the position returned
599                  by match/3.
600
601              Example of the difference between a scope and taking the  binary
602              apart before splitting:
603
604              1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
605              [<<"ban">>,<<"na">>]
606              2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
607              [<<"n">>,<<"n">>]
608
609              The return type is always a list of binaries that are all refer‐
610              encing Subject. This means that  the  data  in  Subject  is  not
611              copied  to new binaries, and that Subject cannot be garbage col‐
612              lected until the results of the split are no longer referenced.
613
614              For a description of Pattern, see compile_pattern/1.
615
616
617
618Ericsson AB                     stdlib 4.3.1.3                       binary(3)
Impressum