1binary(3)                  Erlang Module Definition                  binary(3)
2
3
4

NAME

6       binary - Library for handling binary data.
7

DESCRIPTION

9       This module contains functions for manipulating byte-oriented binaries.
10       Although the majority of functions could be provided using  bit-syntax,
11       the  functions in this library are highly optimized and are expected to
12       either execute faster or consume less memory, or both, than a  counter‐
13       part written in pure Erlang.
14
15       The  module  is provided according to Erlang Enhancement Proposal (EEP)
16       31.
17
18   Note:
19       The library handles byte-oriented data. For  bitstrings  that  are  not
20       binaries  (does not contain whole octets of bits) a badarg exception is
21       thrown from any of the functions in this module.
22
23

DATA TYPES

25       cp()
26
27              Opaque data type representing a compiled search pattern. Guaran‐
28              teed  to  be  a tuple() to allow programs to distinguish it from
29              non-precompiled search patterns.
30
31       part() = {Start :: integer() >= 0, Length :: integer()}
32
33              A representaion of a part (or range) in a  binary.  Start  is  a
34              zero-based  offset  into  a binary() and Length is the length of
35              that part. As input to functions in this module, a reverse  part
36              specification is allowed, constructed with a negative Length, so
37              that the part of the binary begins at  Start  +  Length  and  is
38              -Length long. This is useful for referencing the last N bytes of
39              a binary as {size(Binary), -N}. The  functions  in  this  module
40              always return part()s with positive Length.
41

EXPORTS

43       at(Subject, Pos) -> byte()
44
45              Types:
46
47                 Subject = binary()
48                 Pos = integer() >= 0
49
50              Returns  the byte at position Pos (zero-based) in binary Subject
51              as an integer. If Pos >= byte_size(Subject), a badarg  exception
52              is raised.
53
54       bin_to_list(Subject) -> [byte()]
55
56              Types:
57
58                 Subject = binary()
59
60              Same as bin_to_list(Subject, {0,byte_size(Subject)}).
61
62       bin_to_list(Subject, PosLen) -> [byte()]
63
64              Types:
65
66                 Subject = binary()
67                 PosLen = part()
68
69              Converts  Subject  to  a  list of byte()s, each representing the
70              value of one byte. part() denotes which part of the binary()  to
71              convert.
72
73              Example:
74
75              1> binary:bin_to_list(<<"erlang">>, {1,3}).
76              "rla"
77              %% or [114,108,97] in list notation.
78
79              If  PosLen  in  any  way references outside the binary, a badarg
80              exception is raised.
81
82       bin_to_list(Subject, Pos, Len) -> [byte()]
83
84              Types:
85
86                 Subject = binary()
87                 Pos = integer() >= 0
88                 Len = integer()
89
90              Same as bin_to_list(Subject, {Pos, Len}).
91
92       compile_pattern(Pattern) -> cp()
93
94              Types:
95
96                 Pattern = binary() | [binary()]
97
98              Builds an internal structure representing  a  compilation  of  a
99              search   pattern,   later  to  be  used  in  functions  match/3,
100              matches/3, split/3, or replace/4. The cp() returned  is  guaran‐
101              teed  to  be  a tuple() to allow programs to distinguish it from
102              non-precompiled search patterns.
103
104              When a list of binaries is specified, it denotes a set of alter‐
105              native  binaries  to  search  for.  For  example,  if  [<<"func‐
106              tional">>,<<"programming">>] is specified as Pattern, this means
107              either  <<"functional">> or <<"programming">>". The pattern is a
108              set of alternatives; when only a single binary is specified, the
109              set has only one element. The order of alternatives in a pattern
110              is not significant.
111
112              The list of binaries used for search alternatives must  be  flat
113              and proper.
114
115              If  Pattern  is  not  a binary or a flat proper list of binaries
116              with length > 0, a badarg exception is raised.
117
118       copy(Subject) -> binary()
119
120              Types:
121
122                 Subject = binary()
123
124              Same as copy(Subject, 1).
125
126       copy(Subject, N) -> binary()
127
128              Types:
129
130                 Subject = binary()
131                 N = integer() >= 0
132
133              Creates a binary with the content of Subject duplicated N times.
134
135              This function always creates a new binary, even if  N  =  1.  By
136              using  copy/1  on  a binary referencing a larger binary, one can
137              free up the larger binary for garbage collection.
138
139          Note:
140              By deliberately copying a single binary to avoid  referencing  a
141              larger  binary, one can, instead of freeing up the larger binary
142              for later garbage collection, create much more binary data  than
143              needed.  Sharing  binary  data  is usually good. Only in special
144              cases, when small parts reference large binaries and  the  large
145              binaries  are  no longer used in any process, deliberate copying
146              can be a good idea.
147
148
149              If N < 0, a badarg exception is raised.
150
151       decode_unsigned(Subject) -> Unsigned
152
153              Types:
154
155                 Subject = binary()
156                 Unsigned = integer() >= 0
157
158              Same as decode_unsigned(Subject, big).
159
160       decode_unsigned(Subject, Endianness) -> Unsigned
161
162              Types:
163
164                 Subject = binary()
165                 Endianness = big | little
166                 Unsigned = integer() >= 0
167
168              Converts the binary digit representation, in big endian or  lit‐
169              tle  endian, of a positive integer in Subject to an Erlang inte‐
170              ger().
171
172              Example:
173
174              1> binary:decode_unsigned(<<169,138,199>>,big).
175              11111111
176
177       encode_unsigned(Unsigned) -> binary()
178
179              Types:
180
181                 Unsigned = integer() >= 0
182
183              Same as encode_unsigned(Unsigned, big).
184
185       encode_unsigned(Unsigned, Endianness) -> binary()
186
187              Types:
188
189                 Unsigned = integer() >= 0
190                 Endianness = big | little
191
192              Converts a positive integer to the smallest possible representa‐
193              tion in a binary digit representation, either big endian or lit‐
194              tle endian.
195
196              Example:
197
198              1> binary:encode_unsigned(11111111, big).
199              <<169,138,199>>
200
201       first(Subject) -> byte()
202
203              Types:
204
205                 Subject = binary()
206
207              Returns the first byte of binary Subject as an integer.  If  the
208              size of Subject is zero, a badarg exception is raised.
209
210       last(Subject) -> byte()
211
212              Types:
213
214                 Subject = binary()
215
216              Returns  the  last  byte of binary Subject as an integer. If the
217              size of Subject is zero, a badarg exception is raised.
218
219       list_to_bin(ByteList) -> binary()
220
221              Types:
222
223                 ByteList = iolist()
224
225              Works exactly as erlang:list_to_binary/1,  added  for  complete‐
226              ness.
227
228       longest_common_prefix(Binaries) -> integer() >= 0
229
230              Types:
231
232                 Binaries = [binary()]
233
234              Returns  the length of the longest common prefix of the binaries
235              in list Binaries.
236
237              Example:
238
239              1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
240              2
241              2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
242              0
243
244              If Binaries is not a flat list of binaries, a  badarg  exception
245              is raised.
246
247       longest_common_suffix(Binaries) -> integer() >= 0
248
249              Types:
250
251                 Binaries = [binary()]
252
253              Returns  the length of the longest common suffix of the binaries
254              in list Binaries.
255
256              Example:
257
258              1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
259              3
260              2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
261              0
262
263              If Binaries is not a flat list of binaries, a  badarg  exception
264              is raised.
265
266       match(Subject, Pattern) -> Found | nomatch
267
268              Types:
269
270                 Subject = binary()
271                 Pattern = binary() | [binary()] | cp()
272                 Found = part()
273
274              Same as match(Subject, Pattern, []).
275
276       match(Subject, Pattern, Options) -> Found | nomatch
277
278              Types:
279
280                 Subject = binary()
281                 Pattern = binary() | [binary()] | cp()
282                 Found = part()
283                 Options = [Option]
284                 Option = {scope, part()}
285                 part() = {Start :: integer() >= 0, Length :: integer()}
286
287              Searches  for  the  first  occurrence  of Pattern in Subject and
288              returns the position and length.
289
290              The function returns {Pos, Length} for the  binary  in  Pattern,
291              starting at the lowest position in Subject.
292
293              Example:
294
295              1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
296              {1,4}
297
298              Even  though  <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
299              first and is therefore  the  first  match.  If  two  overlapping
300              matches begin at the same position, the longest is returned.
301
302              Summary of the options:
303
304                {scope, {Start, Length}}:
305                  Only  the  specified  part  is searched. Return values still
306                  have offsets from  the  beginning  of  Subject.  A  negative
307                  Length is allowed as described in section Data Types in this
308                  manual.
309
310              If none of the strings in Pattern is found, the atom nomatch  is
311              returned.
312
313              For a description of Pattern, see function compile_pattern/1.
314
315              If {scope, {Start,Length}} is specified in the options such that
316              Start > size of Subject, Start + Length < 0 or Start + Length  >
317              size of Subject, a badarg exception is raised.
318
319       matches(Subject, Pattern) -> Found
320
321              Types:
322
323                 Subject = binary()
324                 Pattern = binary() | [binary()] | cp()
325                 Found = [part()]
326
327              Same as matches(Subject, Pattern, []).
328
329       matches(Subject, Pattern, Options) -> Found
330
331              Types:
332
333                 Subject = binary()
334                 Pattern = binary() | [binary()] | cp()
335                 Found = [part()]
336                 Options = [Option]
337                 Option = {scope, part()}
338                 part() = {Start :: integer() >= 0, Length :: integer()}
339
340              As  match/2,  but Subject is searched until exhausted and a list
341              of all non-overlapping parts matching Pattern  is  returned  (in
342              order).
343
344              The  first and longest match is preferred to a shorter, which is
345              illustrated by the following example:
346
347              1> binary:matches(<<"abcde">>,
348                                [<<"bcde">>,<<"bc">>,<<"de">>],[]).
349              [{1,4}]
350
351              The result shows that <<"bcde">>  is  selected  instead  of  the
352              shorter match <<"bc">> (which would have given raise to one more
353              match, <<"de">>). This corresponds to the behavior of POSIX reg‐
354              ular  expressions (and programs like awk), but is not consistent
355              with alternative matches in re (and Perl), where instead lexical
356              ordering in the search pattern selects which string matches.
357
358              If  none  of the strings in a pattern is found, an empty list is
359              returned.
360
361              For a description  of  Pattern,  see  compile_pattern/1.  For  a
362              description of available options, see match/3.
363
364              If {scope, {Start,Length}} is specified in the options such that
365              Start > size of Subject, Start + Length < 0 or Start + Length is
366              > size of Subject, a badarg exception is raised.
367
368       part(Subject, PosLen) -> binary()
369
370              Types:
371
372                 Subject = binary()
373                 PosLen = part()
374
375              Extracts the part of binary Subject described by PosLen.
376
377              A  negative  length can be used to extract bytes at the end of a
378              binary:
379
380              1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
381              2> binary:part(Bin, {byte_size(Bin), -5}).
382              <<6,7,8,9,10>>
383
384          Note:
385              part/2 and part/3 are also available in the erlang module  under
386              the  names  binary_part/2  and  binary_part/3.  Those  BIFs  are
387              allowed in guard tests.
388
389
390              If PosLen in any way references outside  the  binary,  a  badarg
391              exception is raised.
392
393       part(Subject, Pos, Len) -> binary()
394
395              Types:
396
397                 Subject = binary()
398                 Pos = integer() >= 0
399                 Len = integer()
400
401              Same as part(Subject, {Pos, Len}).
402
403       referenced_byte_size(Binary) -> integer() >= 0
404
405              Types:
406
407                 Binary = binary()
408
409              If a binary references a larger binary (often described as being
410              a subbinary), it can be useful to get the size of the referenced
411              binary.  This  function  can be used in a program to trigger the
412              use of copy/1. By copying a  binary,  one  can  dereference  the
413              original, possibly large, binary that a smaller binary is a ref‐
414              erence to.
415
416              Example:
417
418              store(Binary, GBSet) ->
419                NewBin =
420                    case binary:referenced_byte_size(Binary) of
421                        Large when Large > 2 * byte_size(Binary) ->
422                           binary:copy(Binary);
423                        _ ->
424                           Binary
425                    end,
426                gb_sets:insert(NewBin,GBSet).
427
428              In this example, we chose to  copy  the  binary  content  before
429              inserting  it  in  gb_sets:set()  if it references a binary more
430              than twice the data size we want to keep. Of  course,  different
431              rules apply when copying to different programs.
432
433              Binary sharing occurs whenever binaries are taken apart. This is
434              the fundamental reason why binaries are fast, decomposition  can
435              always  be done with O(1) complexity. In rare circumstances this
436              data sharing is however undesirable, why this function  together
437              with copy/1 can be useful when optimizing for memory use.
438
439              Example of binary sharing:
440
441              1> A = binary:copy(<<1>>, 100).
442              <<1,1,1,1,1 ...
443              2> byte_size(A).
444              100
445              3> binary:referenced_byte_size(A).
446              100
447              4> <<B:10/binary, C:90/binary>> = A.
448              <<1,1,1,1,1 ...
449              5> {byte_size(B), binary:referenced_byte_size(B)}.
450              {10,10}
451              6> {byte_size(C), binary:referenced_byte_size(C)}.
452              {90,100}
453
454              In  the  above  example, the small binary B was copied while the
455              larger binary C references binary A.
456
457          Note:
458              Binary data is shared among processes. If another process  still
459              references the larger binary, copying the part this process uses
460              only consumes more memory and does not free up the larger binary
461              for  garbage  collection.  Use  this kind of intrusive functions
462              with extreme care and only if a real problem is detected.
463
464
465       replace(Subject, Pattern, Replacement) -> Result
466
467              Types:
468
469                 Subject = binary()
470                 Pattern = binary() | [binary()] | cp()
471                 Replacement = Result = binary()
472
473              Same as replace(Subject, Pattern, Replacement,[]).
474
475       replace(Subject, Pattern, Replacement, Options) -> Result
476
477              Types:
478
479                 Subject = binary()
480                 Pattern = binary() | [binary()] | cp()
481                 Replacement = binary()
482                 Options = [Option]
483                 Option = global | {scope, part()} | {insert_replaced, InsPos}
484                 InsPos = OnePos | [OnePos]
485                 OnePos = integer() >= 0
486                   An integer() =< byte_size(Replacement)
487                 Result = binary()
488
489              Constructs a new binary by replacing the parts in Subject match‐
490              ing Pattern with the content of Replacement.
491
492              If  the matching subpart of Subject giving raise to the replace‐
493              ment is to be inserted in the result,  option  {insert_replaced,
494              InsPos} inserts the matching part into Replacement at the speci‐
495              fied position (or positions) before inserting  Replacement  into
496              Subject.
497
498              Example:
499
500              1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
501              <<"a[b]cde">>
502              2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
503              <<"a[b]c[d]e">>
504              3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
505              <<"a[bb]c[dd]e">>
506              4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
507              <<"a[b-b]c[d-d]e">>
508
509              If  any  position  specified in InsPos > size of the replacement
510              binary, a badarg exception is raised.
511
512              Options global and {scope, part()}  work  as  for  split/3.  The
513              return type is always a binary().
514
515              For a description of Pattern, see compile_pattern/1.
516
517       split(Subject, Pattern) -> Parts
518
519              Types:
520
521                 Subject = binary()
522                 Pattern = binary() | [binary()] | cp()
523                 Parts = [binary()]
524
525              Same as split(Subject, Pattern, []).
526
527       split(Subject, Pattern, Options) -> Parts
528
529              Types:
530
531                 Subject = binary()
532                 Pattern = binary() | [binary()] | cp()
533                 Options = [Option]
534                 Option = {scope, part()} | trim | global | trim_all
535                 Parts = [binary()]
536
537              Splits  Subject  into  a  list  of binaries based on Pattern. If
538              option global is not specified, only  the  first  occurrence  of
539              Pattern in Subject gives rise to a split.
540
541              The  parts  of  Pattern found in Subject are not included in the
542              result.
543
544              Example:
545
546              1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
547              [<<1,255,4>>, <<2,3>>]
548              2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
549              [<<0,1>>,<<4>>,<<9>>]
550
551              Summary of options:
552
553                {scope, part()}:
554                  Works as in match/3 and matches/3.  Notice  that  this  only
555                  defines  the  scope  of  the search for matching strings, it
556                  does not cut the binary before splitting. The  bytes  before
557                  and  after the scope are kept in the result. See the example
558                  below.
559
560                trim:
561                  Removes trailing empty parts of the result (as does trim  in
562                  re:split/3.
563
564                trim_all:
565                  Removes all empty parts of the result.
566
567                global:
568                  Repeats  the  split until Subject is exhausted. Conceptually
569                  option global makes split work on the positions returned  by
570                  matches/3,  while it normally works on the position returned
571                  by match/3.
572
573              Example of the difference between a scope and taking the  binary
574              apart before splitting:
575
576              1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
577              [<<"ban">>,<<"na">>]
578              2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
579              [<<"n">>,<<"n">>]
580
581              The return type is always a list of binaries that are all refer‐
582              encing Subject. This means that  the  data  in  Subject  is  not
583              copied  to new binaries, and that Subject cannot be garbage col‐
584              lected until the results of the split are no longer referenced.
585
586              For a description of Pattern, see compile_pattern/1.
587
588
589
590Ericsson AB                      stdlib 3.12.1                       binary(3)
Impressum