binary(n)

1binary(n)                    Tcl Built-In Commands                   binary(n)
2
3
4
5______________________________________________________________________________
6

NAME

8       binary - Insert and extract fields from binary strings
9

SYNOPSIS

11       binary format formatString ?arg arg ...?
12       binary scan string formatString ?varName varName ...?
13_________________________________________________________________
14
15

DESCRIPTION

17       This  command  provides  facilities  for manipulating binary data.  The
18       first form, binary format, creates a binary string from normal Tcl val‐
19       ues.   For  example,  given the values 16 and 22, on a 32 bit architec‐
20       ture, it might produce an 8-byte binary string consisting of two 4-byte
21       integers, one for each of the numbers.  The second form of the command,
22       binary scan, does the opposite: it extracts data from a  binary  string
23       and returns it as ordinary Tcl string values.
24
25

BINARY FORMAT

27       The  binary  format  command  generates a binary string whose layout is
28       specified by the formatString and whose contents come  from  the  addi‐
29       tional arguments.  The resulting binary value is returned.
30
31       The  formatString  consists  of a sequence of zero or more field speci‐
32       fiers separated by zero or more spaces.  Each field specifier is a sin‐
33       gle  type  character followed by an optional numeric count.  Most field
34       specifiers consume one argument to obtain the value  to  be  formatted.
35       The  type  character  specifies  how the value is to be formatted.  The
36       count typically indicates how many items  of  the  specified  type  are
37       taken  from the value.  If present, the count is a non-negative decimal
38       integer or *, which normally indicates that all of  the  items  in  the
39       value  are  to  be used.  If the number of arguments does not match the
40       number of fields in the format string that consume arguments,  then  an
41       error is generated.
42
43       Here is a small example to clarify the relation between the field spec‐
44       ifiers and the arguments:
45              binary format d3d {1.0 2.0 3.0 4.0} 0.1
46
47       The first argument is a list of four numbers, but because of the  count
48       of  3  for the associated field specifier, only the first three will be
49       used. The second argument is associated with the  second  field  speci‐
50       fier.  The  resulting binary string contains the four numbers 1.0, 2.0,
51       3.0 and 0.1.
52
53       Each type-count pair moves an imaginary cursor through the binary data,
54       storing  bytes at the current position and advancing the cursor to just
55       after the last byte stored.  The cursor is initially at position  0  at
56       the  beginning  of  the data.  The type may be any one of the following
57       characters:
58
59       a    Stores a character string of length count in  the  output  string.
60            Every character is taken as modulo 256 (i.e. the low byte of every
61            character is used, and the high byte discarded)  so  when  storing
62            character  strings  not  wholly  expressible  using the characters
63            \u0000-\u00ff, the encoding convertto command should be used first
64            if  this truncation is not desired (i.e. if the characters are not
65            part of the ISO 8859-1 character set.)   If  arg  has  fewer  than
66            count  bytes,  then  additional zero bytes are used to pad out the
67            field.  If arg is longer than  the  specified  length,  the  extra
68            characters  will be ignored.  If count is *, then all of the bytes
69            in arg will be formatted.  If count is omitted, then one character
70            will be formatted.  For example,
71                   binary format a7a*a alpha bravo charlie
72            will return a string equivalent to alpha\000\000bravoc.
73
74       A    This form is the same as a except that spaces are used for padding
75            instead of nulls.  For example,
76                   binary format A6A*A alpha bravo charlie
77            will return alpha bravoc.
78
79       b    Stores a string of count binary digits in low-to-high order within
80            each  byte in the output string.  Arg must contain a sequence of 1
81            and 0 characters.  The resulting bytes are  emitted  in  first  to
82            last  order  with  the  bits  being formatted in low-to-high order
83            within each byte.  If arg has fewer than count digits, then  zeros
84            will  be  used  for  the remaining bits.  If arg has more than the
85            specified number of digits, the extra digits will be ignored.   If
86            count  is  *, then all of the digits in arg will be formatted.  If
87            count is omitted, then one digit will be formatted.  If the number
88            of  bits  formatted does not end at a byte boundary, the remaining
89            bits of the last byte will be zeros.  For example,
90                   binary format b5b* 11100 111000011010
91            will return a string equivalent to \x07\x87\x05.
92
93       B    This form is the same as b except that  the  bits  are  stored  in
94            high-to-low order within each byte.  For example,
95                   binary format B5B* 11100 111000011010
96            will return a string equivalent to \xe0\xe1\xa0.
97
98       h    Stores  a string of count hexadecimal digits in low-to-high within
99            each byte in the output string.  Arg must contain  a  sequence  of
100            characters  in  the set ``0123456789abcdefABCDEF''.  The resulting
101            bytes are emitted in first to last order with the hex digits being
102            formatted in low-to-high order within each byte.  If arg has fewer
103            than count digits, then zeros will be used for the remaining  dig‐
104            its.   If  arg  has  more than the specified number of digits, the
105            extra digits will be ignored.  If count is *, then all of the dig‐
106            its in arg will be formatted.  If count is omitted, then one digit
107            will be formatted.  If the number of digits formatted does not end
108            at  a  byte  boundary, the remaining bits of the last byte will be
109            zeros.  For example,
110                   binary format h3h* AB def
111            will return a string equivalent to \xba\x00\xed\x0f.
112
113       H    This form is the same as h except that the digits  are  stored  in
114            high-to-low order within each byte.  For example,
115                   binary format H3H* ab DEF
116            will return a string equivalent to \xab\x00\xde\xf0.
117
118       c    Stores  one or more 8-bit integer values in the output string.  If
119            no count is specified, then arg must consist of an integer  value;
120            otherwise  arg  must  consist  of a list containing at least count
121            integer elements.  The low-order 8 bits of each integer are stored
122            as  a  one-byte value at the cursor position.  If count is *, then
123            all of the integers in the list are formatted.  If the  number  of
124            elements  in the list is fewer than count, then an error is gener‐
125            ated.  If the number of elements  in  the  list  is  greater  than
126            count, then the extra elements are ignored.  For example,
127                   binary format c3cc* {3 -3 128 1} 260 {2 5}
128            will  return  a  string  equivalent  to  \x03\xfd\x80\x04\x02\x05,
129            whereas
130                   binary format c {2 5}
131            will generate an error.
132
133       s    This form is the same as c except  that  it  stores  one  or  more
134            16-bit  integers in little-endian byte order in the output string.
135            The low-order 16-bits of each integer are  stored  as  a  two-byte
136            value  at  the  cursor  position  with  the least significant byte
137            stored first.  For example,
138                   binary format s3 {3 -3 258 1}
139            will return a string equivalent to \x03\x00\xfd\xff\x02\x01.
140
141       S    This form is the same as s except  that  it  stores  one  or  more
142            16-bit  integers  in  big-endian  byte order in the output string.
143            For example,
144                   binary format S3 {3 -3 258 1}
145            will return a string equivalent to \x00\x03\xff\xfd\x01\x02.
146
147       i    This form is the same as c except  that  it  stores  one  or  more
148            32-bit  integers in little-endian byte order in the output string.
149            The low-order 32-bits of each integer are stored  as  a  four-byte
150            value  at  the  cursor  position  with  the least significant byte
151            stored first.  For example,
152                   binary format i3 {3 -3 65536 1}
153            will       return       a       string        equivalent        to
154            \x03\x00\x00\x00\xfd\xff\xff\xff\x00\x00\x01\x00
155
156       I    This  form  is the same as i except that it stores one or more one
157            or more 32-bit integers in big-endian byte  order  in  the  output
158            string.  For example,
159                   binary format I3 {3 -3 65536 1}
160            will        return        a       string       equivalent       to
161            \x00\x00\x00\x03\xff\xff\xff\xfd\x00\x01\x00\x00
162
163       w    This form is the same as c except  that  it  stores  one  or  more │
164            64-bit  integers in little-endian byte order in the output string. │
165            The low-order 64-bits of each integer are stored as an  eight-byte │
166            value  at  the  cursor  position  with  the least significant byte │
167            stored first.  For example,                                        │
168                   binary format w 7810179016327718216                         │
169            will return the string HelloTcl                                    │
170
171       W                                                                       │
172            This form is the same as w except that it stores one or  more  one │
173            or  more  64-bit  integers  in big-endian byte order in the output │
174            string.  For example,                                              │
175                   binary format Wc 4785469626960341345 110                    │
176            will return the string BigEndian
177
178       f    This form is the same as c except that it stores one or  more  one
179            or  more  single-precision floating in the machine's native repre‐
180            sentation in the output string.  This representation is not porta‐
181            ble  across architectures, so it should not be used to communicate
182            floating point numbers across the network.  The size of a floating
183            point number may vary across architectures, so the number of bytes
184            that are generated may vary.  If the value overflows the machine's
185            native representation, then the value of FLT_MAX as defined by the
186            system will be used instead.  Because  Tcl  uses  double-precision
187            floating-point  numbers internally, there may be some loss of pre‐
188            cision in the conversion to single-precision.  For example,  on  a
189            Windows system running on an Intel Pentium processor,
190                   binary format f2 {1.6 3.4}
191            will        return        a       string       equivalent       to
192            \xcd\xcc\xcc\x3f\x9a\x99\x59\x40.
193
194       d    This form is the same as f except that it stores one or  more  one
195            or  more  double-precision floating in the machine's native repre‐
196            sentation in the output string.  For example, on a Windows  system
197            running on an Intel Pentium processor,
198                   binary format d1 {1.6}
199            will        return        a       string       equivalent       to
200            \x9a\x99\x99\x99\x99\x99\xf9\x3f.
201
202       x    Stores count null bytes in the output string.   If  count  is  not
203            specified,  stores  one  null  byte.   If count is *, generates an
204            error.  This type does not consume an argument.  For example,
205                   binary format a3xa3x2a3 abc def ghi
206            will return a string equivalent to abc\000def\000\000ghi.
207
208       X    Moves the cursor back count bytes in the output string.  If  count
209            is  * or is larger than the current cursor position, then the cur‐
210            sor is positioned at location 0 so that the next byte stored  will
211            be  the first byte in the result string.  If count is omitted then
212            the cursor is moved back one byte.  This type does not consume  an
213            argument.  For example,
214                   binary format a3X*a3X2a3 abc def ghi
215            will return dghi.
216
217       @    Moves  the  cursor  to  the absolute location in the output string
218            specified by count.  Position 0 refers to the first  byte  in  the
219            output string.  If count refers to a position beyond the last byte
220            stored so far, then null bytes will be placed in the uninitialized
221            locations and the cursor will be placed at the specified location.
222            If count is *, then the cursor is moved to the current end of  the
223            output  string.  If count is omitted, then an error will be gener‐
224            ated.  This type does not consume an argument. For example,
225                   binary format a5@2a1@*a3@10a1 abcde f ghi j
226            will return abfdeghi\000\000j.
227
228

BINARY SCAN

230       The binary scan command parses fields from a binary  string,  returning
231       the  number  of  conversions  performed.   String gives the input to be
232       parsed and formatString indicates how to parse it.  Each varName  gives
233       the  name of a variable; when a field is scanned from string the result
234       is assigned to the corresponding variable.
235
236       As with binary format, the formatString consists of a sequence of  zero
237       or  more field specifiers separated by zero or more spaces.  Each field
238       specifier is a single type character followed by  an  optional  numeric
239       count.   Most field specifiers consume one argument to obtain the vari‐
240       able into which the scanned values should be placed.  The type  charac‐
241       ter specifies how the binary data is to be interpreted.  The count typ‐
242       ically indicates how many items of the specified type  are  taken  from
243       the  data.   If present, the count is a non-negative decimal integer or
244       *, which normally indicates that all of the remaining items in the data
245       are  to  be used.  If there are not enough bytes left after the current
246       cursor position to satisfy the current field specifier, then the corre‐
247       sponding variable is left untouched and binary scan returns immediately
248       with the number of variables that were set.  If there  are  not  enough
249       arguments for all of the fields in the format string that consume argu‐
250       ments, then an error is generated.
251
252       A similar example as with binary format  should  explain  the  relation
253       between  field specifiers and arguments in case of the binary scan sub‐
254       command:
255              binary scan $bytes s3s first second
256
257       This command (provided the binary string in the variable bytes is  long
258       enough)  assigns  a  list  of  three integers to the variable first and
259       assigns a single value to the variable second.  If bytes contains fewer
260       than  8 bytes (i.e. four 2-byte integers), no assignment to second will
261       be made, and if bytes contains fewer than 6 bytes  (i.e.  three  2-byte
262       integers), no assignment to first will be made.  Hence:
263              puts [binary scan abcdefg s3s first second]
264              puts $first
265              puts $second
266       will print (assuming neither variable is set previously):
267              1
268              25185 25699 26213
269              can't read "second": no such variable
270
271       It is important to note that the c, s, and S (and i and I on 64bit sys‐
272       tems) will be scanned into long data size values.  In doing this,  val‐
273       ues  that  have  their high bit set (0x80 for chars, 0x8000 for shorts,
274       0x80000000 for ints), will be sign extended.  Thus the  following  will
275       occur:
276              set signShort [binary format s1 0x8000]
277              binary scan $signShort s1 val; # val == 0xFFFF8000
278       If  you want to produce an unsigned value, then you can mask the return
279       value to the desired size.  For example, to produce an  unsigned  short
280       value:
281              set val [expr {$val & 0xFFFF}]; # val == 0x8000
282
283       Each type-count pair moves an imaginary cursor through the binary data,
284       reading bytes from the current position.  The cursor  is  initially  at
285       position  0  at  the beginning of the data.  The type may be any one of
286       the following characters:
287
288       a    The data is a character string of length count.  If  count  is  *,
289            then all of the remaining bytes in string will be scanned into the
290            variable.  If  count  is  omitted,  then  one  character  will  be
291            scanned.   All  characters scanned will be interpreted as being in
292            the range \u0000-\u00ff so the encoding convertfrom command  might
293            be needed if the string is not an ISO 8859-1 string.  For example,
294                   binary scan abcde\000fghi a6a10 var1 var2
295            will  return  1  with the string equivalent to abcde\000 stored in
296            var1 and var2 left unmodified.
297
298       A    This form is the same as a, except trailing blanks and  nulls  are
299            stripped  from  the scanned value before it is stored in the vari‐
300            able.  For example,
301                   binary scan "abc efghi  \000" A* var1
302            will return 1 with abc efghi stored in var1.
303
304       b    The data is turned into a string of count binary digits in low-to-
305            high  order  represented  as a sequence of ``1'' and ``0'' charac‐
306            ters.  The data bytes are scanned in first to last order with  the
307            bits being taken in low-to-high order within each byte.  Any extra
308            bits in the last byte are ignored.  If count is *, then all of the
309            remaining  bits  in  string will be scanned.  If count is omitted,
310            then one bit will be scanned.  For example,
311                   binary scan \x07\x87\x05 b5b* var1 var2
312            will return 2 with  11100  stored  in  var1  and  1110000110100000
313            stored in var2.
314
315       B    This  form is the same as b, except the bits are taken in high-to-
316            low order within each byte.  For example,
317                   binary scan \x70\x87\x05 B5B* var1 var2
318            will return 2 with  01110  stored  in  var1  and  1000011100000101
319            stored in var2.
320
321       h    The  data  is  turned into a string of count hexadecimal digits in
322            low-to-high order represented as a sequence of characters  in  the
323            set  ``0123456789abcdef''.  The data bytes are scanned in first to
324            last order with the hex digits being taken  in  low-to-high  order
325            within  each  byte.   Any extra bits in the last byte are ignored.
326            If count is *, then all of the remaining hex digits in string will
327            be  scanned.   If  count  is  omitted,  then one hex digit will be
328            scanned.  For example,
329                   binary scan \x07\x86\x05 h3h* var1 var2
330            will return 2 with 706 stored in var1 and 50 stored in var2.
331
332       H    This form is the same as h, except the digits are taken  in  high-
333            to-low order within each byte.  For example,
334                   binary scan \x07\x86\x05 H3H* var1 var2
335            will return 2 with 078 stored in var1 and 05 stored in var2.
336
337       c    The  data is turned into count 8-bit signed integers and stored in
338            the corresponding variable as a list. If count is *, then  all  of
339            the  remaining bytes in string will be scanned.  If count is omit‐
340            ted, then one 8-bit integer will be scanned.  For example,
341                   binary scan \x07\x86\x05 c2c* var1 var2
342            will return 2 with 7 -122 stored in var1 and  5  stored  in  var2.
343            Note  that  the integers returned are signed, but they can be con‐
344            verted to unsigned 8-bit quantities using an expression like:
345                   expr { $num & 0xff }
346
347       s    The data is interpreted as count  16-bit  signed  integers  repre‐
348            sented  in  little-endian  byte order.  The integers are stored in
349            the corresponding variable as a list.  If count is *, then all  of
350            the  remaining bytes in string will be scanned.  If count is omit‐
351            ted, then one 16-bit integer will be scanned.  For example,
352                   binary scan \x05\x00\x07\x00\xf0\xff s2s* var1 var2
353            will return 2 with 5 7 stored in var1  and  -16  stored  in  var2.
354            Note  that  the integers returned are signed, but they can be con‐
355            verted to unsigned 16-bit quantities using an expression like:
356                   expr { $num & 0xffff }
357
358       S    This form is the same as s except that the data is interpreted  as
359            count 16-bit signed integers represented in big-endian byte order.
360            For example,
361                   binary scan \x00\x05\x00\x07\xff\xf0 S2S* var1 var2
362            will return 2 with 5 7 stored in var1 and -16 stored in var2.
363
364       i    The data is interpreted as count  32-bit  signed  integers  repre‐
365            sented  in  little-endian  byte order.  The integers are stored in
366            the corresponding variable as a list.  If count is *, then all  of
367            the  remaining bytes in string will be scanned.  If count is omit‐
368            ted, then one 32-bit integer will be scanned.  For example,
369                   binary scan \x05\x00\x00\x00\x07\x00\x00\x00\xf0\xff\xff\xff i2i* var1 var2
370            will return 2 with 5 7 stored in var1  and  -16  stored  in  var2.
371            Note  that  the integers returned are signed, but they can be con‐
372            verted to unsigned 32-bit quantities using an expression like:
373                   expr { $num & 0xffffffff }
374
375       I    This form is the same as I except that the data is interpreted  as
376            count 32-bit signed integers represented in big-endian byte order.
377            For example,
378                   binary scan \x00\x00\x00\x05\x00\x00\x00\x07\xff\xff\xff\xf0 I2I* var1 var2
379            will return 2 with 5 7 stored in var1 and -16 stored in var2.
380
381       w    The data is interpreted as count  64-bit  signed  integers  repre‐ │
382            sented  in  little-endian  byte order.  The integers are stored in │
383            the corresponding variable as a list.  If count is *, then all  of │
384            the  remaining bytes in string will be scanned.  If count is omit‐ │
385            ted, then one 64-bit integer will be scanned.  For example,        │
386                   binary scan \x05\x00\x00\x00\x07\x00\x00\x00\xf0\xff\xff\xff wi* var1 var2│
387            will return 2 with 30064771077 stored in var1 and  -16  stored  in │
388            var2.   Note  that  the integers returned are signed and cannot be │
389            represented by Tcl as unsigned values.                             │
390
391       W                                                                       │
392            This form is the same as w except that the data is interpreted  as │
393            count 64-bit signed integers represented in big-endian byte order. │
394            For example,                                                       │
395                   binary scan \x00\x00\x00\x05\x00\x00\x00\x07\xff\xff\xff\xf0 WI* var1 var2│
396            will return 2 with 21474836487 stored in var1 and  -16  stored  in │
397            var2.
398
399       f    The  data  is interpreted as count single-precision floating point
400            numbers in the  machine's  native  representation.   The  floating
401            point  numbers are stored in the corresponding variable as a list.
402            If count is *, then all of the remaining bytes in string  will  be
403            scanned.   If count is omitted, then one single-precision floating
404            point number will be scanned.  The size of a floating point number
405            may  vary  across  architectures,  so the number of bytes that are
406            scanned may vary.  If the data does not represent a valid floating
407            point number, the resulting value is undefined and compiler depen‐
408            dent.  For example, on a Windows system running on an  Intel  Pen‐
409            tium processor,
410                   binary scan \x3f\xcc\xcc\xcd f var1
411            will return 1 with 1.6000000238418579 stored in var1.
412
413       d    This  form is the same as f except that the data is interpreted as
414            count double-precision floating point  numbers  in  the  machine's
415            native representation. For example, on a Windows system running on
416            an Intel Pentium processor,
417                   binary scan \x9a\x99\x99\x99\x99\x99\xf9\x3f d var1
418            will return 1 with 1.6000000000000001 stored in var1.
419
420       x    Moves the cursor forward count bytes in string.  If count is *  or
421            is larger than the number of bytes after the current cursor cursor
422            position, then the cursor is positioned after  the  last  byte  in
423            string.  If count is omitted, then the cursor is moved forward one
424            byte.  Note that this type does  not  consume  an  argument.   For
425            example,
426                   binary scan \x01\x02\x03\x04 x2H* var1
427            will return 1 with 0304 stored in var1.
428
429       X    Moves  the cursor back count bytes in string.  If count is * or is
430            larger than the current cursor position, then the cursor is  posi‐
431            tioned  at  location  0  so that the next byte scanned will be the
432            first byte in string.  If count is  omitted  then  the  cursor  is
433            moved  back  one  byte.   Note  that this type does not consume an
434            argument.  For example,
435                   binary scan \x01\x02\x03\x04 c2XH* var1 var2
436            will return 2 with 1 2 stored in var1 and 020304 stored in var2.
437
438       @    Moves the cursor to the absolute location in the data string spec‐
439            ified  by count.  Note that position 0 refers to the first byte in
440            string.  If count refers to a position beyond the end  of  string,
441            then  the  cursor  is positioned after the last byte.  If count is
442            omitted, then an error will be generated.  For example,
443                   binary scan \x01\x02\x03\x04 c2@1H* var1 var2
444            will return 2 with 1 2 stored in var1 and 020304 stored in var2.
445

PLATFORM ISSUES

447       Sometimes it is desirable to format  or  scan  integer  values  in  the
448       native  byte  order for the machine.  Refer to the byteOrder element of
449       the tcl_platform array to decide which type character to use when  for‐
450       matting or scanning integers.
451

EXAMPLES

453       This  is  a procedure to write a Tcl string to a binary-encoded channel
454       as UTF-8 data preceded by a length word:
455              proc writeString {channel string} {
456                  set data [encoding convertto utf-8 $string]
457                  puts -nonewline [binary format Ia* \
458                          [string length $data] $data]
459              }
460
461       This procedure reads a string from a channel that was  written  by  the
462       previously presented writeString procedure:
463              proc readString {channel} {
464                  if {![binary scan [read $channel 4] I length]} {
465                      error "missing length"
466                  }
467                  set data [read $channel $length]
468                  return [encoding convertfrom utf-8 $data]
469              }
470
471

KEYWORDS

477       binary, format, scan
478
479
480
481Tcl                                   8.0                            binary(n)