Convert::Binary::C(3pm)

1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4

NAME

6       Convert::Binary::C - Binary Data Conversion using C Types
7

SYNOPSIS

9   Simple
10         use Convert::Binary::C;
11
12         #---------------------------------------------
13         # Create a new object and parse embedded code
14         #---------------------------------------------
15         my $c = Convert::Binary::C->new->parse(<<ENDC);
16
17         enum Month { JAN, FEB, MAR, APR, MAY, JUN,
18                      JUL, AUG, SEP, OCT, NOV, DEC };
19
20         struct Date {
21           int        year;
22           enum Month month;
23           int        day;
24         };
25
26         ENDC
27
28         #-----------------------------------------------
29         # Pack Perl data structure into a binary string
30         #-----------------------------------------------
31         my $date = { year => 2002, month => 'DEC', day => 24 };
32
33         my $packed = $c->pack('Date', $date);
34
35   Advanced
36         use Convert::Binary::C;
37         use Data::Dumper;
38
39         #---------------------
40         # Create a new object
41         #---------------------
42         my $c = new Convert::Binary::C ByteOrder => 'BigEndian';
43
44         #---------------------------------------------------
45         # Add include paths and global preprocessor defines
46         #---------------------------------------------------
47         $c->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include',
48                     '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include-fixed',
49                     '/usr/include')
50           ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
51
52         #----------------------------------
53         # Parse the 'time.h' header file
54         #----------------------------------
55         $c->parse_file('time.h');
56
57         #---------------------------------------
58         # See which files the object depends on
59         #---------------------------------------
60         print Dumper([$c->dependencies]);
61
62         #-----------------------------------------------------------
63         # See if struct timespec is defined and dump its definition
64         #-----------------------------------------------------------
65         if ($c->def('struct timespec')) {
66           print Dumper($c->struct('timespec'));
67         }
68
69         #-------------------------------
70         # Create some binary dummy data
71         #-------------------------------
72         my $data = "binary_test_string";
73
74         #--------------------------------------------------------
75         # Unpack $data according to 'struct timespec' definition
76         #--------------------------------------------------------
77         if (length($data) >= $c->sizeof('timespec')) {
78           my $perl = $c->unpack('timespec', $data);
79           print Dumper($perl);
80         }
81
82         #--------------------------------------------------------
83         # See which member lies at offset 5 of 'struct timespec'
84         #--------------------------------------------------------
85         my $member = $c->member('timespec', 5);
86         print "member('timespec', 5) = '$member'\n";
87

DESCRIPTION

89       Convert::Binary::C is a preprocessor and parser for C type definitions.
90       It is highly configurable and supports arbitrarily complex data
91       structures. Its object-oriented interface has "pack" and "unpack"
92       methods that act as replacements for Perl's "pack" and "unpack" and
93       allow one to use C types instead of a string representation of the data
94       structure for conversion of binary data from and to Perl's complex data
95       structures.
96
97       Actually, what Convert::Binary::C does is not very different from what
98       a C compiler does, just that it doesn't compile the source code into an
99       object file or executable, but only parses the code and allows Perl to
100       use the enumerations, structs, unions and typedefs that have been
101       defined within your C source for binary data conversion, similar to
102       Perl's "pack" and "unpack".
103
104       Beyond that, the module offers a lot of convenience methods to retrieve
105       information about the C types that have been parsed.
106
107   Background and History
108       In late 2000 I wrote a real-time debugging interface for an embedded
109       medical device that allowed me to send out data from that device over
110       its integrated Ethernet adapter.  The interface was "printf()"-like, so
111       you could easily send out strings or numbers. But you could also send
112       out what I called arbitrary data, which was intended for arbitrary
113       blocks of the device's memory.
114
115       Another part of this real-time debugger was a Perl application running
116       on my workstation that gathered all the messages that were sent out
117       from the embedded device. It printed all the strings and numbers, and
118       hex-dumped the arbitrary data.  However, manually parsing a couple of
119       300 byte hex-dumps of a complex C structure is not only frustrating,
120       but also error-prone and time consuming.
121
122       Using "unpack" to retrieve the contents of a C structure works fine for
123       small structures and if you don't have to deal with struct member
124       alignment. But otherwise, maintaining such code can be as awful as
125       deciphering hex-dumps.
126
127       As I didn't find anything to solve my problem on the CPAN, I wrote a
128       little module that translated simple C structs into "unpack" strings.
129       It worked, but it was slow. And since it couldn't deal with struct
130       member alignment, I soon found myself adding padding bytes everywhere.
131       So again, I had to maintain two sources, and changing one of them
132       forced me to touch the other one.
133
134       All in all, this little module seemed to make my task a bit easier, but
135       it was far from being what I was thinking of:
136
137       · A module that could directly use the source I've been coding for the
138         embedded device without any modifications.
139
140       · A module that could be configured to match the properties of the
141         different compilers and target platforms I was using.
142
143       · A module that was fast enough to decode a great amount of binary data
144         even on my slow workstation.
145
146       I didn't know how to accomplish these tasks until I read something
147       about XS. At least, it seemed as if it could solve my performance
148       problems. However, writing a C parser in C isn't easier than it is in
149       Perl. But writing a C preprocessor from scratch is even worse.
150
151       Fortunately enough, after a few weeks of searching I found both, a
152       lean, open-source C preprocessor library, and a reusable YACC grammar
153       for ANSI-C. That was the beginning of the development of
154       Convert::Binary::C in late 2001.
155
156       Now, I'm successfully using the module in my embedded environment since
157       long before it appeared on CPAN. From my point of view, it is exactly
158       what I had in mind. It's fast, flexible, easy to use and portable. It
159       doesn't require external programs or other Perl modules.
160
161   About this document
162       This document describes how to use Convert::Binary::C. A lot of
163       different features are presented, and the example code sometimes uses
164       Perl's more advanced language elements. If your experience with Perl is
165       rather limited, you should know how to use Perl's very good
166       documentation system.
167
168       To look up one of the manpages, use the "perldoc" command.  For
169       example,
170
171         perldoc perl
172
173       will show you Perl's main manpage. To look up a specific Perl function,
174       use "perldoc -f":
175
176         perldoc -f map
177
178       gives you more information about the "map" function.  You can also
179       search the FAQ using "perldoc -q":
180
181         perldoc -q array
182
183       will give you everything you ever wanted to know about Perl arrays. But
184       now, let's go on with some real stuff!
185
186   Why use Convert::Binary::C?
187       Say you want to pack (or unpack) data according to the following C
188       structure:
189
190         struct foo {
191           char ary[3];
192           unsigned short baz;
193           int bar;
194         };
195
196       You could of course use Perl's "pack" and "unpack" functions:
197
198         @ary = (1, 2, 3);
199         $baz = 40000;
200         $bar = -4711;
201         $binary = pack 'c3 S i', @ary, $baz, $bar;
202
203       But this implies that the struct members are byte aligned. If they were
204       long aligned (which is the default for most compilers), you'd have to
205       write
206
207         $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
208
209       which doesn't really increase readability.
210
211       Now imagine that you need to pack the data for a completely different
212       architecture with different byte order. You would look into the "pack"
213       manpage again and perhaps come up with this:
214
215         $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
216
217       However, if you try to unpack $foo again, your signed values have
218       turned into unsigned ones.
219
220       All this can still be managed with Perl. But imagine your structures
221       get more complex? Imagine you need to support different platforms?
222       Imagine you need to make changes to the structures? You'll not only
223       have to change the C source but also dozens of "pack" strings in your
224       Perl code. This is no fun. And Perl should be fun.
225
226       Now, wouldn't it be great if you could just read in the C source you've
227       already written and use all the types defined there for packing and
228       unpacking? That's what Convert::Binary::C does.
229
230   Creating a Convert::Binary::C object
231       To use Convert::Binary::C just say
232
233         use Convert::Binary::C;
234
235       to load the module. Its interface is completely object oriented, so it
236       doesn't export any functions.
237
238       Next, you need to create a new Convert::Binary::C object. This can be
239       done by either
240
241         $c = Convert::Binary::C->new;
242
243       or
244
245         $c = new Convert::Binary::C;
246
247       You can optionally pass configuration options to the constructor as
248       described in the next section.
249
250   Configuring the object
251       To configure a Convert::Binary::C object, you can either call the
252       "configure" method or directly pass the configuration options to the
253       constructor. If you want to change byte order and alignment, you can
254       use
255
256         $c->configure(ByteOrder => 'LittleEndian',
257                       Alignment => 2);
258
259       or you can change the construction code to
260
261         $c = new Convert::Binary::C ByteOrder => 'LittleEndian',
262                                     Alignment => 2;
263
264       Either way, the object will now know that it should use little endian
265       (Intel) byte order and 2-byte struct member alignment for packing and
266       unpacking.
267
268       Alternatively, you can use the option names as names of methods to
269       configure the object, like:
270
271         $c->ByteOrder('LittleEndian');
272
273       You can also retrieve information about the current configuration of a
274       Convert::Binary::C object. For details, see the section about the
275       "configure" method.
276
277   Parsing C code
278       Convert::Binary::C allows two ways of parsing C source. Either by
279       parsing external C header or C source files:
280
281         $c->parse_file('header.h');
282
283       Or by parsing C code embedded in your script:
284
285         $c->parse(<<'CCODE');
286         struct foo {
287           char ary[3];
288           unsigned short baz;
289           int bar;
290         };
291         CCODE
292
293       Now the object $c will know everything about "struct foo".  The example
294       above uses a so-called here-document. It allows one to easily embed
295       multi-line strings in your code. You can find more about here-documents
296       in perldata or perlop.
297
298       Since the "parse" and "parse_file" methods throw an exception when a
299       parse error occurs, you usually want to catch these in an "eval" block:
300
301         eval { $c->parse_file('header.h') };
302         if ($@) {
303           # handle error appropriately
304         }
305
306       Perl's special $@ variable will contain an empty string (which
307       evaluates to a false value in boolean context) on success or an error
308       string on failure.
309
310       As another feature, "parse" and "parse_file" return a reference to
311       their object on success, just like "configure" does when you're
312       configuring the object. This will allow you to write constructs like
313       this:
314
315         my $c = eval {
316           Convert::Binary::C->new(Include => ['/usr/include'])
317                             ->parse_file('header.h')
318         };
319         if ($@) {
320           # handle error appropriately
321         }
322
323   Packing and unpacking
324       Convert::Binary::C has two methods, "pack" and "unpack", that act
325       similar to the functions of same denominator in Perl.  To perform the
326       packing described in the example above, you could write:
327
328         $data = {
329           ary => [1, 2, 3],
330           baz => 40000,
331           bar => -4711,
332         };
333         $binary = $c->pack('foo', $data);
334
335       Unpacking will work exactly the same way, just that the "unpack" method
336       will take a byte string as its input and will return a reference to a
337       (possibly very complex) Perl data structure.
338
339         $binary = get_data_from_memory();
340         $data = $c->unpack('foo', $binary);
341
342       You can now easily access all of the values:
343
344         print "foo.ary[1] = $data->{ary}[1]\n";
345
346       Or you can even more conveniently use the Data::Dumper module:
347
348         use Data::Dumper;
349         print Dumper($data);
350
351       The output would look something like this:
352
353         $VAR1 = {
354           'bar' => -271,
355           'baz' => 5000,
356           'ary' => [
357             42,
358             48,
359             100
360           ]
361         };
362
363   Preprocessor configuration
364       Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C
365       preprocessor. It is compliant to ISO-C99, so you don't have to worry
366       about using even weird preprocessor constructs in your code.
367
368       If your C source contains includes or depends upon preprocessor
369       defines, you may need to configure the internal preprocessor.  Use the
370       "Include" and "Define" configuration options for that:
371
372         $c->configure(Include => ['/usr/include',
373                                   '/home/mhx/include'],
374                       Define  => [qw( NDEBUG FOO=42 )]);
375
376       If your code uses system includes, it is most likely that you will need
377       to define the symbols that are usually defined by the compiler.
378
379       On some operating systems, the system includes require the preprocessor
380       to predefine a certain set of assertions.  Assertions are supported by
381       "ucpp", and you can define them either in the source code using
382       "#assert" or as a property of the Convert::Binary::C object using
383       "Assert":
384
385         $c->configure(Assert => ['predicate(answer)']);
386
387       Information about defined macros can be retrieved from the preprocessor
388       as long as its configuration isn't changed. The preprocessor is
389       implicitly reset if you change one of the following configuration
390       options:
391
392         Include
393         Define
394         Assert
395         HasCPPComments
396         HasMacroVAARGS
397
398   Supported pragma directives
399       Convert::Binary::C supports the "pack" pragma to locally override
400       struct member alignment. The supported syntax is as follows:
401
402       #pragma pack( ALIGN )
403           Sets the new alignment to ALIGN. If ALIGN is 0, resets the
404           alignment to its original value.
405
406       #pragma pack
407           Resets the alignment to its original value.
408
409       #pragma pack( push, ALIGN )
410           Saves the current alignment on a stack and sets the new alignment
411           to ALIGN. If ALIGN is 0, sets the alignment to the default
412           alignment.
413
414       #pragma pack( pop )
415           Restores the alignment to the last value saved on the stack.
416
417         /*  Example assumes sizeof( short ) == 2, sizeof( long ) == 4.  */
418
419         #pragma pack(1)
420
421         struct nopad {
422           char a;               /* no padding bytes between 'a' and 'b' */
423           long b;
424         };
425
426         #pragma pack            /* reset to "native" alignment          */
427
428         #pragma pack( push, 2 )
429
430         struct pad {
431           char    a;            /* one padding byte between 'a' and 'b' */
432           long    b;
433
434         #pragma pack( push, 1 )
435
436           struct {
437             char  c;            /* no padding between 'c' and 'd'       */
438             short d;
439           }       e;            /* sizeof( e ) == 3                     */
440
441         #pragma pack( pop );    /* back to pack( 2 )                    */
442
443           long    f;            /* one padding byte between 'e' and 'f' */
444         };
445
446         #pragma pack( pop );    /* back to "native"                     */
447
448       The "pack" pragma as it is currently implemented only affects the
449       maximum struct member alignment. There are compilers that also allow
450       one to specify the minimum struct member alignment. This is not
451       supported by Convert::Binary::C.
452
453   Automatic configuration using "ccconfig"
454       As there are over 20 different configuration options, setting all of
455       them correctly can be a lengthy and tedious task.
456
457       The "ccconfig" script, which is bundled with this module, aims at
458       automatically determining the correct compiler configuration by testing
459       the compiler executable. It works for both, native and cross compilers.
460

UNDERSTANDING TYPES

462       This section covers one of the fundamental features of
463       Convert::Binary::C. It's how type expressions, referred to as TYPEs in
464       the method reference, are handled by the module.
465
466       Many of the methods, namely "pack", "unpack", "sizeof", "typeof",
467       "member", "offsetof", "def", "initializer" and "tag", are passed a TYPE
468       to operate on as their first argument.
469
470   Standard Types
471       These are trivial. Standard types are simply enum names, struct names,
472       union names, or typedefs. Almost every method that wants a TYPE will
473       accept a standard type.
474
475       For enums, structs and unions, the prefixes "enum", "struct" and
476       "union" are optional. However, if a typedef with the same name exists,
477       like in
478
479         struct foo {
480           int bar;
481         };
482
483         typedef int foo;
484
485       you will have to use the prefix to distinguish between the struct and
486       the typedef. Otherwise, a typedef is always given preference.
487
488   Basic Types
489       Basic types, or atomic types, are "int" or "char", for example.  It's
490       possible to use these basic types without having parsed any code. You
491       can simply do
492
493         $c = new Convert::Binary::C;
494         $size = $c->sizeof('unsigned long');
495         $data = $c->pack('short int', 42);
496
497       Even though the above works fine, it is not possible to define more
498       complex types on the fly, so
499
500         $size = $c->sizeof('struct { int a, b; }');
501
502       will result in an error.
503
504       Basic types are not supported by all methods. For example, it makes no
505       sense to use "member" or "offsetof" on a basic type. Using "typeof"
506       isn't very useful, but supported.
507
508   Member Expressions
509       This is by far the most complex part, depending on the complexity of
510       your data structures. Any standard type that defines a compound or an
511       array may be followed by a member expression to select only a certain
512       part of the data type. Say you have parsed the following C code:
513
514         struct foo {
515           long type;
516           struct {
517             short x, y;
518           } array[20];
519         };
520
521         typedef struct foo matrix[8][8];
522
523       You may want to know the size of the "array" member of "struct foo".
524       This is quite easy:
525
526         print $c->sizeof('foo.array'), " bytes";
527
528       will print
529
530         80 bytes
531
532       depending of course on the "ShortSize" you configured.
533
534       If you wanted to unpack only a single column of "matrix", that's easy
535       as well (and of course it doesn't matter which index you use):
536
537         $column = $c->unpack('matrix[2]', $data);
538
539       Just like in C, it is possible to use out-of-bounds array indices.
540       This means that, for example, despite "array" is declared to have 20
541       elements, the following code
542
543         $size   = $c->sizeof('foo.array[4711]');
544         $offset = $c->offsetof('foo', 'array[-13]');
545
546       is perfectly valid and will result in:
547
548         $size   = 4
549         $offset = -48
550
551       Member expressions can be arbitrarily complex:
552
553         $type = $c->typeof('matrix[2][3].array[7].y');
554         print "the type is $type";
555
556       will, for example, print
557
558         the type is short
559
560       Member expressions are also used as the second argument to "offsetof".
561
562   Offsets
563       Members returned by the "member" method have an optional offset suffix
564       to indicate that the given offset doesn't point to the start of that
565       member. For example,
566
567         $member = $c->member('matrix', 1431);
568         print $member;
569
570       will print
571
572         [2][1].type+3
573
574       If you would use this as a member expression, like in
575
576         $size = $c->sizeof("matrix $member");
577
578       the offset suffix will simply be ignored. Actually, it will be ignored
579       for all methods if it's used in the first argument.
580
581       When used in the second argument to "offsetof", it will usually do what
582       you mean, i. e. the offset suffix, if present, will be considered when
583       determining the offset. This behaviour ensures that
584
585         $member = $c->member('foo', 43);
586         $offset = $c->offsetof('foo', $member);
587         print "'$member' is located at offset $offset of struct foo";
588
589       will always correctly set $offset:
590
591         '.array[9].y+1' is located at offset 43 of struct foo
592
593       If this is not what you mean, e.g. because you want to know the offset
594       where the member returned by "member" starts, you just have to remove
595       the suffix:
596
597         $member =~ s/\+\d+$//;
598         $offset = $c->offsetof('foo', $member);
599         print "'$member' starts at offset $offset of struct foo";
600
601       This would then print:
602
603         '.array[9].y' starts at offset 42 of struct foo
604

USING TAGS

606       In a nutshell, tags are properties that you can attach to types.
607
608       You can add tags to types using the "tag" method, and remove them using
609       "tag" or "untag", for example:
610
611         # Attach 'Format' and 'Hooks' tags
612         $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
613
614         $c->untag('type', 'Format');  # Remove only 'Format' tag
615         $c->untag('type');            # Remove all tags
616
617       You can also use "tag" to see which tags are attached to a type, for
618       example:
619
620         $tags = $c->tag('type');
621
622       This would give you:
623
624         $tags = {
625           'Hooks' => {
626             'pack' => \&rout
627           },
628           'Format' => 'String'
629         };
630
631       Currently, there are only a couple of different tags that influence the
632       way data is packed and unpacked. There are probably more tags to come
633       in the future.
634
635   The Format Tag
636       One of the tags currently available is the "Format" tag.  Using this
637       tag, you can tell a Convert::Binary::C object to pack and unpack a
638       certain data type in a special way.
639
640       For example, if you have a (fixed length) string type
641
642         typedef char str_type[40];
643
644       this type would, by default, be unpacked as an array of "char"s. That's
645       because it is only an array of "char"s, and Convert::Binary::C doesn't
646       know it is actually used as a string.
647
648       But you can tell Convert::Binary::C that "str_type" is a C string using
649       the "Format" tag:
650
651         $c->tag('str_type', Format => 'String');
652
653       This will make "unpack" (and of course also "pack") treat the binary
654       data like a null-terminated C string:
655
656         $binary = "Hello World!\n\0 this is just some dummy data";
657         $hello = $c->unpack('str_type', $binary);
658         print $hello;
659
660       would thusly print:
661
662         Hello World!
663
664       Of course, this also works the other way round:
665
666         use Data::Hexdumper;
667
668         $binary = $c->pack('str_type', "Just another C::B::C hacker");
669         print hexdump(data => $binary);
670
671       would print:
672
673           0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
674           0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
675           0x0020 : 00 00 00 00 00 00 00 00                         : ........
676
677       If you want Convert::Binary::C to not interpret the binary data at all,
678       you can set the "Format" tag to "Binary".  This might not be seem very
679       useful, as "pack" and "unpack" would just pass through the unmodified
680       binary data.  But you can tag not only whole types, but also compound
681       members. For example
682
683         $c->parse(<<ENDC);
684         struct packet {
685           unsigned short header;
686           unsigned short flags;
687           unsigned char  payload[28];
688         };
689         ENDC
690
691         $c->tag('packet.payload', Format => 'Binary');
692
693       would allow you to write:
694
695         read FILE, $payload, $c->sizeof('packet.payload');
696
697         $packet = {
698                     header  => 4711,
699                     flags   => 0xf00f,
700                     payload => $payload,
701                   };
702
703         $binary = $c->pack('packet', $packet);
704
705         print hexdump(data => $binary);
706
707       This would print something like:
708
709           0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
710           0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
711
712       For obvious reasons, it is not allowed to attach a "Format" tag to
713       bitfield members. Trying to do so will result in an exception being
714       thrown by the "tag" method.
715
716   The ByteOrder Tag
717       The "ByteOrder" tag allows you to override the byte order of certain
718       types or members. The implementation of this tag is considered
719       experimental and may be subject to changes in the future.
720
721       Usually it doesn't make much sense to override the byte order, but
722       there may be applications where a sub-structure is packed in a
723       different byte order than the surrounding structure.
724
725       Take, for example, the following code:
726
727         $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
728                                      OrderMembers => 1);
729         $c->parse(<<'ENDC');
730
731         typedef unsigned short u_16;
732
733         struct coords_3d {
734           long x, y, z;
735         };
736
737         struct coords_msg {
738           u_16 header;
739           u_16 length;
740           struct coords_3d coords;
741         };
742
743         ENDC
744
745       Assume that while "coords_msg" is big endian, the embedded coordinates
746       "coords_3d" are stored in little endian format for some reason. In C,
747       you'll have to handle this manually.
748
749       But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
750       to either the "coords_3d" structure or to the "coords" member of the
751       "coords_msg" structure. Both will work in this case. The only
752       difference is that if you tag the "coords" member, "coords_3d" will
753       only be treated as little endian if you "pack" or "unpack" the
754       "coords_msg" structure. (BTW, you could also tag all members of
755       "coords_3d" individually, but that would be inefficient.)
756
757       So, let's attach the "ByteOrder" tag to the "coords" member:
758
759         $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
760
761       Assume the following binary message:
762
763           0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
764
765       If you unpack this message...
766
767         $msg = $c->unpack('coords_msg', $binary);
768
769       ...you will get the following data structure:
770
771         $msg = {
772           'header' => 42,
773           'length' => 12,
774           'coords' => {
775             'x' => -1,
776             'y' => 2,
777             'z' => 42
778           }
779         };
780
781       Without the "ByteOrder" tag, you would get:
782
783         $msg = {
784           'header' => 42,
785           'length' => 12,
786           'coords' => {
787             'x' => -1,
788             'y' => 33554432,
789             'z' => 704643072
790           }
791         };
792
793       The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
794       of the tagged object recursively. Of course, it is also possible to
795       override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
796       child type. Confused? Here's an example. In addition to tagging the
797       "coords" member as little endian, we now tag "coords_3d.y" as big
798       endian:
799
800         $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
801         $msg = $c->unpack('coords_msg', $binary);
802
803       This will return the following data structure:
804
805         $msg = {
806           'header' => 42,
807           'length' => 12,
808           'coords' => {
809             'x' => -1,
810             'y' => 33554432,
811             'z' => 42
812           }
813         };
814
815       Note that if you tag both a type and a member of that type within a
816       compound, the tag attached to the type itself has higher precedence.
817       Using the example above, if you would attach a "ByteOrder" tag to both
818       "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
819       would always win.
820
821       Also note that the "ByteOrder" tag might not work as expected along
822       with bitfields, which is why the implementation is considered
823       experimental. Bitfields are currently not affected by the "ByteOrder"
824       tag at all. This is because the byte order would affect the bitfield
825       layout, and a consistent implementation supporting multiple layouts of
826       the same struct would be quite bulky and probably slow down the whole
827       module.
828
829       If you really need the correct behaviour, you can use the following
830       trick:
831
832         $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
833
834         $le->parse(<<'ENDC');
835
836         typedef unsigned short u_16;
837         typedef unsigned long  u_32;
838
839         struct message {
840           u_16 header;
841           u_16 length;
842           struct {
843             u_32 a;
844             u_32 b;
845             u_32 c :  7;
846             u_32 d :  5;
847             u_32 e : 20;
848           } data;
849         };
850
851         ENDC
852
853         $be = $le->clone->ByteOrder('BigEndian');
854
855         $le->tag('message.data', Format => 'Binary', Hooks => {
856             unpack => sub { $be->unpack('message.data', @_) },
857             pack   => sub { $be->pack('message.data', @_) },
858           });
859
860
861         $msg = $le->unpack('message', $binary);
862
863       This uses the "Format" and "Hooks" tags along with a big endian "clone"
864       of the original little endian object. It attaches hooks to the little
865       endian object and in the hooks it uses the big endian object to "pack"
866       and "unpack" the binary data.
867
868   The Dimension Tag
869       The "Dimension" tag allows you to override the declared dimension of an
870       array for packing or unpacking data. The implementation of this tag is
871       considered very experimental and will definitely change in a future
872       release.
873
874       That being said, the "Dimension" tag is primarily useful to support
875       variable length arrays. Usually, you have to write the following code
876       for such a variable length array in C:
877
878         struct c_message
879         {
880           unsigned count;
881           char data[1];
882         };
883
884       So, because you cannot declare an empty array, you declare an array
885       with a single element. If you have a ISO-C99 compliant compiler, you
886       can write this code instead:
887
888         struct c99_message
889         {
890           unsigned count;
891           char data[];
892         };
893
894       This explicitly tells the compiler that "data" is a flexible array
895       member. Convert::Binary::C already uses this information to handle
896       flexible array members in a special way.
897
898       As you can see in the following example, the two types are treated
899       differently:
900
901         $data = pack 'NC*', 3, 1..8;
902         $uc   = $c->unpack('c_message', $data);
903         $uc99 = $c->unpack('c99_message', $data);
904
905       This will result in:
906
907         $uc = {'count' => 3,'data' => [1]};
908         $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
909
910       However, only few compilers support ISO-C99, and you probably don't
911       want to change your existing code only to get some extra features when
912       using Convert::Binary::C.
913
914       So it is possible to attach a tag to the "data" member of the
915       "c_message" struct that tells Convert::Binary::C to treat the array as
916       if it were flexible:
917
918         $c->tag('c_message.data', Dimension => '*');
919
920       Now both "c_message" and "c99_message" will behave exactly the same
921       when using "pack" or "unpack".  Repeating the above code:
922
923         $uc = $c->unpack('c_message', $data);
924
925       This will result in:
926
927         $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
928
929       But there's more you can do. Even though it probably doesn't make much
930       sense, you can tag a fixed dimension to an array:
931
932         $c->tag('c_message.data', Dimension => '5');
933
934       This will obviously result in:
935
936         $uc = {'count' => 3,'data' => [1,2,3,4,5]};
937
938       A more useful way to use the "Dimension" tag is to set it to the name
939       of a member in the same compound:
940
941         $c->tag('c_message.data', Dimension => 'count');
942
943       Convert::Binary::C will now use the value of that member to determine
944       the size of the array, so unpacking will result in:
945
946         $uc = {'count' => 3,'data' => [1,2,3]};
947
948       Of course, you can also tag flexible array members. And yes, it's also
949       possible to use more complex member expressions:
950
951         $c->parse(<<ENDC);
952         struct msg_header
953         {
954           unsigned len[2];
955         };
956
957         struct more_complex
958         {
959           struct msg_header hdr;
960           char data[];
961         };
962         ENDC
963
964         $data = pack 'NNC*', 42, 7, 1 .. 10;
965
966         $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
967
968         $u = $c->unpack('more_complex', $data);
969
970       The result will be:
971
972         $u = {
973           'hdr' => {
974             'len' => [
975               42,
976               7
977             ]
978           },
979           'data' => [
980             1,
981             2,
982             3,
983             4,
984             5,
985             6,
986             7
987           ]
988         };
989
990       By the way, it's also possible to tag arrays that are not embedded
991       inside a compound:
992
993         $c->parse(<<ENDC);
994         typedef unsigned short short_array[];
995         ENDC
996
997         $c->tag('short_array', Dimension => '5');
998
999         $u = $c->unpack('short_array', $data);
1000
1001       Resulting in:
1002
1003         $u = [0,42,0,7,258];
1004
1005       The final and most powerful way to define a "Dimension" tag is to pass
1006       it a subroutine reference. The referenced subroutine can execute
1007       whatever code is necessary to determine the size of the tagged array:
1008
1009         sub get_size
1010         {
1011           my $m = shift;
1012           return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1013         }
1014
1015         $c->tag('more_complex.data', Dimension => \&get_size);
1016
1017         $u = $c->unpack('more_complex', $data);
1018
1019       As you can guess from the above code, the subroutine is being passed a
1020       reference to hash that stores the already unpacked part of the compound
1021       embedding the tagged array. This is the result:
1022
1023         $u = {
1024           'hdr' => {
1025             'len' => [
1026               42,
1027               7
1028             ]
1029           },
1030           'data' => [
1031             1,
1032             2,
1033             3,
1034             4,
1035             5,
1036             6
1037           ]
1038         };
1039
1040       You can also pass custom arguments to the subroutines by using the
1041       "arg" method. This is similar to the functionality offered by the
1042       "Hooks" tag.
1043
1044       Of course, all that also works for the "pack" method as well.
1045
1046       However, the current implementation has at least one shortcomings,
1047       which is why it's experimental: The "Dimension" tag doesn't impact
1048       compound layout. This means that while you can alter the size of an
1049       array in the middle of a compound, the offset of the members after that
1050       array won't be impacted. I'd rather like to see the layout adapt
1051       dynamically, so this is what I'm hoping to implement in the future.
1052
1053   The Hooks Tag
1054       Hooks are a special kind of tag that can be extremely useful.
1055
1056       Using hooks, you can easily override the way "pack" and "unpack" handle
1057       data using your own subroutines.  If you define hooks for a certain
1058       data type, each time this data type is processed the corresponding hook
1059       will be called to allow you to modify that data.
1060
1061       Basic Hooks
1062
1063       Here's an example. Let's assume the following C code has been parsed:
1064
1065         typedef unsigned long u_32;
1066         typedef u_32          ProtoId;
1067         typedef ProtoId       MyProtoId;
1068
1069         struct MsgHeader {
1070           MyProtoId id;
1071           u_32      len;
1072         };
1073
1074         struct String {
1075           u_32 len;
1076           char buf[];
1077         };
1078
1079       You could now use the types above and, for example, unpack binary data
1080       representing a "MsgHeader" like this:
1081
1082         $msg_header = $c->unpack('MsgHeader', $data);
1083
1084       This would give you:
1085
1086         $msg_header = {
1087           'len' => 13,
1088           'id' => 42
1089         };
1090
1091       Instead of dealing with "ProtoId"'s as integers, you would rather like
1092       to have them as clear text. You could provide subroutines to convert
1093       between clear text and integers:
1094
1095         %proto = (
1096           CATS      =>    1,
1097           DOGS      =>   42,
1098           HEDGEHOGS => 4711,
1099         );
1100
1101         %rproto = reverse %proto;
1102
1103         sub ProtoId_unpack {
1104           $rproto{$_[0]} || 'unknown protocol'
1105         }
1106
1107         sub ProtoId_pack {
1108           $proto{$_[0]} or die 'unknown protocol'
1109         }
1110
1111       You can now register these subroutines by attaching a "Hooks" tag to
1112       "ProtoId" using the "tag" method:
1113
1114         $c->tag('ProtoId', Hooks => { pack   => \&ProtoId_pack,
1115                                       unpack => \&ProtoId_unpack });
1116
1117       Doing exactly the same unpack on "MsgHeader" again would now return:
1118
1119         $msg_header = {
1120           'len' => 13,
1121           'id' => 'DOGS'
1122         };
1123
1124       Actually, if you don't need the reverse operation, you don't even have
1125       to register a "pack" hook. Or, even better, you can have a more
1126       intelligent "unpack" hook that creates a dual-typed variable:
1127
1128         use Scalar::Util qw(dualvar);
1129
1130         sub ProtoId_unpack2 {
1131           dualvar $_[0], $rproto{$_[0]} || 'unknown protocol'
1132         }
1133
1134         $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1135
1136         $msg_header = $c->unpack('MsgHeader', $data);
1137
1138       Just as before, this would print
1139
1140         $msg_header = {
1141           'len' => 13,
1142           'id' => 'DOGS'
1143         };
1144
1145       but without requiring a "pack" hook for packing, at least as long as
1146       you keep the variable dual-typed.
1147
1148       Hooks are usually called with exactly one argument, which is the data
1149       that should be processed (see "Advanced Hooks" for details on how to
1150       customize hook arguments). They are called in scalar context and
1151       expected to return the processed data.
1152
1153       To get rid of registered hooks, you can either undefine only certain
1154       hooks
1155
1156         $c->tag('ProtoId', Hooks => { pack => undef });
1157
1158       or all hooks:
1159
1160         $c->tag('ProtoId', Hooks => undef);
1161
1162       Of course, hooks are not restricted to handling integer values.  You
1163       could just as well attach hooks for the "String" struct from the code
1164       above. A useful example would be to have these hooks:
1165
1166         sub string_unpack {
1167           my $s = shift;
1168           pack "c$s->{len}", @{$s->{buf}};
1169         }
1170
1171         sub string_pack {
1172           my $s = shift;
1173           return {
1174             len => length $s,
1175             buf => [ unpack 'c*', $s ],
1176           }
1177         }
1178
1179       (Don't be confused by the fact that the "unpack" hook uses "pack" and
1180       the "pack" hook uses "unpack".  And also see "Advanced Hooks" for a
1181       more clever approach.)
1182
1183       While you would normally get the following output when unpacking a
1184       "String"
1185
1186         $string = {
1187           'len' => 12,
1188           'buf' => [
1189             72,
1190             101,
1191             108,
1192             108,
1193             111,
1194             32,
1195             87,
1196             111,
1197             114,
1198             108,
1199             100,
1200             33
1201           ]
1202         };
1203
1204       you could just register the hooks using
1205
1206         $c->tag('String', Hooks => { pack   => \&string_pack,
1207                                      unpack => \&string_unpack });
1208
1209       and you would get a nice human-readable Perl string:
1210
1211         $string = 'Hello World!';
1212
1213       Packing a string turns out to be just as easy:
1214
1215         use Data::Hexdumper;
1216
1217         $data = $c->pack('String', 'Just another Perl hacker,');
1218
1219         print hexdump(data => $data);
1220
1221       This would print:
1222
1223           0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1224           0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C          : .Perl.hacker,
1225
1226       If you want to find out if or which hooks are registered for a certain
1227       type, you can also use the "tag" method:
1228
1229         $hooks = $c->tag('String', 'Hooks');
1230
1231       This would return:
1232
1233         $hooks = {
1234           'unpack' => \&string_unpack,
1235           'pack' => \&string_pack
1236         };
1237
1238       Advanced Hooks
1239
1240       It is also possible to combine hooks with using the "Format" tag.  This
1241       can be useful if you know better than Convert::Binary::C how to
1242       interpret the binary data. In the previous section, we've handled this
1243       type
1244
1245         struct String {
1246           u_32 len;
1247           char buf[];
1248         };
1249
1250       with the following hooks:
1251
1252         sub string_unpack {
1253           my $s = shift;
1254           pack "c$s->{len}", @{$s->{buf}};
1255         }
1256
1257         sub string_pack {
1258           my $s = shift;
1259           return {
1260             len => length $s,
1261             buf => [ unpack 'c*', $s ],
1262           }
1263         }
1264
1265         $c->tag('String', Hooks => { pack   => \&string_pack,
1266                                      unpack => \&string_unpack });
1267
1268       As you can see in the hook code, "buf" is expected to be an array of
1269       characters. For the "unpack" case Convert::Binary::C first turns the
1270       binary data into a Perl array, and then the hook packs it back into a
1271       string. The intermediate array creation and destruction is completely
1272       useless.  Same thing, of course, for the "pack" case.
1273
1274       Here's a clever way to handle this. Just tag "buf" as binary
1275
1276         $c->tag('String.buf', Format => 'Binary');
1277
1278       and use the following hooks instead:
1279
1280         sub string_unpack2 {
1281           my $s = shift;
1282           substr $s->{buf}, 0, $s->{len};
1283         }
1284
1285         sub string_pack2 {
1286           my $s = shift;
1287           return {
1288             len => length $s,
1289             buf => $s,
1290           }
1291         }
1292
1293         $c->tag('String', Hooks => { pack   => \&string_pack2,
1294                                      unpack => \&string_unpack2 });
1295
1296       This will be exactly equivalent to the old code, but faster and
1297       probably even much easier to understand.
1298
1299       But hooks are even more powerful. You can customize the arguments that
1300       are passed to your hooks and you can use "arg" to pass certain special
1301       arguments, such as the name of the type that is currently being
1302       processed by the hook.
1303
1304       The following example shows how it is easily possible to peek into the
1305       perl internals using hooks.
1306
1307         use Config;
1308
1309         $c = new Convert::Binary::C %CC, OrderMembers => 1;
1310         $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1311         $c->parse(<<ENDC);
1312         #include "EXTERN.h"
1313         #include "perl.h"
1314         ENDC
1315
1316         $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1317                                               $c->arg(qw(SELF TYPE DATA))] })
1318             for qw( XPVAV XPVHV );
1319
1320       First, we add the perl core include path and parse perl.h. Then, we add
1321       an "unpack_ptr" hook for a couple of the internal data types.
1322
1323       The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1324       a certain data structure is processed. This is by far the most
1325       experimental part of the hooks feature, as this includes any kind of
1326       pointer. There's no way for the hook to know the difference between a
1327       plain pointer, or a pointer to a pointer, or a pointer to an array
1328       (this is because the difference doesn't matter anywhere else in
1329       Convert::Binary::C).
1330
1331       But the hook above makes use of another very interesting feature: It
1332       uses "arg" to pass special arguments to the hook subroutine.  Usually,
1333       the hook subroutine is simply passed a single data argument.  But using
1334       the above definition, it'll get a reference to the calling object
1335       ("SELF"), the name of the type being processed ("TYPE") and the data
1336       ("DATA").
1337
1338       But how does our hook look like?
1339
1340         sub unpack_ptr {
1341           my($self, $type, $ptr) = @_;
1342           $ptr or return '<NULL>';
1343           my $size = $self->sizeof($type);
1344           $self->unpack($type, unpack("P$size", pack('I', $ptr)));
1345         }
1346
1347       As you can see, the hook is rather simple. First, it receives the
1348       arguments mentioned above. It performs a quick check if the pointer is
1349       "NULL" and shouldn't be processed any further. Next, it determines the
1350       size of the type being processed. And finally, it'll just use the "P"n
1351       unpack template to read from that memory location and recursively call
1352       "unpack" to unpack the type. (And yes, this may of course again call
1353       other hooks.)
1354
1355       Now, let's test that:
1356
1357         my $ref = { foo => 42, bar => 4711 };
1358         my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1359
1360         print Dumper(unpack_ptr($c, 'AV', $ptr));
1361
1362       Just for the fun of it, we create a blessed array reference. But how do
1363       we get a pointer to the corresponding "AV"? This is rather easy, as the
1364       address of the "AV" is just the hex value that appears when using the
1365       array reference in string context. So we just grab that and turn it
1366       into decimal. All that's left to do is just call our hook, as it can
1367       already handle "AV" pointers. And this is what we get:
1368
1369         $VAR1 = {
1370           'sv_any' => {
1371             'xnv_u' => {
1372               'xnv_nv' => '0',
1373               'xgv_stash' => 0,
1374               'xpad_cop_seq' => {
1375                 'xlow' => 0,
1376                 'xhigh' => 0
1377               },
1378               'xbm_s' => {
1379                 'xbm_previous' => 0,
1380                 'xbm_flags' => 0,
1381                 'xbm_rare' => 0
1382               }
1383             },
1384             'xav_fill' => 2,
1385             'xav_max' => 7,
1386             'xiv_u' => {
1387               'xivu_iv' => 2,
1388               'xivu_uv' => 2,
1389               'xivu_p1' => 2,
1390               'xivu_i32' => 2,
1391               'xivu_namehek' => 2,
1392               'xivu_hv' => 2
1393             },
1394             'xmg_u' => {
1395               'xmg_magic' => 0,
1396               'xmg_ourstash' => 0
1397             },
1398             'xmg_stash' => 0
1399           },
1400           'sv_refcnt' => 1,
1401           'sv_flags' => 536870924,
1402           'sv_u' => {
1403             'svu_pv' => 142054140,
1404             'svu_iv' => 142054140,
1405             'svu_uv' => 142054140,
1406             'svu_rv' => 142054140,
1407             'svu_array' => 142054140,
1408             'svu_hash' => 142054140,
1409             'svu_gp' => 142054140
1410           }
1411         };
1412
1413       Even though it is rather easy to do such stuff using "unpack_ptr"
1414       hooks, you should really know what you're doing and do it with extreme
1415       care because of the limitations mentioned above. It's really easy to
1416       run into segmentation faults when you're dereferencing pointers that
1417       point to memory which you don't own.
1418
1419       Performance
1420
1421       Using hooks isn't for free. In performance-critical applications you
1422       have to keep in mind that hooks are actually perl subroutines and that
1423       they are called once for every value of a registered type that is being
1424       packed or unpacked. If only about 10% of the values require hooks to be
1425       called, you'll hardly notice the difference (if your hooks are
1426       implemented efficiently, that is).  But if all values would require
1427       hooks to be called, that alone could easily make packing and unpacking
1428       very slow.
1429
1430   Tag Order
1431       Since it is possible to attach multiple tags to a single type, the
1432       order in which the tags are processed is important. Here's a small
1433       table that shows the processing order.
1434
1435         pack        unpack
1436         ---------------------
1437         Hooks       Format
1438         Format      ByteOrder
1439         ByteOrder   Hooks
1440
1441       As a general rule, the "Hooks" tag is always the first thing processed
1442       when packing data, and the last thing processed when unpacking data.
1443
1444       The "Format" and "ByteOrder" tags are exclusive, but when both are
1445       given the "Format" tag wins.
1446

METHODS

1448   new
1449       "new"
1450       "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1451               The constructor is used to create a new Convert::Binary::C
1452               object.  You can simply use
1453
1454                 $c = new Convert::Binary::C;
1455
1456               without additional arguments to create an object, or you can
1457               optionally pass any arguments to the constructor that are
1458               described for the "configure" method.
1459
1460   configure
1461       "configure"
1462       "configure" OPTION
1463       "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1464               This method can be used to configure an existing
1465               Convert::Binary::C object or to retrieve its current
1466               configuration.
1467
1468               To configure the object, the list of options consists of key
1469               and value pairs and must therefore contain an even number of
1470               elements. "configure" (and also "new" if used with
1471               configuration options) will throw an exception if you pass an
1472               odd number of elements. Configuration will normally look like
1473               this:
1474
1475                 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1476
1477               To retrieve the current value of a configuration option, you
1478               must pass a single argument to "configure" that holds the name
1479               of the option, just like
1480
1481                 $order = $c->configure('ByteOrder');
1482
1483               If you want to get the values of all configuration options at
1484               once, you can call "configure" without any arguments and it
1485               will return a reference to a hash table that holds the whole
1486               object configuration. This can be conveniently used with the
1487               Data::Dumper module, for example:
1488
1489                 use Convert::Binary::C;
1490                 use Data::Dumper;
1491
1492                 $c = new Convert::Binary::C Define  => ['DEBUGGING', 'FOO=123'],
1493                                             Include => ['/usr/include'];
1494
1495                 print Dumper($c->configure);
1496
1497               Which will print something like this:
1498
1499                 $VAR1 = {
1500                   'Define' => [
1501                     'DEBUGGING',
1502                     'FOO=123'
1503                   ],
1504                   'StdCVersion' => 199901,
1505                   'ByteOrder' => 'LittleEndian',
1506                   'LongSize' => 4,
1507                   'IntSize' => 4,
1508                   'HostedC' => 1,
1509                   'ShortSize' => 2,
1510                   'HasMacroVAARGS' => 1,
1511                   'Assert' => [],
1512                   'UnsignedChars' => 0,
1513                   'DoubleSize' => 8,
1514                   'CharSize' => 1,
1515                   'EnumType' => 'Integer',
1516                   'PointerSize' => 4,
1517                   'EnumSize' => 4,
1518                   'DisabledKeywords' => [],
1519                   'FloatSize' => 4,
1520                   'Alignment' => 1,
1521                   'LongLongSize' => 8,
1522                   'LongDoubleSize' => 12,
1523                   'KeywordMap' => {},
1524                   'Include' => [
1525                     '/usr/include'
1526                   ],
1527                   'HasCPPComments' => 1,
1528                   'Bitfields' => {
1529                     'Engine' => 'Generic'
1530                   },
1531                   'UnsignedBitfields' => 0,
1532                   'Warnings' => 0,
1533                   'CompoundAlignment' => 1,
1534                   'OrderMembers' => 0
1535                 };
1536
1537               Since you may not always want to write a "configure" call when
1538               you only want to change a single configuration item, you can
1539               use any configuration option name as a method name, like:
1540
1541                 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1542
1543               (Yes, the example doesn't make very much sense... ;-)
1544
1545               However, you should keep in mind that configuration methods
1546               that can take lists (namely "Include", "Define" and "Assert",
1547               but not "DisabledKeywords") may behave slightly different than
1548               their "configure" equivalent.  If you pass these methods a
1549               single argument that is an array reference, the current list
1550               will be replaced by the new one, which is just the behaviour of
1551               the corresponding "configure" call.  So the following are
1552               equivalent:
1553
1554                 $c->configure(Define => ['foo', 'bar=123']);
1555                 $c->Define(['foo', 'bar=123']);
1556
1557               But if you pass a list of strings instead of an array reference
1558               (which cannot be done when using "configure"), the new list
1559               items are appended to the current list, so
1560
1561                 $c = new Convert::Binary::C Include => ['/include'];
1562                 $c->Include('/usr/include', '/usr/local/include');
1563                 print Dumper($c->Include);
1564
1565                 $c->Include(['/usr/local/include']);
1566                 print Dumper($c->Include);
1567
1568               will first print all three include paths, but finally only
1569               "/usr/local/include" will be configured:
1570
1571                 $VAR1 = [
1572                   '/include',
1573                   '/usr/include',
1574                   '/usr/local/include'
1575                 ];
1576                 $VAR1 = [
1577                   '/usr/local/include'
1578                 ];
1579
1580               Furthermore, configuration methods can be chained together, as
1581               they return a reference to their object if called as a set
1582               method. So, if you like, you can configure your object like
1583               this:
1584
1585                 $c = Convert::Binary::C->new(IntSize => 4)
1586                        ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1587                        ->ByteOrder('BigEndian');
1588
1589                 $c->configure(EnumType => 'Both', Alignment => 4)
1590                   ->Include('/usr/include', '/usr/local/include');
1591
1592               In the example above, "qw( ... )" is the word list quoting
1593               operator. It returns a list of all non-whitespace sequences,
1594               and is especially useful for configuring preprocessor defines
1595               or assertions. The following assignments are equivalent:
1596
1597                 @array = ('one', 'two', 'three');
1598                 @array = qw(one two three);
1599
1600               You can configure the following options. Unknown options, as
1601               well as invalid values for an option, will cause the object to
1602               throw exceptions.
1603
1604               "IntSize" => 0 | 1 | 2 | 4 | 8
1605                   Set the number of bytes that are occupied by an integer.
1606                   This is in most cases 2 or 4. If you set it to zero, the
1607                   size of an integer on the host system will be used. This is
1608                   also the default unless overridden by
1609                   "CBC_DEFAULT_INT_SIZE" at compile time.
1610
1611               "CharSize" => 0 | 1 | 2 | 4 | 8
1612                   Set the number of bytes that are occupied by a "char".
1613                   This rarely needs to be changed, except for some platforms
1614                   that don't care about bytes, for example DSPs.  If you set
1615                   this to zero, the size of a "char" on the host system will
1616                   be used. This is also the default unless overridden by
1617                   "CBC_DEFAULT_CHAR_SIZE" at compile time.
1618
1619               "ShortSize" => 0 | 1 | 2 | 4 | 8
1620                   Set the number of bytes that are occupied by a short
1621                   integer.  Although integers explicitly declared as "short"
1622                   should be always 16 bit, there are compilers that make a
1623                   short 8 bit wide. If you set it to zero, the size of a
1624                   short integer on the host system will be used. This is also
1625                   the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1626                   at compile time.
1627
1628               "LongSize" => 0 | 1 | 2 | 4 | 8
1629                   Set the number of bytes that are occupied by a long
1630                   integer.  If set to zero, the size of a long integer on the
1631                   host system will be used. This is also the default unless
1632                   overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1633
1634               "LongLongSize" => 0 | 1 | 2 | 4 | 8
1635                   Set the number of bytes that are occupied by a long long
1636                   integer. If set to zero, the size of a long long integer on
1637                   the host system, or 8, will be used. This is also the
1638                   default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1639                   at compile time.
1640
1641               "FloatSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1642                   Set the number of bytes that are occupied by a single
1643                   precision floating point value.  If you set it to zero, the
1644                   size of a "float" on the host system will be used. This is
1645                   also the default unless overridden by
1646                   "CBC_DEFAULT_FLOAT_SIZE" at compile time.  For details on
1647                   floating point support, see "FLOATING POINT VALUES".
1648
1649               "DoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1650                   Set the number of bytes that are occupied by a double
1651                   precision floating point value.  If you set it to zero, the
1652                   size of a "double" on the host system will be used. This is
1653                   also the default unless overridden by
1654                   "CBC_DEFAULT_DOUBLE_SIZE" at compile time.  For details on
1655                   floating point support, see "FLOATING POINT VALUES".
1656
1657               "LongDoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1658                   Set the number of bytes that are occupied by a double
1659                   precision floating point value.  If you set it to zero, the
1660                   size of a "long double" on the host system, or 12 will be
1661                   used. This is also the default unless overridden by
1662                   "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1663                   on floating point support, see "FLOATING POINT VALUES".
1664
1665               "PointerSize" => 0 | 1 | 2 | 4 | 8
1666                   Set the number of bytes that are occupied by a pointer.
1667                   This is in most cases 2 or 4. If you set it to zero, the
1668                   size of a pointer on the host system will be used. This is
1669                   also the default unless overridden by
1670                   "CBC_DEFAULT_PTR_SIZE" at compile time.
1671
1672               "EnumSize" => -1 | 0 | 1 | 2 | 4 | 8
1673                   Set the number of bytes that are occupied by an enumeration
1674                   type.  On most systems, this is equal to the size of an
1675                   integer, which is also the default. However, for some
1676                   compilers, the size of an enumeration type depends on the
1677                   size occupied by the largest enumerator. So the size may
1678                   vary between 1 and 8. If you have
1679
1680                     enum foo {
1681                       ONE = 100, TWO = 200
1682                     };
1683
1684                   this will occupy one byte because the enum can be
1685                   represented as an unsigned one-byte value. However,
1686
1687                     enum foo {
1688                       ONE = -100, TWO = 200
1689                     };
1690
1691                   will occupy two bytes, because the -100 forces the type to
1692                   be signed, and 200 doesn't fit into a signed one-byte
1693                   value.  Therefore, the type used is a signed two-byte
1694                   value.  If this is the behaviour you need, set the EnumSize
1695                   to 0.
1696
1697                   Some compilers try to follow this strategy, but don't care
1698                   whether the enumeration has signed values or not. They
1699                   always declare an enum as signed. On such a compiler, given
1700
1701                     enum one { ONE = -100, TWO = 100 };
1702                     enum two { ONE =  100, TWO = 200 };
1703
1704                   enum "one" will occupy only one byte, while enum "two" will
1705                   occupy two bytes, even though it could be represented by a
1706                   unsigned one-byte value. If this is the behaviour of your
1707                   compiler, set EnumSize to "-1".
1708
1709               "Alignment" => 0 | 1 | 2 | 4 | 8 | 16
1710                   Set the struct member alignment. This option controls where
1711                   padding bytes are inserted between struct members. It
1712                   globally sets the alignment for all structs/unions.
1713                   However, this can be overridden from within the source code
1714                   with the common "pack" pragma as explained in "Supported
1715                   pragma directives".  The default alignment is 1, which
1716                   means no padding bytes are inserted. A setting of 0 means
1717                   native alignment, i.e.  the alignment of the system that
1718                   Convert::Binary::C has been compiled on. You can determine
1719                   the native properties using the "native" function.
1720
1721                   The "Alignment" option is similar to the "-Zp[n]" option of
1722                   the Intel compiler. It globally specifies the maximum
1723                   boundary to which struct members are aligned. Consider the
1724                   following structure and the sizes of "char", "short",
1725                   "long" and "double" being 1, 2, 4 and 8, respectively.
1726
1727                     struct align {
1728                       char   a;
1729                       short  b, c;
1730                       long   d;
1731                       double e;
1732                     };
1733
1734                   With an alignment of 1 (the default), the struct members
1735                   would be packed tightly:
1736
1737                     0   1   2   3   4   5   6   7   8   9  10  11  12
1738                     +---+---+---+---+---+---+---+---+---+---+---+---+
1739                     | a |   b   |   c   |       d       |             ...
1740                     +---+---+---+---+---+---+---+---+---+---+---+---+
1741
1742                        12  13  14  15  16  17
1743                         +---+---+---+---+---+
1744                     ...     e               |
1745                         +---+---+---+---+---+
1746
1747                   With an alignment of 2, the struct members larger than one
1748                   byte would be aligned to 2-byte boundaries, which results
1749                   in a single padding byte between "a" and "b".
1750
1751                     0   1   2   3   4   5   6   7   8   9  10  11  12
1752                     +---+---+---+---+---+---+---+---+---+---+---+---+
1753                     | a | * |   b   |   c   |       d       |         ...
1754                     +---+---+---+---+---+---+---+---+---+---+---+---+
1755
1756                        12  13  14  15  16  17  18
1757                         +---+---+---+---+---+---+
1758                     ...         e               |
1759                         +---+---+---+---+---+---+
1760
1761                   With an alignment of 4, the struct members of size 2 would
1762                   be aligned to 2-byte boundaries and larger struct members
1763                   would be aligned to 4-byte boundaries:
1764
1765                     0   1   2   3   4   5   6   7   8   9  10  11  12
1766                     +---+---+---+---+---+---+---+---+---+---+---+---+
1767                     | a | * |   b   |   c   | * | * |       d       | ...
1768                     +---+---+---+---+---+---+---+---+---+---+---+---+
1769
1770                        12  13  14  15  16  17  18  19  20
1771                         +---+---+---+---+---+---+---+---+
1772                     ... |               e               |
1773                         +---+---+---+---+---+---+---+---+
1774
1775                   This layout of the struct members allows the compiler to
1776                   generate optimized code because aligned members can be
1777                   accessed more easily by the underlying architecture.
1778
1779                   Finally, setting the alignment to 8 will align "double"s to
1780                   8-byte boundaries:
1781
1782                     0   1   2   3   4   5   6   7   8   9  10  11  12
1783                     +---+---+---+---+---+---+---+---+---+---+---+---+
1784                     | a | * |   b   |   c   | * | * |       d       | ...
1785                     +---+---+---+---+---+---+---+---+---+---+---+---+
1786
1787                        12  13  14  15  16  17  18  19  20  21  22  23  24
1788                         +---+---+---+---+---+---+---+---+---+---+---+---+
1789                     ... | * | * | * | * |               e               |
1790                         +---+---+---+---+---+---+---+---+---+---+---+---+
1791
1792                   Further increasing the alignment does not alter the layout
1793                   of our structure, as only members larger that 8 bytes would
1794                   be affected.
1795
1796                   The alignment of a structure depends on its largest member
1797                   and on the setting of the "Alignment" option. With
1798                   "Alignment" set to 2, a structure holding a "long" would be
1799                   aligned to a 2-byte boundary, while a structure containing
1800                   only "char"s would have no alignment restrictions.
1801                   (Unfortunately, that's not the whole story. See the
1802                   "CompoundAlignment" option for details.)
1803
1804                   Here's another example. Assuming 8-byte alignment, the
1805                   following two structs will both have a size of 16 bytes:
1806
1807                     struct one {
1808                       char   c;
1809                       double d;
1810                     };
1811
1812                     struct two {
1813                       double d;
1814                       char   c;
1815                     };
1816
1817                   This is clear for "struct one", because the member "d" has
1818                   to be aligned to an 8-byte boundary, and thus 7 padding
1819                   bytes are inserted after "c". But for "struct two", the
1820                   padding bytes are inserted at the end of the structure,
1821                   which doesn't make much sense immediately. However, it
1822                   makes perfect sense if you think about an array of "struct
1823                   two". Each "double" has to be aligned to an 8-byte
1824                   boundary, an thus each array element would have to occupy
1825                   16 bytes. With that in mind, it would be strange if a
1826                   "struct two" variable would have a different size. And it
1827                   would make the widely used construct
1828
1829                     struct two array[] = { {1.0, 0}, {2.0, 1} };
1830                     int elements = sizeof(array) / sizeof(struct two);
1831
1832                   impossible.
1833
1834                   The alignment behaviour described here seems to be common
1835                   for all compilers. However, not all compilers have an
1836                   option to configure their default alignment.
1837
1838               "CompoundAlignment" => 0 | 1 | 2 | 4 | 8 | 16
1839                   Usually, the alignment of a compound (i.e. a "struct" or a
1840                   "union") depends only on its largest member and on the
1841                   setting of the "Alignment" option. There are, however,
1842                   architectures and compilers where compounds can have
1843                   different alignment constraints.
1844
1845                   For most platforms and compilers, the alignment constraint
1846                   for compounds is 1 byte. That is, on most platforms
1847
1848                     struct onebyte {
1849                       char byte;
1850                     };
1851
1852                   will have an alignment of 1 and also a size of 1. But if
1853                   you take an ARM architecture, the above "struct onebyte"
1854                   will have an alignment of 4, and thus also a size of 4.
1855
1856                   You can configure this by setting "CompoundAlignment" to 4.
1857                   This will ensure that the alignment of compounds is always
1858                   4.
1859
1860                   Setting "CompoundAlignment" to 0 means native compound
1861                   alignment, i.e. the compound alignment of the system that
1862                   Convert::Binary::C has been compiled on. You can determine
1863                   the native properties using the "native" function.
1864
1865                   There are also compilers for certain platforms that allow
1866                   you to adjust the compound alignment. If you're not aware
1867                   of the fact that your compiler/architecture has a compound
1868                   alignment other than 1, strange things can happen. If, for
1869                   example, the compound alignment is 2 and you have something
1870                   like
1871
1872                     typedef unsigned char U8;
1873
1874                     struct msg_head {
1875                       U8 cmd;
1876                       struct {
1877                         U8 hi;
1878                         U8 low;
1879                       } crc16;
1880                       U8 len;
1881                     };
1882
1883                   there will be one padding byte inserted before the embedded
1884                   "crc16" struct and after the "len" member, which is most
1885                   probably not what was intended:
1886
1887                     0     1     2     3     4     5     6
1888                     +-----+-----+-----+-----+-----+-----+
1889                     | cmd |  *  | hi  | low | len |  *  |
1890                     +-----+-----+-----+-----+-----+-----+
1891
1892                   Note that both "#pragma pack" and the "Alignment" option
1893                   can override "CompoundAlignment". If you set
1894                   "CompoundAlignment" to 4, but "Alignment" to 2, compounds
1895                   will actually be aligned on 2-byte boundaries.
1896
1897               "ByteOrder" => 'BigEndian' | 'LittleEndian'
1898                   Set the byte order for integers larger than a single byte.
1899                   Little endian (Intel, least significant byte first) and big
1900                   endian (Motorola, most significant byte first) byte order
1901                   are supported. The default byte order is the same as the
1902                   byte order of the host system unless overridden by
1903                   "CBC_DEFAULT_BYTEORDER" at compile time.
1904
1905               "EnumType" => 'Integer' | 'String' | 'Both'
1906                   This option controls the type that enumeration constants
1907                   will have in data structures returned by the "unpack"
1908                   method.  If you have the following definitions:
1909
1910                     typedef enum {
1911                       SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1912                       THURSDAY, FRIDAY, SATURDAY
1913                     } Weekday;
1914
1915                     typedef enum {
1916                       JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1917                       AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1918                     } Month;
1919
1920                     typedef struct {
1921                       int     year;
1922                       Month   month;
1923                       int     day;
1924                       Weekday weekday;
1925                     } Date;
1926
1927                   and a byte string that holds a packed Date struct, then
1928                   you'll get the following results from a call to the
1929                   "unpack" method.
1930
1931                   "Integer"
1932                       Enumeration constants are returned as plain integers.
1933                       This is fast, but may be not very useful. It is also
1934                       the default.
1935
1936                         $date = {
1937                           'weekday' => 1,
1938                           'month' => 0,
1939                           'day' => 7,
1940                           'year' => 2002
1941                         };
1942
1943                   "String"
1944                       Enumeration constants are returned as strings. This
1945                       will create a string constant for every unpacked
1946                       enumeration constant and thus consumes more time and
1947                       memory. However, the result may be more useful.
1948
1949                         $date = {
1950                           'weekday' => 'MONDAY',
1951                           'month' => 'JANUARY',
1952                           'day' => 7,
1953                           'year' => 2002
1954                         };
1955
1956                   "Both"
1957                       Enumeration constants are returned as double typed
1958                       scalars.  If evaluated in string context, the
1959                       enumeration constant will be a string, if evaluated in
1960                       numeric context, the enumeration constant will be an
1961                       integer.
1962
1963                         $date = $c->EnumType('Both')->unpack('Date', $binary);
1964
1965                         printf "Weekday = %s (%d)\n\n", $date->{weekday},
1966                                                         $date->{weekday};
1967
1968                         if ($date->{month} == 0) {
1969                           print "It's $date->{month}, happy new year!\n\n";
1970                         }
1971
1972                         print Dumper($date);
1973
1974                       This will print:
1975
1976                         Weekday = MONDAY (1)
1977
1978                         It's JANUARY, happy new year!
1979
1980                         $VAR1 = {
1981                           'weekday' => 'MONDAY',
1982                           'month' => 'JANUARY',
1983                           'day' => 7,
1984                           'year' => 2002
1985                         };
1986
1987               "DisabledKeywords" => [ KEYWORDS ]
1988                   This option allows you to selectively deactivate certain
1989                   keywords in the C parser. Some C compilers don't have the
1990                   complete ANSI keyword set, i.e. they don't recognize the
1991                   keywords "const" or "void", for example. If you do
1992
1993                     typedef int void;
1994
1995                   on such a compiler, this will usually be ok. But if you
1996                   parse this with an ANSI compiler, it will be a syntax
1997                   error. To parse the above code correctly, you have to
1998                   disable the "void" keyword in the Convert::Binary::C
1999                   parser:
2000
2001                     $c->DisabledKeywords([qw( void )]);
2002
2003                   By default, the Convert::Binary::C parser will recognize
2004                   the keywords "inline" and "restrict". If your compiler
2005                   doesn't have these new keywords, it usually doesn't matter.
2006                   Only if you're using the keywords as identifiers, like in
2007
2008                     typedef struct inline {
2009                       int a, b;
2010                     } restrict;
2011
2012                   you'll have to disable these ISO-C99 keywords:
2013
2014                     $c->DisabledKeywords([qw( inline restrict )]);
2015
2016                   The parser allows you to disable the following keywords:
2017
2018                     asm
2019                     auto
2020                     const
2021                     double
2022                     enum
2023                     extern
2024                     float
2025                     inline
2026                     long
2027                     register
2028                     restrict
2029                     short
2030                     signed
2031                     static
2032                     unsigned
2033                     void
2034                     volatile
2035
2036               "KeywordMap" => { KEYWORD => TOKEN, ... }
2037                   This option allows you to add new keywords to the parser.
2038                   These new keywords can either be mapped to existing tokens
2039                   or simply ignored. For example, recent versions of the GNU
2040                   compiler recognize the keywords "__signed__" and
2041                   "__extension__".  The first one obviously is a synonym for
2042                   "signed", while the second one is only a marker for a
2043                   language extension.
2044
2045                   Using the preprocessor, you could of course do the
2046                   following:
2047
2048                     $c->Define(qw( __signed__=signed __extension__= ));
2049
2050                   However, the preprocessor symbols could be undefined or
2051                   redefined in the code, and
2052
2053                     #ifdef __signed__
2054                     # undef __signed__
2055                     #endif
2056
2057                     typedef __extension__ __signed__ long long s_quad;
2058
2059                   would generate a parse error, because "__signed__" is an
2060                   unexpected identifier.
2061
2062                   Instead of utilizing the preprocessor, you'll have to
2063                   create mappings for the new keywords directly in the parser
2064                   using "KeywordMap". In the above example, you want to map
2065                   "__signed__" to the built-in C keyword "signed" and ignore
2066                   "__extension__". This could be done with the following
2067                   code:
2068
2069                     $c->KeywordMap({ __signed__    => 'signed',
2070                                      __extension__ => undef });
2071
2072                   You can specify any valid identifier as hash key, and
2073                   either a valid C keyword or "undef" as hash value.  Having
2074                   configured the object that way, you could parse even
2075
2076                     #ifdef __signed__
2077                     # undef __signed__
2078                     #endif
2079
2080                     typedef __extension__ __signed__ long long s_quad;
2081
2082                   without problems.
2083
2084                   Note that "KeywordMap" and "DisabledKeywords" perfectly
2085                   work together. You could, for example, disable the "signed"
2086                   keyword, but still have "__signed__" mapped to the original
2087                   "signed" token:
2088
2089                     $c->configure(DisabledKeywords => [ 'signed' ],
2090                                   KeywordMap       => { __signed__  => 'signed' });
2091
2092                   This would allow you to define
2093
2094                     typedef __signed__ long signed;
2095
2096                   which would normally be a syntax error because "signed"
2097                   cannot be used as an identifier.
2098
2099               "UnsignedChars" => 0 | 1
2100                   Use this boolean option if you want characters to be
2101                   unsigned if specified without an explicit "signed" or
2102                   "unsigned" type specifier.  By default, characters are
2103                   signed.
2104
2105               "UnsignedBitfields" => 0 | 1
2106                   Use this boolean option if you want bitfields to be
2107                   unsigned if specified without an explicit "signed" or
2108                   "unsigned" type specifier.  By default, bitfields are
2109                   signed.
2110
2111               "Warnings" => 0 | 1
2112                   Use this boolean option if you want warnings to be issued
2113                   during the parsing of source code. Currently, warnings are
2114                   only reported by the preprocessor, so don't expect the
2115                   output to cover everything.
2116
2117                   By default, warnings are turned off and only errors will be
2118                   reported. However, even these errors are turned off if you
2119                   run without the "-w" flag.
2120
2121               "HasCPPComments" => 0 | 1
2122                   Use this option to turn C++ comments on or off. By default,
2123                   C++ comments are enabled. Disabling C++ comments may be
2124                   necessary if your code includes strange things like:
2125
2126                     one = 4 //* <- divide */ 4;
2127                     two = 2;
2128
2129                   With C++ comments, the above will be interpreted as
2130
2131                     one = 4
2132                     two = 2;
2133
2134                   which will obviously be a syntax error, but without C++
2135                   comments, it will be interpreted as
2136
2137                     one = 4 / 4;
2138                     two = 2;
2139
2140                   which is correct.
2141
2142               "HasMacroVAARGS" => 0 | 1
2143                   Use this option to turn the "__VA_ARGS__" macro expansion
2144                   on or off. If this is enabled (which is the default), you
2145                   can use variable length argument lists in your preprocessor
2146                   macros.
2147
2148                     #define DEBUG( ... )  fprintf( stderr, __VA_ARGS__ )
2149
2150                   There's normally no reason to turn that feature off.
2151
2152               "StdCVersion" => undef | INTEGER
2153                   Use this option to change the value of the preprocessor's
2154                   predefined "__STDC_VERSION__" macro. When set to "undef",
2155                   the macro will not be defined.
2156
2157               "HostedC" => undef | 0 | 1
2158                   Use this option to change the value of the preprocessor's
2159                   predefined "__STDC_HOSTED__" macro. When set to "undef",
2160                   the macro will not be defined.
2161
2162               "Include" => [ INCLUDES ]
2163                   Use this option to set the include path for the internal
2164                   preprocessor. The option value is a reference to an array
2165                   of strings, each string holding a directory that should be
2166                   searched for includes.
2167
2168               "Define" => [ DEFINES ]
2169                   Use this option to define symbols in the preprocessor.  The
2170                   option value is, again, a reference to an array of strings.
2171                   Each string can be either just a symbol or an assignment to
2172                   a symbol. This is completely equivalent to what the "-D"
2173                   option does for most preprocessors.
2174
2175                   The following will define the symbol "FOO" and define "BAR"
2176                   to be 12345:
2177
2178                     $c->configure(Define => [qw( FOO BAR=12345 )]);
2179
2180               "Assert" => [ ASSERTIONS ]
2181                   Use this option to make assertions in the preprocessor.  If
2182                   you don't know what assertions are, don't be concerned,
2183                   since they're deprecated anyway. They are, however, used in
2184                   some system's include files.  The value is an array
2185                   reference, just like for the macro definitions. Only the
2186                   way the assertions are defined is a bit different and
2187                   mimics the way they are defined with the "#assert"
2188                   directive:
2189
2190                     $c->configure(Assert => ['foo(bar)']);
2191
2192               "OrderMembers" => 0 | 1
2193                   When using "unpack" on compounds and iterating over the
2194                   returned hash, the order of the compound members is
2195                   generally not preserved due to the nature of hash tables.
2196                   It is not even guaranteed that the order is the same
2197                   between different runs of the same program. This can be
2198                   very annoying if you simply use to dump your data
2199                   structures and the compound members always show up in a
2200                   different order.
2201
2202                   By setting "OrderMembers" to a non-zero value, all hashes
2203                   returned by "unpack" are tied to a class that preserves the
2204                   order of the hash keys.  This way, all compound members
2205                   will be returned in the correct order just as they are
2206                   defined in your C code.
2207
2208                     use Convert::Binary::C;
2209                     use Data::Dumper;
2210
2211                     $c = Convert::Binary::C->new->parse(<<'ENDC');
2212                     struct test {
2213                       char one;
2214                       char two;
2215                       struct {
2216                         char never;
2217                         char change;
2218                         char this;
2219                         char order;
2220                       } three;
2221                       char four;
2222                     };
2223                     ENDC
2224
2225                     $data = "Convert";
2226
2227                     $u1 = $c->unpack('test', $data);
2228                     $c->OrderMembers(1);
2229                     $u2 = $c->unpack('test', $data);
2230
2231                     print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2232
2233                   This will print something like:
2234
2235                     $u1 = {
2236                       'three' => {
2237                         'change' => 118,
2238                         'order' => 114,
2239                         'this' => 101,
2240                         'never' => 110
2241                       },
2242                       'one' => 67,
2243                       'two' => 111,
2244                       'four' => 116
2245                     };
2246                     $u2 = {
2247                       'one' => 67,
2248                       'two' => 111,
2249                       'three' => {
2250                         'never' => 110,
2251                         'change' => 118,
2252                         'this' => 101,
2253                         'order' => 114
2254                       },
2255                       'four' => 116
2256                     };
2257
2258                   To be able to use this option, you have to install either
2259                   the Tie::Hash::Indexed or the Tie::IxHash module. If both
2260                   are installed, Convert::Binary::C will give preference to
2261                   Tie::Hash::Indexed because it's faster.
2262
2263                   When using this option, you should keep in mind that tied
2264                   hashes are significantly slower and consume more memory
2265                   than ordinary hashes, even when the class they're tied to
2266                   is implemented efficiently. So don't turn this option on if
2267                   you don't have to.
2268
2269                   You can also influence hash member ordering by using the
2270                   "CBC_ORDER_MEMBERS" environment variable.
2271
2272               "Bitfields" => { OPTION => VALUE, ... }
2273                   Use this option to specify and configure a bitfield
2274                   layouting engine. You can choose an engine by passing its
2275                   name to the "Engine" option, like:
2276
2277                     $c->configure(Bitfields => { Engine => 'Generic' });
2278
2279                   Each engine can have its own set of options, although
2280                   currently none of them does.
2281
2282                   You can choose between the following bitfield engines:
2283
2284                   "Generic"
2285                       This engine implements the behaviour of most UNIX C
2286                       compilers, including GCC. It does not handle packed
2287                       bitfields yet.
2288
2289                   "Microsoft"
2290                       This engine implements the behaviour of Microsoft's
2291                       "cl" compiler.  It should be fairly complete and can
2292                       handle packed bitfields.
2293
2294                   "Simple"
2295                       This engine is only used for testing the bitfield
2296                       infrastructure in Convert::Binary::C. There's usually
2297                       no reason to use it.
2298
2299               You can reconfigure all options even after you have parsed some
2300               code. The changes will be applied to the already parsed
2301               definitions. This works as long as array lengths are not
2302               affected by the changes. If you have Alignment and IntSize set
2303               to 4 and parse code like this
2304
2305                 typedef struct {
2306                   char abc;
2307                   int  day;
2308                 } foo;
2309
2310                 struct bar {
2311                   foo  zap[2*sizeof(foo)];
2312                 };
2313
2314               the array "zap" in "struct bar" will obviously have 16
2315               elements. If you reconfigure the alignment to 1 now, the size
2316               of "foo" is now 5 instead of 8. While the alignment is adjusted
2317               correctly, the number of elements in array "zap" will still be
2318               16 and will not be changed to 10.
2319
2320   parse
2321       "parse" CODE
2322               Parses a string of valid C code. All enumeration, compound and
2323               type definitions are extracted. You can call the "parse" and
2324               "parse_file" methods as often as you like to add further
2325               definitions to the Convert::Binary::C object.
2326
2327               "parse" will throw an exception if an error occurs.  On
2328               success, the method returns a reference to its object.
2329
2330               See "Parsing C code" for an example.
2331
2332   parse_file
2333       "parse_file" FILE
2334               Parses a C source file. All enumeration, compound and type
2335               definitions are extracted. You can call the "parse" and
2336               "parse_file" methods as often as you like to add further
2337               definitions to the Convert::Binary::C object.
2338
2339               "parse_file" will search the include path given via the
2340               "Include" option for the file if it cannot find it in the
2341               current directory.
2342
2343               "parse_file" will throw an exception if an error occurs. On
2344               success, the method returns a reference to its object.
2345
2346               See "Parsing C code" for an example.
2347
2348               When calling "parse" or "parse_file" multiple times, you may
2349               use types previously defined, but you are not allowed to
2350               redefine types. The state of the preprocessor is also saved, so
2351               you may also use defines from a previous parse. This works only
2352               as long as the preprocessor is not reset. See "Preprocessor
2353               configuration" for details.
2354
2355               When you're parsing C source files instead of C header files,
2356               note that local definitions are ignored. This means that type
2357               definitions hidden within functions will not be recognized by
2358               Convert::Binary::C. This is necessary because different
2359               functions (even different blocks within the same function) can
2360               define types with the same name:
2361
2362                 void my_func(int i)
2363                 {
2364                   if (i < 10)
2365                   {
2366                     enum digit { ONE, TWO, THREE } x = ONE;
2367                     printf("%d, %d\n", i, x);
2368                   }
2369                   else
2370                   {
2371                     enum digit { THREE, TWO, ONE } x = ONE;
2372                     printf("%d, %d\n", i, x);
2373                   }
2374                 }
2375
2376               The above is a valid piece of C code, but it's not possible for
2377               Convert::Binary::C to distinguish between the different
2378               definitions of "enum digit", as they're only defined locally
2379               within the corresponding block.
2380
2381   clean
2382       "clean" Clears all information that has been collected during previous
2383               calls to "parse" or "parse_file".  You can use this method if
2384               you want to parse some entirely different code, but with the
2385               same configuration.
2386
2387               The "clean" method returns a reference to its object.
2388
2389   clone
2390       "clone" Makes the object return an exact independent copy of itself.
2391
2392                 $c = new Convert::Binary::C Include => ['/usr/include'];
2393                 $c->parse_file('definitions.c');
2394                 $clone = $c->clone;
2395
2396               The above code is technically equivalent (Mostly. Actually,
2397               using "sourcify" and "parse" might alter the order of the
2398               parsed data, which would make methods such as "compound" return
2399               the definitions in a different order.) to:
2400
2401                 $c = new Convert::Binary::C Include => ['/usr/include'];
2402                 $c->parse_file('definitions.c');
2403                 $clone = new Convert::Binary::C %{$c->configure};
2404                 $clone->parse($c->sourcify);
2405
2406               Using "clone" is just a lot faster.
2407
2408   def
2409       "def" NAME
2410       "def" TYPE
2411               If you need to know if a definition for a certain type name
2412               exists, use this method. You pass it the name of an enum,
2413               struct, union or typedef, and it will return a non-empty string
2414               being either "enum", "struct", "union", or "typedef" if there's
2415               a definition for the type in question, an empty string if
2416               there's no such definition, or "undef" if the name is
2417               completely unknown. If the type can be interpreted as a basic
2418               type, "basic" will be returned.
2419
2420               If you pass in a TYPE, the output will be slightly different.
2421               If the specified member exists, the "def" method will return
2422               "member". If the member doesn't exist, or if the type cannot
2423               have members, the empty string will be returned. Again, if the
2424               name of the type is completely unknown, "undef" will be
2425               returned. This may be useful if you want to check if a certain
2426               member exists within a compound, for example.
2427
2428                 use Convert::Binary::C;
2429
2430                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2431
2432                 typedef struct __not  not;
2433                 typedef struct __not *ptr;
2434
2435                 struct foo {
2436                   enum bar *xxx;
2437                 };
2438
2439                 typedef int quad[4];
2440
2441                 ENDC
2442
2443                 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2444                                   quad quad[3] quad[5] quad[-3] short[1] ),
2445                               'unsigned long')
2446                 {
2447                   my $def = $c->def($type);
2448                   printf "%-14s  =>  %s\n",
2449                           $type,     defined $def ? "'$def'" : 'undef';
2450                 }
2451
2452               The following would be returned by the "def" method:
2453
2454                 not             =>  ''
2455                 ptr             =>  'typedef'
2456                 foo             =>  'struct'
2457                 bar             =>  ''
2458                 xxx             =>  undef
2459                 foo.xxx         =>  'member'
2460                 foo.abc         =>  ''
2461                 xxx.yyy         =>  undef
2462                 quad            =>  'typedef'
2463                 quad[3]         =>  'member'
2464                 quad[5]         =>  'member'
2465                 quad[-3]        =>  'member'
2466                 short[1]        =>  undef
2467                 unsigned long   =>  'basic'
2468
2469               So, if "def" returns a non-empty string, you can safely use any
2470               other method with that type's name or with that member
2471               expression.
2472
2473               Concerning arrays, note that the index into an array doesn't
2474               need to be within the bounds of the array's definition, just
2475               like in C. In the above example, "quad[5]" and "quad[-3]" are
2476               valid members of the "quad" array, even though it is declared
2477               to have only four elements.
2478
2479               In cases where the typedef namespace overlaps with the
2480               namespace of enums/structs/unions, the "def" method will give
2481               preference to the typedef and will thus return the string
2482               "typedef". You could however force interpretation as an enum,
2483               struct or union by putting "enum", "struct" or "union" in front
2484               of the type's name.
2485
2486   defined
2487       "defined" MACRO
2488               You can use the "defined" method to find out if a certain macro
2489               is defined, just like you would use the "defined" operator of
2490               the preprocessor. For example, the following code
2491
2492                 use Convert::Binary::C;
2493
2494                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2495
2496                 #define ADD(a, b) ((a) + (b))
2497
2498                 #if 1
2499                 # define DEFINED
2500                 #else
2501                 # define UNDEFINED
2502                 #endif
2503
2504                 ENDC
2505
2506                 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2507                   my $not = $c->defined($macro) ? '' : ' not';
2508                   print "Macro '$macro' is$not defined.\n";
2509                 }
2510
2511               would print:
2512
2513                 Macro 'ADD' is defined.
2514                 Macro 'DEFINED' is defined.
2515                 Macro 'UNDEFINED' is not defined.
2516
2517               You have to keep in mind that this works only as long as the
2518               preprocessor is not reset. See "Preprocessor configuration" for
2519               details.
2520
2521   pack
2522       "pack" TYPE
2523       "pack" TYPE, DATA
2524       "pack" TYPE, DATA, STRING
2525               Use this method to pack a complex data structure into a binary
2526               string according to a type definition that has been previously
2527               parsed. DATA must be a scalar matching the type definition. C
2528               structures and unions are represented by references to Perl
2529               hashes, C arrays by references to Perl arrays.
2530
2531                 use Convert::Binary::C;
2532                 use Data::Dumper;
2533                 use Data::Hexdumper;
2534
2535                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2536                                             , LongSize  => 4
2537                                             , ShortSize => 2
2538                                             )
2539                                        ->parse(<<'ENDC');
2540                 struct test {
2541                   char    ary[3];
2542                   union {
2543                     short word[2];
2544                     long  quad;
2545                   }       uni;
2546                 };
2547                 ENDC
2548
2549               Hashes don't have to contain a key for each compound member and
2550               arrays may be truncated:
2551
2552                 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2553
2554               Elements not defined in the Perl data structure will be set to
2555               zero in the packed byte string. If you pass "undef" as or
2556               simply omit the second parameter, the whole string will be
2557               initialized with zero bytes. On success, the packed byte string
2558               is returned.
2559
2560                 print hexdump(data => $binary);
2561
2562               The above code would print:
2563
2564                   0x0000 : 01 02 00 00 00 00 2A                            : ......*
2565
2566               You could also use "unpack" and dump the data structure.
2567
2568                 $unpacked = $c->unpack('test', $binary);
2569                 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2570
2571               This would print:
2572
2573                 $unpacked = {
2574                   'uni' => {
2575                     'word' => [
2576                       0,
2577                       42
2578                     ],
2579                     'quad' => 42
2580                   },
2581                   'ary' => [
2582                     1,
2583                     2,
2584                     0
2585                   ]
2586                 };
2587
2588               If TYPE refers to a compound object, you may pack any member of
2589               that compound object. Simply add a member expression to the
2590               type name, just as you would access the member in C:
2591
2592                 $array = $c->pack('test.ary', [1, 2, 3]);
2593                 print hexdump(data => $array);
2594
2595                 $value = $c->pack('test.uni.word[1]', 2);
2596                 print hexdump(data => $value);
2597
2598               This would give you:
2599
2600                   0x0000 : 01 02 03                                        : ...
2601                   0x0000 : 00 02                                           : ..
2602
2603               Call "pack" with the optional STRING argument if you want to
2604               use an existing binary string to insert the data.  If called in
2605               a void context, "pack" will directly modify the string you
2606               passed as the third argument.  Otherwise, a copy of the string
2607               is created, and "pack" will modify and return the copy, so the
2608               original string will remain unchanged.
2609
2610               The 3-argument version may be useful if you want to change only
2611               a few members of a complex data structure without having to
2612               "unpack" everything, change the members, and then "pack" again
2613               (which could waste lots of memory and CPU cycles). So, instead
2614               of doing something like
2615
2616                 $test = $c->unpack('test', $binary);
2617                 $test->{uni}{quad} = 4711;
2618                 $new = $c->pack('test', $test);
2619
2620               to change the "uni.quad" member of $packed, you could simply do
2621               either
2622
2623                 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2624
2625               or
2626
2627                 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2628
2629               while the latter would directly modify $packed.  Besides this
2630               code being a lot shorter (and perhaps even more readable), it
2631               can be significantly faster if you're dealing with really big
2632               data blocks.
2633
2634               If the length of the input string is less than the size
2635               required by the type, the string (or its copy) is extended and
2636               the extended part is initialized to zero.  If the length is
2637               more than the size required by the type, the string is kept at
2638               that length, and also a copy would be an exact copy of that
2639               string.
2640
2641                 $too_short = pack "C*", (1 .. 4);
2642                 $too_long  = pack "C*", (1 .. 20);
2643
2644                 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2645                 print "too_short:\n", hexdump(data => $too_short);
2646
2647                 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2648                 print "\ncopy:\n", hexdump(data => $copy);
2649
2650               This would print:
2651
2652                 too_short:
2653                   0x0000 : 01 02 03 00 00 47 11                            : .....G.
2654
2655                 copy:
2656                   0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2657                   0x0010 : 11 12 13 14                                     : ....
2658
2659   unpack
2660       "unpack" TYPE, STRING
2661               Use this method to unpack a binary string and create an
2662               arbitrarily complex Perl data structure based on a previously
2663               parsed type definition.
2664
2665                 use Convert::Binary::C;
2666                 use Data::Dumper;
2667
2668                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2669                                             , LongSize  => 4
2670                                             , ShortSize => 2
2671                                             )
2672                                        ->parse( <<'ENDC' );
2673                 struct test {
2674                   char    ary[3];
2675                   union {
2676                     short word[2];
2677                     long *quad;
2678                   }       uni;
2679                 };
2680                 ENDC
2681
2682                 # Generate some binary dummy data
2683                 $binary = pack "C*", 1 .. $c->sizeof('test');
2684
2685               On failure, e.g. if the specified type cannot be found, the
2686               method will throw an exception. On success, a reference to a
2687               complex Perl data structure is returned, which can directly be
2688               dumped using the Data::Dumper module:
2689
2690                 $unpacked = $c->unpack('test', $binary);
2691                 print Dumper($unpacked);
2692
2693               This would print:
2694
2695                 $VAR1 = {
2696                   'uni' => {
2697                     'word' => [
2698                       1029,
2699                       1543
2700                     ],
2701                     'quad' => 67438087
2702                   },
2703                   'ary' => [
2704                     1,
2705                     2,
2706                     3
2707                   ]
2708                 };
2709
2710               If TYPE refers to a compound object, you may unpack any member
2711               of that compound object. Simply add a member expression to the
2712               type name, just as you would access the member in C:
2713
2714                 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2715
2716                 $unpack1 = $unpacked->{uni}{word};
2717                 $unpack2 = $c->unpack('test.uni.word', $binary2);
2718
2719                 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2720
2721               You will find that the output is exactly the same for both
2722               $unpack1 and $unpack2:
2723
2724                 $unpack1 = [
2725                   1029,
2726                   1543
2727                 ];
2728                 $unpack2 = [
2729                   1029,
2730                   1543
2731                 ];
2732
2733               When "unpack" is called in list context, it will unpack as many
2734               elements as possible from STRING, including zero if STRING is
2735               not long enough.
2736
2737   initializer
2738       "initializer" TYPE
2739       "initializer" TYPE, DATA
2740               The "initializer" method can be used retrieve an initializer
2741               string for a certain TYPE.  This can be useful if you have to
2742               initialize only a couple of members in a huge compound type or
2743               if you simply want to generate initializers automatically.
2744
2745                 struct date {
2746                   unsigned year : 12;
2747                   unsigned month:  4;
2748                   unsigned day  :  5;
2749                   unsigned hour :  5;
2750                   unsigned min  :  6;
2751                 };
2752
2753                 typedef struct {
2754                   enum { DATE, QWORD } type;
2755                   short number;
2756                   union {
2757                     struct date   date;
2758                     unsigned long qword;
2759                   } choice;
2760                 } data;
2761
2762               Given the above code has been parsed
2763
2764                 $init = $c->initializer('data');
2765                 print "data x = $init;\n";
2766
2767               would print the following:
2768
2769                 data x = {
2770                       0,
2771                       0,
2772                       {
2773                               {
2774                                       0,
2775                                       0,
2776                                       0,
2777                                       0,
2778                                       0
2779                               }
2780                       }
2781                 };
2782
2783               You could directly put that into a C program, although it
2784               probably isn't very useful yet. It becomes more useful if you
2785               actually specify how you want to initialize the type:
2786
2787                 $data = {
2788                   type   => 'QWORD',
2789                   choice => {
2790                     date  => { month => 12, day => 24 },
2791                     qword => 4711,
2792                   },
2793                   stuff => 'yes?',
2794                 };
2795
2796                 $init = $c->initializer('data', $data);
2797                 print "data x = $init;\n";
2798
2799               This would print the following:
2800
2801                 data x = {
2802                       QWORD,
2803                       0,
2804                       {
2805                               {
2806                                       0,
2807                                       12,
2808                                       24,
2809                                       0,
2810                                       0
2811                               }
2812                       }
2813                 };
2814
2815               As only the first member of a "union" can be initialized,
2816               "choice.qword" is ignored. You will not be warned about the
2817               fact that you probably tried to initialize a member other than
2818               the first. This is considered a feature, because it allows you
2819               to use "unpack" to generate the initializer data:
2820
2821                 $data = $c->unpack('data', $binary);
2822                 $init = $c->initializer('data', $data);
2823
2824               Since "unpack" unpacks all union members, you would otherwise
2825               have to delete all but the first one previous to feeding it
2826               into "initializer".
2827
2828               Also, "stuff" is ignored, because it actually isn't a member of
2829               "data". You won't be warned about that either.
2830
2831   sizeof
2832       "sizeof" TYPE
2833               This method will return the size of a C type in bytes.  If it
2834               cannot find the type, it will throw an exception.
2835
2836               If the type defines some kind of compound object, you may ask
2837               for the size of a member of that compound object:
2838
2839                 $size = $c->sizeof('test.uni.word[1]');
2840
2841               This would set $size to 2.
2842
2843   typeof
2844       "typeof" TYPE
2845               This method will return the type of a C member.  While this
2846               only makes sense for compound types, it's legal to also use it
2847               for non-compound types.  If it cannot find the type, it will
2848               throw an exception.
2849
2850               The "typeof" method can be used on any valid member, even on
2851               arrays or unnamed types. It will always return a string that
2852               holds the name (or in case of unnamed types only the class) of
2853               the type, optionally followed by a '*' character to indicate
2854               it's a pointer type, and optionally followed by one or more
2855               array dimensions if it's an array type. If the type is a
2856               bitfield, the type name is followed by a colon and the number
2857               of bits.
2858
2859                 struct test {
2860                   char    ary[3];
2861                   union {
2862                     short word[2];
2863                     long *quad;
2864                   }       uni;
2865                   struct {
2866                     unsigned short six:6;
2867                     unsigned short ten:10;
2868                   }       bits;
2869                 };
2870
2871               Given the above C code has been parsed, calls to "typeof" would
2872               return the following values:
2873
2874                 $c->typeof('test')             => 'struct test'
2875                 $c->typeof('test.ary')         => 'char [3]'
2876                 $c->typeof('test.uni')         => 'union'
2877                 $c->typeof('test.uni.quad')    => 'long *'
2878                 $c->typeof('test.uni.word')    => 'short [2]'
2879                 $c->typeof('test.uni.word[1]') => 'short'
2880                 $c->typeof('test.bits')        => 'struct'
2881                 $c->typeof('test.bits.six')    => 'unsigned short :6'
2882                 $c->typeof('test.bits.ten')    => 'unsigned short :10'
2883
2884   offsetof
2885       "offsetof" TYPE, MEMBER
2886               You can use "offsetof" just like the C macro of same
2887               denominator. It will simply return the offset (in bytes) of
2888               MEMBER relative to TYPE.
2889
2890                 use Convert::Binary::C;
2891
2892                 $c = Convert::Binary::C->new( Alignment   => 4
2893                                             , LongSize    => 4
2894                                             , PointerSize => 4
2895                                             )
2896                                        ->parse(<<'ENDC');
2897                 typedef struct {
2898                   char abc;
2899                   long day;
2900                   int *ptr;
2901                 } week;
2902
2903                 struct test {
2904                   week zap[8];
2905                 };
2906                 ENDC
2907
2908                 @args = (
2909                   ['test',        'zap[5].day'  ],
2910                   ['test.zap[2]', 'day'         ],
2911                   ['test',        'zap[5].day+1'],
2912                   ['test',        'zap[-3].ptr' ],
2913                 );
2914
2915                 for (@args) {
2916                   my $offset = eval { $c->offsetof(@$_) };
2917                   printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2918                 }
2919
2920               The final loop will print:
2921
2922                 $c->offsetof('test', 'zap[5].day') => 64
2923                 $c->offsetof('test.zap[2]', 'day') => 4
2924                 $c->offsetof('test', 'zap[5].day+1') => 65
2925                 $c->offsetof('test', 'zap[-3].ptr') => -28
2926
2927               · The first iteration simply shows that the offset of
2928                 "zap[5].day" is 64 relative to the beginning of "struct
2929                 test".
2930
2931               · You may additionally specify a member for the type passed as
2932                 the first argument, as shown in the second iteration.
2933
2934               · The offset suffix is also supported by "offsetof", so the
2935                 third iteration will correctly print 65.
2936
2937               · The last iteration demonstrates that even out-of-bounds array
2938                 indices are handled correctly, just as they are handled in C.
2939
2940               Unlike the C macro, "offsetof" also works on array types.
2941
2942                 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2943                 print "offset = $offset";
2944
2945               This will print:
2946
2947                 offset = 46
2948
2949               If TYPE is a compound, MEMBER may optionally be prefixed with a
2950               dot, so
2951
2952                 printf "offset = %d\n", $c->offsetof('week', 'day');
2953                 printf "offset = %d\n", $c->offsetof('week', '.day');
2954
2955               are both equivalent and will print
2956
2957                 offset = 4
2958                 offset = 4
2959
2960               This allows one to
2961
2962               · use the C macro style, without a leading dot, and
2963
2964               · directly use the output of the "member" method, which
2965                 includes a leading dot for compound types, as input for the
2966                 MEMBER argument.
2967
2968   member
2969       "member" TYPE
2970       "member" TYPE, OFFSET
2971               You can think of "member" as being the reverse of the
2972               "offsetof" method. However, as this is more complex, there's no
2973               equivalent to "member" in the C language.
2974
2975               Usually this method is used if you want to retrieve the name of
2976               the member that is located at a specific offset of a previously
2977               parsed type.
2978
2979                 use Convert::Binary::C;
2980
2981                 $c = Convert::Binary::C->new( Alignment   => 4
2982                                             , LongSize    => 4
2983                                             , PointerSize => 4
2984                                             )
2985                                        ->parse(<<'ENDC');
2986                 typedef struct {
2987                   char abc;
2988                   long day;
2989                   int *ptr;
2990                 } week;
2991
2992                 struct test {
2993                   week zap[8];
2994                 };
2995                 ENDC
2996
2997                 for my $offset (24, 39, 69, 99) {
2998                   print "\$c->member('test', $offset)";
2999                   my $member = eval { $c->member('test', $offset) };
3000                   print $@ ? "\n  exception: $@" : " => '$member'\n";
3001                 }
3002
3003               This will print:
3004
3005                 $c->member('test', 24) => '.zap[2].abc'
3006                 $c->member('test', 39) => '.zap[3]+3'
3007                 $c->member('test', 69) => '.zap[5].ptr+1'
3008                 $c->member('test', 99)
3009                   exception: Offset 99 out of range (0 <= offset < 96)
3010
3011               · The output of the first iteration is obvious. The member
3012                 "zap[2].abc" is located at offset 24 of "struct test".
3013
3014               · In the second iteration, the offset points into a region of
3015                 padding bytes and thus no member of "week" can be named.
3016                 Instead of a member name the offset relative to "zap[3]" is
3017                 appended.
3018
3019               · In the third iteration, the offset points to "zap[5].ptr".
3020                 However, "zap[5].ptr" is located at 68, not at 69, and thus
3021                 the remaining offset of 1 is also appended.
3022
3023               · The last iteration causes an exception because the offset of
3024                 99 is not valid for "struct test" since the size of "struct
3025                 test" is only 96. You might argue that this is inconsistent,
3026                 since "offsetof" can also handle out-of-bounds array members.
3027                 But as soon as you have more than one level of array nesting,
3028                 there's an infinite number of out-of-bounds members for a
3029                 single given offset, so it would be impossible to return a
3030                 list of all members.
3031
3032               You can additionally specify a member for the type passed as
3033               the first argument:
3034
3035                 $member = $c->member('test.zap[2]', 6);
3036                 print $member;
3037
3038               This will print:
3039
3040                 .day+2
3041
3042               Like "offsetof", "member" also works on array types:
3043
3044                 $member = $c->member('test.zap', 42);
3045                 print $member;
3046
3047               This will print:
3048
3049                 [3].day+2
3050
3051               While the behaviour for "struct"s is quite obvious, the
3052               behaviour for "union"s is rather tricky. As a single offset
3053               usually references more than one member of a union, there are
3054               certain rules that the algorithm uses for determining the best
3055               member.
3056
3057               · The first non-compound member that is referenced without an
3058                 offset has the highest priority.
3059
3060               · If no member is referenced without an offset, the first non-
3061                 compound member that is referenced with an offset will be
3062                 returned.
3063
3064               · Otherwise the first padding region that is encountered will
3065                 be taken.
3066
3067               As an example, given 4-byte-alignment and the union
3068
3069                 union choice {
3070                   struct {
3071                     char  color[2];
3072                     long  size;
3073                     char  taste;
3074                   }       apple;
3075                   char    grape[3];
3076                   struct {
3077                     long  weight;
3078                     short price[3];
3079                   }       melon;
3080                 };
3081
3082               the "member" method would return what is shown in the Member
3083               column of the following table. The Type column shows the result
3084               of the "typeof" method when passing the corresponding member.
3085
3086                 Offset   Member               Type
3087                 --------------------------------------
3088                    0     .apple.color[0]      'char'
3089                    1     .apple.color[1]      'char'
3090                    2     .grape[2]            'char'
3091                    3     .melon.weight+3      'long'
3092                    4     .apple.size          'long'
3093                    5     .apple.size+1        'long'
3094                    6     .melon.price[1]      'short'
3095                    7     .apple.size+3        'long'
3096                    8     .apple.taste         'char'
3097                    9     .melon.price[2]+1    'short'
3098                   10     .apple+10            'struct'
3099                   11     .apple+11            'struct'
3100
3101               It's like having a stack of all the union members and looking
3102               through the stack for the shiniest piece you can see. The
3103               beginning of a member (denoted by uppercase letters) is always
3104               shinier than the rest of a member, while padding regions
3105               (denoted by dashes) aren't shiny at all.
3106
3107                 Offset   0   1   2   3   4   5   6   7   8   9  10  11
3108                 -------------------------------------------------------
3109                 apple   (C) (C)  -   -  (S) (s)  s  (s) (T)  -  (-) (-)
3110                 grape    G   G  (G)
3111                 melon    W   w   w  (w)  P   p  (P)  p   P  (p)  -   -
3112
3113               If you look through that stack from top to bottom, you'll end
3114               up at the parenthesized members.
3115
3116               Alternatively, if you're not only interested in the best
3117               member, you can call "member" in list context, which makes it
3118               return all members referenced by the given offset.
3119
3120                 Offset   Member               Type
3121                 --------------------------------------
3122                    0     .apple.color[0]      'char'
3123                          .grape[0]            'char'
3124                          .melon.weight        'long'
3125                    1     .apple.color[1]      'char'
3126                          .grape[1]            'char'
3127                          .melon.weight+1      'long'
3128                    2     .grape[2]            'char'
3129                          .melon.weight+2      'long'
3130                          .apple+2             'struct'
3131                    3     .melon.weight+3      'long'
3132                          .apple+3             'struct'
3133                    4     .apple.size          'long'
3134                          .melon.price[0]      'short'
3135                    5     .apple.size+1        'long'
3136                          .melon.price[0]+1    'short'
3137                    6     .melon.price[1]      'short'
3138                          .apple.size+2        'long'
3139                    7     .apple.size+3        'long'
3140                          .melon.price[1]+1    'short'
3141                    8     .apple.taste         'char'
3142                          .melon.price[2]      'short'
3143                    9     .melon.price[2]+1    'short'
3144                          .apple+9             'struct'
3145                   10     .apple+10            'struct'
3146                          .melon+10            'struct'
3147                   11     .apple+11            'struct'
3148                          .melon+11            'struct'
3149
3150               The first member returned is always the best member. The other
3151               members are sorted according to the rules given above. This
3152               means that members referenced without an offset are followed by
3153               members referenced with an offset. Padding regions will be at
3154               the end.
3155
3156               If OFFSET is not given in the method call, "member" will return
3157               a list of all possible members of TYPE.
3158
3159                 print "$_\n" for $c->member('choice');
3160
3161               This will print:
3162
3163                 .apple.color[0]
3164                 .apple.color[1]
3165                 .apple.size
3166                 .apple.taste
3167                 .grape[0]
3168                 .grape[1]
3169                 .grape[2]
3170                 .melon.weight
3171                 .melon.price[0]
3172                 .melon.price[1]
3173                 .melon.price[2]
3174
3175               In scalar context, the number of possible members is returned.
3176
3177   tag
3178       "tag" TYPE
3179       "tag" TYPE, TAG
3180       "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3181               The "tag" method can be used to tag properties to a TYPE. It's
3182               a bit like having "configure" for individual types.
3183
3184               See "USING TAGS" for an example.
3185
3186               Note that while you can tag whole types as well as compound
3187               members, it is not possible to tag array members, i.e. you
3188               cannot treat, for example, "a[1]" and "a[2]" differently.
3189
3190               Also note that in code like this
3191
3192                 struct test {
3193                   int a;
3194                   struct {
3195                     int x;
3196                   } b, c;
3197                 };
3198
3199               if you tag "test.b.x", this will also tag "test.c.x"
3200               implicitly.
3201
3202               It is also possible to tag basic types if you really want to do
3203               that, for example:
3204
3205                 $c->tag('int', Format => 'Binary');
3206
3207               To remove a tag from a type, you can either set that tag to
3208               "undef", for example
3209
3210                 $c->tag('test', Hooks => undef);
3211
3212               or use "untag".
3213
3214               To see if a tag is attached to a type or to get the value of a
3215               tag, pass only the type and tag name to "tag":
3216
3217                 $c->tag('test.a', Format => 'Binary');
3218
3219                 $hooks = $c->tag('test.a', 'Hooks');
3220                 $format = $c->tag('test.a', 'Format');
3221
3222               This will give you:
3223
3224                 $hooks = undef;
3225                 $format = 'Binary';
3226
3227               To see which tags are attached to a type, pass only the type.
3228               The "tag" method will now return a hash reference containing
3229               all tags attached to the type:
3230
3231                 $tags = $c->tag('test.a');
3232
3233               This will give you:
3234
3235                 $tags = {
3236                   'Format' => 'Binary'
3237                 };
3238
3239               "tag" will throw an exception if an error occurs.  If called as
3240               a 'set' method, it will return a reference to its object,
3241               allowing you to chain together consecutive method calls.
3242
3243               Note that when a compound is inlined, tags attached to the
3244               inlined compound are ignored, for example:
3245
3246                 $c->parse(<<ENDC);
3247                 struct header {
3248                   int id;
3249                   int len;
3250                   unsigned flags;
3251                 };
3252
3253                 struct message {
3254                   struct header;
3255                   short samples[32];
3256                 };
3257                 ENDC
3258
3259                 for my $type (qw( header message header.len )) {
3260                   $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3261                 }
3262
3263                 for my $type (qw( header message )) {
3264                   print "[unpacking $type]\n";
3265                   $u = $c->unpack($type, $data);
3266                 }
3267
3268               This will print:
3269
3270                 [unpacking header]
3271                 unpack: header.len
3272                 unpack: header
3273                 [unpacking message]
3274                 unpack: header.len
3275                 unpack: message
3276
3277               As you can see from the above output, tags attached to members
3278               of inlined compounds ("header.len" are still handled.
3279
3280               The following tags can be configured:
3281
3282               "Format" => 'Binary' | 'String'
3283                   The "Format" tag allows you to control the way binary data
3284                   is converted by "pack" and "unpack".
3285
3286                   If you tag a "TYPE" as "Binary", it will not be converted
3287                   at all, i.e. it will be passed through as a binary string.
3288
3289                   If you tag it as "String", it will be treated like a null-
3290                   terminated C string, i.e. "unpack" will convert the C
3291                   string to a Perl string and vice versa.
3292
3293                   See "The Format Tag" for an example.
3294
3295               "ByteOrder" => 'BigEndian' | 'LittleEndian'
3296                   The "ByteOrder" tag allows you to explicitly set the byte
3297                   order of a TYPE.
3298
3299                   See "The ByteOrder Tag" for an example.
3300
3301               "Dimension" => '*'
3302               "Dimension" => VALUE
3303               "Dimension" => MEMBER
3304               "Dimension" => SUB
3305               "Dimension" => [ SUB, ARGS ]
3306                   The "Dimension" tag allows you to alter the size of an
3307                   array dynamically.
3308
3309                   You can tag fixed size arrays as being flexible using '*'.
3310                   This is useful if you cannot use flexible array members in
3311                   your source code.
3312
3313                     $c->tag('type.array', Dimension => '*');
3314
3315                   You can also tag an array to have a fixed size different
3316                   from the one it was originally declared with.
3317
3318                     $c->tag('type.array', Dimension => 42);
3319
3320                   If the array is a member of a compound, you can also tag it
3321                   with to have a size corresponding to the value of another
3322                   member in that compound.
3323
3324                     $c->tag('type.array', Dimension => 'count');
3325
3326                   Finally, you can specify a subroutine that is called when
3327                   the size of the array needs to be determined.
3328
3329                     $c->tag('type.array', Dimension => \&get_count);
3330
3331                   By default, and if the array is a compound member, that
3332                   subroutine will be passed a reference to the hash storing
3333                   the data for the compound.
3334
3335                   You can also instruct Convert::Binary::C to pass additional
3336                   arguments to the subroutine by passing an array reference
3337                   instead of the subroutine reference. This array contains
3338                   the subroutine reference as well as a list of arguments.
3339                   It is possible to define certain special arguments using
3340                   the "arg" method.
3341
3342                     $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3343
3344                   See "The Dimension Tag" for various examples.
3345
3346               "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3347                   The "Hooks" tag allows you to register subroutines as
3348                   hooks.
3349
3350                   Hooks are called whenever a certain "TYPE" is packed or
3351                   unpacked. Hooks are currently considered an experimental
3352                   feature.
3353
3354                   "HOOK" can be one of the following:
3355
3356                     pack
3357                     unpack
3358                     pack_ptr
3359                     unpack_ptr
3360
3361                   "pack" and "unpack" hooks are called when processing their
3362                   "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3363                   when processing pointers to their "TYPE".
3364
3365                   "SUB" is a reference to a subroutine that usually takes one
3366                   input argument, processes it and returns one output
3367                   argument.
3368
3369                   Alternatively, you can pass a custom list of arguments to
3370                   the hook by using an array reference instead of "SUB" that
3371                   holds the subroutine reference in the first element and the
3372                   arguments to be passed to the subroutine as the other
3373                   elements.  This way, you can even pass special arguments to
3374                   the hook using the "arg" method.
3375
3376                   Here are a few examples for registering hooks:
3377
3378                     $c->tag('ObjectType', Hooks => {
3379                               pack   => \&obj_pack,
3380                               unpack => \&obj_unpack
3381                             });
3382
3383                     $c->tag('ProtocolId', Hooks => {
3384                               unpack => sub { $protos[$_[0]] }
3385                             });
3386
3387                     $c->tag('ProtocolId', Hooks => {
3388                               unpack_ptr => [sub {
3389                                                sprintf "$_[0]:{0x%X}", $_[1]
3390                                              },
3391                                              $c->arg('TYPE', 'DATA')
3392                                             ],
3393                             });
3394
3395                   Note that the above example registers both an "unpack" hook
3396                   and an "unpack_ptr" hook for "ProtocolId" with two separate
3397                   calls to "tag". As long as you don't explicitly overwrite a
3398                   previously registered hook, it won't be modified or removed
3399                   by registering other hooks for the same "TYPE".
3400
3401                   To remove all registered hooks for a type, simply remove
3402                   the "Hooks" tag:
3403
3404                     $c->untag('ProtocolId', 'Hooks');
3405
3406                   To remove only a single hook, pass "undef" as "SUB" instead
3407                   of a subroutine reference:
3408
3409                     $c->tag('ObjectType', Hooks => { pack => undef });
3410
3411                   If all hooks are removed, the whole "Hooks" tag is removed.
3412
3413                   See "The Hooks Tag" for examples on how to use hooks.
3414
3415   untag
3416       "untag" TYPE
3417       "untag" TYPE, TAG1, TAG2, ...
3418               Use the "untag" method to remove one, more, or all tags from a
3419               type. If you don't pass any tag names, all tags attached to the
3420               type will be removed. Otherwise only the listed tags will be
3421               removed.
3422
3423               See "USING TAGS" for an example.
3424
3425   arg
3426       "arg" 'ARG', ...
3427               Creates placeholders for special arguments to be passed to
3428               hooks or other subroutines. These arguments are currently:
3429
3430               "SELF"
3431                   A reference to the calling Convert::Binary::C object. This
3432                   may be useful if you need to work with the object inside
3433                   the subroutine.
3434
3435               "TYPE"
3436                   The name of the type that is currently being processed by
3437                   the hook.
3438
3439               "DATA"
3440                   The data argument that is passed to the subroutine.
3441
3442               "HOOK"
3443                   The type of the hook as which the subroutine has been
3444                   called, for example "pack" or "unpack_ptr".
3445
3446               "arg" will return a placeholder for each argument it is being
3447               passed. Note that not all arguments may be supported depending
3448               on the context of the subroutine.
3449
3450   dependencies
3451       "dependencies"
3452               After some code has been parsed using either the "parse" or
3453               "parse_file" methods, the "dependencies" method can be used to
3454               retrieve information about all files that the object depends
3455               on, i.e. all files that have been parsed.
3456
3457               In scalar context, the method returns a hash reference.  Each
3458               key is the name of a file. The values are again hash
3459               references, each of which holds the size, modification time
3460               (mtime), and change time (ctime) of the file at the moment it
3461               was parsed.
3462
3463                 use Convert::Binary::C;
3464                 use Data::Dumper;
3465
3466                 #----------------------------------------------------------
3467                 # Create object, set include path, parse 'string.h' header
3468                 #----------------------------------------------------------
3469                 my $c = Convert::Binary::C->new
3470                         ->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include',
3471                                   '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include-fixed',
3472                                   '/usr/include')
3473                         ->parse_file('string.h');
3474
3475                 #----------------------------------------------------------
3476                 # Get dependencies of the object, extract dependency files
3477                 #----------------------------------------------------------
3478                 my $depend = $c->dependencies;
3479                 my @files  = keys %$depend;
3480
3481                 #-----------------------------
3482                 # Dump dependencies and files
3483                 #-----------------------------
3484                 print Data::Dumper->Dump([$depend, \@files],
3485                                       [qw( depend   *files )]);
3486
3487               The above code would print something like this:
3488
3489                 $depend = {
3490                   '/usr/include/features.h' => {
3491                     'ctime' => 1300268052,
3492                     'mtime' => 1300267911,
3493                     'size' => 12511
3494                   },
3495                   '/usr/include/gnu/stubs-32.h' => {
3496                     'ctime' => 1300268051,
3497                     'mtime' => 1300268010,
3498                     'size' => 624
3499                   },
3500                   '/usr/include/sys/cdefs.h' => {
3501                     'ctime' => 1300268051,
3502                     'mtime' => 1300267957,
3503                     'size' => 13195
3504                   },
3505                   '/usr/include/gnu/stubs.h' => {
3506                     'ctime' => 1300268051,
3507                     'mtime' => 1300267911,
3508                     'size' => 315
3509                   },
3510                   '/usr/include/string.h' => {
3511                     'ctime' => 1300268052,
3512                     'mtime' => 1300267944,
3513                     'size' => 22572
3514                   },
3515                   '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include/stddef.h' => {
3516                     'ctime' => 1300365679,
3517                     'mtime' => 1300363914,
3518                     'size' => 12542
3519                   },
3520                   '/usr/include/bits/wordsize.h' => {
3521                     'ctime' => 1300268051,
3522                     'mtime' => 1300267937,
3523                     'size' => 873
3524                   },
3525                   '/usr/include/xlocale.h' => {
3526                     'ctime' => 1300268051,
3527                     'mtime' => 1300267915,
3528                     'size' => 1764
3529                   }
3530                 };
3531                 @files = (
3532                   '/usr/include/features.h',
3533                   '/usr/include/gnu/stubs-32.h',
3534                   '/usr/include/sys/cdefs.h',
3535                   '/usr/include/gnu/stubs.h',
3536                   '/usr/include/string.h',
3537                   '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include/stddef.h',
3538                   '/usr/include/bits/wordsize.h',
3539                   '/usr/include/xlocale.h'
3540                 );
3541
3542               In list context, the method returns the names of all files that
3543               have been parsed, i.e. the following lines are equivalent:
3544
3545                 @files = keys %{$c->dependencies};
3546                 @files = $c->dependencies;
3547
3548   sourcify
3549       "sourcify"
3550       "sourcify" CONFIG
3551               Returns a string that holds the C source code necessary to
3552               represent all parsed C data structures.
3553
3554                 use Convert::Binary::C;
3555
3556                 $c = new Convert::Binary::C;
3557                 $c->parse(<<'END');
3558
3559                 #define ADD(a, b) ((a) + (b))
3560                 #define NUMBER 42
3561
3562                 typedef struct _mytype mytype;
3563
3564                 struct _mytype {
3565                   union {
3566                     int         iCount;
3567                     enum count *pCount;
3568                   } counter;
3569                 #pragma pack( push, 1 )
3570                   struct {
3571                     char string[NUMBER];
3572                     int  array[NUMBER/sizeof(int)];
3573                   } storage;
3574                 #pragma pack( pop )
3575                   mytype *next;
3576                 };
3577
3578                 enum count { ZERO, ONE, TWO, THREE };
3579
3580                 END
3581
3582                 print $c->sourcify;
3583
3584               The above code would print something like this:
3585
3586                 /* typedef predeclarations */
3587
3588                 typedef struct _mytype mytype;
3589
3590                 /* defined enums */
3591
3592                 enum count
3593                 {
3594                       ZERO,
3595                       ONE,
3596                       TWO,
3597                       THREE
3598                 };
3599
3600
3601                 /* defined structs and unions */
3602
3603                 struct _mytype
3604                 {
3605                       union
3606                       {
3607                               int iCount;
3608                               enum count *pCount;
3609                       } counter;
3610                 #pragma pack(push, 1)
3611                       struct
3612                       {
3613                               char string[42];
3614                               int array[10];
3615                       } storage;
3616                 #pragma pack(pop)
3617                       mytype *next;
3618                 };
3619
3620               The purpose of the "sourcify" method is to enable some kind of
3621               platform-independent caching. The C code generated by
3622               "sourcify" can be parsed by any standard C compiler, as well as
3623               of course by the Convert::Binary::C parser. However, the code
3624               may be significantly shorter than the code that has originally
3625               been parsed.
3626
3627               When parsing a typical header file, it's easily possible that
3628               you need to open dozens of other files that are included from
3629               that file, and end up parsing several hundred kilobytes of C
3630               code. Since most of it is usually preprocessor directives,
3631               function prototypes and comments, the "sourcify" function
3632               strips this down to a few kilobytes. Saving the "sourcify"
3633               string and parsing it next time instead of the original code
3634               may be a lot faster.
3635
3636               The "sourcify" method takes a hash reference as an optional
3637               argument. It can be used to tweak the method's output.  The
3638               following options can be configured.
3639
3640               "Context" => 0 | 1
3641                   Turns preprocessor context information on or off. If this
3642                   is turned on, "sourcify" will insert "#line" preprocessor
3643                   directives in its output. So in the above example
3644
3645                     print $c->sourcify({ Context => 1 });
3646
3647                   would print:
3648
3649                     /* typedef predeclarations */
3650
3651                     typedef struct _mytype mytype;
3652
3653                     /* defined enums */
3654
3655
3656                     #line 21 "[buffer]"
3657                     enum count
3658                     {
3659                           ZERO,
3660                           ONE,
3661                           TWO,
3662                           THREE
3663                     };
3664
3665
3666                     /* defined structs and unions */
3667
3668
3669                     #line 7 "[buffer]"
3670                     struct _mytype
3671                     {
3672                     #line 8 "[buffer]"
3673                           union
3674                           {
3675                                   int iCount;
3676                                   enum count *pCount;
3677                           } counter;
3678                     #pragma pack(push, 1)
3679                     #line 13 "[buffer]"
3680                           struct
3681                           {
3682                                   char string[42];
3683                                   int array[10];
3684                           } storage;
3685                     #pragma pack(pop)
3686                           mytype *next;
3687                     };
3688
3689                   Note that "[buffer]" refers to the here-doc buffer when
3690                   using "parse".
3691
3692               "Defines" => 0 | 1
3693                   Turn this on if you want all the defined macros to be part
3694                   of the source code output. Given the example code above
3695
3696                     print $c->sourcify({ Defines => 1 });
3697
3698                   would print:
3699
3700                     /* typedef predeclarations */
3701
3702                     typedef struct _mytype mytype;
3703
3704                     /* defined enums */
3705
3706                     enum count
3707                     {
3708                           ZERO,
3709                           ONE,
3710                           TWO,
3711                           THREE
3712                     };
3713
3714
3715                     /* defined structs and unions */
3716
3717                     struct _mytype
3718                     {
3719                           union
3720                           {
3721                                   int iCount;
3722                                   enum count *pCount;
3723                           } counter;
3724                     #pragma pack(push, 1)
3725                           struct
3726                           {
3727                                   char string[42];
3728                                   int array[10];
3729                           } storage;
3730                     #pragma pack(pop)
3731                           mytype *next;
3732                     };
3733
3734                     /* preprocessor defines */
3735
3736                     #define ADD(a, b) ((a) + (b))
3737                     #define NUMBER 42
3738
3739                   The macro definitions always appear at the end of the
3740                   source code.  The order of the macro definitions is
3741                   undefined.
3742
3743       The following methods can be used to retrieve information about the
3744       definitions that have been parsed. The examples given in the
3745       description for "enum", "compound" and "typedef" all assume this piece
3746       of C code has been parsed:
3747
3748         #define ABC_SIZE 2
3749         #define MULTIPLY(x, y) ((x)*(y))
3750
3751         #ifdef ABC_SIZE
3752         # define DEFINED
3753         #else
3754         # define NOT_DEFINED
3755         #endif
3756
3757         typedef unsigned long U32;
3758         typedef void *any;
3759
3760         enum __socket_type
3761         {
3762           SOCK_STREAM    = 1,
3763           SOCK_DGRAM     = 2,
3764           SOCK_RAW       = 3,
3765           SOCK_RDM       = 4,
3766           SOCK_SEQPACKET = 5,
3767           SOCK_PACKET    = 10
3768         };
3769
3770         struct STRUCT_SV {
3771           void *sv_any;
3772           U32   sv_refcnt;
3773           U32   sv_flags;
3774         };
3775
3776         typedef union {
3777           int abc[ABC_SIZE];
3778           struct xxx {
3779             int a;
3780             int b;
3781           }   ab[3][4];
3782           any ptr;
3783         } test;
3784
3785   enum_names
3786       "enum_names"
3787               Returns a list of identifiers of all defined enumeration
3788               objects. Enumeration objects don't necessarily have an
3789               identifier, so something like
3790
3791                 enum { A, B, C };
3792
3793               will obviously not appear in the list returned by the
3794               "enum_names" method. Also, enumerations that are not defined
3795               within the source code - like in
3796
3797                 struct foo {
3798                   enum weekday *pWeekday;
3799                   unsigned long year;
3800                 };
3801
3802               where only a pointer to the "weekday" enumeration object is
3803               used - will not be returned, even though they have an
3804               identifier. So for the above two enumerations, "enum_names"
3805               will return an empty list:
3806
3807                 @names = $c->enum_names;
3808
3809               The only way to retrieve a list of all enumeration identifiers
3810               is to use the "enum" method without additional arguments. You
3811               can get a list of all enumeration objects that have an
3812               identifier by using
3813
3814                 @enums = map { $_->{identifier} || () } $c->enum;
3815
3816               but these may not have a definition. Thus, the two arrays would
3817               look like this:
3818
3819                 @names = ();
3820                 @enums = ('weekday');
3821
3822               The "def" method returns a true value for all identifiers
3823               returned by "enum_names".
3824
3825   enum
3826       enum
3827       "enum" LIST
3828               Returns a list of references to hashes containing detailed
3829               information about all enumerations that have been parsed.
3830
3831               If a list of enumeration identifiers is passed to the method,
3832               the returned list will only contain hash references for those
3833               enumerations. The enumeration identifiers may optionally be
3834               prefixed by "enum".
3835
3836               If an enumeration identifier cannot be found, the returned list
3837               will contain an undefined value at that position.
3838
3839               In scalar context, the number of enumerations will be returned
3840               as long as the number of arguments to the method call is not 1.
3841               In the latter case, a hash reference holding information for
3842               the enumeration will be returned.
3843
3844               The list returned by the "enum" method looks similar to this:
3845
3846                 @enum = (
3847                   {
3848                     'enumerators' => {
3849                       'SOCK_STREAM' => 1,
3850                       'SOCK_RAW' => 3,
3851                       'SOCK_SEQPACKET' => 5,
3852                       'SOCK_RDM' => 4,
3853                       'SOCK_PACKET' => 10,
3854                       'SOCK_DGRAM' => 2
3855                     },
3856                     'identifier' => '__socket_type',
3857                     'context' => 'definitions.c(13)',
3858                     'size' => 4,
3859                     'sign' => 0
3860                   }
3861                 );
3862
3863               "identifier"
3864                   holds the enumeration identifier. This key is not present
3865                   if the enumeration has no identifier.
3866
3867               "context"
3868                   is the context in which the enumeration is defined. This is
3869                   the filename followed by the line number in parentheses.
3870
3871               "enumerators"
3872                   is a reference to a hash table that holds all enumerators
3873                   of the enumeration.
3874
3875               "sign"
3876                   is a boolean indicating if the enumeration is signed (i.e.
3877                   has negative values).
3878
3879               One useful application may be to create a hash table that holds
3880               all enumerators of all defined enumerations:
3881
3882                 %enum = map %{ $_->{enumerators} || {} }, $c->enum;
3883
3884               The %enum hash table would then be:
3885
3886                 %enum = (
3887                   'SOCK_STREAM' => 1,
3888                   'SOCK_RAW' => 3,
3889                   'SOCK_SEQPACKET' => 5,
3890                   'SOCK_RDM' => 4,
3891                   'SOCK_DGRAM' => 2,
3892                   'SOCK_PACKET' => 10
3893                 );
3894
3895   compound_names
3896       "compound_names"
3897               Returns a list of identifiers of all structs and unions
3898               (compound data structures) that are defined in the parsed
3899               source code. Like enumerations, compounds don't need to have an
3900               identifier, nor do they need to be defined.
3901
3902               Again, the only way to retrieve information about all struct
3903               and union objects is to use the "compound" method and don't
3904               pass it any arguments. If you should need a list of all struct
3905               and union identifiers, you can use:
3906
3907                 @compound = map { $_->{identifier} || () } $c->compound;
3908
3909               The "def" method returns a true value for all identifiers
3910               returned by "compound_names".
3911
3912               If you need the names of only the structs or only the unions,
3913               use the "struct_names" and "union_names" methods respectively.
3914
3915   compound
3916       "compound"
3917       "compound" LIST
3918               Returns a list of references to hashes containing detailed
3919               information about all compounds (structs and unions) that have
3920               been parsed.
3921
3922               If a list of struct/union identifiers is passed to the method,
3923               the returned list will only contain hash references for those
3924               compounds. The identifiers may optionally be prefixed by
3925               "struct" or "union", which limits the search to the specified
3926               kind of compound.
3927
3928               If an identifier cannot be found, the returned list will
3929               contain an undefined value at that position.
3930
3931               In scalar context, the number of compounds will be returned as
3932               long as the number of arguments to the method call is not 1. In
3933               the latter case, a hash reference holding information for the
3934               compound will be returned.
3935
3936               The list returned by the "compound" method looks similar to
3937               this:
3938
3939                 @compound = (
3940                   {
3941                     'identifier' => 'STRUCT_SV',
3942                     'align' => 1,
3943                     'context' => 'definitions.c(23)',
3944                     'pack' => 0,
3945                     'type' => 'struct',
3946                     'declarations' => [
3947                       {
3948                         'declarators' => [
3949                           {
3950                             'declarator' => '*sv_any',
3951                             'size' => 4,
3952                             'offset' => 0
3953                           }
3954                         ],
3955                         'type' => 'void'
3956                       },
3957                       {
3958                         'declarators' => [
3959                           {
3960                             'declarator' => 'sv_refcnt',
3961                             'size' => 4,
3962                             'offset' => 4
3963                           }
3964                         ],
3965                         'type' => 'U32'
3966                       },
3967                       {
3968                         'declarators' => [
3969                           {
3970                             'declarator' => 'sv_flags',
3971                             'size' => 4,
3972                             'offset' => 8
3973                           }
3974                         ],
3975                         'type' => 'U32'
3976                       }
3977                     ],
3978                     'size' => 12
3979                   },
3980                   {
3981                     'identifier' => 'xxx',
3982                     'align' => 1,
3983                     'context' => 'definitions.c(31)',
3984                     'pack' => 0,
3985                     'type' => 'struct',
3986                     'declarations' => [
3987                       {
3988                         'declarators' => [
3989                           {
3990                             'declarator' => 'a',
3991                             'size' => 4,
3992                             'offset' => 0
3993                           }
3994                         ],
3995                         'type' => 'int'
3996                       },
3997                       {
3998                         'declarators' => [
3999                           {
4000                             'declarator' => 'b',
4001                             'size' => 4,
4002                             'offset' => 4
4003                           }
4004                         ],
4005                         'type' => 'int'
4006                       }
4007                     ],
4008                     'size' => 8
4009                   },
4010                   {
4011                     'align' => 1,
4012                     'context' => 'definitions.c(29)',
4013                     'pack' => 0,
4014                     'type' => 'union',
4015                     'declarations' => [
4016                       {
4017                         'declarators' => [
4018                           {
4019                             'declarator' => 'abc[2]',
4020                             'size' => 8,
4021                             'offset' => 0
4022                           }
4023                         ],
4024                         'type' => 'int'
4025                       },
4026                       {
4027                         'declarators' => [
4028                           {
4029                             'declarator' => 'ab[3][4]',
4030                             'size' => 96,
4031                             'offset' => 0
4032                           }
4033                         ],
4034                         'type' => 'struct xxx'
4035                       },
4036                       {
4037                         'declarators' => [
4038                           {
4039                             'declarator' => 'ptr',
4040                             'size' => 4,
4041                             'offset' => 0
4042                           }
4043                         ],
4044                         'type' => 'any'
4045                       }
4046                     ],
4047                     'size' => 96
4048                   }
4049                 );
4050
4051               "identifier"
4052                   holds the struct or union identifier. This key is not
4053                   present if the compound has no identifier.
4054
4055               "context"
4056                   is the context in which the struct or union is defined.
4057                   This is the filename followed by the line number in
4058                   parentheses.
4059
4060               "type"
4061                   is either 'struct' or 'union'.
4062
4063               "size"
4064                   is the size of the struct or union.
4065
4066               "align"
4067                   is the alignment of the struct or union.
4068
4069               "pack"
4070                   is the struct member alignment if the compound is packed,
4071                   or zero otherwise.
4072
4073               "declarations"
4074                   is an array of hash references describing each struct
4075                   declaration:
4076
4077                   "type"
4078                       is the type of the struct declaration. This may be a
4079                       string or a reference to a hash describing the type.
4080
4081                   "declarators"
4082                       is an array of hashes describing each declarator:
4083
4084                       "declarator"
4085                           is a string representation of the declarator.
4086
4087                       "offset"
4088                           is the offset of the struct member represented by
4089                           the current declarator relative to the beginning of
4090                           the struct or union.
4091
4092                       "size"
4093                           is the size occupied by the struct member
4094                           represented by the current declarator.
4095
4096               It may be useful to have separate lists for structs and unions.
4097               One way to retrieve such lists would be to use
4098
4099                 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4100                     for $c->compound;
4101
4102               However, you should use the "struct" and "union" methods, which
4103               is a lot simpler:
4104
4105                 @structs = $c->struct;
4106                 @unions  = $c->union;
4107
4108   struct_names
4109       "struct_names"
4110               Returns a list of all defined struct identifiers.  This is
4111               equivalent to calling "compound_names", just that it only
4112               returns the names of the struct identifiers and doesn't return
4113               the names of the union identifiers.
4114
4115   struct
4116       "struct"
4117       "struct" LIST
4118               Like the "compound" method, but only allows for structs.
4119
4120   union_names
4121       "union_names"
4122               Returns a list of all defined union identifiers.  This is
4123               equivalent to calling "compound_names", just that it only
4124               returns the names of the union identifiers and doesn't return
4125               the names of the struct identifiers.
4126
4127   union
4128       "union"
4129       "union" LIST
4130               Like the "compound" method, but only allows for unions.
4131
4132   typedef_names
4133       "typedef_names"
4134               Returns a list of all defined typedef identifiers. Typedefs
4135               that do not specify a type that you could actually work with
4136               will not be returned.
4137
4138               The "def" method returns a true value for all identifiers
4139               returned by "typedef_names".
4140
4141   typedef
4142       "typedef"
4143       "typedef" LIST
4144               Returns a list of references to hashes containing detailed
4145               information about all typedefs that have been parsed.
4146
4147               If a list of typedef identifiers is passed to the method, the
4148               returned list will only contain hash references for those
4149               typedefs.
4150
4151               If an identifier cannot be found, the returned list will
4152               contain an undefined value at that position.
4153
4154               In scalar context, the number of typedefs will be returned as
4155               long as the number of arguments to the method call is not 1. In
4156               the latter case, a hash reference holding information for the
4157               typedef will be returned.
4158
4159               The list returned by the "typedef" method looks similar to
4160               this:
4161
4162                 @typedef = (
4163                   {
4164                     'declarator' => 'U32',
4165                     'type' => 'unsigned long'
4166                   },
4167                   {
4168                     'declarator' => '*any',
4169                     'type' => 'void'
4170                   },
4171                   {
4172                     'declarator' => 'test',
4173                     'type' => {
4174                       'align' => 1,
4175                       'context' => 'definitions.c(29)',
4176                       'pack' => 0,
4177                       'type' => 'union',
4178                       'declarations' => [
4179                         {
4180                           'declarators' => [
4181                             {
4182                               'declarator' => 'abc[2]',
4183                               'size' => 8,
4184                               'offset' => 0
4185                             }
4186                           ],
4187                           'type' => 'int'
4188                         },
4189                         {
4190                           'declarators' => [
4191                             {
4192                               'declarator' => 'ab[3][4]',
4193                               'size' => 96,
4194                               'offset' => 0
4195                             }
4196                           ],
4197                           'type' => 'struct xxx'
4198                         },
4199                         {
4200                           'declarators' => [
4201                             {
4202                               'declarator' => 'ptr',
4203                               'size' => 4,
4204                               'offset' => 0
4205                             }
4206                           ],
4207                           'type' => 'any'
4208                         }
4209                       ],
4210                       'size' => 96
4211                     }
4212                   }
4213                 );
4214
4215               "declarator"
4216                   is the type declarator.
4217
4218               "type"
4219                   is the type specification. This may be a string or a
4220                   reference to a hash describing the type.  See "enum" and
4221                   "compound" for a description on how to interpret this hash.
4222
4223   macro_names
4224       "macro_names"
4225               Returns a list of all defined macro names.
4226
4227               The list returned by the "macro_names" method looks similar to
4228               this:
4229
4230                 @macro_names = (
4231                   '__STDC_VERSION__',
4232                   '__STDC_HOSTED__',
4233                   'DEFINED',
4234                   'MULTIPLY',
4235                   'ABC_SIZE'
4236                 );
4237
4238               This works only as long as the preprocessor is not reset.  See
4239               "Preprocessor configuration" for details.
4240
4241   macro
4242       "macro"
4243       "macro" LIST
4244               Returns the definitions for all defined macros.
4245
4246               If a list of macro names is passed to the method, the returned
4247               list will only contain the definitions for those macros. For
4248               undefined macros, "undef" will be returned.
4249
4250               The list returned by the "macro" method looks similar to this:
4251
4252                 @macro = (
4253                   '__STDC_VERSION__ 199901L',
4254                   '__STDC_HOSTED__ 1',
4255                   'DEFINED',
4256                   'MULTIPLY(x, y) ((x)*(y))',
4257                   'ABC_SIZE 2'
4258                 );
4259
4260               This works only as long as the preprocessor is not reset.  See
4261               "Preprocessor configuration" for details.
4262

FUNCTIONS

4264       You can alternatively call the following functions as methods on
4265       Convert::Binary::C objects.
4266
4267   feature
4268       "feature" STRING
4269               Checks if Convert::Binary::C was built with certain features.
4270               For example,
4271
4272                 print "debugging version"
4273                     if Convert::Binary::C::feature('debug');
4274
4275               will check if Convert::Binary::C was built with debugging
4276               support enabled. The "feature" function returns 1 if the
4277               feature is enabled, 0 if the feature is disabled, and "undef"
4278               if the feature is unknown. Currently the only features that can
4279               be checked are "ieeefp" and "debug".
4280
4281               You can enable or disable certain features at compile time of
4282               the module by using the
4283
4284                 perl Makefile.PL enable-feature disable-feature
4285
4286               syntax.
4287
4288   native
4289       "native"
4290       "native" STRING
4291               Returns the value of a property of the native system that
4292               Convert::Binary::C was built on. For example,
4293
4294                 $size = Convert::Binary::C::native('IntSize');
4295
4296               will fetch the size of an "int" on the native system.  The
4297               following properties can be queried:
4298
4299                 Alignment
4300                 ByteOrder
4301                 CharSize
4302                 CompoundAlignment
4303                 DoubleSize
4304                 EnumSize
4305                 FloatSize
4306                 HostedC
4307                 IntSize
4308                 LongDoubleSize
4309                 LongLongSize
4310                 LongSize
4311                 PointerSize
4312                 ShortSize
4313                 StdCVersion
4314                 UnsignedBitfields
4315                 UnsignedChars
4316
4317               You can also call "native" without arguments, in which case it
4318               will return a reference to a hash with all properties, like:
4319
4320                 $native = {
4321                   'StdCVersion' => undef,
4322                   'ByteOrder' => 'LittleEndian',
4323                   'LongSize' => 4,
4324                   'IntSize' => 4,
4325                   'HostedC' => 1,
4326                   'ShortSize' => 2,
4327                   'UnsignedChars' => 0,
4328                   'DoubleSize' => 8,
4329                   'CharSize' => 1,
4330                   'EnumSize' => 4,
4331                   'PointerSize' => 4,
4332                   'FloatSize' => 4,
4333                   'LongLongSize' => 8,
4334                   'Alignment' => 4,
4335                   'LongDoubleSize' => 12,
4336                   'UnsignedBitfields' => 0,
4337                   'CompoundAlignment' => 1
4338                 };
4339
4340               The contents of that hash are suitable for passing them to the
4341               "configure" method.
4342

DEBUGGING

4344       Like perl itself, Convert::Binary::C can be compiled with debugging
4345       support that can then be selectively enabled at runtime. You can
4346       specify whether you like to build Convert::Binary::C with debugging
4347       support or not by explicitly giving an argument to Makefile.PL.  Use
4348
4349         perl Makefile.PL enable-debug
4350
4351       to enable debugging, or
4352
4353         perl Makefile.PL disable-debug
4354
4355       to disable debugging. The default will depend on how your perl binary
4356       was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4357       be built with debugging support, too.
4358
4359       Once you have built Convert::Binary::C with debugging support, you can
4360       use the following syntax to enable debug output. Instead of
4361
4362         use Convert::Binary::C;
4363
4364       you simply say
4365
4366         use Convert::Binary::C debug => 'all';
4367
4368       which will enable all debug output. However, I don't recommend to
4369       enable all debug output, because that can be a fairly large amount.
4370
4371   Debugging options
4372       Instead of saying "all", you can pass a string that consists of one or
4373       more of the following characters:
4374
4375         m   enable memory allocation tracing
4376         M   enable memory allocation & assertion tracing
4377
4378         h   enable hash table debugging
4379         H   enable hash table dumps
4380
4381         d   enable debug output from the XS module
4382         c   enable debug output from the ctlib
4383         t   enable debug output about type objects
4384
4385         l   enable debug output from the C lexer
4386         p   enable debug output from the C parser
4387         P   enable debug output from the C preprocessor
4388         r   enable debug output from the #pragma parser
4389
4390         y   enable debug output from yacc (bison)
4391
4392       So the following might give you a brief overview of what's going on
4393       inside Convert::Binary::C:
4394
4395         use Convert::Binary::C debug => 'dct';
4396
4397       When you want to debug memory allocation using
4398
4399         use Convert::Binary::C debug => 'm';
4400
4401       you can use the Perl script check_alloc.pl that resides in the
4402       ctlib/util/tool directory to extract statistics about memory usage and
4403       information about memory leaks from the resulting debug output.
4404
4405   Redirecting debug output
4406       By default, all debug output is written to "stderr". You can, however,
4407       redirect the debug output to a file with the "debugfile" option:
4408
4409         use Convert::Binary::C debug     => 'dcthHm',
4410                                debugfile => './debug.out';
4411
4412       If the file cannot be opened, you'll receive a warning and the output
4413       will go the "stderr" way again.
4414
4415       Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4416       and "CBC_DEBUG_FILE" to turn on debug output.
4417
4418       If Convert::Binary::C is built without debugging support, passing the
4419       "debug" or "debugfile" options will cause a warning to be issued. The
4420       corresponding environment variables will simply be ignored.
4421

ENVIRONMENT

4423   "CBC_ORDER_MEMBERS"
4424       Setting this variable to a non-zero value will globally turn on hash
4425       key ordering for compound members. Have a look at the "OrderMembers"
4426       option for details.
4427
4428       Setting the variable to the name of a perl module will additionally use
4429       this module instead of the predefined modules for member ordering to
4430       tie the hashes to.
4431
4432   "CBC_DEBUG_OPT"
4433       If Convert::Binary::C is built with debugging support, you can use this
4434       variable to specify the debugging options.
4435
4436   "CBC_DEBUG_FILE"
4437       If Convert::Binary::C is built with debugging support, you can use this
4438       variable to redirect the debug output to a file.
4439
4440   "CBC_DISABLE_PARSER"
4441       This variable is intended purely for development. Setting it to a non-
4442       zero value disables the Convert::Binary::C parser, which means that no
4443       information is collected from the file or code that is parsed. However,
4444       the preprocessor will run, which is useful for benchmarking the
4445       preprocessor.
4446

FLEXIBLE ARRAY MEMBERS AND INCOMPLETE TYPES

4448       Flexible array members are a feature introduced with ISO-C99.  It's a
4449       common problem that you have a variable length data field at the end of
4450       a structure, for example an array of characters at the end of a message
4451       struct. ISO-C99 allows you to write this as:
4452
4453         struct message {
4454           long header;
4455           char data[];
4456         };
4457
4458       The advantage is that you clearly indicate that the size of the
4459       appended data is variable, and that the "data" member doesn't
4460       contribute to the size of the "message" structure.
4461
4462       When packing or unpacking data, Convert::Binary::C deals with flexible
4463       array members as if their length was adjustable. For example, "unpack"
4464       will adapt the length of the array depending on the input string:
4465
4466         $msg1 = $c->unpack('message', 'abcdefg');
4467         $msg2 = $c->unpack('message', 'abcdefghijkl');
4468
4469       The following data is unpacked:
4470
4471         $msg1 = {
4472           'data' => [
4473             101,
4474             102,
4475             103
4476           ],
4477           'header' => 1633837924
4478         };
4479         $msg2 = {
4480           'data' => [
4481             101,
4482             102,
4483             103,
4484             104,
4485             105,
4486             106,
4487             107,
4488             108
4489           ],
4490           'header' => 1633837924
4491         };
4492
4493       Similarly, pack will adjust the length of the output string according
4494       to the data you feed in:
4495
4496         use Data::Hexdumper;
4497
4498         $msg = {
4499           header => 4711,
4500           data   => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4501         };
4502
4503         $data = $c->pack('message', $msg);
4504
4505         print hexdump(data => $data);
4506
4507       This would print:
4508
4509           0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{|}~
4510           0x0010 : 7F 80 81 82 83 84 85 86 87 88                   : ..........
4511
4512       Incomplete types such as
4513
4514         typedef unsigned long array[];
4515
4516       are handled in exactly the same way. Thus, you can easily
4517
4518         $array = $c->unpack('array', '?'x20);
4519
4520       which will unpack the following array:
4521
4522         $array = [
4523           1061109567,
4524           1061109567,
4525           1061109567,
4526           1061109567,
4527           1061109567
4528         ];
4529
4530       You can also alter the length of an array using the "Dimension" tag.
4531

FLOATING POINT VALUES

4533       When using Convert::Binary::C to handle floating point values, you have
4534       to be aware of some limitations.
4535
4536       You're usually safe if all your platforms are using the IEEE floating
4537       point format. During the Convert::Binary::C build process, the "ieeefp"
4538       feature will automatically be enabled if the host is using IEEE
4539       floating point. You can check for this feature at runtime using the
4540       "feature" function:
4541
4542         if (Convert::Binary::C::feature('ieeefp')) {
4543           # do something
4544         }
4545
4546       When IEEE floating point support is enabled, the module can also handle
4547       floating point values of a different byteorder.
4548
4549       If your host platform is not using IEEE floating point, the "ieeefp"
4550       feature will be disabled. Convert::Binary::C then will be more
4551       restrictive, refusing to handle any non-native floating point values.
4552
4553       However, Convert::Binary::C cannot detect the floating point format
4554       used by your target platform. It can only try to prevent problems in
4555       obvious cases. If you know your target platform has a completely
4556       different floating point format, don't use floating point conversion at
4557       all.
4558
4559       Whenever Convert::Binary::C detects that it cannot properly do floating
4560       point value conversion, it will issue a warning and will not attempt to
4561       convert the floating point value.
4562

BITFIELDS

4564       Bitfield support in Convert::Binary::C is currently in an experimental
4565       state. You are encouraged to test it, but you should not blindly rely
4566       on its results.
4567
4568       You are also encouraged to supply layouting algorithms for compilers
4569       whose bitfield implementation is not handled correctly at the moment.
4570       Even better that the plain algorithm is of course a patch that adds a
4571       new bitfield layouting engine.
4572
4573       While bitfields may not be handled correctly by the conversion routines
4574       yet, they are always parsed correctly. This means that you can reliably
4575       use the declarator fields as returned by the "struct" or "typedef"
4576       methods.  Given the following source
4577
4578         struct bitfield {
4579           int seven:7;
4580           int :1;
4581           int four:4, :0;
4582           int integer;
4583         };
4584
4585       a call to "struct" will return
4586
4587         @struct = (
4588           {
4589             'identifier' => 'bitfield',
4590             'align' => 1,
4591             'context' => 'bitfields.c(1)',
4592             'pack' => 0,
4593             'type' => 'struct',
4594             'declarations' => [
4595               {
4596                 'declarators' => [
4597                   {
4598                     'declarator' => 'seven:7'
4599                   }
4600                 ],
4601                 'type' => 'int'
4602               },
4603               {
4604                 'declarators' => [
4605                   {
4606                     'declarator' => ':1'
4607                   }
4608                 ],
4609                 'type' => 'int'
4610               },
4611               {
4612                 'declarators' => [
4613                   {
4614                     'declarator' => 'four:4'
4615                   },
4616                   {
4617                     'declarator' => ':0'
4618                   }
4619                 ],
4620                 'type' => 'int'
4621               },
4622               {
4623                 'declarators' => [
4624                   {
4625                     'declarator' => 'integer',
4626                     'size' => 4,
4627                     'offset' => 4
4628                   }
4629                 ],
4630                 'type' => 'int'
4631               }
4632             ],
4633             'size' => 8
4634           }
4635         );
4636
4637       No size/offset keys will currently be returned for bitfield entries.
4638

MULTITHREADING

4640       Convert::Binary::C was designed to be thread-safe.
4641

INHERITANCE

4643       If you wish to derive a new class from Convert::Binary::C, this is
4644       relatively easy. Despite their XS implementation, Convert::Binary::C
4645       objects are actually blessed hash references.
4646
4647       The XS data is stored in a read-only hash value for the key that is the
4648       empty string. So it is safe to use any non-empty hash key when deriving
4649       your own class.  In addition, Convert::Binary::C does quite a lot of
4650       checks to detect corruption in the object hash.
4651
4652       If you store private data in the hash, you should override the "clone"
4653       method and provide the necessary code to clone your private data.
4654       You'll have to call "SUPER::clone", but this will only clone the
4655       Convert::Binary::C part of the object.
4656
4657       For an example of a derived class, you can have a look at
4658       Convert::Binary::C::Cached.
4659

PORTABILITY

4661       Convert::Binary::C should build and run on most of the platforms that
4662       Perl runs on:
4663
4664       ·   Various Linux systems
4665
4666       ·   Various BSD systems
4667
4668       ·   HP-UX
4669
4670       ·   Compaq/HP Tru64 Unix
4671
4672       ·   Mac-OS X
4673
4674       ·   Cygwin
4675
4676       ·   Windows 98/NT/2000/XP
4677
4678       Also, many architectures are supported:
4679
4680       ·   Various Intel Pentium and Itanium systems
4681
4682       ·   Various Alpha systems
4683
4684       ·   HP PA-RISC
4685
4686       ·   Power-PC
4687
4688       ·   StrongARM
4689
4690       The module should build with any perl binary from 5.004 up to the
4691       latest development version.
4692

COMPARISON WITH SIMILAR MODULES

4694       Most of the time when you're really looking for Convert::Binary::C
4695       you'll actually end up finding one of the following modules. Some of
4696       them have different goals, so it's probably worth pointing out the
4697       differences.
4698
4699   C::Include
4700       Like Convert::Binary::C, this module aims at doing conversion from and
4701       to binary data based on C types.  However, its configurability is very
4702       limited compared to Convert::Binary::C. Also, it does not parse all C
4703       code correctly. It's slower than Convert::Binary::C, doesn't have a
4704       preprocessor. On the plus side, it's written in pure Perl.
4705
4706   C::DynaLib::Struct
4707       This module doesn't allow you to reuse your C source code. One main
4708       goal of Convert::Binary::C was to avoid code duplication or, even
4709       worse, having to maintain different representations of your data
4710       structures.  Like C::Include, C::DynaLib::Struct is rather limited in
4711       its configurability.
4712
4713   Win32::API::Struct
4714       This module has a special purpose. It aims at building structs for
4715       interfacing Perl code with Windows API code.
4716

CREDITS

4718       · My love Jennifer for always being there, for filling my life with joy
4719         and last but not least for proofreading the documentation.
4720
4721       · Alain Barbet <alian@cpan.org> for testing and debugging support.
4722
4723       · Mitchell N. Charity for giving me pointers into various interesting
4724         directions.
4725
4726       · Alexis Denis for making me improve (externally) and simplify
4727         (internally) floating point support. He can also be blamed
4728         (indirectly) for the "initializer" method, as I need it in my effort
4729         to support bitfields some day.
4730
4731       · Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4732         way to and back home from work, and for making me think about
4733         supporting "pack" and "unpack" for compound members.
4734
4735       · Thorsten Jens <thojens@gmx.de> for testing the package on various
4736         platforms.
4737
4738       · Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4739         giving invaluable feedback.
4740
4741       · Thomas Pornin <pornin@bolet.org> for his excellent "ucpp"
4742         preprocessor library.
4743
4744       · Marc Rosenthal for his suggestions and support.
4745
4746       · James Roskind, as his C parser was a great starting point to fix all
4747         the problems I had with my original parser based only on the ANSI
4748         ruleset.
4749
4750       · Gisbert W. Selke for spotting some interesting bugs and providing
4751         extensive reports.
4752
4753       · Steffen Zimmermann for a prolific discussion on the cloning
4754         algorithm.
4755

MAILING LIST

4757       There's also a mailing list that you can join:
4758
4759         convert-binary-c@yahoogroups.com
4760
4761       To subscribe, simply send mail to:
4762
4763         convert-binary-c-subscribe@yahoogroups.com
4764
4765       You can use this mailing list for non-bug problems, questions or
4766       discussions.
4767

BUGS

4769       I'm sure there are still lots of bugs in the code for this module. If
4770       you find any bugs, Convert::Binary::C doesn't seem to build on your
4771       system or any of its tests fail, please use the CPAN Request Tracker at
4772       <http://rt.cpan.org/> to create a ticket for the module. Alternatively,
4773       just send a mail to <mhx@cpan.org>.
4774

EXPERIMENTAL FEATURES

4776       Some features in Convert::Binary::C are marked as experimental.  This
4777       has most probably one of the following reasons:
4778
4779       · The feature does not behave in exactly the way that I wish it did,
4780         possibly due to some limitations in the current design of the module.
4781
4782       · The feature hasn't been tested enough and may completely fail to
4783         produce the expected results.
4784
4785       I hope to fix most issues with these experimental features someday, but
4786       this may mean that I have to change the way they currently work in a
4787       way that's not backwards compatible.  So if any of these features is
4788       useful to you, you can use it, but you should be aware that the
4789       behaviour or the interface may change in future releases of this
4790       module.
4791

TODO

4793       If you're interested in what I currently plan to improve (or fix), have
4794       a look at the TODO file.
4795

POSTCARDS

4797       If you're using my module and like it, you can show your appreciation
4798       by sending me a postcard from where you live. I won't urge you to do
4799       it, it's completely up to you. To me, this is just a very nice way of
4800       receiving feedback about my work. Please send your postcard to:
4801
4802         Marcus Holland-Moritz
4803         Kuppinger Weg 28
4804         71116 Gaertringen
4805         GERMANY
4806
4807       If you feel that sending a postcard is too much effort, you maybe want
4808       to rate the module at <http://cpanratings.perl.org/>.
4809

COPYRIGHT

4811       Copyright (c) 2002-2015 Marcus Holland-Moritz. All rights reserved.
4812       This program is free software; you can redistribute it and/or modify it
4813       under the same terms as Perl itself.
4814
4815       The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4816       redistribution details refer to ctlib/ucpp/README.
4817
4818       Portions copyright (c) 1989, 1990 James A. Roskind.
4819
4820       The include files located in tests/include/include, which are used in
4821       some of the test scripts are (c) 1991-1999, 2000, 2001 Free Software
4822       Foundation, Inc. They are neither required to create the binary nor
4823       linked to the source code of this module in any other way.
4824