Convert::Binary::C(3pm)

1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4

NAME

6       Convert::Binary::C - Binary Data Conversion using C Types
7

SYNOPSIS

9   Simple
10         use Convert::Binary::C;
11
12         #---------------------------------------------
13         # Create a new object and parse embedded code
14         #---------------------------------------------
15         my $c = Convert::Binary::C->new->parse(<<ENDC);
16
17         enum Month { JAN, FEB, MAR, APR, MAY, JUN,
18                      JUL, AUG, SEP, OCT, NOV, DEC };
19
20         struct Date {
21           int        year;
22           enum Month month;
23           int        day;
24         };
25
26         ENDC
27
28         #-----------------------------------------------
29         # Pack Perl data structure into a binary string
30         #-----------------------------------------------
31         my $date = { year => 2002, month => 'DEC', day => 24 };
32
33         my $packed = $c->pack('Date', $date);
34
35   Advanced
36         use Convert::Binary::C;
37         use Data::Dumper;
38
39         #---------------------
40         # Create a new object
41         #---------------------
42         my $c = Convert::Binary::C->new(ByteOrder => 'BigEndian');
43
44         #---------------------------------------------------
45         # Add include paths and global preprocessor defines
46         #---------------------------------------------------
47         $c->Include('/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include',
48                     '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include-fixed',
49                     '/usr/include')
50           ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
51
52         #----------------------------------
53         # Parse the 'time.h' header file
54         #----------------------------------
55         $c->parse_file('time.h');
56
57         #---------------------------------------
58         # See which files the object depends on
59         #---------------------------------------
60         print Dumper([$c->dependencies]);
61
62         #-----------------------------------------------------------
63         # See if struct timespec is defined and dump its definition
64         #-----------------------------------------------------------
65         if ($c->def('struct timespec')) {
66           print Dumper($c->struct('timespec'));
67         }
68
69         #-------------------------------
70         # Create some binary dummy data
71         #-------------------------------
72         my $data = "binary_test_string";
73
74         #--------------------------------------------------------
75         # Unpack $data according to 'struct timespec' definition
76         #--------------------------------------------------------
77         if (length($data) >= $c->sizeof('timespec')) {
78           my $perl = $c->unpack('timespec', $data);
79           print Dumper($perl);
80         }
81
82         #--------------------------------------------------------
83         # See which member lies at offset 5 of 'struct timespec'
84         #--------------------------------------------------------
85         my $member = $c->member('timespec', 5);
86         print "member('timespec', 5) = '$member'\n";
87

DESCRIPTION

89       Convert::Binary::C is a preprocessor and parser for C type definitions.
90       It is highly configurable and supports arbitrarily complex data
91       structures. Its object-oriented interface has "pack" and "unpack"
92       methods that act as replacements for Perl's "pack" and "unpack" and
93       allow one to use C types instead of a string representation of the data
94       structure for conversion of binary data from and to Perl's complex data
95       structures.
96
97       Actually, what Convert::Binary::C does is not very different from what
98       a C compiler does, just that it doesn't compile the source code into an
99       object file or executable, but only parses the code and allows Perl to
100       use the enumerations, structs, unions and typedefs that have been
101       defined within your C source for binary data conversion, similar to
102       Perl's "pack" and "unpack".
103
104       Beyond that, the module offers a lot of convenience methods to retrieve
105       information about the C types that have been parsed.
106
107   Background and History
108       In late 2000 I wrote a real-time debugging interface for an embedded
109       medical device that allowed me to send out data from that device over
110       its integrated Ethernet adapter.  The interface was "printf()"-like, so
111       you could easily send out strings or numbers. But you could also send
112       out what I called arbitrary data, which was intended for arbitrary
113       blocks of the device's memory.
114
115       Another part of this real-time debugger was a Perl application running
116       on my workstation that gathered all the messages that were sent out
117       from the embedded device. It printed all the strings and numbers, and
118       hex-dumped the arbitrary data.  However, manually parsing a couple of
119       300 byte hex-dumps of a complex C structure is not only frustrating,
120       but also error-prone and time consuming.
121
122       Using "unpack" to retrieve the contents of a C structure works fine for
123       small structures and if you don't have to deal with struct member
124       alignment. But otherwise, maintaining such code can be as awful as
125       deciphering hex-dumps.
126
127       As I didn't find anything to solve my problem on the CPAN, I wrote a
128       little module that translated simple C structs into "unpack" strings.
129       It worked, but it was slow. And since it couldn't deal with struct
130       member alignment, I soon found myself adding padding bytes everywhere.
131       So again, I had to maintain two sources, and changing one of them
132       forced me to touch the other one.
133
134       All in all, this little module seemed to make my task a bit easier, but
135       it was far from being what I was thinking of:
136
137       • A module that could directly use the source I've been coding for the
138         embedded device without any modifications.
139
140       • A module that could be configured to match the properties of the
141         different compilers and target platforms I was using.
142
143       • A module that was fast enough to decode a great amount of binary data
144         even on my slow workstation.
145
146       I didn't know how to accomplish these tasks until I read something
147       about XS. At least, it seemed as if it could solve my performance
148       problems. However, writing a C parser in C isn't easier than it is in
149       Perl. But writing a C preprocessor from scratch is even worse.
150
151       Fortunately enough, after a few weeks of searching I found both, a
152       lean, open-source C preprocessor library, and a reusable YACC grammar
153       for ANSI-C. That was the beginning of the development of
154       Convert::Binary::C in late 2001.
155
156       Now, I'm successfully using the module in my embedded environment since
157       long before it appeared on CPAN. From my point of view, it is exactly
158       what I had in mind. It's fast, flexible, easy to use and portable. It
159       doesn't require external programs or other Perl modules.
160
161   About this document
162       This document describes how to use Convert::Binary::C. A lot of
163       different features are presented, and the example code sometimes uses
164       Perl's more advanced language elements. If your experience with Perl is
165       rather limited, you should know how to use Perl's very good
166       documentation system.
167
168       To look up one of the manpages, use the "perldoc" command.  For
169       example,
170
171         perldoc perl
172
173       will show you Perl's main manpage. To look up a specific Perl function,
174       use "perldoc -f":
175
176         perldoc -f map
177
178       gives you more information about the "map" function.  You can also
179       search the FAQ using "perldoc -q":
180
181         perldoc -q array
182
183       will give you everything you ever wanted to know about Perl arrays. But
184       now, let's go on with some real stuff!
185
186   Why use Convert::Binary::C?
187       Say you want to pack (or unpack) data according to the following C
188       structure:
189
190         struct foo {
191           char ary[3];
192           unsigned short baz;
193           int bar;
194         };
195
196       You could of course use Perl's "pack" and "unpack" functions:
197
198         @ary = (1, 2, 3);
199         $baz = 40000;
200         $bar = -4711;
201         $binary = pack 'c3 S i', @ary, $baz, $bar;
202
203       But this implies that the struct members are byte aligned. If they were
204       long aligned (which is the default for most compilers), you'd have to
205       write
206
207         $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
208
209       which doesn't really increase readability.
210
211       Now imagine that you need to pack the data for a completely different
212       architecture with different byte order. You would look into the "pack"
213       manpage again and perhaps come up with this:
214
215         $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
216
217       However, if you try to unpack $foo again, your signed values have
218       turned into unsigned ones.
219
220       All this can still be managed with Perl. But imagine your structures
221       get more complex? Imagine you need to support different platforms?
222       Imagine you need to make changes to the structures? You'll not only
223       have to change the C source but also dozens of "pack" strings in your
224       Perl code. This is no fun. And Perl should be fun.
225
226       Now, wouldn't it be great if you could just read in the C source you've
227       already written and use all the types defined there for packing and
228       unpacking? That's what Convert::Binary::C does.
229
230   Creating a Convert::Binary::C object
231       To use Convert::Binary::C just say
232
233         use Convert::Binary::C;
234
235       to load the module. Its interface is completely object oriented, so it
236       doesn't export any functions.
237
238       Next, you need to create a new Convert::Binary::C object. This can be
239       done by either
240
241         $c = Convert::Binary::C->new;
242
243       or
244
245         $c = Convert::Binary::C->new;
246
247       You can optionally pass configuration options to the constructor as
248       described in the next section.
249
250   Configuring the object
251       To configure a Convert::Binary::C object, you can either call the
252       "configure" method or directly pass the configuration options to the
253       constructor. If you want to change byte order and alignment, you can
254       use
255
256         $c->configure(ByteOrder => 'LittleEndian',
257                       Alignment => 2);
258
259       or you can change the construction code to
260
261         $c = Convert::Binary::C->new(ByteOrder => 'LittleEndian',
262                                      Alignment => 2);
263
264       Either way, the object will now know that it should use little endian
265       (Intel) byte order and 2-byte struct member alignment for packing and
266       unpacking.
267
268       Alternatively, you can use the option names as names of methods to
269       configure the object, like:
270
271         $c->ByteOrder('LittleEndian');
272
273       You can also retrieve information about the current configuration of a
274       Convert::Binary::C object. For details, see the section about the
275       "configure" method.
276
277   Parsing C code
278       Convert::Binary::C allows two ways of parsing C source. Either by
279       parsing external C header or C source files:
280
281         $c->parse_file('header.h');
282
283       Or by parsing C code embedded in your script:
284
285         $c->parse(<<'CCODE');
286         struct foo {
287           char ary[3];
288           unsigned short baz;
289           int bar;
290         };
291         CCODE
292
293       Now the object $c will know everything about "struct foo".  The example
294       above uses a so-called here-document. It allows one to easily embed
295       multi-line strings in your code. You can find more about here-documents
296       in perldata or perlop.
297
298       Since the "parse" and "parse_file" methods throw an exception when a
299       parse error occurs, you usually want to catch these in an "eval" block:
300
301         eval { $c->parse_file('header.h') };
302         if ($@) {
303           # handle error appropriately
304         }
305
306       Perl's special $@ variable will contain an empty string (which
307       evaluates to a false value in boolean context) on success or an error
308       string on failure.
309
310       As another feature, "parse" and "parse_file" return a reference to
311       their object on success, just like "configure" does when you're
312       configuring the object. This will allow you to write constructs like
313       this:
314
315         my $c = eval {
316           Convert::Binary::C->new(Include => ['/usr/include'])
317                             ->parse_file('header.h')
318         };
319         if ($@) {
320           # handle error appropriately
321         }
322
323   Packing and unpacking
324       Convert::Binary::C has two methods, "pack" and "unpack", that act
325       similar to the functions of same denominator in Perl.  To perform the
326       packing described in the example above, you could write:
327
328         $data = {
329           ary => [1, 2, 3],
330           baz => 40000,
331           bar => -4711,
332         };
333         $binary = $c->pack('foo', $data);
334
335       Unpacking will work exactly the same way, just that the "unpack" method
336       will take a byte string as its input and will return a reference to a
337       (possibly very complex) Perl data structure.
338
339         $binary = get_data_from_memory();
340         $data = $c->unpack('foo', $binary);
341
342       You can now easily access all of the values:
343
344         print "foo.ary[1] = $data->{ary}[1]\n";
345
346       Or you can even more conveniently use the Data::Dumper module:
347
348         use Data::Dumper;
349         print Dumper($data);
350
351       The output would look something like this:
352
353         $VAR1 = {
354           'ary' => [
355             42,
356             48,
357             100
358           ],
359           'baz' => 5000,
360           'bar' => -271
361         };
362
363   Preprocessor configuration
364       Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C
365       preprocessor. It is compliant to ISO-C99, so you don't have to worry
366       about using even weird preprocessor constructs in your code.
367
368       If your C source contains includes or depends upon preprocessor
369       defines, you may need to configure the internal preprocessor.  Use the
370       "Include" and "Define" configuration options for that:
371
372         $c->configure(Include => ['/usr/include',
373                                   '/home/mhx/include'],
374                       Define  => [qw( NDEBUG FOO=42 )]);
375
376       If your code uses system includes, it is most likely that you will need
377       to define the symbols that are usually defined by the compiler.
378
379       On some operating systems, the system includes require the preprocessor
380       to predefine a certain set of assertions.  Assertions are supported by
381       "ucpp", and you can define them either in the source code using
382       "#assert" or as a property of the Convert::Binary::C object using
383       "Assert":
384
385         $c->configure(Assert => ['predicate(answer)']);
386
387       Information about defined macros can be retrieved from the preprocessor
388       as long as its configuration isn't changed. The preprocessor is
389       implicitly reset if you change one of the following configuration
390       options:
391
392         Include
393         Define
394         Assert
395         HasCPPComments
396         HasMacroVAARGS
397
398   Supported pragma directives
399       Convert::Binary::C supports the "pack" pragma to locally override
400       struct member alignment. The supported syntax is as follows:
401
402       #pragma pack( ALIGN )
403           Sets the new alignment to ALIGN. If ALIGN is 0, resets the
404           alignment to its original value.
405
406       #pragma pack
407           Resets the alignment to its original value.
408
409       #pragma pack( push, ALIGN )
410           Saves the current alignment on a stack and sets the new alignment
411           to ALIGN. If ALIGN is 0, sets the alignment to the default
412           alignment.
413
414       #pragma pack( pop )
415           Restores the alignment to the last value saved on the stack.
416
417         /*  Example assumes sizeof( short ) == 2, sizeof( long ) == 4.  */
418
419         #pragma pack(1)
420
421         struct nopad {
422           char a;               /* no padding bytes between 'a' and 'b' */
423           long b;
424         };
425
426         #pragma pack            /* reset to "native" alignment          */
427
428         #pragma pack( push, 2 )
429
430         struct pad {
431           char    a;            /* one padding byte between 'a' and 'b' */
432           long    b;
433
434         #pragma pack( push, 1 )
435
436           struct {
437             char  c;            /* no padding between 'c' and 'd'       */
438             short d;
439           }       e;            /* sizeof( e ) == 3                     */
440
441         #pragma pack( pop );    /* back to pack( 2 )                    */
442
443           long    f;            /* one padding byte between 'e' and 'f' */
444         };
445
446         #pragma pack( pop );    /* back to "native"                     */
447
448       The "pack" pragma as it is currently implemented only affects the
449       maximum struct member alignment. There are compilers that also allow
450       one to specify the minimum struct member alignment. This is not
451       supported by Convert::Binary::C.
452
453   Automatic configuration using "ccconfig"
454       As there are over 20 different configuration options, setting all of
455       them correctly can be a lengthy and tedious task.
456
457       The "ccconfig" script, which is bundled with this module, aims at
458       automatically determining the correct compiler configuration by testing
459       the compiler executable. It works for both, native and cross compilers.
460

UNDERSTANDING TYPES

462       This section covers one of the fundamental features of
463       Convert::Binary::C. It's how type expressions, referred to as TYPEs in
464       the method reference, are handled by the module.
465
466       Many of the methods, namely "pack", "unpack", "sizeof", "typeof",
467       "member", "offsetof", "def", "initializer" and "tag", are passed a TYPE
468       to operate on as their first argument.
469
470   Standard Types
471       These are trivial. Standard types are simply enum names, struct names,
472       union names, or typedefs. Almost every method that wants a TYPE will
473       accept a standard type.
474
475       For enums, structs and unions, the prefixes "enum", "struct" and
476       "union" are optional. However, if a typedef with the same name exists,
477       like in
478
479         struct foo {
480           int bar;
481         };
482
483         typedef int foo;
484
485       you will have to use the prefix to distinguish between the struct and
486       the typedef. Otherwise, a typedef is always given preference.
487
488   Basic Types
489       Basic types, or atomic types, are "int" or "char", for example.  It's
490       possible to use these basic types without having parsed any code. You
491       can simply do
492
493         $c = Convert::Binary::C->new;
494         $size = $c->sizeof('unsigned long');
495         $data = $c->pack('short int', 42);
496
497       Even though the above works fine, it is not possible to define more
498       complex types on the fly, so
499
500         $size = $c->sizeof('struct { int a, b; }');
501
502       will result in an error.
503
504       Basic types are not supported by all methods. For example, it makes no
505       sense to use "member" or "offsetof" on a basic type. Using "typeof"
506       isn't very useful, but supported.
507
508   Member Expressions
509       This is by far the most complex part, depending on the complexity of
510       your data structures. Any standard type that defines a compound or an
511       array may be followed by a member expression to select only a certain
512       part of the data type. Say you have parsed the following C code:
513
514         struct foo {
515           long type;
516           struct {
517             short x, y;
518           } array[20];
519         };
520
521         typedef struct foo matrix[8][8];
522
523       You may want to know the size of the "array" member of "struct foo".
524       This is quite easy:
525
526         print $c->sizeof('foo.array'), " bytes";
527
528       will print
529
530         80 bytes
531
532       depending of course on the "ShortSize" you configured.
533
534       If you wanted to unpack only a single column of "matrix", that's easy
535       as well (and of course it doesn't matter which index you use):
536
537         $column = $c->unpack('matrix[2]', $data);
538
539       Just like in C, it is possible to use out-of-bounds array indices.
540       This means that, for example, despite "array" is declared to have 20
541       elements, the following code
542
543         $size   = $c->sizeof('foo.array[4711]');
544         $offset = $c->offsetof('foo', 'array[-13]');
545
546       is perfectly valid and will result in:
547
548         $size   = 4
549         $offset = -44
550
551       Member expressions can be arbitrarily complex:
552
553         $type = $c->typeof('matrix[2][3].array[7].y');
554         print "the type is $type";
555
556       will, for example, print
557
558         the type is short
559
560       Member expressions are also used as the second argument to "offsetof".
561
562   Offsets
563       Members returned by the "member" method have an optional offset suffix
564       to indicate that the given offset doesn't point to the start of that
565       member. For example,
566
567         $member = $c->member('matrix', 1431);
568         print $member;
569
570       will print
571
572         [2][0].array[3].y+1
573
574       If you would use this as a member expression, like in
575
576         $size = $c->sizeof("matrix $member");
577
578       the offset suffix will simply be ignored. Actually, it will be ignored
579       for all methods if it's used in the first argument.
580
581       When used in the second argument to "offsetof", it will usually do what
582       you mean, i. e. the offset suffix, if present, will be considered when
583       determining the offset. This behaviour ensures that
584
585         $member = $c->member('foo', 43);
586         $offset = $c->offsetof('foo', $member);
587         print "'$member' is located at offset $offset of struct foo";
588
589       will always correctly set $offset:
590
591         '.array[8].y+1' is located at offset 43 of struct foo
592
593       If this is not what you mean, e.g. because you want to know the offset
594       where the member returned by "member" starts, you just have to remove
595       the suffix:
596
597         $member =~ s/\+\d+$//;
598         $offset = $c->offsetof('foo', $member);
599         print "'$member' starts at offset $offset of struct foo";
600
601       This would then print:
602
603         '.array[8].y' starts at offset 42 of struct foo
604

USING TAGS

606       In a nutshell, tags are properties that you can attach to types.
607
608       You can add tags to types using the "tag" method, and remove them using
609       "tag" or "untag", for example:
610
611         # Attach 'Format' and 'Hooks' tags
612         $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
613
614         $c->untag('type', 'Format');  # Remove only 'Format' tag
615         $c->untag('type');            # Remove all tags
616
617       You can also use "tag" to see which tags are attached to a type, for
618       example:
619
620         $tags = $c->tag('type');
621
622       This would give you:
623
624         $tags = {
625           'Hooks' => {
626             'pack' => \&rout
627           },
628           'Format' => 'String'
629         };
630
631       Currently, there are only a couple of different tags that influence the
632       way data is packed and unpacked. There are probably more tags to come
633       in the future.
634
635   The Format Tag
636       One of the tags currently available is the "Format" tag.  Using this
637       tag, you can tell a Convert::Binary::C object to pack and unpack a
638       certain data type in a special way.
639
640       For example, if you have a (fixed length) string type
641
642         typedef char str_type[40];
643
644       this type would, by default, be unpacked as an array of "char"s. That's
645       because it is only an array of "char"s, and Convert::Binary::C doesn't
646       know it is actually used as a string.
647
648       But you can tell Convert::Binary::C that "str_type" is a C string using
649       the "Format" tag:
650
651         $c->tag('str_type', Format => 'String');
652
653       This will make "unpack" (and of course also "pack") treat the binary
654       data like a null-terminated C string:
655
656         $binary = "Hello World!\n\0 this is just some dummy data";
657         $hello = $c->unpack('str_type', $binary);
658         print $hello;
659
660       would thusly print:
661
662         Hello World!
663
664       Of course, this also works the other way round:
665
666         use Data::Hexdumper;
667
668         $binary = $c->pack('str_type', "Just another C::B::C hacker");
669         print hexdump(data => $binary);
670
671       would print:
672
673           0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
674           0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
675           0x0020 : 00 00 00 00 00 00 00 00                         : ........
676
677       If you want Convert::Binary::C to not interpret the binary data at all,
678       you can set the "Format" tag to "Binary".  This might not be seem very
679       useful, as "pack" and "unpack" would just pass through the unmodified
680       binary data.  But you can tag not only whole types, but also compound
681       members. For example
682
683         $c->parse(<<ENDC);
684         struct packet {
685           unsigned short header;
686           unsigned short flags;
687           unsigned char  payload[28];
688         };
689         ENDC
690
691         $c->tag('packet.payload', Format => 'Binary');
692
693       would allow you to write:
694
695         read FILE, $payload, $c->sizeof('packet.payload');
696
697         $packet = {
698                     header  => 4711,
699                     flags   => 0xf00f,
700                     payload => $payload,
701                   };
702
703         $binary = $c->pack('packet', $packet);
704
705         print hexdump(data => $binary);
706
707       This would print something like:
708
709           0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
710           0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
711
712       For obvious reasons, it is not allowed to attach a "Format" tag to
713       bitfield members. Trying to do so will result in an exception being
714       thrown by the "tag" method.
715
716   The ByteOrder Tag
717       The "ByteOrder" tag allows you to override the byte order of certain
718       types or members. The implementation of this tag is considered
719       experimental and may be subject to changes in the future.
720
721       Usually it doesn't make much sense to override the byte order, but
722       there may be applications where a sub-structure is packed in a
723       different byte order than the surrounding structure.
724
725       Take, for example, the following code:
726
727         $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
728                                      OrderMembers => 1);
729         $c->parse(<<'ENDC');
730
731         typedef unsigned short u_16;
732
733         struct coords_3d {
734           int x, y, z;
735         };
736
737         struct coords_msg {
738           u_16 header;
739           u_16 length;
740           struct coords_3d coords;
741         };
742
743         ENDC
744
745       Assume that while "coords_msg" is big endian, the embedded coordinates
746       "coords_3d" are stored in little endian format for some reason. In C,
747       you'll have to handle this manually.
748
749       But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
750       to either the "coords_3d" structure or to the "coords" member of the
751       "coords_msg" structure. Both will work in this case. The only
752       difference is that if you tag the "coords" member, "coords_3d" will
753       only be treated as little endian if you "pack" or "unpack" the
754       "coords_msg" structure. (BTW, you could also tag all members of
755       "coords_3d" individually, but that would be inefficient.)
756
757       So, let's attach the "ByteOrder" tag to the "coords" member:
758
759         $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
760
761       Assume the following binary message:
762
763           0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
764
765       If you unpack this message...
766
767         $msg = $c->unpack('coords_msg', $binary);
768
769       ...you will get the following data structure:
770
771         $msg = {
772           'header' => 42,
773           'length' => 12,
774           'coords' => {
775             'x' => -1,
776             'y' => 2,
777             'z' => 42
778           }
779         };
780
781       Without the "ByteOrder" tag, you would get:
782
783         $msg = {
784           'header' => 42,
785           'length' => 12,
786           'coords' => {
787             'x' => -1,
788             'y' => 33554432,
789             'z' => 704643072
790           }
791         };
792
793       The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
794       of the tagged object recursively. Of course, it is also possible to
795       override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
796       child type. Confused? Here's an example. In addition to tagging the
797       "coords" member as little endian, we now tag "coords_3d.y" as big
798       endian:
799
800         $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
801         $msg = $c->unpack('coords_msg', $binary);
802
803       This will return the following data structure:
804
805         $msg = {
806           'header' => 42,
807           'length' => 12,
808           'coords' => {
809             'x' => -1,
810             'y' => 33554432,
811             'z' => 42
812           }
813         };
814
815       Note that if you tag both a type and a member of that type within a
816       compound, the tag attached to the type itself has higher precedence.
817       Using the example above, if you would attach a "ByteOrder" tag to both
818       "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
819       would always win.
820
821       Also note that the "ByteOrder" tag might not work as expected along
822       with bitfields, which is why the implementation is considered
823       experimental. Bitfields are currently not affected by the "ByteOrder"
824       tag at all. This is because the byte order would affect the bitfield
825       layout, and a consistent implementation supporting multiple layouts of
826       the same struct would be quite bulky and probably slow down the whole
827       module.
828
829       If you really need the correct behaviour, you can use the following
830       trick:
831
832         $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
833
834         $le->parse(<<'ENDC');
835
836         typedef unsigned short u_16;
837         typedef unsigned long  u_32;
838
839         struct message {
840           u_16 header;
841           u_16 length;
842           struct {
843             u_32 a;
844             u_32 b;
845             u_32 c :  7;
846             u_32 d :  5;
847             u_32 e : 20;
848           } data;
849         };
850
851         ENDC
852
853         $be = $le->clone->ByteOrder('BigEndian');
854
855         $le->tag('message.data', Format => 'Binary', Hooks => {
856             unpack => sub { $be->unpack('message.data', @_) },
857             pack   => sub { $be->pack('message.data', @_) },
858           });
859
860
861         $msg = $le->unpack('message', $binary);
862
863       This uses the "Format" and "Hooks" tags along with a big endian "clone"
864       of the original little endian object. It attaches hooks to the little
865       endian object and in the hooks it uses the big endian object to "pack"
866       and "unpack" the binary data.
867
868   The Dimension Tag
869       The "Dimension" tag allows you to override the declared dimension of an
870       array for packing or unpacking data. The implementation of this tag is
871       considered very experimental and will definitely change in a future
872       release.
873
874       That being said, the "Dimension" tag is primarily useful to support
875       variable length arrays. Usually, you have to write the following code
876       for such a variable length array in C:
877
878         struct c_message
879         {
880           unsigned count;
881           char data[1];
882         };
883
884       So, because you cannot declare an empty array, you declare an array
885       with a single element. If you have a ISO-C99 compliant compiler, you
886       can write this code instead:
887
888         struct c99_message
889         {
890           unsigned count;
891           char data[];
892         };
893
894       This explicitly tells the compiler that "data" is a flexible array
895       member. Convert::Binary::C already uses this information to handle
896       flexible array members in a special way.
897
898       As you can see in the following example, the two types are treated
899       differently:
900
901         $data = pack 'NC*', 3, 1..8;
902         $uc   = $c->unpack('c_message', $data);
903         $uc99 = $c->unpack('c99_message', $data);
904
905       This will result in:
906
907         $uc = {'count' => 3,'data' => [1]};
908         $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
909
910       However, only few compilers support ISO-C99, and you probably don't
911       want to change your existing code only to get some extra features when
912       using Convert::Binary::C.
913
914       So it is possible to attach a tag to the "data" member of the
915       "c_message" struct that tells Convert::Binary::C to treat the array as
916       if it were flexible:
917
918         $c->tag('c_message.data', Dimension => '*');
919
920       Now both "c_message" and "c99_message" will behave exactly the same
921       when using "pack" or "unpack".  Repeating the above code:
922
923         $uc = $c->unpack('c_message', $data);
924
925       This will result in:
926
927         $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
928
929       But there's more you can do. Even though it probably doesn't make much
930       sense, you can tag a fixed dimension to an array:
931
932         $c->tag('c_message.data', Dimension => '5');
933
934       This will obviously result in:
935
936         $uc = {'count' => 3,'data' => [1,2,3,4,5]};
937
938       A more useful way to use the "Dimension" tag is to set it to the name
939       of a member in the same compound:
940
941         $c->tag('c_message.data', Dimension => 'count');
942
943       Convert::Binary::C will now use the value of that member to determine
944       the size of the array, so unpacking will result in:
945
946         $uc = {'count' => 3,'data' => [1,2,3]};
947
948       Of course, you can also tag flexible array members. And yes, it's also
949       possible to use more complex member expressions:
950
951         $c->parse(<<ENDC);
952         struct msg_header
953         {
954           unsigned len[2];
955         };
956
957         struct more_complex
958         {
959           struct msg_header hdr;
960           char data[];
961         };
962         ENDC
963
964         $data = pack 'NNC*', 42, 7, 1 .. 10;
965
966         $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
967
968         $u = $c->unpack('more_complex', $data);
969
970       The result will be:
971
972         $u = {
973           'hdr' => {
974             'len' => [
975               42,
976               7
977             ]
978           },
979           'data' => [
980             1,
981             2,
982             3,
983             4,
984             5,
985             6,
986             7
987           ]
988         };
989
990       By the way, it's also possible to tag arrays that are not embedded
991       inside a compound:
992
993         $c->parse(<<ENDC);
994         typedef unsigned short short_array[];
995         ENDC
996
997         $c->tag('short_array', Dimension => '5');
998
999         $u = $c->unpack('short_array', $data);
1000
1001       Resulting in:
1002
1003         $u = [0,42,0,7,258];
1004
1005       The final and most powerful way to define a "Dimension" tag is to pass
1006       it a subroutine reference. The referenced subroutine can execute
1007       whatever code is necessary to determine the size of the tagged array:
1008
1009         sub get_size
1010         {
1011           my $m = shift;
1012           return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1013         }
1014
1015         $c->tag('more_complex.data', Dimension => \&get_size);
1016
1017         $u = $c->unpack('more_complex', $data);
1018
1019       As you can guess from the above code, the subroutine is being passed a
1020       reference to hash that stores the already unpacked part of the compound
1021       embedding the tagged array. This is the result:
1022
1023         $u = {
1024           'hdr' => {
1025             'len' => [
1026               42,
1027               7
1028             ]
1029           },
1030           'data' => [
1031             1,
1032             2,
1033             3,
1034             4,
1035             5,
1036             6
1037           ]
1038         };
1039
1040       You can also pass custom arguments to the subroutines by using the
1041       "arg" method. This is similar to the functionality offered by the
1042       "Hooks" tag.
1043
1044       Of course, all that also works for the "pack" method as well.
1045
1046       However, the current implementation has at least one shortcomings,
1047       which is why it's experimental: The "Dimension" tag doesn't impact
1048       compound layout. This means that while you can alter the size of an
1049       array in the middle of a compound, the offset of the members after that
1050       array won't be impacted. I'd rather like to see the layout adapt
1051       dynamically, so this is what I'm hoping to implement in the future.
1052
1053   The Hooks Tag
1054       Hooks are a special kind of tag that can be extremely useful.
1055
1056       Using hooks, you can easily override the way "pack" and "unpack" handle
1057       data using your own subroutines.  If you define hooks for a certain
1058       data type, each time this data type is processed the corresponding hook
1059       will be called to allow you to modify that data.
1060
1061       Basic Hooks
1062
1063       Here's an example. Let's assume the following C code has been parsed:
1064
1065         typedef unsigned int u_32;
1066         typedef u_32         ProtoId;
1067         typedef ProtoId      MyProtoId;
1068
1069         struct MsgHeader {
1070           MyProtoId id;
1071           u_32      len;
1072         };
1073
1074         struct String {
1075           u_32 len;
1076           char buf[];
1077         };
1078
1079       You could now use the types above and, for example, unpack binary data
1080       representing a "MsgHeader" like this:
1081
1082         $msg_header = $c->unpack('MsgHeader', $data);
1083
1084       This would give you:
1085
1086         $msg_header = {
1087           'id' => 42,
1088           'len' => 13
1089         };
1090
1091       Instead of dealing with "ProtoId"'s as integers, you would rather like
1092       to have them as clear text. You could provide subroutines to convert
1093       between clear text and integers:
1094
1095         %proto = (
1096           CATS      =>    1,
1097           DOGS      =>   42,
1098           HEDGEHOGS => 4711,
1099         );
1100
1101         %rproto = reverse %proto;
1102
1103         sub ProtoId_unpack {
1104           $rproto{$_[0]} || 'unknown protocol'
1105         }
1106
1107         sub ProtoId_pack {
1108           $proto{$_[0]} or die 'unknown protocol'
1109         }
1110
1111       You can now register these subroutines by attaching a "Hooks" tag to
1112       "ProtoId" using the "tag" method:
1113
1114         $c->tag('ProtoId', Hooks => { pack   => \&ProtoId_pack,
1115                                       unpack => \&ProtoId_unpack });
1116
1117       Doing exactly the same unpack on "MsgHeader" again would now return:
1118
1119         $msg_header = {
1120           'id' => 'DOGS',
1121           'len' => 13
1122         };
1123
1124       Actually, if you don't need the reverse operation, you don't even have
1125       to register a "pack" hook. Or, even better, you can have a more
1126       intelligent "unpack" hook that creates a dual-typed variable:
1127
1128         use Scalar::Util qw(dualvar);
1129
1130         sub ProtoId_unpack2 {
1131           dualvar $_[0], $rproto{$_[0]} || 'unknown protocol'
1132         }
1133
1134         $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1135
1136         $msg_header = $c->unpack('MsgHeader', $data);
1137
1138       Just as before, this would print
1139
1140         $msg_header = {
1141           'id' => 'DOGS',
1142           'len' => 13
1143         };
1144
1145       but without requiring a "pack" hook for packing, at least as long as
1146       you keep the variable dual-typed.
1147
1148       Hooks are usually called with exactly one argument, which is the data
1149       that should be processed (see "Advanced Hooks" for details on how to
1150       customize hook arguments). They are called in scalar context and
1151       expected to return the processed data.
1152
1153       To get rid of registered hooks, you can either undefine only certain
1154       hooks
1155
1156         $c->tag('ProtoId', Hooks => { pack => undef });
1157
1158       or all hooks:
1159
1160         $c->tag('ProtoId', Hooks => undef);
1161
1162       Of course, hooks are not restricted to handling integer values.  You
1163       could just as well attach hooks for the "String" struct from the code
1164       above. A useful example would be to have these hooks:
1165
1166         sub string_unpack {
1167           my $s = shift;
1168           pack "c$s->{len}", @{$s->{buf}};
1169         }
1170
1171         sub string_pack {
1172           my $s = shift;
1173           return {
1174             len => length $s,
1175             buf => [ unpack 'c*', $s ],
1176           }
1177         }
1178
1179       (Don't be confused by the fact that the "unpack" hook uses "pack" and
1180       the "pack" hook uses "unpack".  And also see "Advanced Hooks" for a
1181       more clever approach.)
1182
1183       While you would normally get the following output when unpacking a
1184       "String"
1185
1186         $string = {
1187           'len' => 12,
1188           'buf' => [
1189             72,
1190             101,
1191             108,
1192             108,
1193             111,
1194             32,
1195             87,
1196             111,
1197             114,
1198             108,
1199             100,
1200             33
1201           ]
1202         };
1203
1204       you could just register the hooks using
1205
1206         $c->tag('String', Hooks => { pack   => \&string_pack,
1207                                      unpack => \&string_unpack });
1208
1209       and you would get a nice human-readable Perl string:
1210
1211         $string = 'Hello World!';
1212
1213       Packing a string turns out to be just as easy:
1214
1215         use Data::Hexdumper;
1216
1217         $data = $c->pack('String', 'Just another Perl hacker,');
1218
1219         print hexdump(data => $data);
1220
1221       This would print:
1222
1223           0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1224           0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C          : .Perl.hacker,
1225
1226       If you want to find out if or which hooks are registered for a certain
1227       type, you can also use the "tag" method:
1228
1229         $hooks = $c->tag('String', 'Hooks');
1230
1231       This would return:
1232
1233         $hooks = {
1234           'unpack' => \&string_unpack,
1235           'pack' => \&string_pack
1236         };
1237
1238       Advanced Hooks
1239
1240       It is also possible to combine hooks with using the "Format" tag.  This
1241       can be useful if you know better than Convert::Binary::C how to
1242       interpret the binary data. In the previous section, we've handled this
1243       type
1244
1245         struct String {
1246           u_32 len;
1247           char buf[];
1248         };
1249
1250       with the following hooks:
1251
1252         sub string_unpack {
1253           my $s = shift;
1254           pack "c$s->{len}", @{$s->{buf}};
1255         }
1256
1257         sub string_pack {
1258           my $s = shift;
1259           return {
1260             len => length $s,
1261             buf => [ unpack 'c*', $s ],
1262           }
1263         }
1264
1265         $c->tag('String', Hooks => { pack   => \&string_pack,
1266                                      unpack => \&string_unpack });
1267
1268       As you can see in the hook code, "buf" is expected to be an array of
1269       characters. For the "unpack" case Convert::Binary::C first turns the
1270       binary data into a Perl array, and then the hook packs it back into a
1271       string. The intermediate array creation and destruction is completely
1272       useless.  Same thing, of course, for the "pack" case.
1273
1274       Here's a clever way to handle this. Just tag "buf" as binary
1275
1276         $c->tag('String.buf', Format => 'Binary');
1277
1278       and use the following hooks instead:
1279
1280         sub string_unpack2 {
1281           my $s = shift;
1282           substr $s->{buf}, 0, $s->{len};
1283         }
1284
1285         sub string_pack2 {
1286           my $s = shift;
1287           return {
1288             len => length $s,
1289             buf => $s,
1290           }
1291         }
1292
1293         $c->tag('String', Hooks => { pack   => \&string_pack2,
1294                                      unpack => \&string_unpack2 });
1295
1296       This will be exactly equivalent to the old code, but faster and
1297       probably even much easier to understand.
1298
1299       But hooks are even more powerful. You can customize the arguments that
1300       are passed to your hooks and you can use "arg" to pass certain special
1301       arguments, such as the name of the type that is currently being
1302       processed by the hook.
1303
1304       The following example shows how it is easily possible to peek into the
1305       perl internals using hooks.
1306
1307         use Config;
1308
1309         $c = Convert::Binary::C->new(%CC, OrderMembers => 1);
1310         $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1311         $c->parse(<<ENDC);
1312         #include "EXTERN.h"
1313         #include "perl.h"
1314         ENDC
1315
1316         $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1317                                               $c->arg(qw(SELF TYPE DATA))] })
1318             for qw( XPVAV XPVHV );
1319
1320       First, we add the perl core include path and parse perl.h. Then, we add
1321       an "unpack_ptr" hook for a couple of the internal data types.
1322
1323       The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1324       a certain data structure is processed. This is by far the most
1325       experimental part of the hooks feature, as this includes any kind of
1326       pointer. There's no way for the hook to know the difference between a
1327       plain pointer, or a pointer to a pointer, or a pointer to an array
1328       (this is because the difference doesn't matter anywhere else in
1329       Convert::Binary::C).
1330
1331       But the hook above makes use of another very interesting feature: It
1332       uses "arg" to pass special arguments to the hook subroutine.  Usually,
1333       the hook subroutine is simply passed a single data argument.  But using
1334       the above definition, it'll get a reference to the calling object
1335       ("SELF"), the name of the type being processed ("TYPE") and the data
1336       ("DATA").
1337
1338       But how does our hook look like?
1339
1340         sub unpack_ptr {
1341           my($self, $type, $ptr) = @_;
1342           $ptr or return '<NULL>';
1343           my $size = $self->sizeof($type);
1344           $self->unpack($type, unpack("P$size", pack('Q', $ptr)));
1345         }
1346
1347       As you can see, the hook is rather simple. First, it receives the
1348       arguments mentioned above. It performs a quick check if the pointer is
1349       "NULL" and shouldn't be processed any further. Next, it determines the
1350       size of the type being processed. And finally, it'll just use the "P"n
1351       unpack template to read from that memory location and recursively call
1352       "unpack" to unpack the type. (And yes, this may of course again call
1353       other hooks.)
1354
1355       Now, let's test that:
1356
1357         my $ref = { foo => 42, bar => 4711 };
1358         my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1359
1360         print Dumper(unpack_ptr($c, 'AV', $ptr));
1361
1362       Just for the fun of it, we create a blessed array reference. But how do
1363       we get a pointer to the corresponding "AV"? This is rather easy, as the
1364       address of the "AV" is just the hex value that appears when using the
1365       array reference in string context. So we just grab that and turn it
1366       into decimal. All that's left to do is just call our hook, as it can
1367       already handle "AV" pointers. And this is what we get:
1368
1369         $VAR1 = {
1370           'sv_any' => {
1371             'xmg_stash' => 0,
1372             'xmg_u' => {
1373               'xmg_magic' => 0,
1374               'xmg_hash_index' => 0
1375             },
1376             'xav_fill' => 2,
1377             'xav_max' => 7,
1378             'xav_alloc' => 0
1379           },
1380           'sv_refcnt' => 1,
1381           'sv_flags' => 536870924,
1382           'sv_u' => {
1383             'svu_pv' => '94716517508048',
1384             'svu_iv' => '94716517508048',
1385             'svu_uv' => '94716517508048',
1386             'svu_nv' => '4.67961773944475e-310',
1387             'svu_rv' => '94716517508048',
1388             'svu_array' => '94716517508048',
1389             'svu_hash' => '94716517508048',
1390             'svu_gp' => '94716517508048',
1391             'svu_fp' => '94716517508048'
1392           }
1393         };
1394
1395       Even though it is rather easy to do such stuff using "unpack_ptr"
1396       hooks, you should really know what you're doing and do it with extreme
1397       care because of the limitations mentioned above. It's really easy to
1398       run into segmentation faults when you're dereferencing pointers that
1399       point to memory which you don't own.
1400
1401       Performance
1402
1403       Using hooks isn't for free. In performance-critical applications you
1404       have to keep in mind that hooks are actually perl subroutines and that
1405       they are called once for every value of a registered type that is being
1406       packed or unpacked. If only about 10% of the values require hooks to be
1407       called, you'll hardly notice the difference (if your hooks are
1408       implemented efficiently, that is).  But if all values would require
1409       hooks to be called, that alone could easily make packing and unpacking
1410       very slow.
1411
1412   Tag Order
1413       Since it is possible to attach multiple tags to a single type, the
1414       order in which the tags are processed is important. Here's a small
1415       table that shows the processing order.
1416
1417         pack        unpack
1418         ---------------------
1419         Hooks       Format
1420         Format      ByteOrder
1421         ByteOrder   Hooks
1422
1423       As a general rule, the "Hooks" tag is always the first thing processed
1424       when packing data, and the last thing processed when unpacking data.
1425
1426       The "Format" and "ByteOrder" tags are exclusive, but when both are
1427       given the "Format" tag wins.
1428

METHODS

1430   new
1431       "new"
1432       "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1433               The constructor is used to create a new Convert::Binary::C
1434               object.  You can simply use
1435
1436                 $c = Convert::Binary::C->new;
1437
1438               without additional arguments to create an object, or you can
1439               optionally pass any arguments to the constructor that are
1440               described for the "configure" method.
1441
1442   configure
1443       "configure"
1444       "configure" OPTION
1445       "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1446               This method can be used to configure an existing
1447               Convert::Binary::C object or to retrieve its current
1448               configuration.
1449
1450               To configure the object, the list of options consists of key
1451               and value pairs and must therefore contain an even number of
1452               elements. "configure" (and also "new" if used with
1453               configuration options) will throw an exception if you pass an
1454               odd number of elements. Configuration will normally look like
1455               this:
1456
1457                 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1458
1459               To retrieve the current value of a configuration option, you
1460               must pass a single argument to "configure" that holds the name
1461               of the option, just like
1462
1463                 $order = $c->configure('ByteOrder');
1464
1465               If you want to get the values of all configuration options at
1466               once, you can call "configure" without any arguments and it
1467               will return a reference to a hash table that holds the whole
1468               object configuration. This can be conveniently used with the
1469               Data::Dumper module, for example:
1470
1471                 use Convert::Binary::C;
1472                 use Data::Dumper;
1473
1474                 $c = Convert::Binary::C->new(Define  => ['DEBUGGING', 'FOO=123'],
1475                                              Include => ['/usr/include']);
1476
1477                 print Dumper($c->configure);
1478
1479               Which will print something like this:
1480
1481                 $VAR1 = {
1482                   'DisabledKeywords' => [],
1483                   'HasCPPComments' => 1,
1484                   'UnsignedChars' => 0,
1485                   'LongDoubleSize' => 16,
1486                   'OrderMembers' => 1,
1487                   'CompoundAlignment' => 1,
1488                   'UnsignedBitfields' => 0,
1489                   'DoubleSize' => 8,
1490                   'Assert' => [],
1491                   'PointerSize' => 8,
1492                   'ByteOrder' => 'LittleEndian',
1493                   'Warnings' => 0,
1494                   'LongSize' => 8,
1495                   'Include' => [
1496                     '/usr/include'
1497                   ],
1498                   'EnumType' => 'Integer',
1499                   'EnumSize' => 4,
1500                   'ShortSize' => 2,
1501                   'IntSize' => 4,
1502                   'StdCVersion' => 199901,
1503                   'HostedC' => 1,
1504                   'Alignment' => 1,
1505                   'HasMacroVAARGS' => 1,
1506                   'KeywordMap' => {},
1507                   'Define' => [
1508                     'DEBUGGING',
1509                     'FOO=123'
1510                   ],
1511                   'LongLongSize' => 8,
1512                   'CharSize' => 1,
1513                   'FloatSize' => 4,
1514                   'Bitfields' => {
1515                     'Engine' => 'Generic'
1516                   }
1517                 };
1518
1519               Since you may not always want to write a "configure" call when
1520               you only want to change a single configuration item, you can
1521               use any configuration option name as a method name, like:
1522
1523                 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1524
1525               (Yes, the example doesn't make very much sense... ;-)
1526
1527               However, you should keep in mind that configuration methods
1528               that can take lists (namely "Include", "Define" and "Assert",
1529               but not "DisabledKeywords") may behave slightly different than
1530               their "configure" equivalent.  If you pass these methods a
1531               single argument that is an array reference, the current list
1532               will be replaced by the new one, which is just the behaviour of
1533               the corresponding "configure" call.  So the following are
1534               equivalent:
1535
1536                 $c->configure(Define => ['foo', 'bar=123']);
1537                 $c->Define(['foo', 'bar=123']);
1538
1539               But if you pass a list of strings instead of an array reference
1540               (which cannot be done when using "configure"), the new list
1541               items are appended to the current list, so
1542
1543                 $c = Convert::Binary::C->new(Include => ['/include']);
1544                 $c->Include('/usr/include', '/usr/local/include');
1545                 print Dumper($c->Include);
1546
1547                 $c->Include(['/usr/local/include']);
1548                 print Dumper($c->Include);
1549
1550               will first print all three include paths, but finally only
1551               "/usr/local/include" will be configured:
1552
1553                 $VAR1 = [
1554                   '/include',
1555                   '/usr/include',
1556                   '/usr/local/include'
1557                 ];
1558                 $VAR1 = [
1559                   '/usr/local/include'
1560                 ];
1561
1562               Furthermore, configuration methods can be chained together, as
1563               they return a reference to their object if called as a set
1564               method. So, if you like, you can configure your object like
1565               this:
1566
1567                 $c = Convert::Binary::C->new(IntSize => 4)
1568                        ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1569                        ->ByteOrder('BigEndian');
1570
1571                 $c->configure(EnumType => 'Both', Alignment => 4)
1572                   ->Include('/usr/include', '/usr/local/include');
1573
1574               In the example above, "qw( ... )" is the word list quoting
1575               operator. It returns a list of all non-whitespace sequences,
1576               and is especially useful for configuring preprocessor defines
1577               or assertions. The following assignments are equivalent:
1578
1579                 @array = ('one', 'two', 'three');
1580                 @array = qw(one two three);
1581
1582               You can configure the following options. Unknown options, as
1583               well as invalid values for an option, will cause the object to
1584               throw exceptions.
1585
1586               "IntSize" => 0 | 1 | 2 | 4 | 8
1587                   Set the number of bytes that are occupied by an integer.
1588                   This is in most cases 2 or 4. If you set it to zero, the
1589                   size of an integer on the host system will be used. This is
1590                   also the default unless overridden by
1591                   "CBC_DEFAULT_INT_SIZE" at compile time.
1592
1593               "CharSize" => 0 | 1 | 2 | 4 | 8
1594                   Set the number of bytes that are occupied by a "char".
1595                   This rarely needs to be changed, except for some platforms
1596                   that don't care about bytes, for example DSPs.  If you set
1597                   this to zero, the size of a "char" on the host system will
1598                   be used. This is also the default unless overridden by
1599                   "CBC_DEFAULT_CHAR_SIZE" at compile time.
1600
1601               "ShortSize" => 0 | 1 | 2 | 4 | 8
1602                   Set the number of bytes that are occupied by a short
1603                   integer.  Although integers explicitly declared as "short"
1604                   should be always 16 bit, there are compilers that make a
1605                   short 8 bit wide. If you set it to zero, the size of a
1606                   short integer on the host system will be used. This is also
1607                   the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1608                   at compile time.
1609
1610               "LongSize" => 0 | 1 | 2 | 4 | 8
1611                   Set the number of bytes that are occupied by a long
1612                   integer.  If set to zero, the size of a long integer on the
1613                   host system will be used. This is also the default unless
1614                   overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1615
1616               "LongLongSize" => 0 | 1 | 2 | 4 | 8
1617                   Set the number of bytes that are occupied by a long long
1618                   integer. If set to zero, the size of a long long integer on
1619                   the host system, or 8, will be used. This is also the
1620                   default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1621                   at compile time.
1622
1623               "FloatSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1624                   Set the number of bytes that are occupied by a single
1625                   precision floating point value.  If you set it to zero, the
1626                   size of a "float" on the host system will be used. This is
1627                   also the default unless overridden by
1628                   "CBC_DEFAULT_FLOAT_SIZE" at compile time.  For details on
1629                   floating point support, see "FLOATING POINT VALUES".
1630
1631               "DoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1632                   Set the number of bytes that are occupied by a double
1633                   precision floating point value.  If you set it to zero, the
1634                   size of a "double" on the host system will be used. This is
1635                   also the default unless overridden by
1636                   "CBC_DEFAULT_DOUBLE_SIZE" at compile time.  For details on
1637                   floating point support, see "FLOATING POINT VALUES".
1638
1639               "LongDoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1640                   Set the number of bytes that are occupied by a double
1641                   precision floating point value.  If you set it to zero, the
1642                   size of a "long double" on the host system, or 12 will be
1643                   used. This is also the default unless overridden by
1644                   "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1645                   on floating point support, see "FLOATING POINT VALUES".
1646
1647               "PointerSize" => 0 | 1 | 2 | 4 | 8
1648                   Set the number of bytes that are occupied by a pointer.
1649                   This is in most cases 2 or 4. If you set it to zero, the
1650                   size of a pointer on the host system will be used. This is
1651                   also the default unless overridden by
1652                   "CBC_DEFAULT_PTR_SIZE" at compile time.
1653
1654               "EnumSize" => -1 | 0 | 1 | 2 | 4 | 8
1655                   Set the number of bytes that are occupied by an enumeration
1656                   type.  On most systems, this is equal to the size of an
1657                   integer, which is also the default. However, for some
1658                   compilers, the size of an enumeration type depends on the
1659                   size occupied by the largest enumerator. So the size may
1660                   vary between 1 and 8. If you have
1661
1662                     enum foo {
1663                       ONE = 100, TWO = 200
1664                     };
1665
1666                   this will occupy one byte because the enum can be
1667                   represented as an unsigned one-byte value. However,
1668
1669                     enum foo {
1670                       ONE = -100, TWO = 200
1671                     };
1672
1673                   will occupy two bytes, because the -100 forces the type to
1674                   be signed, and 200 doesn't fit into a signed one-byte
1675                   value.  Therefore, the type used is a signed two-byte
1676                   value.  If this is the behaviour you need, set the EnumSize
1677                   to 0.
1678
1679                   Some compilers try to follow this strategy, but don't care
1680                   whether the enumeration has signed values or not. They
1681                   always declare an enum as signed. On such a compiler, given
1682
1683                     enum one { ONE = -100, TWO = 100 };
1684                     enum two { ONE =  100, TWO = 200 };
1685
1686                   enum "one" will occupy only one byte, while enum "two" will
1687                   occupy two bytes, even though it could be represented by a
1688                   unsigned one-byte value. If this is the behaviour of your
1689                   compiler, set EnumSize to "-1".
1690
1691               "Alignment" => 0 | 1 | 2 | 4 | 8 | 16
1692                   Set the struct member alignment. This option controls where
1693                   padding bytes are inserted between struct members. It
1694                   globally sets the alignment for all structs/unions.
1695                   However, this can be overridden from within the source code
1696                   with the common "pack" pragma as explained in "Supported
1697                   pragma directives".  The default alignment is 1, which
1698                   means no padding bytes are inserted. A setting of 0 means
1699                   native alignment, i.e.  the alignment of the system that
1700                   Convert::Binary::C has been compiled on. You can determine
1701                   the native properties using the "native" function.
1702
1703                   The "Alignment" option is similar to the "-Zp[n]" option of
1704                   the Intel compiler. It globally specifies the maximum
1705                   boundary to which struct members are aligned. Consider the
1706                   following structure and the sizes of "char", "short",
1707                   "long" and "double" being 1, 2, 4 and 8, respectively.
1708
1709                     struct align {
1710                       char   a;
1711                       short  b, c;
1712                       long   d;
1713                       double e;
1714                     };
1715
1716                   With an alignment of 1 (the default), the struct members
1717                   would be packed tightly:
1718
1719                     0   1   2   3   4   5   6   7   8   9  10  11  12
1720                     +---+---+---+---+---+---+---+---+---+---+---+---+
1721                     | a |   b   |   c   |       d       |             ...
1722                     +---+---+---+---+---+---+---+---+---+---+---+---+
1723
1724                        12  13  14  15  16  17
1725                         +---+---+---+---+---+
1726                     ...     e               |
1727                         +---+---+---+---+---+
1728
1729                   With an alignment of 2, the struct members larger than one
1730                   byte would be aligned to 2-byte boundaries, which results
1731                   in a single padding byte between "a" and "b".
1732
1733                     0   1   2   3   4   5   6   7   8   9  10  11  12
1734                     +---+---+---+---+---+---+---+---+---+---+---+---+
1735                     | a | * |   b   |   c   |       d       |         ...
1736                     +---+---+---+---+---+---+---+---+---+---+---+---+
1737
1738                        12  13  14  15  16  17  18
1739                         +---+---+---+---+---+---+
1740                     ...         e               |
1741                         +---+---+---+---+---+---+
1742
1743                   With an alignment of 4, the struct members of size 2 would
1744                   be aligned to 2-byte boundaries and larger struct members
1745                   would be aligned to 4-byte boundaries:
1746
1747                     0   1   2   3   4   5   6   7   8   9  10  11  12
1748                     +---+---+---+---+---+---+---+---+---+---+---+---+
1749                     | a | * |   b   |   c   | * | * |       d       | ...
1750                     +---+---+---+---+---+---+---+---+---+---+---+---+
1751
1752                        12  13  14  15  16  17  18  19  20
1753                         +---+---+---+---+---+---+---+---+
1754                     ... |               e               |
1755                         +---+---+---+---+---+---+---+---+
1756
1757                   This layout of the struct members allows the compiler to
1758                   generate optimized code because aligned members can be
1759                   accessed more easily by the underlying architecture.
1760
1761                   Finally, setting the alignment to 8 will align "double"s to
1762                   8-byte boundaries:
1763
1764                     0   1   2   3   4   5   6   7   8   9  10  11  12
1765                     +---+---+---+---+---+---+---+---+---+---+---+---+
1766                     | a | * |   b   |   c   | * | * |       d       | ...
1767                     +---+---+---+---+---+---+---+---+---+---+---+---+
1768
1769                        12  13  14  15  16  17  18  19  20  21  22  23  24
1770                         +---+---+---+---+---+---+---+---+---+---+---+---+
1771                     ... | * | * | * | * |               e               |
1772                         +---+---+---+---+---+---+---+---+---+---+---+---+
1773
1774                   Further increasing the alignment does not alter the layout
1775                   of our structure, as only members larger that 8 bytes would
1776                   be affected.
1777
1778                   The alignment of a structure depends on its largest member
1779                   and on the setting of the "Alignment" option. With
1780                   "Alignment" set to 2, a structure holding a "long" would be
1781                   aligned to a 2-byte boundary, while a structure containing
1782                   only "char"s would have no alignment restrictions.
1783                   (Unfortunately, that's not the whole story. See the
1784                   "CompoundAlignment" option for details.)
1785
1786                   Here's another example. Assuming 8-byte alignment, the
1787                   following two structs will both have a size of 16 bytes:
1788
1789                     struct one {
1790                       char   c;
1791                       double d;
1792                     };
1793
1794                     struct two {
1795                       double d;
1796                       char   c;
1797                     };
1798
1799                   This is clear for "struct one", because the member "d" has
1800                   to be aligned to an 8-byte boundary, and thus 7 padding
1801                   bytes are inserted after "c". But for "struct two", the
1802                   padding bytes are inserted at the end of the structure,
1803                   which doesn't make much sense immediately. However, it
1804                   makes perfect sense if you think about an array of "struct
1805                   two". Each "double" has to be aligned to an 8-byte
1806                   boundary, an thus each array element would have to occupy
1807                   16 bytes. With that in mind, it would be strange if a
1808                   "struct two" variable would have a different size. And it
1809                   would make the widely used construct
1810
1811                     struct two array[] = { {1.0, 0}, {2.0, 1} };
1812                     int elements = sizeof(array) / sizeof(struct two);
1813
1814                   impossible.
1815
1816                   The alignment behaviour described here seems to be common
1817                   for all compilers. However, not all compilers have an
1818                   option to configure their default alignment.
1819
1820               "CompoundAlignment" => 0 | 1 | 2 | 4 | 8 | 16
1821                   Usually, the alignment of a compound (i.e. a "struct" or a
1822                   "union") depends only on its largest member and on the
1823                   setting of the "Alignment" option. There are, however,
1824                   architectures and compilers where compounds can have
1825                   different alignment constraints.
1826
1827                   For most platforms and compilers, the alignment constraint
1828                   for compounds is 1 byte. That is, on most platforms
1829
1830                     struct onebyte {
1831                       char byte;
1832                     };
1833
1834                   will have an alignment of 1 and also a size of 1. But if
1835                   you take an ARM architecture, the above "struct onebyte"
1836                   will have an alignment of 4, and thus also a size of 4.
1837
1838                   You can configure this by setting "CompoundAlignment" to 4.
1839                   This will ensure that the alignment of compounds is always
1840                   4.
1841
1842                   Setting "CompoundAlignment" to 0 means native compound
1843                   alignment, i.e. the compound alignment of the system that
1844                   Convert::Binary::C has been compiled on. You can determine
1845                   the native properties using the "native" function.
1846
1847                   There are also compilers for certain platforms that allow
1848                   you to adjust the compound alignment. If you're not aware
1849                   of the fact that your compiler/architecture has a compound
1850                   alignment other than 1, strange things can happen. If, for
1851                   example, the compound alignment is 2 and you have something
1852                   like
1853
1854                     typedef unsigned char U8;
1855
1856                     struct msg_head {
1857                       U8 cmd;
1858                       struct {
1859                         U8 hi;
1860                         U8 low;
1861                       } crc16;
1862                       U8 len;
1863                     };
1864
1865                   there will be one padding byte inserted before the embedded
1866                   "crc16" struct and after the "len" member, which is most
1867                   probably not what was intended:
1868
1869                     0     1     2     3     4     5     6
1870                     +-----+-----+-----+-----+-----+-----+
1871                     | cmd |  *  | hi  | low | len |  *  |
1872                     +-----+-----+-----+-----+-----+-----+
1873
1874                   Note that both "#pragma pack" and the "Alignment" option
1875                   can override "CompoundAlignment". If you set
1876                   "CompoundAlignment" to 4, but "Alignment" to 2, compounds
1877                   will actually be aligned on 2-byte boundaries.
1878
1879               "ByteOrder" => 'BigEndian' | 'LittleEndian'
1880                   Set the byte order for integers larger than a single byte.
1881                   Little endian (Intel, least significant byte first) and big
1882                   endian (Motorola, most significant byte first) byte order
1883                   are supported. The default byte order is the same as the
1884                   byte order of the host system unless overridden by
1885                   "CBC_DEFAULT_BYTEORDER" at compile time.
1886
1887               "EnumType" => 'Integer' | 'String' | 'Both'
1888                   This option controls the type that enumeration constants
1889                   will have in data structures returned by the "unpack"
1890                   method.  If you have the following definitions:
1891
1892                     typedef enum {
1893                       SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1894                       THURSDAY, FRIDAY, SATURDAY
1895                     } Weekday;
1896
1897                     typedef enum {
1898                       JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1899                       AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1900                     } Month;
1901
1902                     typedef struct {
1903                       int     year;
1904                       Month   month;
1905                       int     day;
1906                       Weekday weekday;
1907                     } Date;
1908
1909                   and a byte string that holds a packed Date struct, then
1910                   you'll get the following results from a call to the
1911                   "unpack" method.
1912
1913                   "Integer"
1914                       Enumeration constants are returned as plain integers.
1915                       This is fast, but may be not very useful. It is also
1916                       the default.
1917
1918                         $date = {
1919                           'year' => 2002,
1920                           'month' => 0,
1921                           'day' => 7,
1922                           'weekday' => 1
1923                         };
1924
1925                   "String"
1926                       Enumeration constants are returned as strings. This
1927                       will create a string constant for every unpacked
1928                       enumeration constant and thus consumes more time and
1929                       memory. However, the result may be more useful.
1930
1931                         $date = {
1932                           'year' => 2002,
1933                           'month' => 'JANUARY',
1934                           'day' => 7,
1935                           'weekday' => 'MONDAY'
1936                         };
1937
1938                   "Both"
1939                       Enumeration constants are returned as double typed
1940                       scalars.  If evaluated in string context, the
1941                       enumeration constant will be a string, if evaluated in
1942                       numeric context, the enumeration constant will be an
1943                       integer.
1944
1945                         $date = $c->EnumType('Both')->unpack('Date', $binary);
1946
1947                         printf "Weekday = %s (%d)\n\n", $date->{weekday},
1948                                                         $date->{weekday};
1949
1950                         if ($date->{month} == 0) {
1951                           print "It's $date->{month}, happy new year!\n\n";
1952                         }
1953
1954                         print Dumper($date);
1955
1956                       This will print:
1957
1958                         Weekday = MONDAY (1)
1959
1960                         It's JANUARY, happy new year!
1961
1962                         $VAR1 = {
1963                           'year' => 2002,
1964                           'month' => 'JANUARY',
1965                           'day' => 7,
1966                           'weekday' => 'MONDAY'
1967                         };
1968
1969               "DisabledKeywords" => [ KEYWORDS ]
1970                   This option allows you to selectively deactivate certain
1971                   keywords in the C parser. Some C compilers don't have the
1972                   complete ANSI keyword set, i.e. they don't recognize the
1973                   keywords "const" or "void", for example. If you do
1974
1975                     typedef int void;
1976
1977                   on such a compiler, this will usually be ok. But if you
1978                   parse this with an ANSI compiler, it will be a syntax
1979                   error. To parse the above code correctly, you have to
1980                   disable the "void" keyword in the Convert::Binary::C
1981                   parser:
1982
1983                     $c->DisabledKeywords([qw( void )]);
1984
1985                   By default, the Convert::Binary::C parser will recognize
1986                   the keywords "inline" and "restrict". If your compiler
1987                   doesn't have these new keywords, it usually doesn't matter.
1988                   Only if you're using the keywords as identifiers, like in
1989
1990                     typedef struct inline {
1991                       int a, b;
1992                     } restrict;
1993
1994                   you'll have to disable these ISO-C99 keywords:
1995
1996                     $c->DisabledKeywords([qw( inline restrict )]);
1997
1998                   The parser allows you to disable the following keywords:
1999
2000                     asm
2001                     auto
2002                     const
2003                     double
2004                     enum
2005                     extern
2006                     float
2007                     inline
2008                     long
2009                     register
2010                     restrict
2011                     short
2012                     signed
2013                     static
2014                     unsigned
2015                     void
2016                     volatile
2017
2018               "KeywordMap" => { KEYWORD => TOKEN, ... }
2019                   This option allows you to add new keywords to the parser.
2020                   These new keywords can either be mapped to existing tokens
2021                   or simply ignored. For example, recent versions of the GNU
2022                   compiler recognize the keywords "__signed__" and
2023                   "__extension__".  The first one obviously is a synonym for
2024                   "signed", while the second one is only a marker for a
2025                   language extension.
2026
2027                   Using the preprocessor, you could of course do the
2028                   following:
2029
2030                     $c->Define(qw( __signed__=signed __extension__= ));
2031
2032                   However, the preprocessor symbols could be undefined or
2033                   redefined in the code, and
2034
2035                     #ifdef __signed__
2036                     # undef __signed__
2037                     #endif
2038
2039                     typedef __extension__ __signed__ long long s_quad;
2040
2041                   would generate a parse error, because "__signed__" is an
2042                   unexpected identifier.
2043
2044                   Instead of utilizing the preprocessor, you'll have to
2045                   create mappings for the new keywords directly in the parser
2046                   using "KeywordMap". In the above example, you want to map
2047                   "__signed__" to the built-in C keyword "signed" and ignore
2048                   "__extension__". This could be done with the following
2049                   code:
2050
2051                     $c->KeywordMap({ __signed__    => 'signed',
2052                                      __extension__ => undef });
2053
2054                   You can specify any valid identifier as hash key, and
2055                   either a valid C keyword or "undef" as hash value.  Having
2056                   configured the object that way, you could parse even
2057
2058                     #ifdef __signed__
2059                     # undef __signed__
2060                     #endif
2061
2062                     typedef __extension__ __signed__ long long s_quad;
2063
2064                   without problems.
2065
2066                   Note that "KeywordMap" and "DisabledKeywords" perfectly
2067                   work together. You could, for example, disable the "signed"
2068                   keyword, but still have "__signed__" mapped to the original
2069                   "signed" token:
2070
2071                     $c->configure(DisabledKeywords => [ 'signed' ],
2072                                   KeywordMap       => { __signed__  => 'signed' });
2073
2074                   This would allow you to define
2075
2076                     typedef __signed__ long signed;
2077
2078                   which would normally be a syntax error because "signed"
2079                   cannot be used as an identifier.
2080
2081               "UnsignedChars" => 0 | 1
2082                   Use this boolean option if you want characters to be
2083                   unsigned if specified without an explicit "signed" or
2084                   "unsigned" type specifier.  By default, characters are
2085                   signed.
2086
2087               "UnsignedBitfields" => 0 | 1
2088                   Use this boolean option if you want bitfields to be
2089                   unsigned if specified without an explicit "signed" or
2090                   "unsigned" type specifier.  By default, bitfields are
2091                   signed.
2092
2093               "Warnings" => 0 | 1
2094                   Use this boolean option if you want warnings to be issued
2095                   during the parsing of source code. Currently, warnings are
2096                   only reported by the preprocessor, so don't expect the
2097                   output to cover everything.
2098
2099                   By default, warnings are turned off and only errors will be
2100                   reported. However, even these errors are turned off if you
2101                   run without the "-w" flag.
2102
2103               "HasCPPComments" => 0 | 1
2104                   Use this option to turn C++ comments on or off. By default,
2105                   C++ comments are enabled. Disabling C++ comments may be
2106                   necessary if your code includes strange things like:
2107
2108                     one = 4 //* <- divide */ 4;
2109                     two = 2;
2110
2111                   With C++ comments, the above will be interpreted as
2112
2113                     one = 4
2114                     two = 2;
2115
2116                   which will obviously be a syntax error, but without C++
2117                   comments, it will be interpreted as
2118
2119                     one = 4 / 4;
2120                     two = 2;
2121
2122                   which is correct.
2123
2124               "HasMacroVAARGS" => 0 | 1
2125                   Use this option to turn the "__VA_ARGS__" macro expansion
2126                   on or off. If this is enabled (which is the default), you
2127                   can use variable length argument lists in your preprocessor
2128                   macros.
2129
2130                     #define DEBUG( ... )  fprintf( stderr, __VA_ARGS__ )
2131
2132                   There's normally no reason to turn that feature off.
2133
2134               "StdCVersion" => undef | INTEGER
2135                   Use this option to change the value of the preprocessor's
2136                   predefined "__STDC_VERSION__" macro. When set to "undef",
2137                   the macro will not be defined.
2138
2139               "HostedC" => undef | 0 | 1
2140                   Use this option to change the value of the preprocessor's
2141                   predefined "__STDC_HOSTED__" macro. When set to "undef",
2142                   the macro will not be defined.
2143
2144               "Include" => [ INCLUDES ]
2145                   Use this option to set the include path for the internal
2146                   preprocessor. The option value is a reference to an array
2147                   of strings, each string holding a directory that should be
2148                   searched for includes.
2149
2150               "Define" => [ DEFINES ]
2151                   Use this option to define symbols in the preprocessor.  The
2152                   option value is, again, a reference to an array of strings.
2153                   Each string can be either just a symbol or an assignment to
2154                   a symbol. This is completely equivalent to what the "-D"
2155                   option does for most preprocessors.
2156
2157                   The following will define the symbol "FOO" and define "BAR"
2158                   to be 12345:
2159
2160                     $c->configure(Define => [qw( FOO BAR=12345 )]);
2161
2162               "Assert" => [ ASSERTIONS ]
2163                   Use this option to make assertions in the preprocessor.  If
2164                   you don't know what assertions are, don't be concerned,
2165                   since they're deprecated anyway. They are, however, used in
2166                   some system's include files.  The value is an array
2167                   reference, just like for the macro definitions. Only the
2168                   way the assertions are defined is a bit different and
2169                   mimics the way they are defined with the "#assert"
2170                   directive:
2171
2172                     $c->configure(Assert => ['foo(bar)']);
2173
2174               "OrderMembers" => 0 | 1
2175                   When using "unpack" on compounds and iterating over the
2176                   returned hash, the order of the compound members is
2177                   generally not preserved due to the nature of hash tables.
2178                   It is not even guaranteed that the order is the same
2179                   between different runs of the same program. This can be
2180                   very annoying if you simply use to dump your data
2181                   structures and the compound members always show up in a
2182                   different order.
2183
2184                   By setting "OrderMembers" to a non-zero value, all hashes
2185                   returned by "unpack" are tied to a class that preserves the
2186                   order of the hash keys.  This way, all compound members
2187                   will be returned in the correct order just as they are
2188                   defined in your C code.
2189
2190                     use Convert::Binary::C;
2191                     use Data::Dumper;
2192
2193                     $c = Convert::Binary::C->new->parse(<<'ENDC');
2194                     struct test {
2195                       char one;
2196                       char two;
2197                       struct {
2198                         char never;
2199                         char change;
2200                         char this;
2201                         char order;
2202                       } three;
2203                       char four;
2204                     };
2205                     ENDC
2206
2207                     $data = "Convert";
2208
2209                     $u1 = $c->unpack('test', $data);
2210                     $c->OrderMembers(1);
2211                     $u2 = $c->unpack('test', $data);
2212
2213                     print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2214
2215                   This will print something like:
2216
2217                     $u1 = {
2218                       'one' => 67,
2219                       'two' => 111,
2220                       'three' => {
2221                         'never' => 110,
2222                         'change' => 118,
2223                         'this' => 101,
2224                         'order' => 114
2225                       },
2226                       'four' => 116
2227                     };
2228                     $u2 = {
2229                       'one' => 67,
2230                       'two' => 111,
2231                       'three' => {
2232                         'never' => 110,
2233                         'change' => 118,
2234                         'this' => 101,
2235                         'order' => 114
2236                       },
2237                       'four' => 116
2238                     };
2239
2240                   To be able to use this option, you have to install one of
2241                   the following modules: Tie::Hash::Indexed, Hash::Ordered or
2242                   Tie::IxHash.  If more than one of these modules is
2243                   installed, Convert::Binary::C will use them in that order
2244                   of preference.
2245
2246                   When using this option, you should keep in mind that tied
2247                   hashes are significantly slower and consume more memory
2248                   than ordinary hashes, even when the class they're tied to
2249                   is implemented efficiently. So don't turn this option on if
2250                   you don't have to.
2251
2252                   You can also influence hash member ordering by using the
2253                   "CBC_ORDER_MEMBERS" environment variable.
2254
2255               "Bitfields" => { OPTION => VALUE, ... }
2256                   Use this option to specify and configure a bitfield
2257                   layouting engine. You can choose an engine by passing its
2258                   name to the "Engine" option, like:
2259
2260                     $c->configure(Bitfields => { Engine => 'Generic' });
2261
2262                   Each engine can have its own set of options, although
2263                   currently none of them does.
2264
2265                   You can choose between the following bitfield engines:
2266
2267                   "Generic"
2268                       This engine implements the behaviour of most UNIX C
2269                       compilers, including GCC. It does not handle packed
2270                       bitfields yet.
2271
2272                   "Microsoft"
2273                       This engine implements the behaviour of Microsoft's
2274                       "cl" compiler.  It should be fairly complete and can
2275                       handle packed bitfields.
2276
2277                   "Simple"
2278                       This engine is only used for testing the bitfield
2279                       infrastructure in Convert::Binary::C. There's usually
2280                       no reason to use it.
2281
2282               You can reconfigure all options even after you have parsed some
2283               code. The changes will be applied to the already parsed
2284               definitions. This works as long as array lengths are not
2285               affected by the changes. If you have Alignment and IntSize set
2286               to 4 and parse code like this
2287
2288                 typedef struct {
2289                   char abc;
2290                   int  day;
2291                 } foo;
2292
2293                 struct bar {
2294                   foo  zap[2*sizeof(foo)];
2295                 };
2296
2297               the array "zap" in "struct bar" will obviously have 16
2298               elements. If you reconfigure the alignment to 1 now, the size
2299               of "foo" is now 5 instead of 8. While the alignment is adjusted
2300               correctly, the number of elements in array "zap" will still be
2301               16 and will not be changed to 10.
2302
2303   parse
2304       "parse" CODE
2305               Parses a string of valid C code. All enumeration, compound and
2306               type definitions are extracted. You can call the "parse" and
2307               "parse_file" methods as often as you like to add further
2308               definitions to the Convert::Binary::C object.
2309
2310               "parse" will throw an exception if an error occurs.  On
2311               success, the method returns a reference to its object.
2312
2313               See "Parsing C code" for an example.
2314
2315   parse_file
2316       "parse_file" FILE
2317               Parses a C source file. All enumeration, compound and type
2318               definitions are extracted. You can call the "parse" and
2319               "parse_file" methods as often as you like to add further
2320               definitions to the Convert::Binary::C object.
2321
2322               "parse_file" will search the include path given via the
2323               "Include" option for the file if it cannot find it in the
2324               current directory.
2325
2326               "parse_file" will throw an exception if an error occurs. On
2327               success, the method returns a reference to its object.
2328
2329               See "Parsing C code" for an example.
2330
2331               When calling "parse" or "parse_file" multiple times, you may
2332               use types previously defined, but you are not allowed to
2333               redefine types. The state of the preprocessor is also saved, so
2334               you may also use defines from a previous parse. This works only
2335               as long as the preprocessor is not reset. See "Preprocessor
2336               configuration" for details.
2337
2338               When you're parsing C source files instead of C header files,
2339               note that local definitions are ignored. This means that type
2340               definitions hidden within functions will not be recognized by
2341               Convert::Binary::C. This is necessary because different
2342               functions (even different blocks within the same function) can
2343               define types with the same name:
2344
2345                 void my_func(int i)
2346                 {
2347                   if (i < 10)
2348                   {
2349                     enum digit { ONE, TWO, THREE } x = ONE;
2350                     printf("%d, %d\n", i, x);
2351                   }
2352                   else
2353                   {
2354                     enum digit { THREE, TWO, ONE } x = ONE;
2355                     printf("%d, %d\n", i, x);
2356                   }
2357                 }
2358
2359               The above is a valid piece of C code, but it's not possible for
2360               Convert::Binary::C to distinguish between the different
2361               definitions of "enum digit", as they're only defined locally
2362               within the corresponding block.
2363
2364   clean
2365       "clean" Clears all information that has been collected during previous
2366               calls to "parse" or "parse_file".  You can use this method if
2367               you want to parse some entirely different code, but with the
2368               same configuration.
2369
2370               The "clean" method returns a reference to its object.
2371
2372   clone
2373       "clone" Makes the object return an exact independent copy of itself.
2374
2375                 $c = Convert::Binary::C->new(Include => ['/usr/include']);
2376                 $c->parse_file('definitions.c');
2377                 $clone = $c->clone;
2378
2379               The above code is technically equivalent (Mostly. Actually,
2380               using "sourcify" and "parse" might alter the order of the
2381               parsed data, which would make methods such as "compound" return
2382               the definitions in a different order.) to:
2383
2384                 $c = Convert::Binary::C->new(Include => ['/usr/include']);
2385                 $c->parse_file('definitions.c');
2386                 $clone = Convert::Binary::C->new(%{$c->configure});
2387                 $clone->parse($c->sourcify);
2388
2389               Using "clone" is just a lot faster.
2390
2391   def
2392       "def" NAME
2393       "def" TYPE
2394               If you need to know if a definition for a certain type name
2395               exists, use this method. You pass it the name of an enum,
2396               struct, union or typedef, and it will return a non-empty string
2397               being either "enum", "struct", "union", or "typedef" if there's
2398               a definition for the type in question, an empty string if
2399               there's no such definition, or "undef" if the name is
2400               completely unknown. If the type can be interpreted as a basic
2401               type, "basic" will be returned.
2402
2403               If you pass in a TYPE, the output will be slightly different.
2404               If the specified member exists, the "def" method will return
2405               "member". If the member doesn't exist, or if the type cannot
2406               have members, the empty string will be returned. Again, if the
2407               name of the type is completely unknown, "undef" will be
2408               returned. This may be useful if you want to check if a certain
2409               member exists within a compound, for example.
2410
2411                 use Convert::Binary::C;
2412
2413                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2414
2415                 typedef struct __not  not;
2416                 typedef struct __not *ptr;
2417
2418                 struct foo {
2419                   enum bar *xxx;
2420                 };
2421
2422                 typedef int quad[4];
2423
2424                 ENDC
2425
2426                 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2427                                   quad quad[3] quad[5] quad[-3] short[1] ),
2428                               'unsigned long')
2429                 {
2430                   my $def = $c->def($type);
2431                   printf "%-14s  =>  %s\n",
2432                           $type,     defined $def ? "'$def'" : 'undef';
2433                 }
2434
2435               The following would be returned by the "def" method:
2436
2437                 not             =>  ''
2438                 ptr             =>  'typedef'
2439                 foo             =>  'struct'
2440                 bar             =>  ''
2441                 xxx             =>  undef
2442                 foo.xxx         =>  'member'
2443                 foo.abc         =>  ''
2444                 xxx.yyy         =>  undef
2445                 quad            =>  'typedef'
2446                 quad[3]         =>  'member'
2447                 quad[5]         =>  'member'
2448                 quad[-3]        =>  'member'
2449                 short[1]        =>  undef
2450                 unsigned long   =>  'basic'
2451
2452               So, if "def" returns a non-empty string, you can safely use any
2453               other method with that type's name or with that member
2454               expression.
2455
2456               Concerning arrays, note that the index into an array doesn't
2457               need to be within the bounds of the array's definition, just
2458               like in C. In the above example, "quad[5]" and "quad[-3]" are
2459               valid members of the "quad" array, even though it is declared
2460               to have only four elements.
2461
2462               In cases where the typedef namespace overlaps with the
2463               namespace of enums/structs/unions, the "def" method will give
2464               preference to the typedef and will thus return the string
2465               "typedef". You could however force interpretation as an enum,
2466               struct or union by putting "enum", "struct" or "union" in front
2467               of the type's name.
2468
2469   defined
2470       "defined" MACRO
2471               You can use the "defined" method to find out if a certain macro
2472               is defined, just like you would use the "defined" operator of
2473               the preprocessor. For example, the following code
2474
2475                 use Convert::Binary::C;
2476
2477                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2478
2479                 #define ADD(a, b) ((a) + (b))
2480
2481                 #if 1
2482                 # define DEFINED
2483                 #else
2484                 # define UNDEFINED
2485                 #endif
2486
2487                 ENDC
2488
2489                 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2490                   my $not = $c->defined($macro) ? '' : ' not';
2491                   print "Macro '$macro' is$not defined.\n";
2492                 }
2493
2494               would print:
2495
2496                 Macro 'ADD' is defined.
2497                 Macro 'DEFINED' is defined.
2498                 Macro 'UNDEFINED' is not defined.
2499
2500               You have to keep in mind that this works only as long as the
2501               preprocessor is not reset. See "Preprocessor configuration" for
2502               details.
2503
2504   pack
2505       "pack" TYPE
2506       "pack" TYPE, DATA
2507       "pack" TYPE, DATA, STRING
2508               Use this method to pack a complex data structure into a binary
2509               string according to a type definition that has been previously
2510               parsed. DATA must be a scalar matching the type definition. C
2511               structures and unions are represented by references to Perl
2512               hashes, C arrays by references to Perl arrays.
2513
2514                 use Convert::Binary::C;
2515                 use Data::Dumper;
2516                 use Data::Hexdumper;
2517
2518                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2519                                             , LongSize  => 4
2520                                             , ShortSize => 2
2521                                             )
2522                                        ->parse(<<'ENDC');
2523                 struct test {
2524                   char    ary[3];
2525                   union {
2526                     short word[2];
2527                     long  quad;
2528                   }       uni;
2529                 };
2530                 ENDC
2531
2532               Hashes don't have to contain a key for each compound member and
2533               arrays may be truncated:
2534
2535                 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2536
2537               Elements not defined in the Perl data structure will be set to
2538               zero in the packed byte string. If you pass "undef" as or
2539               simply omit the second parameter, the whole string will be
2540               initialized with zero bytes. On success, the packed byte string
2541               is returned.
2542
2543                 print hexdump(data => $binary);
2544
2545               The above code would print:
2546
2547                   0x0000 : 01 02 00 00 00 00 2A                            : ......*
2548
2549               You could also use "unpack" and dump the data structure.
2550
2551                 $unpacked = $c->unpack('test', $binary);
2552                 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2553
2554               This would print:
2555
2556                 $unpacked = {
2557                   'ary' => [
2558                     1,
2559                     2,
2560                     0
2561                   ],
2562                   'uni' => {
2563                     'word' => [
2564                       0,
2565                       42
2566                     ],
2567                     'quad' => 42
2568                   }
2569                 };
2570
2571               If TYPE refers to a compound object, you may pack any member of
2572               that compound object. Simply add a member expression to the
2573               type name, just as you would access the member in C:
2574
2575                 $array = $c->pack('test.ary', [1, 2, 3]);
2576                 print hexdump(data => $array);
2577
2578                 $value = $c->pack('test.uni.word[1]', 2);
2579                 print hexdump(data => $value);
2580
2581               This would give you:
2582
2583                   0x0000 : 01 02 03                                        : ...
2584                   0x0000 : 00 02                                           : ..
2585
2586               Call "pack" with the optional STRING argument if you want to
2587               use an existing binary string to insert the data.  If called in
2588               a void context, "pack" will directly modify the string you
2589               passed as the third argument.  Otherwise, a copy of the string
2590               is created, and "pack" will modify and return the copy, so the
2591               original string will remain unchanged.
2592
2593               The 3-argument version may be useful if you want to change only
2594               a few members of a complex data structure without having to
2595               "unpack" everything, change the members, and then "pack" again
2596               (which could waste lots of memory and CPU cycles). So, instead
2597               of doing something like
2598
2599                 $test = $c->unpack('test', $binary);
2600                 $test->{uni}{quad} = 4711;
2601                 $new = $c->pack('test', $test);
2602
2603               to change the "uni.quad" member of $packed, you could simply do
2604               either
2605
2606                 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2607
2608               or
2609
2610                 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2611
2612               while the latter would directly modify $packed.  Besides this
2613               code being a lot shorter (and perhaps even more readable), it
2614               can be significantly faster if you're dealing with really big
2615               data blocks.
2616
2617               If the length of the input string is less than the size
2618               required by the type, the string (or its copy) is extended and
2619               the extended part is initialized to zero.  If the length is
2620               more than the size required by the type, the string is kept at
2621               that length, and also a copy would be an exact copy of that
2622               string.
2623
2624                 $too_short = pack "C*", (1 .. 4);
2625                 $too_long  = pack "C*", (1 .. 20);
2626
2627                 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2628                 print "too_short:\n", hexdump(data => $too_short);
2629
2630                 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2631                 print "\ncopy:\n", hexdump(data => $copy);
2632
2633               This would print:
2634
2635                 too_short:
2636                   0x0000 : 01 02 03 00 00 47 11                            : .....G.
2637
2638                 copy:
2639                   0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2640                   0x0010 : 11 12 13 14                                     : ....
2641
2642   unpack
2643       "unpack" TYPE, STRING
2644               Use this method to unpack a binary string and create an
2645               arbitrarily complex Perl data structure based on a previously
2646               parsed type definition.
2647
2648                 use Convert::Binary::C;
2649                 use Data::Dumper;
2650
2651                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2652                                             , LongSize  => 4
2653                                             , ShortSize => 2
2654                                             )
2655                                        ->parse( <<'ENDC' );
2656                 struct test {
2657                   char    ary[3];
2658                   union {
2659                     short word[2];
2660                     long *quad;
2661                   }       uni;
2662                 };
2663                 ENDC
2664
2665                 # Generate some binary dummy data
2666                 $binary = pack "C*", 1 .. $c->sizeof('test');
2667
2668               On failure, e.g. if the specified type cannot be found, the
2669               method will throw an exception. On success, a reference to a
2670               complex Perl data structure is returned, which can directly be
2671               dumped using the Data::Dumper module:
2672
2673                 $unpacked = $c->unpack('test', $binary);
2674                 print Dumper($unpacked);
2675
2676               This would print:
2677
2678                 $VAR1 = {
2679                   'ary' => [
2680                     1,
2681                     2,
2682                     3
2683                   ],
2684                   'uni' => {
2685                     'word' => [
2686                       1029,
2687                       1543
2688                     ],
2689                     'quad' => '289644378304612875'
2690                   }
2691                 };
2692
2693               If TYPE refers to a compound object, you may unpack any member
2694               of that compound object. Simply add a member expression to the
2695               type name, just as you would access the member in C:
2696
2697                 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2698
2699                 $unpack1 = $unpacked->{uni}{word};
2700                 $unpack2 = $c->unpack('test.uni.word', $binary2);
2701
2702                 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2703
2704               You will find that the output is exactly the same for both
2705               $unpack1 and $unpack2:
2706
2707                 $unpack1 = [
2708                   1029,
2709                   1543
2710                 ];
2711                 $unpack2 = [
2712                   1029,
2713                   1543
2714                 ];
2715
2716               When "unpack" is called in list context, it will unpack as many
2717               elements as possible from STRING, including zero if STRING is
2718               not long enough.
2719
2720   initializer
2721       "initializer" TYPE
2722       "initializer" TYPE, DATA
2723               The "initializer" method can be used retrieve an initializer
2724               string for a certain TYPE.  This can be useful if you have to
2725               initialize only a couple of members in a huge compound type or
2726               if you simply want to generate initializers automatically.
2727
2728                 struct date {
2729                   unsigned year : 12;
2730                   unsigned month:  4;
2731                   unsigned day  :  5;
2732                   unsigned hour :  5;
2733                   unsigned min  :  6;
2734                 };
2735
2736                 typedef struct {
2737                   enum { DATE, QWORD } type;
2738                   short number;
2739                   union {
2740                     struct date   date;
2741                     unsigned long qword;
2742                   } choice;
2743                 } data;
2744
2745               Given the above code has been parsed
2746
2747                 $init = $c->initializer('data');
2748                 print "data x = $init;\n";
2749
2750               would print the following:
2751
2752                 data x = {
2753                       0,
2754                       0,
2755                       {
2756                               {
2757                                       0,
2758                                       0,
2759                                       0,
2760                                       0,
2761                                       0
2762                               }
2763                       }
2764                 };
2765
2766               You could directly put that into a C program, although it
2767               probably isn't very useful yet. It becomes more useful if you
2768               actually specify how you want to initialize the type:
2769
2770                 $data = {
2771                   type   => 'QWORD',
2772                   choice => {
2773                     date  => { month => 12, day => 24 },
2774                     qword => 4711,
2775                   },
2776                   stuff => 'yes?',
2777                 };
2778
2779                 $init = $c->initializer('data', $data);
2780                 print "data x = $init;\n";
2781
2782               This would print the following:
2783
2784                 data x = {
2785                       QWORD,
2786                       0,
2787                       {
2788                               {
2789                                       0,
2790                                       12,
2791                                       24,
2792                                       0,
2793                                       0
2794                               }
2795                       }
2796                 };
2797
2798               As only the first member of a "union" can be initialized,
2799               "choice.qword" is ignored. You will not be warned about the
2800               fact that you probably tried to initialize a member other than
2801               the first. This is considered a feature, because it allows you
2802               to use "unpack" to generate the initializer data:
2803
2804                 $data = $c->unpack('data', $binary);
2805                 $init = $c->initializer('data', $data);
2806
2807               Since "unpack" unpacks all union members, you would otherwise
2808               have to delete all but the first one previous to feeding it
2809               into "initializer".
2810
2811               Also, "stuff" is ignored, because it actually isn't a member of
2812               "data". You won't be warned about that either.
2813
2814   sizeof
2815       "sizeof" TYPE
2816               This method will return the size of a C type in bytes.  If it
2817               cannot find the type, it will throw an exception.
2818
2819               If the type defines some kind of compound object, you may ask
2820               for the size of a member of that compound object:
2821
2822                 $size = $c->sizeof('test.uni.word[1]');
2823
2824               This would set $size to 2.
2825
2826   typeof
2827       "typeof" TYPE
2828               This method will return the type of a C member.  While this
2829               only makes sense for compound types, it's legal to also use it
2830               for non-compound types.  If it cannot find the type, it will
2831               throw an exception.
2832
2833               The "typeof" method can be used on any valid member, even on
2834               arrays or unnamed types. It will always return a string that
2835               holds the name (or in case of unnamed types only the class) of
2836               the type, optionally followed by a '*' character to indicate
2837               it's a pointer type, and optionally followed by one or more
2838               array dimensions if it's an array type. If the type is a
2839               bitfield, the type name is followed by a colon and the number
2840               of bits.
2841
2842                 struct test {
2843                   char    ary[3];
2844                   union {
2845                     short word[2];
2846                     long *quad;
2847                   }       uni;
2848                   struct {
2849                     unsigned short six:6;
2850                     unsigned short ten:10;
2851                   }       bits;
2852                 };
2853
2854               Given the above C code has been parsed, calls to "typeof" would
2855               return the following values:
2856
2857                 $c->typeof('test')             => 'struct test'
2858                 $c->typeof('test.ary')         => 'char [3]'
2859                 $c->typeof('test.uni')         => 'union'
2860                 $c->typeof('test.uni.quad')    => 'long *'
2861                 $c->typeof('test.uni.word')    => 'short [2]'
2862                 $c->typeof('test.uni.word[1]') => 'short'
2863                 $c->typeof('test.bits')        => 'struct'
2864                 $c->typeof('test.bits.six')    => 'unsigned short :6'
2865                 $c->typeof('test.bits.ten')    => 'unsigned short :10'
2866
2867   offsetof
2868       "offsetof" TYPE, MEMBER
2869               You can use "offsetof" just like the C macro of same
2870               denominator. It will simply return the offset (in bytes) of
2871               MEMBER relative to TYPE.
2872
2873                 use Convert::Binary::C;
2874
2875                 $c = Convert::Binary::C->new( Alignment   => 4
2876                                             , LongSize    => 4
2877                                             , PointerSize => 4
2878                                             )
2879                                        ->parse(<<'ENDC');
2880                 typedef struct {
2881                   char abc;
2882                   long day;
2883                   int *ptr;
2884                 } week;
2885
2886                 struct test {
2887                   week zap[8];
2888                 };
2889                 ENDC
2890
2891                 @args = (
2892                   ['test',        'zap[5].day'  ],
2893                   ['test.zap[2]', 'day'         ],
2894                   ['test',        'zap[5].day+1'],
2895                   ['test',        'zap[-3].ptr' ],
2896                 );
2897
2898                 for (@args) {
2899                   my $offset = eval { $c->offsetof(@$_) };
2900                   printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2901                 }
2902
2903               The final loop will print:
2904
2905                 $c->offsetof('test', 'zap[5].day') => 64
2906                 $c->offsetof('test.zap[2]', 'day') => 4
2907                 $c->offsetof('test', 'zap[5].day+1') => 65
2908                 $c->offsetof('test', 'zap[-3].ptr') => -28
2909
2910               • The first iteration simply shows that the offset of
2911                 "zap[5].day" is 64 relative to the beginning of "struct
2912                 test".
2913
2914               • You may additionally specify a member for the type passed as
2915                 the first argument, as shown in the second iteration.
2916
2917               • The offset suffix is also supported by "offsetof", so the
2918                 third iteration will correctly print 65.
2919
2920               • The last iteration demonstrates that even out-of-bounds array
2921                 indices are handled correctly, just as they are handled in C.
2922
2923               Unlike the C macro, "offsetof" also works on array types.
2924
2925                 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2926                 print "offset = $offset";
2927
2928               This will print:
2929
2930                 offset = 46
2931
2932               If TYPE is a compound, MEMBER may optionally be prefixed with a
2933               dot, so
2934
2935                 printf "offset = %d\n", $c->offsetof('week', 'day');
2936                 printf "offset = %d\n", $c->offsetof('week', '.day');
2937
2938               are both equivalent and will print
2939
2940                 offset = 4
2941                 offset = 4
2942
2943               This allows one to
2944
2945               • use the C macro style, without a leading dot, and
2946
2947               • directly use the output of the "member" method, which
2948                 includes a leading dot for compound types, as input for the
2949                 MEMBER argument.
2950
2951   member
2952       "member" TYPE
2953       "member" TYPE, OFFSET
2954               You can think of "member" as being the reverse of the
2955               "offsetof" method. However, as this is more complex, there's no
2956               equivalent to "member" in the C language.
2957
2958               Usually this method is used if you want to retrieve the name of
2959               the member that is located at a specific offset of a previously
2960               parsed type.
2961
2962                 use Convert::Binary::C;
2963
2964                 $c = Convert::Binary::C->new( Alignment   => 4
2965                                             , LongSize    => 4
2966                                             , PointerSize => 4
2967                                             )
2968                                        ->parse(<<'ENDC');
2969                 typedef struct {
2970                   char abc;
2971                   long day;
2972                   int *ptr;
2973                 } week;
2974
2975                 struct test {
2976                   week zap[8];
2977                 };
2978                 ENDC
2979
2980                 for my $offset (24, 39, 69, 99) {
2981                   print "\$c->member('test', $offset)";
2982                   my $member = eval { $c->member('test', $offset) };
2983                   print $@ ? "\n  exception: $@" : " => '$member'\n";
2984                 }
2985
2986               This will print:
2987
2988                 $c->member('test', 24) => '.zap[2].abc'
2989                 $c->member('test', 39) => '.zap[3]+3'
2990                 $c->member('test', 69) => '.zap[5].ptr+1'
2991                 $c->member('test', 99)
2992                   exception: Offset 99 out of range (0 <= offset < 96)
2993
2994               • The output of the first iteration is obvious. The member
2995                 "zap[2].abc" is located at offset 24 of "struct test".
2996
2997               • In the second iteration, the offset points into a region of
2998                 padding bytes and thus no member of "week" can be named.
2999                 Instead of a member name the offset relative to "zap[3]" is
3000                 appended.
3001
3002               • In the third iteration, the offset points to "zap[5].ptr".
3003                 However, "zap[5].ptr" is located at 68, not at 69, and thus
3004                 the remaining offset of 1 is also appended.
3005
3006               • The last iteration causes an exception because the offset of
3007                 99 is not valid for "struct test" since the size of "struct
3008                 test" is only 96. You might argue that this is inconsistent,
3009                 since "offsetof" can also handle out-of-bounds array members.
3010                 But as soon as you have more than one level of array nesting,
3011                 there's an infinite number of out-of-bounds members for a
3012                 single given offset, so it would be impossible to return a
3013                 list of all members.
3014
3015               You can additionally specify a member for the type passed as
3016               the first argument:
3017
3018                 $member = $c->member('test.zap[2]', 6);
3019                 print $member;
3020
3021               This will print:
3022
3023                 .day+2
3024
3025               Like "offsetof", "member" also works on array types:
3026
3027                 $member = $c->member('test.zap', 42);
3028                 print $member;
3029
3030               This will print:
3031
3032                 [3].day+2
3033
3034               While the behaviour for "struct"s is quite obvious, the
3035               behaviour for "union"s is rather tricky. As a single offset
3036               usually references more than one member of a union, there are
3037               certain rules that the algorithm uses for determining the best
3038               member.
3039
3040               • The first non-compound member that is referenced without an
3041                 offset has the highest priority.
3042
3043               • If no member is referenced without an offset, the first non-
3044                 compound member that is referenced with an offset will be
3045                 returned.
3046
3047               • Otherwise the first padding region that is encountered will
3048                 be taken.
3049
3050               As an example, given 4-byte-alignment and the union
3051
3052                 union choice {
3053                   struct {
3054                     char  color[2];
3055                     long  size;
3056                     char  taste;
3057                   }       apple;
3058                   char    grape[3];
3059                   struct {
3060                     long  weight;
3061                     short price[3];
3062                   }       melon;
3063                 };
3064
3065               the "member" method would return what is shown in the Member
3066               column of the following table. The Type column shows the result
3067               of the "typeof" method when passing the corresponding member.
3068
3069                 Offset   Member               Type
3070                 --------------------------------------
3071                    0     .apple.color[0]      'char'
3072                    1     .apple.color[1]      'char'
3073                    2     .grape[2]            'char'
3074                    3     .melon.weight+3      'long'
3075                    4     .apple.size          'long'
3076                    5     .apple.size+1        'long'
3077                    6     .melon.price[1]      'short'
3078                    7     .apple.size+3        'long'
3079                    8     .apple.taste         'char'
3080                    9     .melon.price[2]+1    'short'
3081                   10     .apple+10            'struct'
3082                   11     .apple+11            'struct'
3083
3084               It's like having a stack of all the union members and looking
3085               through the stack for the shiniest piece you can see. The
3086               beginning of a member (denoted by uppercase letters) is always
3087               shinier than the rest of a member, while padding regions
3088               (denoted by dashes) aren't shiny at all.
3089
3090                 Offset   0   1   2   3   4   5   6   7   8   9  10  11
3091                 -------------------------------------------------------
3092                 apple   (C) (C)  -   -  (S) (s)  s  (s) (T)  -  (-) (-)
3093                 grape    G   G  (G)
3094                 melon    W   w   w  (w)  P   p  (P)  p   P  (p)  -   -
3095
3096               If you look through that stack from top to bottom, you'll end
3097               up at the parenthesized members.
3098
3099               Alternatively, if you're not only interested in the best
3100               member, you can call "member" in list context, which makes it
3101               return all members referenced by the given offset.
3102
3103                 Offset   Member               Type
3104                 --------------------------------------
3105                    0     .apple.color[0]      'char'
3106                          .grape[0]            'char'
3107                          .melon.weight        'long'
3108                    1     .apple.color[1]      'char'
3109                          .grape[1]            'char'
3110                          .melon.weight+1      'long'
3111                    2     .grape[2]            'char'
3112                          .melon.weight+2      'long'
3113                          .apple+2             'struct'
3114                    3     .melon.weight+3      'long'
3115                          .apple+3             'struct'
3116                    4     .apple.size          'long'
3117                          .melon.price[0]      'short'
3118                    5     .apple.size+1        'long'
3119                          .melon.price[0]+1    'short'
3120                    6     .melon.price[1]      'short'
3121                          .apple.size+2        'long'
3122                    7     .apple.size+3        'long'
3123                          .melon.price[1]+1    'short'
3124                    8     .apple.taste         'char'
3125                          .melon.price[2]      'short'
3126                    9     .melon.price[2]+1    'short'
3127                          .apple+9             'struct'
3128                   10     .apple+10            'struct'
3129                          .melon+10            'struct'
3130                   11     .apple+11            'struct'
3131                          .melon+11            'struct'
3132
3133               The first member returned is always the best member. The other
3134               members are sorted according to the rules given above. This
3135               means that members referenced without an offset are followed by
3136               members referenced with an offset. Padding regions will be at
3137               the end.
3138
3139               If OFFSET is not given in the method call, "member" will return
3140               a list of all possible members of TYPE.
3141
3142                 print "$_\n" for $c->member('choice');
3143
3144               This will print:
3145
3146                 .apple.color[0]
3147                 .apple.color[1]
3148                 .apple.size
3149                 .apple.taste
3150                 .grape[0]
3151                 .grape[1]
3152                 .grape[2]
3153                 .melon.weight
3154                 .melon.price[0]
3155                 .melon.price[1]
3156                 .melon.price[2]
3157
3158               In scalar context, the number of possible members is returned.
3159
3160   tag
3161       "tag" TYPE
3162       "tag" TYPE, TAG
3163       "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3164               The "tag" method can be used to tag properties to a TYPE. It's
3165               a bit like having "configure" for individual types.
3166
3167               See "USING TAGS" for an example.
3168
3169               Note that while you can tag whole types as well as compound
3170               members, it is not possible to tag array members, i.e. you
3171               cannot treat, for example, "a[1]" and "a[2]" differently.
3172
3173               Also note that in code like this
3174
3175                 struct test {
3176                   int a;
3177                   struct {
3178                     int x;
3179                   } b, c;
3180                 };
3181
3182               if you tag "test.b.x", this will also tag "test.c.x"
3183               implicitly.
3184
3185               It is also possible to tag basic types if you really want to do
3186               that, for example:
3187
3188                 $c->tag('int', Format => 'Binary');
3189
3190               To remove a tag from a type, you can either set that tag to
3191               "undef", for example
3192
3193                 $c->tag('test', Hooks => undef);
3194
3195               or use "untag".
3196
3197               To see if a tag is attached to a type or to get the value of a
3198               tag, pass only the type and tag name to "tag":
3199
3200                 $c->tag('test.a', Format => 'Binary');
3201
3202                 $hooks = $c->tag('test.a', 'Hooks');
3203                 $format = $c->tag('test.a', 'Format');
3204
3205               This will give you:
3206
3207                 $hooks = undef;
3208                 $format = 'Binary';
3209
3210               To see which tags are attached to a type, pass only the type.
3211               The "tag" method will now return a hash reference containing
3212               all tags attached to the type:
3213
3214                 $tags = $c->tag('test.a');
3215
3216               This will give you:
3217
3218                 $tags = {
3219                   'Format' => 'Binary'
3220                 };
3221
3222               "tag" will throw an exception if an error occurs.  If called as
3223               a 'set' method, it will return a reference to its object,
3224               allowing you to chain together consecutive method calls.
3225
3226               Note that when a compound is inlined, tags attached to the
3227               inlined compound are ignored, for example:
3228
3229                 $c->parse(<<ENDC);
3230                 struct header {
3231                   int id;
3232                   int len;
3233                   unsigned flags;
3234                 };
3235
3236                 struct message {
3237                   struct header;
3238                   short samples[32];
3239                 };
3240                 ENDC
3241
3242                 for my $type (qw( header message header.len )) {
3243                   $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3244                 }
3245
3246                 for my $type (qw( header message )) {
3247                   print "[unpacking $type]\n";
3248                   $u = $c->unpack($type, $data);
3249                 }
3250
3251               This will print:
3252
3253                 [unpacking header]
3254                 unpack: header.len
3255                 unpack: header
3256                 [unpacking message]
3257                 unpack: header.len
3258                 unpack: message
3259
3260               As you can see from the above output, tags attached to members
3261               of inlined compounds ("header.len" are still handled.
3262
3263               The following tags can be configured:
3264
3265               "Format" => 'Binary' | 'String'
3266                   The "Format" tag allows you to control the way binary data
3267                   is converted by "pack" and "unpack".
3268
3269                   If you tag a "TYPE" as "Binary", it will not be converted
3270                   at all, i.e. it will be passed through as a binary string.
3271
3272                   If you tag it as "String", it will be treated like a null-
3273                   terminated C string, i.e. "unpack" will convert the C
3274                   string to a Perl string and vice versa.
3275
3276                   See "The Format Tag" for an example.
3277
3278               "ByteOrder" => 'BigEndian' | 'LittleEndian'
3279                   The "ByteOrder" tag allows you to explicitly set the byte
3280                   order of a TYPE.
3281
3282                   See "The ByteOrder Tag" for an example.
3283
3284               "Dimension" => '*'
3285               "Dimension" => VALUE
3286               "Dimension" => MEMBER
3287               "Dimension" => SUB
3288               "Dimension" => [ SUB, ARGS ]
3289                   The "Dimension" tag allows you to alter the size of an
3290                   array dynamically.
3291
3292                   You can tag fixed size arrays as being flexible using '*'.
3293                   This is useful if you cannot use flexible array members in
3294                   your source code.
3295
3296                     $c->tag('type.array', Dimension => '*');
3297
3298                   You can also tag an array to have a fixed size different
3299                   from the one it was originally declared with.
3300
3301                     $c->tag('type.array', Dimension => 42);
3302
3303                   If the array is a member of a compound, you can also tag it
3304                   with to have a size corresponding to the value of another
3305                   member in that compound.
3306
3307                     $c->tag('type.array', Dimension => 'count');
3308
3309                   Finally, you can specify a subroutine that is called when
3310                   the size of the array needs to be determined.
3311
3312                     $c->tag('type.array', Dimension => \&get_count);
3313
3314                   By default, and if the array is a compound member, that
3315                   subroutine will be passed a reference to the hash storing
3316                   the data for the compound.
3317
3318                   You can also instruct Convert::Binary::C to pass additional
3319                   arguments to the subroutine by passing an array reference
3320                   instead of the subroutine reference. This array contains
3321                   the subroutine reference as well as a list of arguments.
3322                   It is possible to define certain special arguments using
3323                   the "arg" method.
3324
3325                     $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3326
3327                   See "The Dimension Tag" for various examples.
3328
3329               "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3330                   The "Hooks" tag allows you to register subroutines as
3331                   hooks.
3332
3333                   Hooks are called whenever a certain "TYPE" is packed or
3334                   unpacked. Hooks are currently considered an experimental
3335                   feature.
3336
3337                   "HOOK" can be one of the following:
3338
3339                     pack
3340                     unpack
3341                     pack_ptr
3342                     unpack_ptr
3343
3344                   "pack" and "unpack" hooks are called when processing their
3345                   "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3346                   when processing pointers to their "TYPE".
3347
3348                   "SUB" is a reference to a subroutine that usually takes one
3349                   input argument, processes it and returns one output
3350                   argument.
3351
3352                   Alternatively, you can pass a custom list of arguments to
3353                   the hook by using an array reference instead of "SUB" that
3354                   holds the subroutine reference in the first element and the
3355                   arguments to be passed to the subroutine as the other
3356                   elements.  This way, you can even pass special arguments to
3357                   the hook using the "arg" method.
3358
3359                   Here are a few examples for registering hooks:
3360
3361                     $c->tag('ObjectType', Hooks => {
3362                               pack   => \&obj_pack,
3363                               unpack => \&obj_unpack
3364                             });
3365
3366                     $c->tag('ProtocolId', Hooks => {
3367                               unpack => sub { $protos[$_[0]] }
3368                             });
3369
3370                     $c->tag('ProtocolId', Hooks => {
3371                               unpack_ptr => [sub {
3372                                                sprintf "$_[0]:{0x%X}", $_[1]
3373                                              },
3374                                              $c->arg('TYPE', 'DATA')
3375                                             ],
3376                             });
3377
3378                   Note that the above example registers both an "unpack" hook
3379                   and an "unpack_ptr" hook for "ProtocolId" with two separate
3380                   calls to "tag". As long as you don't explicitly overwrite a
3381                   previously registered hook, it won't be modified or removed
3382                   by registering other hooks for the same "TYPE".
3383
3384                   To remove all registered hooks for a type, simply remove
3385                   the "Hooks" tag:
3386
3387                     $c->untag('ProtocolId', 'Hooks');
3388
3389                   To remove only a single hook, pass "undef" as "SUB" instead
3390                   of a subroutine reference:
3391
3392                     $c->tag('ObjectType', Hooks => { pack => undef });
3393
3394                   If all hooks are removed, the whole "Hooks" tag is removed.
3395
3396                   See "The Hooks Tag" for examples on how to use hooks.
3397
3398   untag
3399       "untag" TYPE
3400       "untag" TYPE, TAG1, TAG2, ...
3401               Use the "untag" method to remove one, more, or all tags from a
3402               type. If you don't pass any tag names, all tags attached to the
3403               type will be removed. Otherwise only the listed tags will be
3404               removed.
3405
3406               See "USING TAGS" for an example.
3407
3408   arg
3409       "arg" 'ARG', ...
3410               Creates placeholders for special arguments to be passed to
3411               hooks or other subroutines. These arguments are currently:
3412
3413               "SELF"
3414                   A reference to the calling Convert::Binary::C object. This
3415                   may be useful if you need to work with the object inside
3416                   the subroutine.
3417
3418               "TYPE"
3419                   The name of the type that is currently being processed by
3420                   the hook.
3421
3422               "DATA"
3423                   The data argument that is passed to the subroutine.
3424
3425               "HOOK"
3426                   The type of the hook as which the subroutine has been
3427                   called, for example "pack" or "unpack_ptr".
3428
3429               "arg" will return a placeholder for each argument it is being
3430               passed. Note that not all arguments may be supported depending
3431               on the context of the subroutine.
3432
3433   dependencies
3434       "dependencies"
3435               After some code has been parsed using either the "parse" or
3436               "parse_file" methods, the "dependencies" method can be used to
3437               retrieve information about all files that the object depends
3438               on, i.e. all files that have been parsed.
3439
3440               In scalar context, the method returns a hash reference.  Each
3441               key is the name of a file. The values are again hash
3442               references, each of which holds the size, modification time
3443               (mtime), and change time (ctime) of the file at the moment it
3444               was parsed.
3445
3446                 use Convert::Binary::C;
3447                 use Data::Dumper;
3448
3449                 #----------------------------------------------------------
3450                 # Create object, set include path, parse 'string.h' header
3451                 #----------------------------------------------------------
3452                 my $c = Convert::Binary::C->new
3453                         ->Include('/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include',
3454                                   '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include-fixed',
3455                                   '/usr/include')
3456                         ->parse_file('string.h');
3457
3458                 #----------------------------------------------------------
3459                 # Get dependencies of the object, extract dependency files
3460                 #----------------------------------------------------------
3461                 my $depend = $c->dependencies;
3462                 my @files  = keys %$depend;
3463
3464                 #-----------------------------
3465                 # Dump dependencies and files
3466                 #-----------------------------
3467                 print Data::Dumper->Dump([$depend, \@files],
3468                                       [qw( depend   *files )]);
3469
3470               The above code would print something like this:
3471
3472                 $depend = {
3473                   '/usr/include/sys/cdefs.h' => {
3474                     'size' => 20051,
3475                     'mtime' => 1604969938,
3476                     'ctime' => 1604969964
3477                   },
3478                   '/usr/include/gnu/stubs-32.h' => {
3479                     'size' => 449,
3480                     'mtime' => 1604969908,
3481                     'ctime' => 1604969964
3482                   },
3483                   '/usr/include/bits/wordsize.h' => {
3484                     'size' => 442,
3485                     'mtime' => 1604969934,
3486                     'ctime' => 1604969964
3487                   },
3488                   '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/stddef.h' => {
3489                     'size' => 12959,
3490                     'mtime' => 1604974286,
3491                     'ctime' => 1604975398
3492                   },
3493                   '/usr/include/stdc-predef.h' => {
3494                     'size' => 2290,
3495                     'mtime' => 1604969927,
3496                     'ctime' => 1604969964
3497                   },
3498                   '/usr/include/string.h' => {
3499                     'size' => 18766,
3500                     'mtime' => 1604969936,
3501                     'ctime' => 1604969964
3502                   },
3503                   '/usr/include/bits/types/locale_t.h' => {
3504                     'size' => 983,
3505                     'mtime' => 1604969927,
3506                     'ctime' => 1604969964
3507                   },
3508                   '/usr/include/bits/long-double.h' => {
3509                     'size' => 970,
3510                     'mtime' => 1604969933,
3511                     'ctime' => 1604969964
3512                   },
3513                   '/usr/include/bits/libc-header-start.h' => {
3514                     'size' => 3288,
3515                     'mtime' => 1604969927,
3516                     'ctime' => 1604969964
3517                   },
3518                   '/usr/include/strings.h' => {
3519                     'size' => 4753,
3520                     'mtime' => 1604969936,
3521                     'ctime' => 1604969964
3522                   },
3523                   '/usr/include/gnu/stubs.h' => {
3524                     'size' => 384,
3525                     'mtime' => 1604969927,
3526                     'ctime' => 1604969964
3527                   },
3528                   '/usr/include/bits/types/__locale_t.h' => {
3529                     'size' => 1722,
3530                     'mtime' => 1604969927,
3531                     'ctime' => 1604969964
3532                   },
3533                   '/usr/include/features.h' => {
3534                     'size' => 17235,
3535                     'mtime' => 1604969927,
3536                     'ctime' => 1604969964
3537                   }
3538                 };
3539                 @files = (
3540                   '/usr/include/sys/cdefs.h',
3541                   '/usr/include/gnu/stubs-32.h',
3542                   '/usr/include/bits/wordsize.h',
3543                   '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/stddef.h',
3544                   '/usr/include/stdc-predef.h',
3545                   '/usr/include/string.h',
3546                   '/usr/include/bits/types/locale_t.h',
3547                   '/usr/include/bits/long-double.h',
3548                   '/usr/include/bits/libc-header-start.h',
3549                   '/usr/include/strings.h',
3550                   '/usr/include/gnu/stubs.h',
3551                   '/usr/include/bits/types/__locale_t.h',
3552                   '/usr/include/features.h'
3553                 );
3554
3555               In list context, the method returns the names of all files that
3556               have been parsed, i.e. the following lines are equivalent:
3557
3558                 @files = keys %{$c->dependencies};
3559                 @files = $c->dependencies;
3560
3561   sourcify
3562       "sourcify"
3563       "sourcify" CONFIG
3564               Returns a string that holds the C source code necessary to
3565               represent all parsed C data structures.
3566
3567                 use Convert::Binary::C;
3568
3569                 $c = Convert::Binary::C->new;
3570                 $c->parse(<<'END');
3571
3572                 #define ADD(a, b) ((a) + (b))
3573                 #define NUMBER 42
3574
3575                 typedef struct _mytype mytype;
3576
3577                 struct _mytype {
3578                   union {
3579                     int         iCount;
3580                     enum count *pCount;
3581                   } counter;
3582                 #pragma pack( push, 1 )
3583                   struct {
3584                     char string[NUMBER];
3585                     int  array[NUMBER/sizeof(int)];
3586                   } storage;
3587                 #pragma pack( pop )
3588                   mytype *next;
3589                 };
3590
3591                 enum count { ZERO, ONE, TWO, THREE };
3592
3593                 END
3594
3595                 print $c->sourcify;
3596
3597               The above code would print something like this:
3598
3599                 /* typedef predeclarations */
3600
3601                 typedef struct _mytype mytype;
3602
3603                 /* defined enums */
3604
3605                 enum count
3606                 {
3607                       ZERO,
3608                       ONE,
3609                       TWO,
3610                       THREE
3611                 };
3612
3613
3614                 /* defined structs and unions */
3615
3616                 struct _mytype
3617                 {
3618                       union
3619                       {
3620                               int iCount;
3621                               enum count *pCount;
3622                       } counter;
3623                 #pragma pack(push, 1)
3624                       struct
3625                       {
3626                               char string[42];
3627                               int array[10];
3628                       } storage;
3629                 #pragma pack(pop)
3630                       mytype *next;
3631                 };
3632
3633               The purpose of the "sourcify" method is to enable some kind of
3634               platform-independent caching. The C code generated by
3635               "sourcify" can be parsed by any standard C compiler, as well as
3636               of course by the Convert::Binary::C parser. However, the code
3637               may be significantly shorter than the code that has originally
3638               been parsed.
3639
3640               When parsing a typical header file, it's easily possible that
3641               you need to open dozens of other files that are included from
3642               that file, and end up parsing several hundred kilobytes of C
3643               code. Since most of it is usually preprocessor directives,
3644               function prototypes and comments, the "sourcify" function
3645               strips this down to a few kilobytes. Saving the "sourcify"
3646               string and parsing it next time instead of the original code
3647               may be a lot faster.
3648
3649               The "sourcify" method takes a hash reference as an optional
3650               argument. It can be used to tweak the method's output.  The
3651               following options can be configured.
3652
3653               "Context" => 0 | 1
3654                   Turns preprocessor context information on or off. If this
3655                   is turned on, "sourcify" will insert "#line" preprocessor
3656                   directives in its output. So in the above example
3657
3658                     print $c->sourcify({ Context => 1 });
3659
3660                   would print:
3661
3662                     /* typedef predeclarations */
3663
3664                     typedef struct _mytype mytype;
3665
3666                     /* defined enums */
3667
3668
3669                     #line 21 "[buffer]"
3670                     enum count
3671                     {
3672                           ZERO,
3673                           ONE,
3674                           TWO,
3675                           THREE
3676                     };
3677
3678
3679                     /* defined structs and unions */
3680
3681
3682                     #line 7 "[buffer]"
3683                     struct _mytype
3684                     {
3685                     #line 8 "[buffer]"
3686                           union
3687                           {
3688                                   int iCount;
3689                                   enum count *pCount;
3690                           } counter;
3691                     #pragma pack(push, 1)
3692                     #line 13 "[buffer]"
3693                           struct
3694                           {
3695                                   char string[42];
3696                                   int array[10];
3697                           } storage;
3698                     #pragma pack(pop)
3699                           mytype *next;
3700                     };
3701
3702                   Note that "[buffer]" refers to the here-doc buffer when
3703                   using "parse".
3704
3705               "Defines" => 0 | 1
3706                   Turn this on if you want all the defined macros to be part
3707                   of the source code output. Given the example code above
3708
3709                     print $c->sourcify({ Defines => 1 });
3710
3711                   would print:
3712
3713                     /* typedef predeclarations */
3714
3715                     typedef struct _mytype mytype;
3716
3717                     /* defined enums */
3718
3719                     enum count
3720                     {
3721                           ZERO,
3722                           ONE,
3723                           TWO,
3724                           THREE
3725                     };
3726
3727
3728                     /* defined structs and unions */
3729
3730                     struct _mytype
3731                     {
3732                           union
3733                           {
3734                                   int iCount;
3735                                   enum count *pCount;
3736                           } counter;
3737                     #pragma pack(push, 1)
3738                           struct
3739                           {
3740                                   char string[42];
3741                                   int array[10];
3742                           } storage;
3743                     #pragma pack(pop)
3744                           mytype *next;
3745                     };
3746
3747                     /* preprocessor defines */
3748
3749                     #define ADD(a, b) ((a) + (b))
3750                     #define NUMBER 42
3751
3752                   The macro definitions always appear at the end of the
3753                   source code.  The order of the macro definitions is
3754                   undefined.
3755
3756       The following methods can be used to retrieve information about the
3757       definitions that have been parsed. The examples given in the
3758       description for "enum", "compound" and "typedef" all assume this piece
3759       of C code has been parsed:
3760
3761         #define ABC_SIZE 2
3762         #define MULTIPLY(x, y) ((x)*(y))
3763
3764         #ifdef ABC_SIZE
3765         # define DEFINED
3766         #else
3767         # define NOT_DEFINED
3768         #endif
3769
3770         typedef unsigned long U32;
3771         typedef void *any;
3772
3773         enum __socket_type
3774         {
3775           SOCK_STREAM    = 1,
3776           SOCK_DGRAM     = 2,
3777           SOCK_RAW       = 3,
3778           SOCK_RDM       = 4,
3779           SOCK_SEQPACKET = 5,
3780           SOCK_PACKET    = 10
3781         };
3782
3783         struct STRUCT_SV {
3784           void *sv_any;
3785           U32   sv_refcnt;
3786           U32   sv_flags;
3787         };
3788
3789         typedef union {
3790           int abc[ABC_SIZE];
3791           struct xxx {
3792             int a;
3793             int b;
3794           }   ab[3][4];
3795           any ptr;
3796         } test;
3797
3798   enum_names
3799       "enum_names"
3800               Returns a list of identifiers of all defined enumeration
3801               objects. Enumeration objects don't necessarily have an
3802               identifier, so something like
3803
3804                 enum { A, B, C };
3805
3806               will obviously not appear in the list returned by the
3807               "enum_names" method. Also, enumerations that are not defined
3808               within the source code - like in
3809
3810                 struct foo {
3811                   enum weekday *pWeekday;
3812                   unsigned long year;
3813                 };
3814
3815               where only a pointer to the "weekday" enumeration object is
3816               used - will not be returned, even though they have an
3817               identifier. So for the above two enumerations, "enum_names"
3818               will return an empty list:
3819
3820                 @names = $c->enum_names;
3821
3822               The only way to retrieve a list of all enumeration identifiers
3823               is to use the "enum" method without additional arguments. You
3824               can get a list of all enumeration objects that have an
3825               identifier by using
3826
3827                 @enums = map { $_->{identifier} || () } $c->enum;
3828
3829               but these may not have a definition. Thus, the two arrays would
3830               look like this:
3831
3832                 @names = ();
3833                 @enums = ('weekday');
3834
3835               The "def" method returns a true value for all identifiers
3836               returned by "enum_names".
3837
3838   enum
3839       enum
3840       "enum" LIST
3841               Returns a list of references to hashes containing detailed
3842               information about all enumerations that have been parsed.
3843
3844               If a list of enumeration identifiers is passed to the method,
3845               the returned list will only contain hash references for those
3846               enumerations. The enumeration identifiers may optionally be
3847               prefixed by "enum".
3848
3849               If an enumeration identifier cannot be found, the returned list
3850               will contain an undefined value at that position.
3851
3852               In scalar context, the number of enumerations will be returned
3853               as long as the number of arguments to the method call is not 1.
3854               In the latter case, a hash reference holding information for
3855               the enumeration will be returned.
3856
3857               The list returned by the "enum" method looks similar to this:
3858
3859                 @enum = (
3860                   {
3861                     'enumerators' => {
3862                       'SOCK_STREAM' => 1,
3863                       'SOCK_DGRAM' => 2,
3864                       'SOCK_PACKET' => 10,
3865                       'SOCK_SEQPACKET' => 5,
3866                       'SOCK_RDM' => 4,
3867                       'SOCK_RAW' => 3
3868                     },
3869                     'identifier' => '__socket_type',
3870                     'size' => 4,
3871                     'sign' => 0,
3872                     'context' => 'definitions.c(13)'
3873                   }
3874                 );
3875
3876               "identifier"
3877                   holds the enumeration identifier. This key is not present
3878                   if the enumeration has no identifier.
3879
3880               "context"
3881                   is the context in which the enumeration is defined. This is
3882                   the filename followed by the line number in parentheses.
3883
3884               "enumerators"
3885                   is a reference to a hash table that holds all enumerators
3886                   of the enumeration.
3887
3888               "sign"
3889                   is a boolean indicating if the enumeration is signed (i.e.
3890                   has negative values).
3891
3892               One useful application may be to create a hash table that holds
3893               all enumerators of all defined enumerations:
3894
3895                 %enum = map %{ $_->{enumerators} || {} }, $c->enum;
3896
3897               The %enum hash table would then be:
3898
3899                 %enum = (
3900                   'SOCK_RDM' => 4,
3901                   'SOCK_SEQPACKET' => 5,
3902                   'SOCK_PACKET' => 10,
3903                   'SOCK_STREAM' => 1,
3904                   'SOCK_DGRAM' => 2,
3905                   'SOCK_RAW' => 3
3906                 );
3907
3908   compound_names
3909       "compound_names"
3910               Returns a list of identifiers of all structs and unions
3911               (compound data structures) that are defined in the parsed
3912               source code. Like enumerations, compounds don't need to have an
3913               identifier, nor do they need to be defined.
3914
3915               Again, the only way to retrieve information about all struct
3916               and union objects is to use the "compound" method and don't
3917               pass it any arguments. If you should need a list of all struct
3918               and union identifiers, you can use:
3919
3920                 @compound = map { $_->{identifier} || () } $c->compound;
3921
3922               The "def" method returns a true value for all identifiers
3923               returned by "compound_names".
3924
3925               If you need the names of only the structs or only the unions,
3926               use the "struct_names" and "union_names" methods respectively.
3927
3928   compound
3929       "compound"
3930       "compound" LIST
3931               Returns a list of references to hashes containing detailed
3932               information about all compounds (structs and unions) that have
3933               been parsed.
3934
3935               If a list of struct/union identifiers is passed to the method,
3936               the returned list will only contain hash references for those
3937               compounds. The identifiers may optionally be prefixed by
3938               "struct" or "union", which limits the search to the specified
3939               kind of compound.
3940
3941               If an identifier cannot be found, the returned list will
3942               contain an undefined value at that position.
3943
3944               In scalar context, the number of compounds will be returned as
3945               long as the number of arguments to the method call is not 1. In
3946               the latter case, a hash reference holding information for the
3947               compound will be returned.
3948
3949               The list returned by the "compound" method looks similar to
3950               this:
3951
3952                 @compound = (
3953                   {
3954                     'identifier' => 'STRUCT_SV',
3955                     'align' => 1,
3956                     'declarations' => [
3957                       {
3958                         'type' => 'void',
3959                         'declarators' => [
3960                           {
3961                             'size' => 8,
3962                             'offset' => 0,
3963                             'declarator' => '*sv_any'
3964                           }
3965                         ]
3966                       },
3967                       {
3968                         'type' => 'U32',
3969                         'declarators' => [
3970                           {
3971                             'size' => 8,
3972                             'offset' => 8,
3973                             'declarator' => 'sv_refcnt'
3974                           }
3975                         ]
3976                       },
3977                       {
3978                         'type' => 'U32',
3979                         'declarators' => [
3980                           {
3981                             'size' => 8,
3982                             'offset' => 16,
3983                             'declarator' => 'sv_flags'
3984                           }
3985                         ]
3986                       }
3987                     ],
3988                     'type' => 'struct',
3989                     'size' => 24,
3990                     'context' => 'definitions.c(23)',
3991                     'pack' => 0
3992                   },
3993                   {
3994                     'identifier' => 'xxx',
3995                     'align' => 1,
3996                     'declarations' => [
3997                       {
3998                         'type' => 'int',
3999                         'declarators' => [
4000                           {
4001                             'size' => 4,
4002                             'offset' => 0,
4003                             'declarator' => 'a'
4004                           }
4005                         ]
4006                       },
4007                       {
4008                         'type' => 'int',
4009                         'declarators' => [
4010                           {
4011                             'size' => 4,
4012                             'offset' => 4,
4013                             'declarator' => 'b'
4014                           }
4015                         ]
4016                       }
4017                     ],
4018                     'type' => 'struct',
4019                     'size' => 8,
4020                     'context' => 'definitions.c(31)',
4021                     'pack' => 0
4022                   },
4023                   {
4024                     'align' => 1,
4025                     'declarations' => [
4026                       {
4027                         'type' => 'int',
4028                         'declarators' => [
4029                           {
4030                             'size' => 8,
4031                             'offset' => 0,
4032                             'declarator' => 'abc[2]'
4033                           }
4034                         ]
4035                       },
4036                       {
4037                         'type' => 'struct xxx',
4038                         'declarators' => [
4039                           {
4040                             'size' => 96,
4041                             'offset' => 0,
4042                             'declarator' => 'ab[3][4]'
4043                           }
4044                         ]
4045                       },
4046                       {
4047                         'type' => 'any',
4048                         'declarators' => [
4049                           {
4050                             'size' => 8,
4051                             'offset' => 0,
4052                             'declarator' => 'ptr'
4053                           }
4054                         ]
4055                       }
4056                     ],
4057                     'type' => 'union',
4058                     'size' => 96,
4059                     'context' => 'definitions.c(29)',
4060                     'pack' => 0
4061                   }
4062                 );
4063
4064               "identifier"
4065                   holds the struct or union identifier. This key is not
4066                   present if the compound has no identifier.
4067
4068               "context"
4069                   is the context in which the struct or union is defined.
4070                   This is the filename followed by the line number in
4071                   parentheses.
4072
4073               "type"
4074                   is either 'struct' or 'union'.
4075
4076               "size"
4077                   is the size of the struct or union.
4078
4079               "align"
4080                   is the alignment of the struct or union.
4081
4082               "pack"
4083                   is the struct member alignment if the compound is packed,
4084                   or zero otherwise.
4085
4086               "declarations"
4087                   is an array of hash references describing each struct
4088                   declaration:
4089
4090                   "type"
4091                       is the type of the struct declaration. This may be a
4092                       string or a reference to a hash describing the type.
4093
4094                   "declarators"
4095                       is an array of hashes describing each declarator:
4096
4097                       "declarator"
4098                           is a string representation of the declarator.
4099
4100                       "offset"
4101                           is the offset of the struct member represented by
4102                           the current declarator relative to the beginning of
4103                           the struct or union.
4104
4105                       "size"
4106                           is the size occupied by the struct member
4107                           represented by the current declarator.
4108
4109               It may be useful to have separate lists for structs and unions.
4110               One way to retrieve such lists would be to use
4111
4112                 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4113                     for $c->compound;
4114
4115               However, you should use the "struct" and "union" methods, which
4116               is a lot simpler:
4117
4118                 @structs = $c->struct;
4119                 @unions  = $c->union;
4120
4121   struct_names
4122       "struct_names"
4123               Returns a list of all defined struct identifiers.  This is
4124               equivalent to calling "compound_names", just that it only
4125               returns the names of the struct identifiers and doesn't return
4126               the names of the union identifiers.
4127
4128   struct
4129       "struct"
4130       "struct" LIST
4131               Like the "compound" method, but only allows for structs.
4132
4133   union_names
4134       "union_names"
4135               Returns a list of all defined union identifiers.  This is
4136               equivalent to calling "compound_names", just that it only
4137               returns the names of the union identifiers and doesn't return
4138               the names of the struct identifiers.
4139
4140   union
4141       "union"
4142       "union" LIST
4143               Like the "compound" method, but only allows for unions.
4144
4145   typedef_names
4146       "typedef_names"
4147               Returns a list of all defined typedef identifiers. Typedefs
4148               that do not specify a type that you could actually work with
4149               will not be returned.
4150
4151               The "def" method returns a true value for all identifiers
4152               returned by "typedef_names".
4153
4154   typedef
4155       "typedef"
4156       "typedef" LIST
4157               Returns a list of references to hashes containing detailed
4158               information about all typedefs that have been parsed.
4159
4160               If a list of typedef identifiers is passed to the method, the
4161               returned list will only contain hash references for those
4162               typedefs.
4163
4164               If an identifier cannot be found, the returned list will
4165               contain an undefined value at that position.
4166
4167               In scalar context, the number of typedefs will be returned as
4168               long as the number of arguments to the method call is not 1. In
4169               the latter case, a hash reference holding information for the
4170               typedef will be returned.
4171
4172               The list returned by the "typedef" method looks similar to
4173               this:
4174
4175                 @typedef = (
4176                   {
4177                     'type' => 'unsigned long',
4178                     'declarator' => 'U32'
4179                   },
4180                   {
4181                     'type' => 'void',
4182                     'declarator' => '*any'
4183                   },
4184                   {
4185                     'type' => {
4186                       'align' => 1,
4187                       'declarations' => [
4188                         {
4189                           'type' => 'int',
4190                           'declarators' => [
4191                             {
4192                               'size' => 8,
4193                               'offset' => 0,
4194                               'declarator' => 'abc[2]'
4195                             }
4196                           ]
4197                         },
4198                         {
4199                           'type' => 'struct xxx',
4200                           'declarators' => [
4201                             {
4202                               'size' => 96,
4203                               'offset' => 0,
4204                               'declarator' => 'ab[3][4]'
4205                             }
4206                           ]
4207                         },
4208                         {
4209                           'type' => 'any',
4210                           'declarators' => [
4211                             {
4212                               'size' => 8,
4213                               'offset' => 0,
4214                               'declarator' => 'ptr'
4215                             }
4216                           ]
4217                         }
4218                       ],
4219                       'type' => 'union',
4220                       'size' => 96,
4221                       'context' => 'definitions.c(29)',
4222                       'pack' => 0
4223                     },
4224                     'declarator' => 'test'
4225                   }
4226                 );
4227
4228               "declarator"
4229                   is the type declarator.
4230
4231               "type"
4232                   is the type specification. This may be a string or a
4233                   reference to a hash describing the type.  See "enum" and
4234                   "compound" for a description on how to interpret this hash.
4235
4236   macro_names
4237       "macro_names"
4238               Returns a list of all defined macro names.
4239
4240               The list returned by the "macro_names" method looks similar to
4241               this:
4242
4243                 @macro_names = (
4244                   '__STDC_VERSION__',
4245                   '__STDC_HOSTED__',
4246                   'DEFINED',
4247                   'MULTIPLY',
4248                   'ABC_SIZE'
4249                 );
4250
4251               This works only as long as the preprocessor is not reset.  See
4252               "Preprocessor configuration" for details.
4253
4254   macro
4255       "macro"
4256       "macro" LIST
4257               Returns the definitions for all defined macros.
4258
4259               If a list of macro names is passed to the method, the returned
4260               list will only contain the definitions for those macros. For
4261               undefined macros, "undef" will be returned.
4262
4263               The list returned by the "macro" method looks similar to this:
4264
4265                 @macro = (
4266                   '__STDC_VERSION__ 199901L',
4267                   '__STDC_HOSTED__ 1',
4268                   'DEFINED',
4269                   'MULTIPLY(x, y) ((x)*(y))',
4270                   'ABC_SIZE 2'
4271                 );
4272
4273               This works only as long as the preprocessor is not reset.  See
4274               "Preprocessor configuration" for details.
4275

FUNCTIONS

4277       You can alternatively call the following functions as methods on
4278       Convert::Binary::C objects.
4279
4280   feature
4281       "feature" STRING
4282               Checks if Convert::Binary::C was built with certain features.
4283               For example,
4284
4285                 print "debugging version"
4286                     if Convert::Binary::C::feature('debug');
4287
4288               will check if Convert::Binary::C was built with debugging
4289               support enabled. The "feature" function returns 1 if the
4290               feature is enabled, 0 if the feature is disabled, and "undef"
4291               if the feature is unknown. Currently the only features that can
4292               be checked are "ieeefp" and "debug".
4293
4294               You can enable or disable certain features at compile time of
4295               the module by using the
4296
4297                 perl Makefile.PL enable-feature disable-feature
4298
4299               syntax.
4300
4301   native
4302       "native"
4303       "native" STRING
4304               Returns the value of a property of the native system that
4305               Convert::Binary::C was built on. For example,
4306
4307                 $size = Convert::Binary::C::native('IntSize');
4308
4309               will fetch the size of an "int" on the native system.  The
4310               following properties can be queried:
4311
4312                 Alignment
4313                 ByteOrder
4314                 CharSize
4315                 CompoundAlignment
4316                 DoubleSize
4317                 EnumSize
4318                 FloatSize
4319                 HostedC
4320                 IntSize
4321                 LongDoubleSize
4322                 LongLongSize
4323                 LongSize
4324                 PointerSize
4325                 ShortSize
4326                 StdCVersion
4327                 UnsignedBitfields
4328                 UnsignedChars
4329
4330               You can also call "native" without arguments, in which case it
4331               will return a reference to a hash with all properties, like:
4332
4333                 $native = {
4334                   'EnumSize' => 4,
4335                   'ShortSize' => 2,
4336                   'UnsignedChars' => 0,
4337                   'IntSize' => 4,
4338                   'LongDoubleSize' => 16,
4339                   'StdCVersion' => 201710,
4340                   'HostedC' => 1,
4341                   'CompoundAlignment' => 1,
4342                   'UnsignedBitfields' => 0,
4343                   'DoubleSize' => 8,
4344                   'Alignment' => 16,
4345                   'PointerSize' => 8,
4346                   'ByteOrder' => 'LittleEndian',
4347                   'LongLongSize' => 8,
4348                   'CharSize' => 1,
4349                   'LongSize' => 8,
4350                   'FloatSize' => 4
4351                 };
4352
4353               The contents of that hash are suitable for passing them to the
4354               "configure" method.
4355

DEBUGGING

4357       Like perl itself, Convert::Binary::C can be compiled with debugging
4358       support that can then be selectively enabled at runtime. You can
4359       specify whether you like to build Convert::Binary::C with debugging
4360       support or not by explicitly giving an argument to Makefile.PL.  Use
4361
4362         perl Makefile.PL enable-debug
4363
4364       to enable debugging, or
4365
4366         perl Makefile.PL disable-debug
4367
4368       to disable debugging. The default will depend on how your perl binary
4369       was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4370       be built with debugging support, too.
4371
4372       Once you have built Convert::Binary::C with debugging support, you can
4373       use the following syntax to enable debug output. Instead of
4374
4375         use Convert::Binary::C;
4376
4377       you simply say
4378
4379         use Convert::Binary::C debug => 'all';
4380
4381       which will enable all debug output. However, I don't recommend to
4382       enable all debug output, because that can be a fairly large amount.
4383
4384   Debugging options
4385       Instead of saying "all", you can pass a string that consists of one or
4386       more of the following characters:
4387
4388         m   enable memory allocation tracing
4389         M   enable memory allocation & assertion tracing
4390
4391         h   enable hash table debugging
4392         H   enable hash table dumps
4393
4394         d   enable debug output from the XS module
4395         c   enable debug output from the ctlib
4396         t   enable debug output about type objects
4397
4398         l   enable debug output from the C lexer
4399         p   enable debug output from the C parser
4400         P   enable debug output from the C preprocessor
4401         r   enable debug output from the #pragma parser
4402
4403         y   enable debug output from yacc (bison)
4404
4405       So the following might give you a brief overview of what's going on
4406       inside Convert::Binary::C:
4407
4408         use Convert::Binary::C debug => 'dct';
4409
4410       When you want to debug memory allocation using
4411
4412         use Convert::Binary::C debug => 'm';
4413
4414       you can use the Perl script check_alloc.pl that resides in the
4415       ctlib/util/tool directory to extract statistics about memory usage and
4416       information about memory leaks from the resulting debug output.
4417
4418   Redirecting debug output
4419       By default, all debug output is written to "stderr". You can, however,
4420       redirect the debug output to a file with the "debugfile" option:
4421
4422         use Convert::Binary::C debug     => 'dcthHm',
4423                                debugfile => './debug.out';
4424
4425       If the file cannot be opened, you'll receive a warning and the output
4426       will go the "stderr" way again.
4427
4428       Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4429       and "CBC_DEBUG_FILE" to turn on debug output.
4430
4431       If Convert::Binary::C is built without debugging support, passing the
4432       "debug" or "debugfile" options will cause a warning to be issued. The
4433       corresponding environment variables will simply be ignored.
4434

ENVIRONMENT

4436   "CBC_ORDER_MEMBERS"
4437       Setting this variable to a non-zero value will globally turn on hash
4438       key ordering for compound members. Have a look at the "OrderMembers"
4439       option for details.
4440
4441       Setting the variable to the name of a perl module will additionally use
4442       this module instead of the predefined modules for member ordering to
4443       tie the hashes to.
4444
4445   "CBC_DEBUG_OPT"
4446       If Convert::Binary::C is built with debugging support, you can use this
4447       variable to specify the debugging options.
4448
4449   "CBC_DEBUG_FILE"
4450       If Convert::Binary::C is built with debugging support, you can use this
4451       variable to redirect the debug output to a file.
4452
4453   "CBC_DISABLE_PARSER"
4454       This variable is intended purely for development. Setting it to a non-
4455       zero value disables the Convert::Binary::C parser, which means that no
4456       information is collected from the file or code that is parsed. However,
4457       the preprocessor will run, which is useful for benchmarking the
4458       preprocessor.
4459

FLEXIBLE ARRAY MEMBERS AND INCOMPLETE TYPES

4461       Flexible array members are a feature introduced with ISO-C99.  It's a
4462       common problem that you have a variable length data field at the end of
4463       a structure, for example an array of characters at the end of a message
4464       struct. ISO-C99 allows you to write this as:
4465
4466         struct message {
4467           long header;
4468           char data[];
4469         };
4470
4471       The advantage is that you clearly indicate that the size of the
4472       appended data is variable, and that the "data" member doesn't
4473       contribute to the size of the "message" structure.
4474
4475       When packing or unpacking data, Convert::Binary::C deals with flexible
4476       array members as if their length was adjustable. For example, "unpack"
4477       will adapt the length of the array depending on the input string:
4478
4479         $msg1 = $c->unpack('message', 'abcdefg');
4480         $msg2 = $c->unpack('message', 'abcdefghijkl');
4481
4482       The following data is unpacked:
4483
4484         $msg1 = {
4485           'header' => 1633837924,
4486           'data' => [
4487             101,
4488             102,
4489             103
4490           ]
4491         };
4492         $msg2 = {
4493           'header' => 1633837924,
4494           'data' => [
4495             101,
4496             102,
4497             103,
4498             104,
4499             105,
4500             106,
4501             107,
4502             108
4503           ]
4504         };
4505
4506       Similarly, pack will adjust the length of the output string according
4507       to the data you feed in:
4508
4509         use Data::Hexdumper;
4510
4511         $msg = {
4512           header => 4711,
4513           data   => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4514         };
4515
4516         $data = $c->pack('message', $msg);
4517
4518         print hexdump(data => $data);
4519
4520       This would print:
4521
4522           0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{|}~
4523           0x0010 : 7F 80 81 82 83 84 85 86 87 88                   : ..........
4524
4525       Incomplete types such as
4526
4527         typedef unsigned long array[];
4528
4529       are handled in exactly the same way. Thus, you can easily
4530
4531         $array = $c->unpack('array', '?'x20);
4532
4533       which will unpack the following array:
4534
4535         $array = [
4536           1061109567,
4537           1061109567,
4538           1061109567,
4539           1061109567,
4540           1061109567
4541         ];
4542
4543       You can also alter the length of an array using the "Dimension" tag.
4544

FLOATING POINT VALUES

4546       When using Convert::Binary::C to handle floating point values, you have
4547       to be aware of some limitations.
4548
4549       You're usually safe if all your platforms are using the IEEE floating
4550       point format. During the Convert::Binary::C build process, the "ieeefp"
4551       feature will automatically be enabled if the host is using IEEE
4552       floating point. You can check for this feature at runtime using the
4553       "feature" function:
4554
4555         if (Convert::Binary::C::feature('ieeefp')) {
4556           # do something
4557         }
4558
4559       When IEEE floating point support is enabled, the module can also handle
4560       floating point values of a different byteorder.
4561
4562       If your host platform is not using IEEE floating point, the "ieeefp"
4563       feature will be disabled. Convert::Binary::C then will be more
4564       restrictive, refusing to handle any non-native floating point values.
4565
4566       However, Convert::Binary::C cannot detect the floating point format
4567       used by your target platform. It can only try to prevent problems in
4568       obvious cases. If you know your target platform has a completely
4569       different floating point format, don't use floating point conversion at
4570       all.
4571
4572       Whenever Convert::Binary::C detects that it cannot properly do floating
4573       point value conversion, it will issue a warning and will not attempt to
4574       convert the floating point value.
4575

BITFIELDS

4577       Bitfield support in Convert::Binary::C is currently in an experimental
4578       state. You are encouraged to test it, but you should not blindly rely
4579       on its results.
4580
4581       You are also encouraged to supply layouting algorithms for compilers
4582       whose bitfield implementation is not handled correctly at the moment.
4583       Even better that the plain algorithm is of course a patch that adds a
4584       new bitfield layouting engine.
4585
4586       While bitfields may not be handled correctly by the conversion routines
4587       yet, they are always parsed correctly. This means that you can reliably
4588       use the declarator fields as returned by the "struct" or "typedef"
4589       methods.  Given the following source
4590
4591         struct bitfield {
4592           int seven:7;
4593           int :1;
4594           int four:4, :0;
4595           int integer;
4596         };
4597
4598       a call to "struct" will return
4599
4600         @struct = (
4601           {
4602             'identifier' => 'bitfield',
4603             'align' => 1,
4604             'declarations' => [
4605               {
4606                 'type' => 'int',
4607                 'declarators' => [
4608                   {
4609                     'declarator' => 'seven:7'
4610                   }
4611                 ]
4612               },
4613               {
4614                 'type' => 'int',
4615                 'declarators' => [
4616                   {
4617                     'declarator' => ':1'
4618                   }
4619                 ]
4620               },
4621               {
4622                 'type' => 'int',
4623                 'declarators' => [
4624                   {
4625                     'declarator' => 'four:4'
4626                   },
4627                   {
4628                     'declarator' => ':0'
4629                   }
4630                 ]
4631               },
4632               {
4633                 'type' => 'int',
4634                 'declarators' => [
4635                   {
4636                     'size' => 4,
4637                     'offset' => 4,
4638                     'declarator' => 'integer'
4639                   }
4640                 ]
4641               }
4642             ],
4643             'type' => 'struct',
4644             'size' => 8,
4645             'context' => 'bitfields.c(1)',
4646             'pack' => 0
4647           }
4648         );
4649
4650       No size/offset keys will currently be returned for bitfield entries.
4651

MULTITHREADING

4653       Convert::Binary::C was designed to be thread-safe.
4654

INHERITANCE

4656       If you wish to derive a new class from Convert::Binary::C, this is
4657       relatively easy. Despite their XS implementation, Convert::Binary::C
4658       objects are actually blessed hash references.
4659
4660       The XS data is stored in a read-only hash value for the key that is the
4661       empty string. So it is safe to use any non-empty hash key when deriving
4662       your own class.  In addition, Convert::Binary::C does quite a lot of
4663       checks to detect corruption in the object hash.
4664
4665       If you store private data in the hash, you should override the "clone"
4666       method and provide the necessary code to clone your private data.
4667       You'll have to call "SUPER::clone", but this will only clone the
4668       Convert::Binary::C part of the object.
4669
4670       For an example of a derived class, you can have a look at
4671       Convert::Binary::C::Cached.
4672

PORTABILITY

4674       Convert::Binary::C should build and run on most of the platforms that
4675       Perl runs on:
4676
4677       •   Various Linux systems
4678
4679       •   Various BSD systems
4680
4681       •   HP-UX
4682
4683       •   Compaq/HP Tru64 Unix
4684
4685       •   Mac-OS X
4686
4687       •   Cygwin
4688
4689       •   Windows 98/NT/2000/XP
4690
4691       Also, many architectures are supported:
4692
4693       •   Various Intel Pentium and Itanium systems
4694
4695       •   Various Alpha systems
4696
4697       •   HP PA-RISC
4698
4699       •   Power-PC
4700
4701       •   StrongARM
4702
4703       The module should build with any perl binary from 5.004 up to the
4704       latest development version.
4705

COMPARISON WITH SIMILAR MODULES

4707       Most of the time when you're really looking for Convert::Binary::C
4708       you'll actually end up finding one of the following modules. Some of
4709       them have different goals, so it's probably worth pointing out the
4710       differences.
4711
4712   C::Include
4713       Like Convert::Binary::C, this module aims at doing conversion from and
4714       to binary data based on C types.  However, its configurability is very
4715       limited compared to Convert::Binary::C. Also, it does not parse all C
4716       code correctly. It's slower than Convert::Binary::C, doesn't have a
4717       preprocessor. On the plus side, it's written in pure Perl.
4718
4719   C::DynaLib::Struct
4720       This module doesn't allow you to reuse your C source code. One main
4721       goal of Convert::Binary::C was to avoid code duplication or, even
4722       worse, having to maintain different representations of your data
4723       structures.  Like C::Include, C::DynaLib::Struct is rather limited in
4724       its configurability.
4725
4726   Win32::API::Struct
4727       This module has a special purpose. It aims at building structs for
4728       interfacing Perl code with Windows API code.
4729

CREDITS

4731       • Alain Barbet <alian@cpan.org> for testing and debugging support.
4732
4733       • Mitchell N. Charity for giving me pointers into various interesting
4734         directions.
4735
4736       • Alexis Denis for making me improve (externally) and simplify
4737         (internally) floating point support. He can also be blamed
4738         (indirectly) for the "initializer" method, as I need it in my effort
4739         to support bitfields some day.
4740
4741       • Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4742         way to and back home from work, and for making me think about
4743         supporting "pack" and "unpack" for compound members.
4744
4745       • Thorsten Jens <thojens@gmx.de> for testing the package on various
4746         platforms.
4747
4748       • Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4749         giving invaluable feedback.
4750
4751       • Thomas Pornin <pornin@bolet.org> for his excellent "ucpp"
4752         preprocessor library.
4753
4754       • Marc Rosenthal for his suggestions and support.
4755
4756       • James Roskind, as his C parser was a great starting point to fix all
4757         the problems I had with my original parser based only on the ANSI
4758         ruleset.
4759
4760       • Gisbert W. Selke for spotting some interesting bugs and providing
4761         extensive reports.
4762
4763       • Steffen Zimmermann for a prolific discussion on the cloning
4764         algorithm.
4765

BUGS

4767       I'm sure there are still lots of bugs in the code for this module. If
4768       you find any bugs, Convert::Binary::C doesn't seem to build on your
4769       system or any of its tests fail, please report the issue at
4770       <https://github.com/mhx/Convert-Binary-C/issues>.
4771

EXPERIMENTAL FEATURES

4773       Some features in Convert::Binary::C are marked as experimental.  This
4774       has most probably one of the following reasons:
4775
4776       • The feature does not behave in exactly the way that I wish it did,
4777         possibly due to some limitations in the current design of the module.
4778
4779       • The feature hasn't been tested enough and may completely fail to
4780         produce the expected results.
4781
4782       I hope to fix most issues with these experimental features someday, but
4783       this may mean that I have to change the way they currently work in a
4784       way that's not backwards compatible.  So if any of these features is
4785       useful to you, you can use it, but you should be aware that the
4786       behaviour or the interface may change in future releases of this
4787       module.
4788

TODO

4790       If you're interested in what I currently plan to improve (or fix), have
4791       a look at the TODO file.
4792

COPYRIGHT

4794       Copyright (c) 2002-2020 Marcus Holland-Moritz. All rights reserved.
4795       This program is free software; you can redistribute it and/or modify it
4796       under the same terms as Perl itself.
4797
4798       The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4799       redistribution details refer to ctlib/ucpp/README.
4800
4801       Portions copyright (c) 1989, 1990 James A. Roskind.
4802