1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4

NAME

6       Convert::Binary::C - Binary Data Conversion using C Types
7

SYNOPSIS

9       Simple
10
11         use Convert::Binary::C;
12
13         #---------------------------------------------
14         # Create a new object and parse embedded code
15         #---------------------------------------------
16         my $c = Convert::Binary::C->new->parse(<<ENDC);
17
18         enum Month { JAN, FEB, MAR, APR, MAY, JUN,
19                      JUL, AUG, SEP, OCT, NOV, DEC };
20
21         struct Date {
22           int        year;
23           enum Month month;
24           int        day;
25         };
26
27         ENDC
28
29         #-----------------------------------------------
30         # Pack Perl data structure into a binary string
31         #-----------------------------------------------
32         my $date = { year => 2002, month => 'DEC', day => 24 };
33
34         my $packed = $c->pack('Date', $date);
35
36       Advanced
37
38         use Convert::Binary::C;
39         use Data::Dumper;
40
41         #---------------------
42         # Create a new object
43         #---------------------
44         my $c = new Convert::Binary::C ByteOrder => 'BigEndian';
45
46         #---------------------------------------------------
47         # Add include paths and global preprocessor defines
48         #---------------------------------------------------
49         $c->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include',
50                     '/usr/include')
51           ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
52
53         #----------------------------------
54         # Parse the 'time.h' header file
55         #----------------------------------
56         $c->parse_file('time.h');
57
58         #---------------------------------------
59         # See which files the object depends on
60         #---------------------------------------
61         print Dumper([$c->dependencies]);
62
63         #-----------------------------------------------------------
64         # See if struct timespec is defined and dump its definition
65         #-----------------------------------------------------------
66         if ($c->def('struct timespec')) {
67           print Dumper($c->struct('timespec'));
68         }
69
70         #-------------------------------
71         # Create some binary dummy data
72         #-------------------------------
73         my $data = "binary_test_string";
74
75         #--------------------------------------------------------
76         # Unpack $data according to 'struct timespec' definition
77         #--------------------------------------------------------
78         if (length($data) >= $c->sizeof('timespec')) {
79           my $perl = $c->unpack('timespec', $data);
80           print Dumper($perl);
81         }
82
83         #--------------------------------------------------------
84         # See which member lies at offset 5 of 'struct timespec'
85         #--------------------------------------------------------
86         my $member = $c->member('timespec', 5);
87         print "member('timespec', 5) = '$member'\n";
88

DESCRIPTION

90       Convert::Binary::C is a preprocessor and parser for C type definitions.
91       It is highly configurable and supports arbitrarily complex data struc‐
92       tures. Its object-oriented interface has "pack" and "unpack" methods
93       that act as replacements for Perl's "pack" and "unpack" and allow to
94       use C types instead of a string representation of the data structure
95       for conversion of binary data from and to Perl's complex data struc‐
96       tures.
97
98       Actually, what Convert::Binary::C does is not very different from what
99       a C compiler does, just that it doesn't compile the source code into an
100       object file or executable, but only parses the code and allows Perl to
101       use the enumerations, structs, unions and typedefs that have been
102       defined within your C source for binary data conversion, similar to
103       Perl's "pack" and "unpack".
104
105       Beyond that, the module offers a lot of convenience methods to retrieve
106       information about the C types that have been parsed.
107
108       Background and History
109
110       In late 2000 I wrote a real-time debugging interface for an embedded
111       medical device that allowed me to send out data from that device over
112       its integrated Ethernet adapter.  The interface was "printf()"-like, so
113       you could easily send out strings or numbers. But you could also send
114       out what I called arbitrary data, which was intended for arbitrary
115       blocks of the device's memory.
116
117       Another part of this real-time debugger was a Perl application running
118       on my workstation that gathered all the messages that were sent out
119       from the embedded device. It printed all the strings and numbers, and
120       hex-dumped the arbitrary data.  However, manually parsing a couple of
121       300 byte hex-dumps of a complex C structure is not only frustrating,
122       but also error-prone and time consuming.
123
124       Using "unpack" to retrieve the contents of a C structure works fine for
125       small structures and if you don't have to deal with struct member
126       alignment. But otherwise, maintaining such code can be as awful as
127       deciphering hex-dumps.
128
129       As I didn't find anything to solve my problem on the CPAN, I wrote a
130       little module that translated simple C structs into "unpack" strings.
131       It worked, but it was slow. And since it couldn't deal with struct mem‐
132       ber alignment, I soon found myself adding padding bytes everywhere.  So
133       again, I had to maintain two sources, and changing one of them forced
134       me to touch the other one.
135
136       All in all, this little module seemed to make my task a bit easier, but
137       it was far from being what I was thinking of:
138
139       · A module that could directly use the source I've been coding for the
140         embedded device without any modifications.
141
142       · A module that could be configured to match the properties of the dif‐
143         ferent compilers and target platforms I was using.
144
145       · A module that was fast enough to decode a great amount of binary data
146         even on my slow workstation.
147
148       I didn't know how to accomplish these tasks until I read something
149       about XS. At least, it seemed as if it could solve my performance prob‐
150       lems. However, writing a C parser in C isn't easier than it is in Perl.
151       But writing a C preprocessor from scratch is even worse.
152
153       Fortunately enough, after a few weeks of searching I found both, a
154       lean, open-source C preprocessor library, and a reusable YACC grammar
155       for ANSI-C. That was the beginning of the development of Con‐
156       vert::Binary::C in late 2001.
157
158       Now, I'm successfully using the module in my embedded environment since
159       long before it appeared on CPAN. From my point of view, it is exactly
160       what I had in mind. It's fast, flexible, easy to use and portable. It
161       doesn't require external programs or other Perl modules.
162
163       About this document
164
165       This document describes how to use Convert::Binary::C. A lot of differ‐
166       ent features are presented, and the example code sometimes uses Perl's
167       more advanced language elements. If your experience with Perl is rather
168       limited, you should know how to use Perl's very good documentation sys‐
169       tem.
170
171       To look up one of the manpages, use the "perldoc" command.  For exam‐
172       ple,
173
174         perldoc perl
175
176       will show you Perl's main manpage. To look up a specific Perl function,
177       use "perldoc -f":
178
179         perldoc -f map
180
181       gives you more information about the "map" function.  You can also
182       search the FAQ using "perldoc -q":
183
184         perldoc -q array
185
186       will give you everything you ever wanted to know about Perl arrays. But
187       now, let's go on with some real stuff!
188
189       Why use Convert::Binary::C?
190
191       Say you want to pack (or unpack) data according to the following C
192       structure:
193
194         struct foo {
195           char ary[3];
196           unsigned short baz;
197           int bar;
198         };
199
200       You could of course use Perl's "pack" and "unpack" functions:
201
202         @ary = (1, 2, 3);
203         $baz = 40000;
204         $bar = -4711;
205         $binary = pack 'c3 S i', @ary, $baz, $bar;
206
207       But this implies that the struct members are byte aligned. If they were
208       long aligned (which is the default for most compilers), you'd have to
209       write
210
211         $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
212
213       which doesn't really increase readability.
214
215       Now imagine that you need to pack the data for a completely different
216       architecture with different byte order. You would look into the "pack"
217       manpage again and perhaps come up with this:
218
219         $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
220
221       However, if you try to unpack $foo again, your signed values have
222       turned into unsigned ones.
223
224       All this can still be managed with Perl. But imagine your structures
225       get more complex? Imagine you need to support different platforms?
226       Imagine you need to make changes to the structures? You'll not only
227       have to change the C source but also dozens of "pack" strings in your
228       Perl code. This is no fun. And Perl should be fun.
229
230       Now, wouldn't it be great if you could just read in the C source you've
231       already written and use all the types defined there for packing and
232       unpacking? That's what Convert::Binary::C does.
233
234       Creating a Convert::Binary::C object
235
236       To use Convert::Binary::C just say
237
238         use Convert::Binary::C;
239
240       to load the module. Its interface is completely object oriented, so it
241       doesn't export any functions.
242
243       Next, you need to create a new Convert::Binary::C object. This can be
244       done by either
245
246         $c = Convert::Binary::C->new;
247
248       or
249
250         $c = new Convert::Binary::C;
251
252       You can optionally pass configuration options to the constructor as
253       described in the next section.
254
255       Configuring the object
256
257       To configure a Convert::Binary::C object, you can either call the "con‐
258       figure" method or directly pass the configuration options to the con‐
259       structor. If you want to change byte order and alignment, you can use
260
261         $c->configure(ByteOrder => 'LittleEndian',
262                       Alignment => 2);
263
264       or you can change the construction code to
265
266         $c = new Convert::Binary::C ByteOrder => 'LittleEndian',
267                                     Alignment => 2;
268
269       Either way, the object will now know that it should use little endian
270       (Intel) byte order and 2-byte struct member alignment for packing and
271       unpacking.
272
273       Alternatively, you can use the option names as names of methods to con‐
274       figure the object, like:
275
276         $c->ByteOrder('LittleEndian');
277
278       You can also retrieve information about the current configuration of a
279       Convert::Binary::C object. For details, see the section about the "con‐
280       figure" method.
281
282       Parsing C code
283
284       Convert::Binary::C allows two ways of parsing C source. Either by pars‐
285       ing external C header or C source files:
286
287         $c->parse_file('header.h');
288
289       Or by parsing C code embedded in your script:
290
291         $c->parse(<<'CCODE');
292         struct foo {
293           char ary[3];
294           unsigned short baz;
295           int bar;
296         };
297         CCODE
298
299       Now the object $c will know everything about "struct foo".  The example
300       above uses a so-called here-document. It allows to easily embed multi-
301       line strings in your code. You can find more about here-documents in
302       perldata or perlop.
303
304       Since the "parse" and "parse_file" methods throw an exception when a
305       parse error occurs, you usually want to catch these in an "eval" block:
306
307         eval { $c->parse_file('header.h') };
308         if ($@) {
309           # handle error appropriately
310         }
311
312       Perl's special $@ variable will contain an empty string (which evalu‐
313       ates to a false value in boolean context) on success or an error string
314       on failure.
315
316       As another feature, "parse" and "parse_file" return a reference to
317       their object on success, just like "configure" does when you're config‐
318       uring the object. This will allow you to write constructs like this:
319
320         my $c = eval {
321           Convert::Binary::C->new(Include => ['/usr/include'])
322                             ->parse_file('header.h')
323         };
324         if ($@) {
325           # handle error appropriately
326         }
327
328       Packing and unpacking
329
330       Convert::Binary::C has two methods, "pack" and "unpack", that act simi‐
331       lar to the functions of same denominator in Perl.  To perform the pack‐
332       ing described in the example above, you could write:
333
334         $data = {
335           ary => [1, 2, 3],
336           baz => 40000,
337           bar => -4711,
338         };
339         $binary = $c->pack('foo', $data);
340
341       Unpacking will work exactly the same way, just that the "unpack" method
342       will take a byte string as its input and will return a reference to a
343       (possibly very complex) Perl data structure.
344
345         $binary = get_data_from_memory();
346         $data = $c->unpack('foo', $binary);
347
348       You can now easily access all of the values:
349
350         print "foo.ary[1] = $data->{ary}[1]\n";
351
352       Or you can even more conveniently use the Data::Dumper module:
353
354         use Data::Dumper;
355         print Dumper($data);
356
357       The output would look something like this:
358
359         $VAR1 = {
360           'bar' => -271,
361           'baz' => 5000,
362           'ary' => [
363             42,
364             48,
365             100
366           ]
367         };
368
369       Preprocessor configuration
370
371       Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C pre‐
372       processor. It is compliant to ISO-C99, so you don't have to worry about
373       using even weird preprocessor constructs in your code.
374
375       If your C source contains includes or depends upon preprocessor
376       defines, you may need to configure the internal preprocessor.  Use the
377       "Include" and "Define" configuration options for that:
378
379         $c->configure(Include => ['/usr/include',
380                                   '/home/mhx/include'],
381                       Define  => [qw( NDEBUG FOO=42 )]);
382
383       If your code uses system includes, it is most likely that you will need
384       to define the symbols that are usually defined by the compiler.
385
386       On some operating systems, the system includes require the preprocessor
387       to predefine a certain set of assertions.  Assertions are supported by
388       "ucpp", and you can define them either in the source code using
389       "#assert" or as a property of the Convert::Binary::C object using
390       "Assert":
391
392         $c->configure(Assert => ['predicate(answer)']);
393
394       Information about defined macros can be retrieved from the preprocessor
395       as long as its configuration isn't changed. The preprocessor is implic‐
396       itly reset if you change one of the following configuration options:
397
398         Include
399         Define
400         Assert
401         HasCPPComments
402         HasMacroVAARGS
403
404       Supported pragma directives
405
406       Convert::Binary::C supports the "pack" pragma to locally override
407       struct member alignment. The supported syntax is as follows:
408
409       #pragma pack( ALIGN )
410           Sets the new alignment to ALIGN. If ALIGN is 0, resets the align‐
411           ment to its original value.
412
413       #pragma pack
414           Resets the alignment to its original value.
415
416       #pragma pack( push, ALIGN )
417           Saves the current alignment on a stack and sets the new alignment
418           to ALIGN. If ALIGN is 0, sets the alignment to the default align‐
419           ment.
420
421       #pragma pack( pop )
422           Restores the alignment to the last value saved on the stack.
423
424         /*  Example assumes sizeof( short ) == 2, sizeof( long ) == 4.  */
425
426         #pragma pack(1)
427
428         struct nopad {
429           char a;               /* no padding bytes between 'a' and 'b' */
430           long b;
431         };
432
433         #pragma pack            /* reset to "native" alignment          */
434
435         #pragma pack( push, 2 )
436
437         struct pad {
438           char    a;            /* one padding byte between 'a' and 'b' */
439           long    b;
440
441         #pragma pack( push, 1 )
442
443           struct {
444             char  c;            /* no padding between 'c' and 'd'       */
445             short d;
446           }       e;            /* sizeof( e ) == 3                     */
447
448         #pragma pack( pop );    /* back to pack( 2 )                    */
449
450           long    f;            /* one padding byte between 'e' and 'f' */
451         };
452
453         #pragma pack( pop );    /* back to "native"                     */
454
455       The "pack" pragma as it is currently implemented only affects the maxi‐
456       mum struct member alignment. There are compilers that also allow to
457       specify the minimum struct member alignment. This is not supported by
458       Convert::Binary::C.
459
460       Automatic configuration using "ccconfig"
461
462       As there are over 20 different configuration options, setting all of
463       them correctly can be a lengthy and tedious task.
464
465       The "ccconfig" script, which is bundled with this module, aims at auto‐
466       matically determining the correct compiler configuration by testing the
467       compiler executable. It works for both, native and cross compilers.
468

UNDERSTANDING TYPES

470       This section covers one of the fundamental features of Con‐
471       vert::Binary::C. It's how type expressions, referred to as TYPEs in the
472       method reference, are handled by the module.
473
474       Many of the methods, namely "pack", "unpack", "sizeof", "typeof", "mem‐
475       ber", "offsetof", "def", "initializer" and "tag", are passed a TYPE to
476       operate on as their first argument.
477
478       Standard Types
479
480       These are trivial. Standard types are simply enum names, struct names,
481       union names, or typedefs. Almost every method that wants a TYPE will
482       accept a standard type.
483
484       For enums, structs and unions, the prefixes "enum", "struct" and
485       "union" are optional. However, if a typedef with the same name exists,
486       like in
487
488         struct foo {
489           int bar;
490         };
491
492         typedef int foo;
493
494       you will have to use the prefix to distinguish between the struct and
495       the typedef. Otherwise, a typedef is always given preference.
496
497       Basic Types
498
499       Basic types, or atomic types, are "int" or "char", for example.  It's
500       possible to use these basic types without having parsed any code. You
501       can simply do
502
503         $c = new Convert::Binary::C;
504         $size = $c->sizeof('unsigned long');
505         $data = $c->pack('short int', 42);
506
507       Even though the above works fine, it is not possible to define more
508       complex types on the fly, so
509
510         $size = $c->sizeof('struct { int a, b; }');
511
512       will result in an error.
513
514       Basic types are not supported by all methods. For example, it makes no
515       sense to use "member" or "offsetof" on a basic type. Using "typeof"
516       isn't very useful, but supported.
517
518       Member Expressions
519
520       This is by far the most complex part, depending on the complexity of
521       your data structures. Any standard type that defines a compound or an
522       array may be followed by a member expression to select only a certain
523       part of the data type. Say you have parsed the following C code:
524
525         struct foo {
526           long type;
527           struct {
528             short x, y;
529           } array[20];
530         };
531
532         typedef struct foo matrix[8][8];
533
534       You may want to know the size of the "array" member of "struct foo".
535       This is quite easy:
536
537         print $c->sizeof('foo.array'), " bytes";
538
539       will print
540
541         80 bytes
542
543       depending of course on the "ShortSize" you configured.
544
545       If you wanted to unpack only a single column of "matrix", that's easy
546       as well (and of course it doesn't matter which index you use):
547
548         $column = $c->unpack('matrix[2]', $data);
549
550       Just like in C, it is possible to use out-of-bounds array indices.
551       This means that, for example, despite "array" is declared to have 20
552       elements, the following code
553
554         $size   = $c->sizeof('foo.array[4711]');
555         $offset = $c->offsetof('foo', 'array[-13]');
556
557       is perfectly valid and will result in:
558
559         $size   = 4
560         $offset = -48
561
562       Member expressions can be arbitrarily complex:
563
564         $type = $c->typeof('matrix[2][3].array[7].y');
565         print "the type is $type";
566
567       will, for example, print
568
569         the type is short
570
571       Member expressions are also used as the second argument to "offsetof".
572
573       Offsets
574
575       Members returned by the "member" method have an optional offset suffix
576       to indicate that the given offset doesn't point to the start of that
577       member. For example,
578
579         $member = $c->member('matrix', 1431);
580         print $member;
581
582       will print
583
584         [2][1].type+3
585
586       If you would use this as a member expression, like in
587
588         $size = $c->sizeof("matrix $member");
589
590       the offset suffix will simply be ignored. Actually, it will be ignored
591       for all methods if it's used in the first argument.
592
593       When used in the second argument to "offsetof", it will usually do what
594       you mean, i. e. the offset suffix, if present, will be considered when
595       determining the offset. This behaviour ensures that
596
597         $member = $c->member('foo', 43);
598         $offset = $c->offsetof('foo', $member);
599         print "'$member' is located at offset $offset of struct foo";
600
601       will always correctly set $offset:
602
603         '.array[9].y+1' is located at offset 43 of struct foo
604
605       If this is not what you mean, e.g. because you want to know the offset
606       where the member returned by "member" starts, you just have to remove
607       the suffix:
608
609         $member =~ s/\+\d+$//;
610         $offset = $c->offsetof('foo', $member);
611         print "'$member' starts at offset $offset of struct foo";
612
613       This would then print:
614
615         '.array[9].y' starts at offset 42 of struct foo
616

USING TAGS

618       In a nutshell, tags are properties that you can attach to types.
619
620       You can add tags to types using the "tag" method, and remove them using
621       "tag" or "untag", for example:
622
623         # Attach 'Format' and 'Hooks' tags
624         $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
625
626         $c->untag('type', 'Format');  # Remove only 'Format' tag
627         $c->untag('type');            # Remove all tags
628
629       You can also use "tag" to see which tags are attached to a type, for
630       example:
631
632         $tags = $c->tag('type');
633
634       This would give you:
635
636         $tags = {
637           'Hooks' => {
638             'pack' => \&rout
639           },
640           'Format' => 'String'
641         };
642
643       Currently, there are only a couple of different tags that influence the
644       way data is packed and unpacked. There are probably more tags to come
645       in the future.
646
647       The Format Tag
648
649       One of the tags currently available is the "Format" tag.  Using this
650       tag, you can tell a Convert::Binary::C object to pack and unpack a cer‐
651       tain data type in a special way.
652
653       For example, if you have a (fixed length) string type
654
655         typedef char str_type[40];
656
657       this type would, by default, be unpacked as an array of "char"s. That's
658       because it is only an array of "char"s, and Convert::Binary::C doesn't
659       know it is actually used as a string.
660
661       But you can tell Convert::Binary::C that "str_type" is a C string using
662       the "Format" tag:
663
664         $c->tag('str_type', Format => 'String');
665
666       This will make "unpack" (and of course also "pack") treat the binary
667       data like a null-terminated C string:
668
669         $binary = "Hello World!\n\0 this is just some dummy data";
670         $hello = $c->unpack('str_type', $binary);
671         print $hello;
672
673       would thusly print:
674
675         Hello World!
676
677       Of course, this also works the other way round:
678
679         use Data::Hexdumper;
680
681         $binary = $c->pack('str_type', "Just another C::B::C hacker");
682         print hexdump(data => $binary);
683
684       would print:
685
686           0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
687           0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
688           0x0020 : 00 00 00 00 00 00 00 00                         : ........
689
690       If you want Convert::Binary::C to not interpret the binary data at all,
691       you can set the "Format" tag to "Binary".  This might not be seem very
692       useful, as "pack" and "unpack" would just pass through the unmodified
693       binary data.  But you can tag not only whole types, but also compound
694       members. For example
695
696         $c->parse(<<ENDC);
697         struct packet {
698           unsigned short header;
699           unsigned short flags;
700           unsigned char  payload[28];
701         };
702         ENDC
703
704         $c->tag('packet.payload', Format => 'Binary');
705
706       would allow you to write:
707
708         read FILE, $payload, $c->sizeof('packet.payload');
709
710         $packet = {
711                     header  => 4711,
712                     flags   => 0xf00f,
713                     payload => $payload,
714                   };
715
716         $binary = $c->pack('packet', $packet);
717
718         print hexdump(data => $binary);
719
720       This would print something like:
721
722           0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
723           0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
724
725       For obvious reasons, it is not allowed to attach a "Format" tag to bit‐
726       field members. Trying to do so will result in an exception being thrown
727       by the "tag" method.
728
729       The ByteOrder Tag
730
731       The "ByteOrder" tag allows you to override the byte order of certain
732       types or members. The implementation of this tag is considered experi‐
733       mental and may be subject to changes in the future.
734
735       Usually it doesn't make much sense to override the byte order, but
736       there may be applications where a sub-structure is packed in a differ‐
737       ent byte order than the surrounding structure.
738
739       Take, for example, the following code:
740
741         $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
742                                      OrderMembers => 1);
743         $c->parse(<<'ENDC');
744
745         typedef unsigned short u_16;
746
747         struct coords_3d {
748           long x, y, z;
749         };
750
751         struct coords_msg {
752           u_16 header;
753           u_16 length;
754           struct coords_3d coords;
755         };
756
757         ENDC
758
759       Assume that while "coords_msg" is big endian, the embedded coordinates
760       "coords_3d" are stored in little endian format for some reason. In C,
761       you'll have to handle this manually.
762
763       But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
764       to either the "coords_3d" structure or to the "coords" member of the
765       "coords_msg" structure. Both will work in this case. The only differ‐
766       ence is that if you tag the "coords" member, "coords_3d" will only be
767       treated as little endian if you "pack" or "unpack" the "coords_msg"
768       structure. (BTW, you could also tag all members of "coords_3d" individ‐
769       ually, but that would be inefficient.)
770
771       So, let's attach the "ByteOrder" tag to the "coords" member:
772
773         $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
774
775       Assume the following binary message:
776
777           0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
778
779       If you unpack this message...
780
781         $msg = $c->unpack('coords_msg', $binary);
782
783       ...you will get the following data structure:
784
785         $msg = {
786           'header' => 42,
787           'length' => 12,
788           'coords' => {
789             'x' => -1,
790             'y' => 2,
791             'z' => 42
792           }
793         };
794
795       Without the "ByteOrder" tag, you would get:
796
797         $msg = {
798           'header' => 42,
799           'length' => 12,
800           'coords' => {
801             'x' => -1,
802             'y' => 33554432,
803             'z' => 704643072
804           }
805         };
806
807       The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
808       of the tagged object recursively. Of course, it is also possible to
809       override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
810       child type. Confused? Here's an example. In addition to tagging the
811       "coords" member as little endian, we now tag "coords_3d.y" as big
812       endian:
813
814         $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
815         $msg = $c->unpack('coords_msg', $binary);
816
817       This will return the following data structure:
818
819         $msg = {
820           'header' => 42,
821           'length' => 12,
822           'coords' => {
823             'x' => -1,
824             'y' => 33554432,
825             'z' => 42
826           }
827         };
828
829       Note that if you tag both a type and a member of that type within a
830       compound, the tag attached to the type itself has higher precedence.
831       Using the example above, if you would attach a "ByteOrder" tag to both
832       "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
833       would always win.
834
835       Also note that the "ByteOrder" tag might not work as expected along
836       with bitfields, which is why the implementation is considered experi‐
837       mental. Bitfields are currently not affected by the "ByteOrder" tag at
838       all. This is because the byte order would affect the bitfield layout,
839       and a consistent implementation supporting multiple layouts of the same
840       struct would be quite bulky and probably slow down the whole module.
841
842       If you really need the correct behaviour, you can use the following
843       trick:
844
845         $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
846
847         $le->parse(<<'ENDC');
848
849         typedef unsigned short u_16;
850         typedef unsigned long  u_32;
851
852         struct message {
853           u_16 header;
854           u_16 length;
855           struct {
856             u_32 a;
857             u_32 b;
858             u_32 c :  7;
859             u_32 d :  5;
860             u_32 e : 20;
861           } data;
862         };
863
864         ENDC
865
866         $be = $le->clone->ByteOrder('BigEndian');
867
868         $le->tag('message.data', Format => 'Binary', Hooks => {
869             unpack => sub { $be->unpack('message.data', @_) },
870             pack   => sub { $be->pack('message.data', @_) },
871           });
872
873         $msg = $le->unpack('message', $binary);
874
875       This uses the "Format" and "Hooks" tags along with a big endian "clone"
876       of the original little endian object. It attaches hooks to the little
877       endian object and in the hooks it uses the big endian object to "pack"
878       and "unpack" the binary data.
879
880       The Dimension Tag
881
882       The "Dimension" tag allows you to override the declared dimension of an
883       array for packing or unpacking data. The implementation of this tag is
884       considered very experimental and will definitely change in a future
885       release.
886
887       That being said, the "Dimension" tag is primarily useful to support
888       variable length arrays. Usually, you have to write the following code
889       for such a variable length array in C:
890
891         struct c_message
892         {
893           unsigned count;
894           char data[1];
895         };
896
897       So, because you cannot declare an empty array, you declare an array
898       with a single element. If you have a ISO-C99 compliant compiler, you
899       can write this code instead:
900
901         struct c99_message
902         {
903           unsigned count;
904           char data[];
905         };
906
907       This explicitly tells the compiler that "data" is a flexible array mem‐
908       ber. Convert::Binary::C already uses this information to handle flexi‐
909       ble array members in a special way.
910
911       As you can see in the following example, the two types are treated dif‐
912       ferently:
913
914         $data = pack 'NC*', 3, 1..8;
915         $uc   = $c->unpack('c_message', $data);
916         $uc99 = $c->unpack('c99_message', $data);
917
918       This will result in:
919
920         $uc = {'count' => 3,'data' => [1]};
921         $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
922
923       However, only few compilers support ISO-C99, and you probably don't
924       want to change your existing code only to get some extra features when
925       using Convert::Binary::C.
926
927       So it is possible to attach a tag to the "data" member of the "c_mes‐
928       sage" struct that tells Convert::Binary::C to treat the array as if it
929       were flexible:
930
931         $c->tag('c_message.data', Dimension => '*');
932
933       Now both "c_message" and "c99_message" will behave exactly the same
934       when using "pack" or "unpack".  Repeating the above code:
935
936         $uc = $c->unpack('c_message', $data);
937
938       This will result in:
939
940         $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
941
942       But there's more you can do. Even though it probably doesn't make much
943       sense, you can tag a fixed dimension to an array:
944
945         $c->tag('c_message.data', Dimension => '5');
946
947       This will obviously result in:
948
949         $uc = {'count' => 3,'data' => [1,2,3,4,5]};
950
951       A more useful way to use the "Dimension" tag is to set it to the name
952       of a member in the same compound:
953
954         $c->tag('c_message.data', Dimension => 'count');
955
956       Convert::Binary::C will now use the value of that member to determine
957       the size of the array, so unpacking will result in:
958
959         $uc = {'count' => 3,'data' => [1,2,3]};
960
961       Of course, you can also tag flexible array members. And yes, it's also
962       possible to use more complex member expressions:
963
964         $c->parse(<<ENDC);
965         struct msg_header
966         {
967           unsigned len[2];
968         };
969
970         struct more_complex
971         {
972           struct msg_header hdr;
973           char data[];
974         };
975         ENDC
976
977         $data = pack 'NNC*', 42, 7, 1 .. 10;
978
979         $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
980
981         $u = $c->unpack('more_complex', $data);
982
983       The result will be:
984
985         $u = {
986           'hdr' => {
987             'len' => [
988               42,
989               7
990             ]
991           },
992           'data' => [
993             1,
994             2,
995             3,
996             4,
997             5,
998             6,
999             7
1000           ]
1001         };
1002
1003       By the way, it's also possible to tag arrays that are not embedded
1004       inside a compound:
1005
1006         $c->parse(<<ENDC);
1007         typedef unsigned short short_array[];
1008         ENDC
1009
1010         $c->tag('short_array', Dimension => '5');
1011
1012         $u = $c->unpack('short_array', $data);
1013
1014       Resulting in:
1015
1016         $u = [0,42,0,7,258];
1017
1018       The final and most powerful way to define a "Dimension" tag is to pass
1019       it a subroutine reference. The referenced subroutine can execute what‐
1020       ever code is neccessary to determine the size of the tagged array:
1021
1022         sub get_size
1023         {
1024           my $m = shift;
1025           return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1026         }
1027
1028         $c->tag('more_complex.data', Dimension => \&get_size);
1029
1030         $u = $c->unpack('more_complex', $data);
1031
1032       As you can guess from the above code, the subroutine is being passed a
1033       reference to hash that stores the already unpacked part of the compound
1034       embedding the tagged array. This is the result:
1035
1036         $u = {
1037           'hdr' => {
1038             'len' => [
1039               42,
1040               7
1041             ]
1042           },
1043           'data' => [
1044             1,
1045             2,
1046             3,
1047             4,
1048             5,
1049             6
1050           ]
1051         };
1052
1053       You can also pass custom arguments to the subroutines by using the
1054       "arg" method. This is similar to the functionality offered by the
1055       "Hooks" tag.
1056
1057       Of course, all that also works for the "pack" method as well.
1058
1059       However, the current implementation has at least one shortcomings,
1060       which is why it's experimental: The "Dimension" tag doesn't impact com‐
1061       pound layout. This means that while you can alter the size of an array
1062       in the middle of a compound, the offset of the members after that array
1063       won't be impacted. I'd rather like to see the layout adapt dynamically,
1064       so this is what I'm hoping to implement in the future.
1065
1066       The Hooks Tag
1067
1068       Hooks are a special kind of tag that can be extremely useful.
1069
1070       Using hooks, you can easily override the way "pack" and "unpack" handle
1071       data using your own subroutines.  If you define hooks for a certain
1072       data type, each time this data type is processed the corresponding hook
1073       will be called to allow you to modify that data.
1074
1075       Basic Hooks
1076
1077       Here's an example. Let's assume the following C code has been parsed:
1078
1079         typedef unsigned long u_32;
1080         typedef u_32          ProtoId;
1081         typedef ProtoId       MyProtoId;
1082
1083         struct MsgHeader {
1084           MyProtoId id;
1085           u_32      len;
1086         };
1087
1088         struct String {
1089           u_32 len;
1090           char buf[];
1091         };
1092
1093       You could now use the types above and, for example, unpack binary data
1094       representing a "MsgHeader" like this:
1095
1096         $msg_header = $c->unpack('MsgHeader', $data);
1097
1098       This would give you:
1099
1100         $msg_header = {
1101           'len' => 13,
1102           'id' => 42
1103         };
1104
1105       Instead of dealing with "ProtoId"'s as integers, you would rather like
1106       to have them as clear text. You could provide subroutines to convert
1107       between clear text and integers:
1108
1109         %proto = (
1110           CATS      =>    1,
1111           DOGS      =>   42,
1112           HEDGEHOGS => 4711,
1113         );
1114
1115         %rproto = reverse %proto;
1116
1117         sub ProtoId_unpack {
1118           $rproto{$_[0]} ⎪⎪ 'unknown protocol'
1119         }
1120
1121         sub ProtoId_pack {
1122           $proto{$_[0]} or die 'unknown protocol'
1123         }
1124
1125       You can now register these subroutines by attaching a "Hooks" tag to
1126       "ProtoId" using the "tag" method:
1127
1128         $c->tag('ProtoId', Hooks => { pack   => \&ProtoId_pack,
1129                                       unpack => \&ProtoId_unpack });
1130
1131       Doing exactly the same unpack on "MsgHeader" again would now return:
1132
1133         $msg_header = {
1134           'len' => 13,
1135           'id' => 'DOGS'
1136         };
1137
1138       Actually, if you don't need the reverse operation, you don't even have
1139       to register a "pack" hook. Or, even better, you can have a more intel‐
1140       ligent "unpack" hook that creates a dual-typed variable:
1141
1142         use Scalar::Util qw(dualvar);
1143
1144         sub ProtoId_unpack2 {
1145           dualvar $_[0], $rproto{$_[0]} ⎪⎪ 'unknown protocol'
1146         }
1147
1148         $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1149
1150         $msg_header = $c->unpack('MsgHeader', $data);
1151
1152       Just as before, this would print
1153
1154         $msg_header = {
1155           'len' => 13,
1156           'id' => 'DOGS'
1157         };
1158
1159       but without requiring a "pack" hook for packing, at least as long as
1160       you keep the variable dual-typed.
1161
1162       Hooks are usually called with exactly one argument, which is the data
1163       that should be processed (see "Advanced Hooks" for details on how to
1164       customize hook arguments). They are called in scalar context and
1165       expected to return the processed data.
1166
1167       To get rid of registered hooks, you can either undefine only certain
1168       hooks
1169
1170         $c->tag('ProtoId', Hooks => { pack => undef });
1171
1172       or all hooks:
1173
1174         $c->tag('ProtoId', Hooks => undef);
1175
1176       Of course, hooks are not restricted to handling integer values.  You
1177       could just as well attach hooks for the "String" struct from the code
1178       above. A useful example would be to have these hooks:
1179
1180         sub string_unpack {
1181           my $s = shift;
1182           pack "c$s->{len}", @{$s->{buf}};
1183         }
1184
1185         sub string_pack {
1186           my $s = shift;
1187           return {
1188             len => length $s,
1189             buf => [ unpack 'c*', $s ],
1190           }
1191         }
1192
1193       (Don't be confused by the fact that the "unpack" hook uses "pack" and
1194       the "pack" hook uses "unpack".  And also see "Advanced Hooks" for a
1195       more clever approach.)
1196
1197       While you would normally get the following output when unpacking a
1198       "String"
1199
1200         $string = {
1201           'len' => 12,
1202           'buf' => [
1203             72,
1204             101,
1205             108,
1206             108,
1207             111,
1208             32,
1209             87,
1210             111,
1211             114,
1212             108,
1213             100,
1214             33
1215           ]
1216         };
1217
1218       you could just register the hooks using
1219
1220         $c->tag('String', Hooks => { pack   => \&string_pack,
1221                                      unpack => \&string_unpack });
1222
1223       and you would get a nice human-readable Perl string:
1224
1225         $string = 'Hello World!';
1226
1227       Packing a string turns out to be just as easy:
1228
1229         use Data::Hexdumper;
1230
1231         $data = $c->pack('String', 'Just another Perl hacker,');
1232
1233         print hexdump(data => $data);
1234
1235       This would print:
1236
1237           0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1238           0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C          : .Perl.hacker,
1239
1240       If you want to find out if or which hooks are registered for a certain
1241       type, you can also use the "tag" method:
1242
1243         $hooks = $c->tag('String', 'Hooks');
1244
1245       This would return:
1246
1247         $hooks = {
1248           'unpack' => \&string_unpack,
1249           'pack' => \&string_pack
1250         };
1251
1252       Advanced Hooks
1253
1254       It is also possible to combine hooks with using the "Format" tag.  This
1255       can be useful if you know better than Convert::Binary::C how to inter‐
1256       pret the binary data. In the previous section, we've handled this type
1257
1258         struct String {
1259           u_32 len;
1260           char buf[];
1261         };
1262
1263       with the following hooks:
1264
1265         sub string_unpack {
1266           my $s = shift;
1267           pack "c$s->{len}", @{$s->{buf}};
1268         }
1269
1270         sub string_pack {
1271           my $s = shift;
1272           return {
1273             len => length $s,
1274             buf => [ unpack 'c*', $s ],
1275           }
1276         }
1277
1278         $c->tag('String', Hooks => { pack   => \&string_pack,
1279                                      unpack => \&string_unpack });
1280
1281       As you can see in the hook code, "buf" is expected to be an array of
1282       characters. For the "unpack" case Convert::Binary::C first turns the
1283       binary data into a Perl array, and then the hook packs it back into a
1284       string. The intermediate array creation and destruction is completely
1285       useless.  Same thing, of course, for the "pack" case.
1286
1287       Here's a clever way to handle this. Just tag "buf" as binary
1288
1289         $c->tag('String.buf', Format => 'Binary');
1290
1291       and use the following hooks instead:
1292
1293         sub string_unpack2 {
1294           my $s = shift;
1295           substr $s->{buf}, 0, $s->{len};
1296         }
1297
1298         sub string_pack2 {
1299           my $s = shift;
1300           return {
1301             len => length $s,
1302             buf => $s,
1303           }
1304         }
1305
1306         $c->tag('String', Hooks => { pack   => \&string_pack2,
1307                                      unpack => \&string_unpack2 });
1308
1309       This will be exactly equivalent to the old code, but faster and proba‐
1310       bly even much easier to understand.
1311
1312       But hooks are even more powerful. You can customize the arguments that
1313       are passed to your hooks and you can use "arg" to pass certain special
1314       arguments, such as the name of the type that is currently being pro‐
1315       cessed by the hook.
1316
1317       The following example shows how it is easily possible to peek into the
1318       perl internals using hooks.
1319
1320         use Config;
1321
1322         $c = new Convert::Binary::C %CC, OrderMembers => 1;
1323         $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1324         $c->parse(<<ENDC);
1325         #include "EXTERN.h"
1326         #include "perl.h"
1327         ENDC
1328
1329         $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1330                                               $c->arg(qw(SELF TYPE DATA))] })
1331             for qw( XPVAV XPVHV );
1332
1333       First, we add the perl core include path and parse perl.h. Then, we add
1334       an "unpack_ptr" hook for a couple of the internal data types.
1335
1336       The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1337       a certain data structure is processed. This is by far the most experi‐
1338       mental part of the hooks feature, as this includes any kind of pointer.
1339       There's no way for the hook to know the difference between a plain
1340       pointer, or a pointer to a pointer, or a pointer to an array (this is
1341       because the difference doesn't matter anywhere else in Con‐
1342       vert::Binary::C).
1343
1344       But the hook above makes use of another very interesting feature: It
1345       uses "arg" to pass special arguments to the hook subroutine.  Usually,
1346       the hook subroutine is simply passed a single data argument.  But using
1347       the above definition, it'll get a reference to the calling object
1348       ("SELF"), the name of the type being processed ("TYPE") and the data
1349       ("DATA").
1350
1351       But how does our hook look like?
1352
1353         sub unpack_ptr {
1354           my($self, $type, $ptr) = @_;
1355           $ptr or return '<NULL>';
1356           my $size = $self->sizeof($type);
1357           $self->unpack($type, unpack("P$size", pack('I', $ptr)));
1358         }
1359
1360       As you can see, the hook is rather simple. First, it receives the argu‐
1361       ments mentioned above. It performs a quick check if the pointer is
1362       "NULL" and shouldn't be processed any further. Next, it determines the
1363       size of the type being processed. And finally, it'll just use the "P"n
1364       unpack template to read from that memory location and recursively call
1365       "unpack" to unpack the type. (And yes, this may of course again call
1366       other hooks.)
1367
1368       Now, let's test that:
1369
1370         my $ref = { foo => 42, bar => 4711 };
1371         my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1372
1373         print Dumper(unpack_ptr($c, 'AV', $ptr));
1374
1375       Just for the fun of it, we create a blessed array reference. But how do
1376       we get a pointer to the corresponding "AV"? This is rather easy, as the
1377       address of the "AV" is just the hex value that appears when using the
1378       array reference in string context. So we just grab that and turn it
1379       into decimal. All that's left to do is just call our hook, as it can
1380       already handle "AV" pointers. And this is what we get:
1381
1382         $VAR1 = {
1383           'sv_any' => {
1384             'xnv_u' => {
1385               'xnv_nv' => '2.18376848395956105e-4933',
1386               'xgv_stash' => 0,
1387               'xpad_cop_seq' => {
1388                 'xlow' => 0,
1389                 'xhigh' => 139484332
1390               },
1391               'xbm_s' => {
1392                 'xbm_previous' => 0,
1393                 'xbm_flags' => 172,
1394                 'xbm_rare' => 92
1395               }
1396             },
1397             'xav_fill' => 2,
1398             'xav_max' => 7,
1399             'xiv_u' => {
1400               'xivu_iv' => 2,
1401               'xivu_uv' => 2,
1402               'xivu_p1' => 2,
1403               'xivu_i32' => 2,
1404               'xivu_namehek' => 2,
1405               'xivu_hv' => 2
1406             },
1407             'xmg_u' => {
1408               'xmg_magic' => 0,
1409               'xmg_ourstash' => 0
1410             },
1411             'xmg_stash' => 0
1412           },
1413           'sv_refcnt' => 1,
1414           'sv_flags' => 536870924,
1415           'sv_u' => {
1416             'svu_iv' => 139483844,
1417             'svu_uv' => 139483844,
1418             'svu_rv' => 139483844,
1419             'svu_pv' => 139483844,
1420             'svu_array' => 139483844,
1421             'svu_hash' => 139483844,
1422             'svu_gp' => 139483844
1423           }
1424         };
1425
1426       Even though it is rather easy to do such stuff using "unpack_ptr"
1427       hooks, you should really know what you're doing and do it with extreme
1428       care because of the limitations mentioned above. It's really easy to
1429       run into segmentation faults when you're dereferencing pointers that
1430       point to memory which you don't own.
1431
1432       Performance
1433
1434       Using hooks isn't for free. In performance-critical applications you
1435       have to keep in mind that hooks are actually perl subroutines and that
1436       they are called once for every value of a registered type that is being
1437       packed or unpacked. If only about 10% of the values require hooks to be
1438       called, you'll hardly notice the difference (if your hooks are imple‐
1439       mented efficiently, that is).  But if all values would require hooks to
1440       be called, that alone could easily make packing and unpacking very
1441       slow.
1442
1443       Tag Order
1444
1445       Since it is possible to attach multiple tags to a single type, the
1446       order in which the tags are processed is important. Here's a small ta‐
1447       ble that shows the processing order.
1448
1449         pack        unpack
1450         ---------------------
1451         Hooks       Format
1452         Format      ByteOrder
1453         ByteOrder   Hooks
1454
1455       As a general rule, the "Hooks" tag is always the first thing processed
1456       when packing data, and the last thing processed when unpacking data.
1457
1458       The "Format" and "ByteOrder" tags are exclusive, but when both are
1459       given the "Format" tag wins.
1460

METHODS

1462       new
1463
1464       "new"
1465       "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1466               The constructor is used to create a new Convert::Binary::C
1467               object.  You can simply use
1468
1469                 $c = new Convert::Binary::C;
1470
1471               without additional arguments to create an object, or you can
1472               optionally pass any arguments to the constructor that are
1473               described for the "configure" method.
1474
1475       configure
1476
1477       "configure"
1478       "configure" OPTION
1479       "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1480               This method can be used to configure an existing Con‐
1481               vert::Binary::C object or to retrieve its current configura‐
1482               tion.
1483
1484               To configure the object, the list of options consists of key
1485               and value pairs and must therefore contain an even number of
1486               elements. "configure" (and also "new" if used with configura‐
1487               tion options) will throw an exception if you pass an odd number
1488               of elements. Configuration will normally look like this:
1489
1490                 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1491
1492               To retrieve the current value of a configuration option, you
1493               must pass a single argument to "configure" that holds the name
1494               of the option, just like
1495
1496                 $order = $c->configure('ByteOrder');
1497
1498               If you want to get the values of all configuration options at
1499               once, you can call "configure" without any arguments and it
1500               will return a reference to a hash table that holds the whole
1501               object configuration. This can be conveniently used with the
1502               Data::Dumper module, for example:
1503
1504                 use Convert::Binary::C;
1505                 use Data::Dumper;
1506
1507                 $c = new Convert::Binary::C Define  => ['DEBUGGING', 'FOO=123'],
1508                                             Include => ['/usr/include'];
1509
1510                 print Dumper($c->configure);
1511
1512               Which will print something like this:
1513
1514                 $VAR1 = {
1515                   'Define' => [
1516                     'DEBUGGING',
1517                     'FOO=123'
1518                   ],
1519                   'StdCVersion' => 199901,
1520                   'ByteOrder' => 'LittleEndian',
1521                   'LongSize' => 4,
1522                   'IntSize' => 4,
1523                   'HostedC' => 1,
1524                   'ShortSize' => 2,
1525                   'HasMacroVAARGS' => 1,
1526                   'Assert' => [],
1527                   'UnsignedChars' => 0,
1528                   'DoubleSize' => 8,
1529                   'CharSize' => 1,
1530                   'EnumType' => 'Integer',
1531                   'PointerSize' => 4,
1532                   'EnumSize' => 4,
1533                   'DisabledKeywords' => [],
1534                   'FloatSize' => 4,
1535                   'Alignment' => 1,
1536                   'LongLongSize' => 8,
1537                   'LongDoubleSize' => 12,
1538                   'KeywordMap' => {},
1539                   'Include' => [
1540                     '/usr/include'
1541                   ],
1542                   'HasCPPComments' => 1,
1543                   'Bitfields' => {
1544                     'Engine' => 'Generic'
1545                   },
1546                   'UnsignedBitfields' => 0,
1547                   'Warnings' => 0,
1548                   'CompoundAlignment' => 1,
1549                   'OrderMembers' => 0
1550                 };
1551
1552               Since you may not always want to write a "configure" call when
1553               you only want to change a single configuration item, you can
1554               use any configuration option name as a method name, like:
1555
1556                 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1557
1558               (Yes, the example doesn't make very much sense... ;-)
1559
1560               However, you should keep in mind that configuration methods
1561               that can take lists (namely "Include", "Define" and "Assert",
1562               but not "DisabledKeywords") may behave slightly different than
1563               their "configure" equivalent.  If you pass these methods a sin‐
1564               gle argument that is an array reference, the current list will
1565               be replaced by the new one, which is just the behaviour of the
1566               corresponding "configure" call.  So the following are equiva‐
1567               lent:
1568
1569                 $c->configure(Define => ['foo', 'bar=123']);
1570                 $c->Define(['foo', 'bar=123']);
1571
1572               But if you pass a list of strings instead of an array reference
1573               (which cannot be done when using "configure"), the new list
1574               items are appended to the current list, so
1575
1576                 $c = new Convert::Binary::C Include => ['/include'];
1577                 $c->Include('/usr/include', '/usr/local/include');
1578                 print Dumper($c->Include);
1579
1580                 $c->Include(['/usr/local/include']);
1581                 print Dumper($c->Include);
1582
1583               will first print all three include paths, but finally only
1584               "/usr/local/include" will be configured:
1585
1586                 $VAR1 = [
1587                   '/include',
1588                   '/usr/include',
1589                   '/usr/local/include'
1590                 ];
1591                 $VAR1 = [
1592                   '/usr/local/include'
1593                 ];
1594
1595               Furthermore, configuration methods can be chained together, as
1596               they return a reference to their object if called as a set
1597               method. So, if you like, you can configure your object like
1598               this:
1599
1600                 $c = Convert::Binary::C->new(IntSize => 4)
1601                        ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1602                        ->ByteOrder('BigEndian');
1603
1604                 $c->configure(EnumType => 'Both', Alignment => 4)
1605                   ->Include('/usr/include', '/usr/local/include');
1606
1607               In the example above, "qw( ... )" is the word list quoting
1608               operator. It returns a list of all non-whitespace sequences,
1609               and is especially useful for configuring preprocessor defines
1610               or assertions. The following assignments are equivalent:
1611
1612                 @array = ('one', 'two', 'three');
1613                 @array = qw(one two three);
1614
1615               You can configure the following options. Unknown options, as
1616               well as invalid values for an option, will cause the object to
1617               throw exceptions.
1618
1619               "IntSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1620                   Set the number of bytes that are occupied by an integer.
1621                   This is in most cases 2 or 4. If you set it to zero, the
1622                   size of an integer on the host system will be used. This is
1623                   also the default unless overridden by
1624                   "CBC_DEFAULT_INT_SIZE" at compile time.
1625
1626               "CharSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1627                   Set the number of bytes that are occupied by a "char".
1628                   This rarely needs to be changed, except for some platforms
1629                   that don't care about bytes, for example DSPs.  If you set
1630                   this to zero, the size of a "char" on the host system will
1631                   be used. This is also the default unless overridden by
1632                   "CBC_DEFAULT_CHAR_SIZE" at compile time.
1633
1634               "ShortSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1635                   Set the number of bytes that are occupied by a short inte‐
1636                   ger.  Although integers explicitly declared as "short"
1637                   should be always 16 bit, there are compilers that make a
1638                   short 8 bit wide. If you set it to zero, the size of a
1639                   short integer on the host system will be used. This is also
1640                   the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1641                   at compile time.
1642
1643               "LongSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1644                   Set the number of bytes that are occupied by a long inte‐
1645                   ger.  If set to zero, the size of a long integer on the
1646                   host system will be used. This is also the default unless
1647                   overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1648
1649               "LongLongSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1650                   Set the number of bytes that are occupied by a long long
1651                   integer. If set to zero, the size of a long long integer on
1652                   the host system, or 8, will be used. This is also the
1653                   default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1654                   at compile time.
1655
1656               "FloatSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8 ⎪ 12 ⎪ 16
1657                   Set the number of bytes that are occupied by a single pre‐
1658                   cision floating point value.  If you set it to zero, the
1659                   size of a "float" on the host system will be used. This is
1660                   also the default unless overridden by
1661                   "CBC_DEFAULT_FLOAT_SIZE" at compile time.  For details on
1662                   floating point support, see "FLOATING POINT VALUES".
1663
1664               "DoubleSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8 ⎪ 12 ⎪ 16
1665                   Set the number of bytes that are occupied by a double pre‐
1666                   cision floating point value.  If you set it to zero, the
1667                   size of a "double" on the host system will be used. This is
1668                   also the default unless overridden by "CBC_DEFAULT_DOU‐
1669                   BLE_SIZE" at compile time.  For details on floating point
1670                   support, see "FLOATING POINT VALUES".
1671
1672               "LongDoubleSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8 ⎪ 12 ⎪ 16
1673                   Set the number of bytes that are occupied by a double pre‐
1674                   cision floating point value.  If you set it to zero, the
1675                   size of a "long double" on the host system, or 12 will be
1676                   used. This is also the default unless overridden by
1677                   "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1678                   on floating point support, see "FLOATING POINT VALUES".
1679
1680               "PointerSize" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1681                   Set the number of bytes that are occupied by a pointer.
1682                   This is in most cases 2 or 4. If you set it to zero, the
1683                   size of a pointer on the host system will be used. This is
1684                   also the default unless overridden by
1685                   "CBC_DEFAULT_PTR_SIZE" at compile time.
1686
1687               "EnumSize" => -1 ⎪ 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8
1688                   Set the number of bytes that are occupied by an enumeration
1689                   type.  On most systems, this is equal to the size of an
1690                   integer, which is also the default. However, for some com‐
1691                   pilers, the size of an enumeration type depends on the size
1692                   occupied by the largest enumerator. So the size may vary
1693                   between 1 and 8. If you have
1694
1695                     enum foo {
1696                       ONE = 100, TWO = 200
1697                     };
1698
1699                   this will occupy one byte because the enum can be repre‐
1700                   sented as an unsigned one-byte value. However,
1701
1702                     enum foo {
1703                       ONE = -100, TWO = 200
1704                     };
1705
1706                   will occupy two bytes, because the -100 forces the type to
1707                   be signed, and 200 doesn't fit into a signed one-byte
1708                   value.  Therefore, the type used is a signed two-byte
1709                   value.  If this is the behaviour you need, set the EnumSize
1710                   to 0.
1711
1712                   Some compilers try to follow this strategy, but don't care
1713                   whether the enumeration has signed values or not. They
1714                   always declare an enum as signed. On such a compiler, given
1715
1716                     enum one { ONE = -100, TWO = 100 };
1717                     enum two { ONE =  100, TWO = 200 };
1718
1719                   enum "one" will occupy only one byte, while enum "two" will
1720                   occupy two bytes, even though it could be represented by a
1721                   unsigned one-byte value. If this is the behaviour of your
1722                   compiler, set EnumSize to "-1".
1723
1724               "Alignment" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8 ⎪ 16
1725                   Set the struct member alignment. This option controls where
1726                   padding bytes are inserted between struct members. It glob‐
1727                   ally sets the alignment for all structs/unions. However,
1728                   this can be overridden from within the source code with the
1729                   common "pack" pragma as explained in "Supported pragma
1730                   directives".  The default alignment is 1, which means no
1731                   padding bytes are inserted. A setting of 0 means native
1732                   alignment, i.e.  the alignment of the system that Con‐
1733                   vert::Binary::C has been compiled on. You can determine the
1734                   native properties using the "native" function.
1735
1736                   The "Alignment" option is similar to the "-Zp[n]" option of
1737                   the Intel compiler. It globally specifies the maximum
1738                   boundary to which struct members are aligned. Consider the
1739                   following structure and the sizes of "char", "short",
1740                   "long" and "double" being 1, 2, 4 and 8, respectively.
1741
1742                     struct align {
1743                       char   a;
1744                       short  b, c;
1745                       long   d;
1746                       double e;
1747                     };
1748
1749                   With an alignment of 1 (the default), the struct members
1750                   would be packed tightly:
1751
1752                     0   1   2   3   4   5   6   7   8   9  10  11  12
1753                     +---+---+---+---+---+---+---+---+---+---+---+---+
1754                     ⎪ a ⎪   b   ⎪   c   ⎪       d       ⎪             ...
1755                     +---+---+---+---+---+---+---+---+---+---+---+---+
1756
1757                        12  13  14  15  16  17
1758                         +---+---+---+---+---+
1759                     ...     e               ⎪
1760                         +---+---+---+---+---+
1761
1762                   With an alignment of 2, the struct members larger than one
1763                   byte would be aligned to 2-byte boundaries, which results
1764                   in a single padding byte between "a" and "b".
1765
1766                     0   1   2   3   4   5   6   7   8   9  10  11  12
1767                     +---+---+---+---+---+---+---+---+---+---+---+---+
1768                     ⎪ a ⎪ * ⎪   b   ⎪   c   ⎪       d       ⎪         ...
1769                     +---+---+---+---+---+---+---+---+---+---+---+---+
1770
1771                        12  13  14  15  16  17  18
1772                         +---+---+---+---+---+---+
1773                     ...         e               ⎪
1774                         +---+---+---+---+---+---+
1775
1776                   With an alignment of 4, the struct members of size 2 would
1777                   be aligned to 2-byte boundaries and larger struct members
1778                   would be aligned to 4-byte boundaries:
1779
1780                     0   1   2   3   4   5   6   7   8   9  10  11  12
1781                     +---+---+---+---+---+---+---+---+---+---+---+---+
1782                     ⎪ a ⎪ * ⎪   b   ⎪   c   ⎪ * ⎪ * ⎪       d       ⎪ ...
1783                     +---+---+---+---+---+---+---+---+---+---+---+---+
1784
1785                        12  13  14  15  16  17  18  19  20
1786                         +---+---+---+---+---+---+---+---+
1787                     ... ⎪               e               ⎪
1788                         +---+---+---+---+---+---+---+---+
1789
1790                   This layout of the struct members allows the compiler to
1791                   generate optimized code because aligned members can be
1792                   accessed more easily by the underlying architecture.
1793
1794                   Finally, setting the alignment to 8 will align "double"s to
1795                   8-byte boundaries:
1796
1797                     0   1   2   3   4   5   6   7   8   9  10  11  12
1798                     +---+---+---+---+---+---+---+---+---+---+---+---+
1799                     ⎪ a ⎪ * ⎪   b   ⎪   c   ⎪ * ⎪ * ⎪       d       ⎪ ...
1800                     +---+---+---+---+---+---+---+---+---+---+---+---+
1801
1802                        12  13  14  15  16  17  18  19  20  21  22  23  24
1803                         +---+---+---+---+---+---+---+---+---+---+---+---+
1804                     ... ⎪ * ⎪ * ⎪ * ⎪ * ⎪               e               ⎪
1805                         +---+---+---+---+---+---+---+---+---+---+---+---+
1806
1807                   Further increasing the alignment does not alter the layout
1808                   of our structure, as only members larger that 8 bytes would
1809                   be affected.
1810
1811                   The alignment of a structure depends on its largest member
1812                   and on the setting of the "Alignment" option. With "Align‐
1813                   ment" set to 2, a structure holding a "long" would be
1814                   aligned to a 2-byte boundary, while a structure containing
1815                   only "char"s would have no alignment restrictions. (Unfor‐
1816                   tunately, that's not the whole story. See the "Com‐
1817                   poundAlignment" option for details.)
1818
1819                   Here's another example. Assuming 8-byte alignment, the fol‐
1820                   lowing two structs will both have a size of 16 bytes:
1821
1822                     struct one {
1823                       char   c;
1824                       double d;
1825                     };
1826
1827                     struct two {
1828                       double d;
1829                       char   c;
1830                     };
1831
1832                   This is clear for "struct one", because the member "d" has
1833                   to be aligned to an 8-byte boundary, and thus 7 padding
1834                   bytes are inserted after "c". But for "struct two", the
1835                   padding bytes are inserted at the end of the structure,
1836                   which doesn't make much sense immediately. However, it
1837                   makes perfect sense if you think about an array of "struct
1838                   two". Each "double" has to be aligned to an 8-byte bound‐
1839                   ary, an thus each array element would have to occupy 16
1840                   bytes. With that in mind, it would be strange if a "struct
1841                   two" variable would have a different size. And it would
1842                   make the widely used construct
1843
1844                     struct two array[] = { {1.0, 0}, {2.0, 1} };
1845                     int elements = sizeof(array) / sizeof(struct two);
1846
1847                   impossible.
1848
1849                   The alignment behaviour described here seems to be common
1850                   for all compilers. However, not all compilers have an
1851                   option to configure their default alignment.
1852
1853               "CompoundAlignment" => 0 ⎪ 1 ⎪ 2 ⎪ 4 ⎪ 8 ⎪ 16
1854                   Usually, the alignment of a compound (i.e. a "struct" or a
1855                   "union") depends only on its largest member and on the set‐
1856                   ting of the "Alignment" option. There are, however, archi‐
1857                   tectures and compilers where compounds can have different
1858                   alignment constraints.
1859
1860                   For most platforms and compilers, the alignment constraint
1861                   for compounds is 1 byte. That is, on most platforms
1862
1863                     struct onebyte {
1864                       char byte;
1865                     };
1866
1867                   will have an alignment of 1 and also a size of 1. But if
1868                   you take an ARM architecture, the above "struct onebyte"
1869                   will have an alignment of 4, and thus also a size of 4.
1870
1871                   You can configure this by setting "CompoundAlignment" to 4.
1872                   This will ensure that the alignment of compounds is always
1873                   4.
1874
1875                   Setting "CompoundAlignment" to 0 means native compound
1876                   alignment, i.e. the compound alignment of the system that
1877                   Convert::Binary::C has been compiled on. You can determine
1878                   the native properties using the "native" function.
1879
1880                   There are also compilers for certain platforms that allow
1881                   you to adjust the compound alignment. If you're not aware
1882                   of the fact that your compiler/architecture has a compound
1883                   alignment other than 1, strange things can happen. If, for
1884                   example, the compound alignment is 2 and you have something
1885                   like
1886
1887                     typedef unsigned char U8;
1888
1889                     struct msg_head {
1890                       U8 cmd;
1891                       struct {
1892                         U8 hi;
1893                         U8 low;
1894                       } crc16;
1895                       U8 len;
1896                     };
1897
1898                   there will be one padding byte inserted before the embedded
1899                   "crc16" struct and after the "len" member, which is most
1900                   probably not what was intended:
1901
1902                     0     1     2     3     4     5     6
1903                     +-----+-----+-----+-----+-----+-----+
1904                     ⎪ cmd ⎪  *  ⎪ hi  ⎪ low ⎪ len ⎪  *  ⎪
1905                     +-----+-----+-----+-----+-----+-----+
1906
1907                   Note that both "#pragma pack" and the "Alignment" option
1908                   can override "CompoundAlignment". If you set "Com‐
1909                   poundAlignment" to 4, but "Alignment" to 2, compounds will
1910                   actually be aligned on 2-byte boundaries.
1911
1912               "ByteOrder" => 'BigEndian' ⎪ 'LittleEndian'
1913                   Set the byte order for integers larger than a single byte.
1914                   Little endian (Intel, least significant byte first) and big
1915                   endian (Motorola, most significant byte first) byte order
1916                   are supported. The default byte order is the same as the
1917                   byte order of the host system unless overridden by
1918                   "CBC_DEFAULT_BYTEORDER" at compile time.
1919
1920               "EnumType" => 'Integer' ⎪ 'String' ⎪ 'Both'
1921                   This option controls the type that enumeration constants
1922                   will have in data structures returned by the "unpack"
1923                   method.  If you have the following definitions:
1924
1925                     typedef enum {
1926                       SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1927                       THURSDAY, FRIDAY, SATURDAY
1928                     } Weekday;
1929
1930                     typedef enum {
1931                       JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1932                       AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1933                     } Month;
1934
1935                     typedef struct {
1936                       int     year;
1937                       Month   month;
1938                       int     day;
1939                       Weekday weekday;
1940                     } Date;
1941
1942                   and a byte string that holds a packed Date struct, then
1943                   you'll get the following results from a call to the
1944                   "unpack" method.
1945
1946                   "Integer"
1947                       Enumeration constants are returned as plain integers.
1948                       This is fast, but may be not very useful. It is also
1949                       the default.
1950
1951                         $date = {
1952                           'weekday' => 1,
1953                           'month' => 0,
1954                           'day' => 7,
1955                           'year' => 2002
1956                         };
1957
1958                   "String"
1959                       Enumeration constants are returned as strings. This
1960                       will create a string constant for every unpacked enu‐
1961                       meration constant and thus consumes more time and mem‐
1962                       ory. However, the result may be more useful.
1963
1964                         $date = {
1965                           'weekday' => 'MONDAY',
1966                           'month' => 'JANUARY',
1967                           'day' => 7,
1968                           'year' => 2002
1969                         };
1970
1971                   "Both"
1972                       Enumeration constants are returned as double typed
1973                       scalars.  If evaluated in string context, the enumera‐
1974                       tion constant will be a string, if evaluated in numeric
1975                       context, the enumeration constant will be an integer.
1976
1977                         $date = $c->EnumType('Both')->unpack('Date', $binary);
1978
1979                         printf "Weekday = %s (%d)\n\n", $date->{weekday},
1980                                                         $date->{weekday};
1981
1982                         if ($date->{month} == 0) {
1983                           print "It's $date->{month}, happy new year!\n\n";
1984                         }
1985
1986                         print Dumper($date);
1987
1988                       This will print:
1989
1990                         Weekday = MONDAY (1)
1991
1992                         It's JANUARY, happy new year!
1993
1994                         $VAR1 = {
1995                           'weekday' => 'MONDAY',
1996                           'month' => 'JANUARY',
1997                           'day' => 7,
1998                           'year' => 2002
1999                         };
2000
2001               "DisabledKeywords" => [ KEYWORDS ]
2002                   This option allows you to selectively deactivate certain
2003                   keywords in the C parser. Some C compilers don't have the
2004                   complete ANSI keyword set, i.e. they don't recognize the
2005                   keywords "const" or "void", for example. If you do
2006
2007                     typedef int void;
2008
2009                   on such a compiler, this will usually be ok. But if you
2010                   parse this with an ANSI compiler, it will be a syntax
2011                   error. To parse the above code correctly, you have to dis‐
2012                   able the "void" keyword in the Convert::Binary::C parser:
2013
2014                     $c->DisabledKeywords([qw( void )]);
2015
2016                   By default, the Convert::Binary::C parser will recognize
2017                   the keywords "inline" and "restrict". If your compiler
2018                   doesn't have these new keywords, it usually doesn't matter.
2019                   Only if you're using the keywords as identifiers, like in
2020
2021                     typedef struct inline {
2022                       int a, b;
2023                     } restrict;
2024
2025                   you'll have to disable these ISO-C99 keywords:
2026
2027                     $c->DisabledKeywords([qw( inline restrict )]);
2028
2029                   The parser allows you to disable the following keywords:
2030
2031                     asm
2032                     auto
2033                     const
2034                     double
2035                     enum
2036                     extern
2037                     float
2038                     inline
2039                     long
2040                     register
2041                     restrict
2042                     short
2043                     signed
2044                     static
2045                     unsigned
2046                     void
2047                     volatile
2048
2049               "KeywordMap" => { KEYWORD => TOKEN, ... }
2050                   This option allows you to add new keywords to the parser.
2051                   These new keywords can either be mapped to existing tokens
2052                   or simply ignored. For example, recent versions of the GNU
2053                   compiler recognize the keywords "__signed__" and "__exten‐
2054                   sion__".  The first one obviously is a synonym for
2055                   "signed", while the second one is only a marker for a lan‐
2056                   guage extension.
2057
2058                   Using the preprocessor, you could of course do the follow‐
2059                   ing:
2060
2061                     $c->Define(qw( __signed__=signed __extension__= ));
2062
2063                   However, the preprocessor symbols could be undefined or
2064                   redefined in the code, and
2065
2066                     #ifdef __signed__
2067                     # undef __signed__
2068                     #endif
2069
2070                     typedef __extension__ __signed__ long long s_quad;
2071
2072                   would generate a parse error, because "__signed__" is an
2073                   unexpected identifier.
2074
2075                   Instead of utilizing the preprocessor, you'll have to cre‐
2076                   ate mappings for the new keywords directly in the parser
2077                   using "KeywordMap". In the above example, you want to map
2078                   "__signed__" to the built-in C keyword "signed" and ignore
2079                   "__extension__". This could be done with the following
2080                   code:
2081
2082                     $c->KeywordMap({ __signed__    => 'signed',
2083                                      __extension__ => undef });
2084
2085                   You can specify any valid identifier as hash key, and
2086                   either a valid C keyword or "undef" as hash value.  Having
2087                   configured the object that way, you could parse even
2088
2089                     #ifdef __signed__
2090                     # undef __signed__
2091                     #endif
2092
2093                     typedef __extension__ __signed__ long long s_quad;
2094
2095                   without problems.
2096
2097                   Note that "KeywordMap" and "DisabledKeywords" perfectly
2098                   work together. You could, for example, disable the "signed"
2099                   keyword, but still have "__signed__" mapped to the original
2100                   "signed" token:
2101
2102                     $c->configure(DisabledKeywords => [ 'signed' ],
2103                                   KeywordMap       => { __signed__  => 'signed' });
2104
2105                   This would allow you to define
2106
2107                     typedef __signed__ long signed;
2108
2109                   which would normally be a syntax error because "signed"
2110                   cannot be used as an identifier.
2111
2112               "UnsignedChars" => 0 ⎪ 1
2113                   Use this boolean option if you want characters to be
2114                   unsigned if specified without an explicit "signed" or
2115                   "unsigned" type specifier.  By default, characters are
2116                   signed.
2117
2118               "UnsignedBitfields" => 0 ⎪ 1
2119                   Use this boolean option if you want bitfields to be
2120                   unsigned if specified without an explicit "signed" or
2121                   "unsigned" type specifier.  By default, bitfields are
2122                   signed.
2123
2124               "Warnings" => 0 ⎪ 1
2125                   Use this boolean option if you want warnings to be issued
2126                   during the parsing of source code. Currently, warnings are
2127                   only reported by the preprocessor, so don't expect the out‐
2128                   put to cover everything.
2129
2130                   By default, warnings are turned off and only errors will be
2131                   reported. However, even these errors are turned off if you
2132                   run without the "-w" flag.
2133
2134               "HasCPPComments" => 0 ⎪ 1
2135                   Use this option to turn C++ comments on or off. By default,
2136                   C++ comments are enabled. Disabling C++ comments may be
2137                   necessary if your code includes strange things like:
2138
2139                     one = 4 //* <- divide */ 4;
2140                     two = 2;
2141
2142                   With C++ comments, the above will be interpreted as
2143
2144                     one = 4
2145                     two = 2;
2146
2147                   which will obviously be a syntax error, but without C++
2148                   comments, it will be interpreted as
2149
2150                     one = 4 / 4;
2151                     two = 2;
2152
2153                   which is correct.
2154
2155               "HasMacroVAARGS" => 0 ⎪ 1
2156                   Use this option to turn the "__VA_ARGS__" macro expansion
2157                   on or off. If this is enabled (which is the default), you
2158                   can use variable length argument lists in your preprocessor
2159                   macros.
2160
2161                     #define DEBUG( ... )  fprintf( stderr, __VA_ARGS__ )
2162
2163                   There's normally no reason to turn that feature off.
2164
2165               "StdCVersion" => undef ⎪ INTEGER
2166                   Use this option to change the value of the preprocessor's
2167                   predefined "__STDC_VERSION__" macro. When set to "undef",
2168                   the macro will not be defined.
2169
2170               "HostedC" => undef ⎪ 0 ⎪ 1
2171                   Use this option to change the value of the preprocessor's
2172                   predefined "__STDC_HOSTED__" macro. When set to "undef",
2173                   the macro will not be defined.
2174
2175               "Include" => [ INCLUDES ]
2176                   Use this option to set the include path for the internal
2177                   preprocessor. The option value is a reference to an array
2178                   of strings, each string holding a directory that should be
2179                   searched for includes.
2180
2181               "Define" => [ DEFINES ]
2182                   Use this option to define symbols in the preprocessor.  The
2183                   option value is, again, a reference to an array of strings.
2184                   Each string can be either just a symbol or an assignment to
2185                   a symbol. This is completely equivalent to what the "-D"
2186                   option does for most preprocessors.
2187
2188                   The following will define the symbol "FOO" and define "BAR"
2189                   to be 12345:
2190
2191                     $c->configure(Define => [qw( FOO BAR=12345 )]);
2192
2193               "Assert" => [ ASSERTIONS ]
2194                   Use this option to make assertions in the preprocessor.  If
2195                   you don't know what assertions are, don't be concerned,
2196                   since they're deprecated anyway. They are, however, used in
2197                   some system's include files.  The value is an array refer‐
2198                   ence, just like for the macro definitions. Only the way the
2199                   assertions are defined is a bit different and mimics the
2200                   way they are defined with the "#assert" directive:
2201
2202                     $c->configure(Assert => ['foo(bar)']);
2203
2204               "OrderMembers" => 0 ⎪ 1
2205                   When using "unpack" on compounds and iterating over the
2206                   returned hash, the order of the compound members is gener‐
2207                   ally not preserved due to the nature of hash tables. It is
2208                   not even guaranteed that the order is the same between dif‐
2209                   ferent runs of the same program. This can be very annoying
2210                   if you simply use to dump your data structures and the com‐
2211                   pound members always show up in a different order.
2212
2213                   By setting "OrderMembers" to a non-zero value, all hashes
2214                   returned by "unpack" are tied to a class that preserves the
2215                   order of the hash keys.  This way, all compound members
2216                   will be returned in the correct order just as they are
2217                   defined in your C code.
2218
2219                     use Convert::Binary::C;
2220                     use Data::Dumper;
2221
2222                     $c = Convert::Binary::C->new->parse(<<'ENDC');
2223                     struct test {
2224                       char one;
2225                       char two;
2226                       struct {
2227                         char never;
2228                         char change;
2229                         char this;
2230                         char order;
2231                       } three;
2232                       char four;
2233                     };
2234                     ENDC
2235
2236                     $data = "Convert";
2237
2238                     $u1 = $c->unpack('test', $data);
2239                     $c->OrderMembers(1);
2240                     $u2 = $c->unpack('test', $data);
2241
2242                     print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2243
2244                   This will print something like:
2245
2246                     $u1 = {
2247                       'three' => {
2248                         'change' => 118,
2249                         'order' => 114,
2250                         'this' => 101,
2251                         'never' => 110
2252                       },
2253                       'one' => 67,
2254                       'two' => 111,
2255                       'four' => 116
2256                     };
2257                     $u2 = {
2258                       'one' => 67,
2259                       'two' => 111,
2260                       'three' => {
2261                         'never' => 110,
2262                         'change' => 118,
2263                         'this' => 101,
2264                         'order' => 114
2265                       },
2266                       'four' => 116
2267                     };
2268
2269                   To be able to use this option, you have to install either
2270                   the Tie::Hash::Indexed or the Tie::IxHash module. If both
2271                   are installed, Convert::Binary::C will give preference to
2272                   Tie::Hash::Indexed because it's faster.
2273
2274                   When using this option, you should keep in mind that tied
2275                   hashes are significantly slower and consume more memory
2276                   than ordinary hashes, even when the class they're tied to
2277                   is implemented efficiently. So don't turn this option on if
2278                   you don't have to.
2279
2280                   You can also influence hash member ordering by using the
2281                   "CBC_ORDER_MEMBERS" environment variable.
2282
2283               "Bitfields" => { OPTION => VALUE, ... }
2284                   Use this option to specify and configure a bitfield layout‐
2285                   ing engine. You can choose an engine by passing its name to
2286                   the "Engine" option, like:
2287
2288                     $c->configure(Bitfields => { Engine => 'Generic' });
2289
2290                   Each engine can have its own set of options, although cur‐
2291                   rently none of them does.
2292
2293                   You can choose between the following bitfield engines:
2294
2295                   "Generic"
2296                       This engine implements the behaviour of most UNIX C
2297                       compilers, including GCC. It does not handle packed
2298                       bitfields yet.
2299
2300                   "Microsoft"
2301                       This engine implements the behaviour of Microsoft's
2302                       "cl" compiler.  It should be fairly complete and can
2303                       handle packed bitfields.
2304
2305                   "Simple"
2306                       This engine is only used for testing the bitfield in‐
2307                       frastructure in Convert::Binary::C. There's usually no
2308                       reason to use it.
2309
2310               You can reconfigure all options even after you have parsed some
2311               code. The changes will be applied to the already parsed defini‐
2312               tions. This works as long as array lengths are not affected by
2313               the changes. If you have Alignment and IntSize set to 4 and
2314               parse code like this
2315
2316                 typedef struct {
2317                   char abc;
2318                   int  day;
2319                 } foo;
2320
2321                 struct bar {
2322                   foo  zap[2*sizeof(foo)];
2323                 };
2324
2325               the array "zap" in "struct bar" will obviously have 16 ele‐
2326               ments. If you reconfigure the alignment to 1 now, the size of
2327               "foo" is now 5 instead of 8. While the alignment is adjusted
2328               correctly, the number of elements in array "zap" will still be
2329               16 and will not be changed to 10.
2330
2331       parse
2332
2333       "parse" CODE
2334               Parses a string of valid C code. All enumeration, compound and
2335               type definitions are extracted. You can call the "parse" and
2336               "parse_file" methods as often as you like to add further defi‐
2337               nitions to the Convert::Binary::C object.
2338
2339               "parse" will throw an exception if an error occurs.  On suc‐
2340               cess, the method returns a reference to its object.
2341
2342               See "Parsing C code" for an example.
2343
2344       parse_file
2345
2346       "parse_file" FILE
2347               Parses a C source file. All enumeration, compound and type def‐
2348               initions are extracted. You can call the "parse" and
2349               "parse_file" methods as often as you like to add further defi‐
2350               nitions to the Convert::Binary::C object.
2351
2352               "parse_file" will search the include path given via the
2353               "Include" option for the file if it cannot find it in the cur‐
2354               rent directory.
2355
2356               "parse_file" will throw an exception if an error occurs. On
2357               success, the method returns a reference to its object.
2358
2359               See "Parsing C code" for an example.
2360
2361               When calling "parse" or "parse_file" multiple times, you may
2362               use types previously defined, but you are not allowed to rede‐
2363               fine types. The state of the preprocessor is also saved, so you
2364               may also use defines from a previous parse. This works only as
2365               long as the preprocessor is not reset. See "Preprocessor con‐
2366               figuration" for details.
2367
2368               When you're parsing C source files instead of C header files,
2369               note that local definitions are ignored. This means that type
2370               definitions hidden within functions will not be recognized by
2371               Convert::Binary::C. This is necessary because different func‐
2372               tions (even different blocks within the same function) can
2373               define types with the same name:
2374
2375                 void my_func(int i)
2376                 {
2377                   if (i < 10)
2378                   {
2379                     enum digit { ONE, TWO, THREE } x = ONE;
2380                     printf("%d, %d\n", i, x);
2381                   }
2382                   else
2383                   {
2384                     enum digit { THREE, TWO, ONE } x = ONE;
2385                     printf("%d, %d\n", i, x);
2386                   }
2387                 }
2388
2389               The above is a valid piece of C code, but it's not possible for
2390               Convert::Binary::C to distinguish between the different defini‐
2391               tions of "enum digit", as they're only defined locally within
2392               the corresponding block.
2393
2394       clean
2395
2396       "clean" Clears all information that has been collected during previous
2397               calls to "parse" or "parse_file".  You can use this method if
2398               you want to parse some entirely different code, but with the
2399               same configuration.
2400
2401               The "clean" method returns a reference to its object.
2402
2403       clone
2404
2405       "clone" Makes the object return an exact independent copy of itself.
2406
2407                 $c = new Convert::Binary::C Include => ['/usr/include'];
2408                 $c->parse_file('definitions.c');
2409                 $clone = $c->clone;
2410
2411               The above code is technically equivalent (Mostly. Actually,
2412               using "sourcify" and "parse" might alter the order of the
2413               parsed data, which would make methods such as "compound" return
2414               the definitions in a different order.) to:
2415
2416                 $c = new Convert::Binary::C Include => ['/usr/include'];
2417                 $c->parse_file('definitions.c');
2418                 $clone = new Convert::Binary::C %{$c->configure};
2419                 $clone->parse($c->sourcify);
2420
2421               Using "clone" is just a lot faster.
2422
2423       def
2424
2425       "def" NAME
2426       "def" TYPE
2427               If you need to know if a definition for a certain type name
2428               exists, use this method. You pass it the name of an enum,
2429               struct, union or typedef, and it will return a non-empty string
2430               being either "enum", "struct", "union", or "typedef" if there's
2431               a definition for the type in question, an empty string if
2432               there's no such definition, or "undef" if the name is com‐
2433               pletely unknown. If the type can be interpreted as a basic
2434               type, "basic" will be returned.
2435
2436               If you pass in a TYPE, the output will be slightly different.
2437               If the specified member exists, the "def" method will return
2438               "member". If the member doesn't exist, or if the type cannot
2439               have members, the empty string will be returned. Again, if the
2440               name of the type is completely unknown, "undef" will be
2441               returned. This may be useful if you want to check if a certain
2442               member exists within a compound, for example.
2443
2444                 use Convert::Binary::C;
2445
2446                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2447
2448                 typedef struct __not  not;
2449                 typedef struct __not *ptr;
2450
2451                 struct foo {
2452                   enum bar *xxx;
2453                 };
2454
2455                 typedef int quad[4];
2456
2457                 ENDC
2458
2459                 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2460                                   quad quad[3] quad[5] quad[-3] short[1] ),
2461                               'unsigned long')
2462                 {
2463                   my $def = $c->def($type);
2464                   printf "%-14s  =>  %s\n",
2465                           $type,     defined $def ? "'$def'" : 'undef';
2466                 }
2467
2468               The following would be returned by the "def" method:
2469
2470                 not             =>  ''
2471                 ptr             =>  'typedef'
2472                 foo             =>  'struct'
2473                 bar             =>  ''
2474                 xxx             =>  undef
2475                 foo.xxx         =>  'member'
2476                 foo.abc         =>  ''
2477                 xxx.yyy         =>  undef
2478                 quad            =>  'typedef'
2479                 quad[3]         =>  'member'
2480                 quad[5]         =>  'member'
2481                 quad[-3]        =>  'member'
2482                 short[1]        =>  undef
2483                 unsigned long   =>  'basic'
2484
2485               So, if "def" returns a non-empty string, you can safely use any
2486               other method with that type's name or with that member expres‐
2487               sion.
2488
2489               Concerning arrays, note that the index into an array doesn't
2490               need to be within the bounds of the array's definition, just
2491               like in C. In the above example, "quad[5]" and "quad[-3]" are
2492               valid members of the "quad" array, even though it is declared
2493               to have only four elements.
2494
2495               In cases where the typedef namespace overlaps with the names‐
2496               pace of enums/structs/unions, the "def" method will give pref‐
2497               erence to the typedef and will thus return the string "type‐
2498               def". You could however force interpretation as an enum, struct
2499               or union by putting "enum", "struct" or "union" in front of the
2500               type's name.
2501
2502       defined
2503
2504       "defined" MACRO
2505               You can use the "defined" method to find out if a certain macro
2506               is defined, just like you would use the "defined" operator of
2507               the preprocessor. For example, the following code
2508
2509                 use Convert::Binary::C;
2510
2511                 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2512
2513                 #define ADD(a, b) ((a) + (b))
2514
2515                 #if 1
2516                 # define DEFINED
2517                 #else
2518                 # define UNDEFINED
2519                 #endif
2520
2521                 ENDC
2522
2523                 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2524                   my $not = $c->defined($macro) ? '' : ' not';
2525                   print "Macro '$macro' is$not defined.\n";
2526                 }
2527
2528               would print:
2529
2530                 Macro 'ADD' is defined.
2531                 Macro 'DEFINED' is defined.
2532                 Macro 'UNDEFINED' is not defined.
2533
2534               You have to keep in mind that this works only as long as the
2535               preprocessor is not reset. See "Preprocessor configuration" for
2536               details.
2537
2538       pack
2539
2540       "pack" TYPE
2541       "pack" TYPE, DATA
2542       "pack" TYPE, DATA, STRING
2543               Use this method to pack a complex data structure into a binary
2544               string according to a type definition that has been previously
2545               parsed. DATA must be a scalar matching the type definition. C
2546               structures and unions are represented by references to Perl
2547               hashes, C arrays by references to Perl arrays.
2548
2549                 use Convert::Binary::C;
2550                 use Data::Dumper;
2551                 use Data::Hexdumper;
2552
2553                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2554                                             , LongSize  => 4
2555                                             , ShortSize => 2
2556                                             )
2557                                        ->parse(<<'ENDC');
2558                 struct test {
2559                   char    ary[3];
2560                   union {
2561                     short word[2];
2562                     long  quad;
2563                   }       uni;
2564                 };
2565                 ENDC
2566
2567               Hashes don't have to contain a key for each compound member and
2568               arrays may be truncated:
2569
2570                 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2571
2572               Elements not defined in the Perl data structure will be set to
2573               zero in the packed byte string. If you pass "undef" as or sim‐
2574               ply omit the second parameter, the whole string will be ini‐
2575               tialized with zero bytes. On success, the packed byte string is
2576               returned.
2577
2578                 print hexdump(data => $binary);
2579
2580               The above code would print:
2581
2582                   0x0000 : 01 02 00 00 00 00 2A                            : ......*
2583
2584               You could also use "unpack" and dump the data structure.
2585
2586                 $unpacked = $c->unpack('test', $binary);
2587                 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2588
2589               This would print:
2590
2591                 $unpacked = {
2592                   'uni' => {
2593                     'word' => [
2594                       0,
2595                       42
2596                     ],
2597                     'quad' => 42
2598                   },
2599                   'ary' => [
2600                     1,
2601                     2,
2602                     0
2603                   ]
2604                 };
2605
2606               If TYPE refers to a compound object, you may pack any member of
2607               that compound object. Simply add a member expression to the
2608               type name, just as you would access the member in C:
2609
2610                 $array = $c->pack('test.ary', [1, 2, 3]);
2611                 print hexdump(data => $array);
2612
2613                 $value = $c->pack('test.uni.word[1]', 2);
2614                 print hexdump(data => $value);
2615
2616               This would give you:
2617
2618                   0x0000 : 01 02 03                                        : ...
2619                   0x0000 : 00 02                                           : ..
2620
2621               Call "pack" with the optional STRING argument if you want to
2622               use an existing binary string to insert the data.  If called in
2623               a void context, "pack" will directly modify the string you
2624               passed as the third argument.  Otherwise, a copy of the string
2625               is created, and "pack" will modify and return the copy, so the
2626               original string will remain unchanged.
2627
2628               The 3-argument version may be useful if you want to change only
2629               a few members of a complex data structure without having to
2630               "unpack" everything, change the members, and then "pack" again
2631               (which could waste lots of memory and CPU cycles). So, instead
2632               of doing something like
2633
2634                 $test = $c->unpack('test', $binary);
2635                 $test->{uni}{quad} = 4711;
2636                 $new = $c->pack('test', $test);
2637
2638               to change the "uni.quad" member of $packed, you could simply do
2639               either
2640
2641                 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2642
2643               or
2644
2645                 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2646
2647               while the latter would directly modify $packed.  Besides this
2648               code being a lot shorter (and perhaps even more readable), it
2649               can be significantly faster if you're dealing with really big
2650               data blocks.
2651
2652               If the length of the input string is less than the size
2653               required by the type, the string (or its copy) is extended and
2654               the extended part is initialized to zero.  If the length is
2655               more than the size required by the type, the string is kept at
2656               that length, and also a copy would be an exact copy of that
2657               string.
2658
2659                 $too_short = pack "C*", (1 .. 4);
2660                 $too_long  = pack "C*", (1 .. 20);
2661
2662                 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2663                 print "too_short:\n", hexdump(data => $too_short);
2664
2665                 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2666                 print "\ncopy:\n", hexdump(data => $copy);
2667
2668               This would print:
2669
2670                 too_short:
2671                   0x0000 : 01 02 03 00 00 47 11                            : .....G.
2672
2673                 copy:
2674                   0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2675                   0x0010 : 11 12 13 14                                     : ....
2676
2677       unpack
2678
2679       "unpack" TYPE, STRING
2680               Use this method to unpack a binary string and create an arbi‐
2681               trarily complex Perl data structure based on a previously
2682               parsed type definition.
2683
2684                 use Convert::Binary::C;
2685                 use Data::Dumper;
2686
2687                 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2688                                             , LongSize  => 4
2689                                             , ShortSize => 2
2690                                             )
2691                                        ->parse( <<'ENDC' );
2692                 struct test {
2693                   char    ary[3];
2694                   union {
2695                     short word[2];
2696                     long *quad;
2697                   }       uni;
2698                 };
2699                 ENDC
2700
2701                 # Generate some binary dummy data
2702                 $binary = pack "C*", 1 .. $c->sizeof('test');
2703
2704               On failure, e.g. if the specified type cannot be found, the
2705               method will throw an exception. On success, a reference to a
2706               complex Perl data structure is returned, which can directly be
2707               dumped using the Data::Dumper module:
2708
2709                 $unpacked = $c->unpack('test', $binary);
2710                 print Dumper($unpacked);
2711
2712               This would print:
2713
2714                 $VAR1 = {
2715                   'uni' => {
2716                     'word' => [
2717                       1029,
2718                       1543
2719                     ],
2720                     'quad' => 67438087
2721                   },
2722                   'ary' => [
2723                     1,
2724                     2,
2725                     3
2726                   ]
2727                 };
2728
2729               If TYPE refers to a compound object, you may unpack any member
2730               of that compound object. Simply add a member expression to the
2731               type name, just as you would access the member in C:
2732
2733                 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2734
2735                 $unpack1 = $unpacked->{uni}{word};
2736                 $unpack2 = $c->unpack('test.uni.word', $binary2);
2737
2738                 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2739
2740               You will find that the output is exactly the same for both
2741               $unpack1 and $unpack2:
2742
2743                 $unpack1 = [
2744                   1029,
2745                   1543
2746                 ];
2747                 $unpack2 = [
2748                   1029,
2749                   1543
2750                 ];
2751
2752               When "unpack" is called in list context, it will unpack as many
2753               elements as possible from STRING, including zero if STRING is
2754               not long enough.
2755
2756       initializer
2757
2758       "initializer" TYPE
2759       "initializer" TYPE, DATA
2760               The "initializer" method can be used retrieve an initializer
2761               string for a certain TYPE.  This can be useful if you have to
2762               initialize only a couple of members in a huge compound type or
2763               if you simply want to generate initializers automatically.
2764
2765                 struct date {
2766                   unsigned year : 12;
2767                   unsigned month:  4;
2768                   unsigned day  :  5;
2769                   unsigned hour :  5;
2770                   unsigned min  :  6;
2771                 };
2772
2773                 typedef struct {
2774                   enum { DATE, QWORD } type;
2775                   short number;
2776                   union {
2777                     struct date   date;
2778                     unsigned long qword;
2779                   } choice;
2780                 } data;
2781
2782               Given the above code has been parsed
2783
2784                 $init = $c->initializer('data');
2785                 print "data x = $init;\n";
2786
2787               would print the following:
2788
2789                 data x = {
2790                       0,
2791                       0,
2792                       {
2793                               {
2794                                       0,
2795                                       0,
2796                                       0,
2797                                       0,
2798                                       0
2799                               }
2800                       }
2801                 };
2802
2803               You could directly put that into a C program, although it prob‐
2804               ably isn't very useful yet. It becomes more useful if you actu‐
2805               ally specify how you want to initialize the type:
2806
2807                 $data = {
2808                   type   => 'QWORD',
2809                   choice => {
2810                     date  => { month => 12, day => 24 },
2811                     qword => 4711,
2812                   },
2813                   stuff => 'yes?',
2814                 };
2815
2816                 $init = $c->initializer('data', $data);
2817                 print "data x = $init;\n";
2818
2819               This would print the following:
2820
2821                 data x = {
2822                       QWORD,
2823                       0,
2824                       {
2825                               {
2826                                       0,
2827                                       12,
2828                                       24,
2829                                       0,
2830                                       0
2831                               }
2832                       }
2833                 };
2834
2835               As only the first member of a "union" can be initialized,
2836               "choice.qword" is ignored. You will not be warned about the
2837               fact that you probably tried to initialize a member other than
2838               the first. This is considered a feature, because it allows you
2839               to use "unpack" to generate the initializer data:
2840
2841                 $data = $c->unpack('data', $binary);
2842                 $init = $c->initializer('data', $data);
2843
2844               Since "unpack" unpacks all union members, you would otherwise
2845               have to delete all but the first one previous to feeding it
2846               into "initializer".
2847
2848               Also, "stuff" is ignored, because it actually isn't a member of
2849               "data". You won't be warned about that either.
2850
2851       sizeof
2852
2853       "sizeof" TYPE
2854               This method will return the size of a C type in bytes.  If it
2855               cannot find the type, it will throw an exception.
2856
2857               If the type defines some kind of compound object, you may ask
2858               for the size of a member of that compound object:
2859
2860                 $size = $c->sizeof('test.uni.word[1]');
2861
2862               This would set $size to 2.
2863
2864       typeof
2865
2866       "typeof" TYPE
2867               This method will return the type of a C member.  While this
2868               only makes sense for compound types, it's legal to also use it
2869               for non-compound types.  If it cannot find the type, it will
2870               throw an exception.
2871
2872               The "typeof" method can be used on any valid member, even on
2873               arrays or unnamed types. It will always return a string that
2874               holds the name (or in case of unnamed types only the class) of
2875               the type, optionally followed by a '*' character to indicate
2876               it's a pointer type, and optionally followed by one or more
2877               array dimensions if it's an array type. If the type is a bit‐
2878               field, the type name is followed by a colon and the number of
2879               bits.
2880
2881                 struct test {
2882                   char    ary[3];
2883                   union {
2884                     short word[2];
2885                     long *quad;
2886                   }       uni;
2887                   struct {
2888                     unsigned short six:6;
2889                     unsigned short ten:10;
2890                   }       bits;
2891                 };
2892
2893               Given the above C code has been parsed, calls to "typeof" would
2894               return the following values:
2895
2896                 $c->typeof('test')             => 'struct test'
2897                 $c->typeof('test.ary')         => 'char [3]'
2898                 $c->typeof('test.uni')         => 'union'
2899                 $c->typeof('test.uni.quad')    => 'long *'
2900                 $c->typeof('test.uni.word')    => 'short [2]'
2901                 $c->typeof('test.uni.word[1]') => 'short'
2902                 $c->typeof('test.bits')        => 'struct'
2903                 $c->typeof('test.bits.six')    => 'unsigned short :6'
2904                 $c->typeof('test.bits.ten')    => 'unsigned short :10'
2905
2906       offsetof
2907
2908       "offsetof" TYPE, MEMBER
2909               You can use "offsetof" just like the C macro of same denomina‐
2910               tor. It will simply return the offset (in bytes) of MEMBER rel‐
2911               ative to TYPE.
2912
2913                 use Convert::Binary::C;
2914
2915                 $c = Convert::Binary::C->new( Alignment   => 4
2916                                             , LongSize    => 4
2917                                             , PointerSize => 4
2918                                             )
2919                                        ->parse(<<'ENDC');
2920                 typedef struct {
2921                   char abc;
2922                   long day;
2923                   int *ptr;
2924                 } week;
2925
2926                 struct test {
2927                   week zap[8];
2928                 };
2929                 ENDC
2930
2931                 @args = (
2932                   ['test',        'zap[5].day'  ],
2933                   ['test.zap[2]', 'day'         ],
2934                   ['test',        'zap[5].day+1'],
2935                   ['test',        'zap[-3].ptr' ],
2936                 );
2937
2938                 for (@args) {
2939                   my $offset = eval { $c->offsetof(@$_) };
2940                   printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2941                 }
2942
2943               The final loop will print:
2944
2945                 $c->offsetof('test', 'zap[5].day') => 64
2946                 $c->offsetof('test.zap[2]', 'day') => 4
2947                 $c->offsetof('test', 'zap[5].day+1') => 65
2948                 $c->offsetof('test', 'zap[-3].ptr') => -28
2949
2950               * The first iteration simply shows that the offset of
2951                 "zap[5].day" is 64 relative to the beginning of "struct
2952                 test".
2953
2954               * You may additionally specify a member for the type passed as
2955                 the first argument, as shown in the second iteration.
2956
2957               * The offset suffix is also supported by "offsetof", so the
2958                 third iteration will correctly print 65.
2959
2960               * The last iteration demonstrates that even out-of-bounds array
2961                 indices are handled correctly, just as they are handled in C.
2962
2963               Unlike the C macro, "offsetof" also works on array types.
2964
2965                 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2966                 print "offset = $offset";
2967
2968               This will print:
2969
2970                 offset = 46
2971
2972               If TYPE is a compound, MEMBER may optionally be prefixed with a
2973               dot, so
2974
2975                 printf "offset = %d\n", $c->offsetof('week', 'day');
2976                 printf "offset = %d\n", $c->offsetof('week', '.day');
2977
2978               are both equivalent and will print
2979
2980                 offset = 4
2981                 offset = 4
2982
2983               This allows to
2984
2985               * use the C macro style, without a leading dot, and
2986
2987               * directly use the output of the "member" method, which
2988                 includes a leading dot for compound types, as input for the
2989                 MEMBER argument.
2990
2991       member
2992
2993       "member" TYPE
2994       "member" TYPE, OFFSET
2995               You can think of "member" as being the reverse of the "off‐
2996               setof" method. However, as this is more complex, there's no
2997               equivalent to "member" in the C language.
2998
2999               Usually this method is used if you want to retrieve the name of
3000               the member that is located at a specific offset of a previously
3001               parsed type.
3002
3003                 use Convert::Binary::C;
3004
3005                 $c = Convert::Binary::C->new( Alignment   => 4
3006                                             , LongSize    => 4
3007                                             , PointerSize => 4
3008                                             )
3009                                        ->parse(<<'ENDC');
3010                 typedef struct {
3011                   char abc;
3012                   long day;
3013                   int *ptr;
3014                 } week;
3015
3016                 struct test {
3017                   week zap[8];
3018                 };
3019                 ENDC
3020
3021                 for my $offset (24, 39, 69, 99) {
3022                   print "\$c->member('test', $offset)";
3023                   my $member = eval { $c->member('test', $offset) };
3024                   print $@ ? "\n  exception: $@" : " => '$member'\n";
3025                 }
3026
3027               This will print:
3028
3029                 $c->member('test', 24) => '.zap[2].abc'
3030                 $c->member('test', 39) => '.zap[3]+3'
3031                 $c->member('test', 69) => '.zap[5].ptr+1'
3032                 $c->member('test', 99)
3033                   exception: Offset 99 out of range (0 <= offset < 96)
3034
3035               * The output of the first iteration is obvious. The member
3036                 "zap[2].abc" is located at offset 24 of "struct test".
3037
3038               * In the second iteration, the offset points into a region of
3039                 padding bytes and thus no member of "week" can be named.
3040                 Instead of a member name the offset relative to "zap[3]" is
3041                 appended.
3042
3043               * In the third iteration, the offset points to "zap[5].ptr".
3044                 However, "zap[5].ptr" is located at 68, not at 69, and thus
3045                 the remaining offset of 1 is also appended.
3046
3047               * The last iteration causes an exception because the offset of
3048                 99 is not valid for "struct test" since the size of "struct
3049                 test" is only 96. You might argue that this is inconsistent,
3050                 since "offsetof" can also handle out-of-bounds array members.
3051                 But as soon as you have more than one level of array nesting,
3052                 there's an infinite number of out-of-bounds members for a
3053                 single given offset, so it would be impossible to return a
3054                 list of all members.
3055
3056               You can additionally specify a member for the type passed as
3057               the first argument:
3058
3059                 $member = $c->member('test.zap[2]', 6);
3060                 print $member;
3061
3062               This will print:
3063
3064                 .day+2
3065
3066               Like "offsetof", "member" also works on array types:
3067
3068                 $member = $c->member('test.zap', 42);
3069                 print $member;
3070
3071               This will print:
3072
3073                 [3].day+2
3074
3075               While the behaviour for "struct"s is quite obvious, the behav‐
3076               iour for "union"s is rather tricky. As a single offset usually
3077               references more than one member of a union, there are certain
3078               rules that the algorithm uses for determining the best member.
3079
3080               * The first non-compound member that is referenced without an
3081                 offset has the highest priority.
3082
3083               * If no member is referenced without an offset, the first non-
3084                 compound member that is referenced with an offset will be
3085                 returned.
3086
3087               * Otherwise the first padding region that is encountered will
3088                 be taken.
3089
3090               As an example, given 4-byte-alignment and the union
3091
3092                 union choice {
3093                   struct {
3094                     char  color[2];
3095                     long  size;
3096                     char  taste;
3097                   }       apple;
3098                   char    grape[3];
3099                   struct {
3100                     long  weight;
3101                     short price[3];
3102                   }       melon;
3103                 };
3104
3105               the "member" method would return what is shown in the Member
3106               column of the following table. The Type column shows the result
3107               of the "typeof" method when passing the corresponding member.
3108
3109                 Offset   Member               Type
3110                 --------------------------------------
3111                    0     .apple.color[0]      'char'
3112                    1     .apple.color[1]      'char'
3113                    2     .grape[2]            'char'
3114                    3     .melon.weight+3      'long'
3115                    4     .apple.size          'long'
3116                    5     .apple.size+1        'long'
3117                    6     .melon.price[1]      'short'
3118                    7     .apple.size+3        'long'
3119                    8     .apple.taste         'char'
3120                    9     .melon.price[2]+1    'short'
3121                   10     .apple+10            'struct'
3122                   11     .apple+11            'struct'
3123
3124               It's like having a stack of all the union members and looking
3125               through the stack for the shiniest piece you can see. The
3126               beginning of a member (denoted by uppercase letters) is always
3127               shinier than the rest of a member, while padding regions
3128               (denoted by dashes) aren't shiny at all.
3129
3130                 Offset   0   1   2   3   4   5   6   7   8   9  10  11
3131                 -------------------------------------------------------
3132                 apple   (C) (C)  -   -  (S) (s)  s  (s) (T)  -  (-) (-)
3133                 grape    G   G  (G)
3134                 melon    W   w   w  (w)  P   p  (P)  p   P  (p)  -   -
3135
3136               If you look through that stack from top to bottom, you'll end
3137               up at the parenthesized members.
3138
3139               Alternatively, if you're not only interested in the best mem‐
3140               ber, you can call "member" in list context, which makes it
3141               return all members referenced by the given offset.
3142
3143                 Offset   Member               Type
3144                 --------------------------------------
3145                    0     .apple.color[0]      'char'
3146                          .grape[0]            'char'
3147                          .melon.weight        'long'
3148                    1     .apple.color[1]      'char'
3149                          .grape[1]            'char'
3150                          .melon.weight+1      'long'
3151                    2     .grape[2]            'char'
3152                          .melon.weight+2      'long'
3153                          .apple+2             'struct'
3154                    3     .melon.weight+3      'long'
3155                          .apple+3             'struct'
3156                    4     .apple.size          'long'
3157                          .melon.price[0]      'short'
3158                    5     .apple.size+1        'long'
3159                          .melon.price[0]+1    'short'
3160                    6     .melon.price[1]      'short'
3161                          .apple.size+2        'long'
3162                    7     .apple.size+3        'long'
3163                          .melon.price[1]+1    'short'
3164                    8     .apple.taste         'char'
3165                          .melon.price[2]      'short'
3166                    9     .melon.price[2]+1    'short'
3167                          .apple+9             'struct'
3168                   10     .apple+10            'struct'
3169                          .melon+10            'struct'
3170                   11     .apple+11            'struct'
3171                          .melon+11            'struct'
3172
3173               The first member returned is always the best member. The other
3174               members are sorted according to the rules given above. This
3175               means that members referenced without an offset are followed by
3176               members referenced with an offset. Padding regions will be at
3177               the end.
3178
3179               If OFFSET is not given in the method call, "member" will return
3180               a list of all possible members of TYPE.
3181
3182                 print "$_\n" for $c->member('choice');
3183
3184               This will print:
3185
3186                 .apple.color[0]
3187                 .apple.color[1]
3188                 .apple.size
3189                 .apple.taste
3190                 .grape[0]
3191                 .grape[1]
3192                 .grape[2]
3193                 .melon.weight
3194                 .melon.price[0]
3195                 .melon.price[1]
3196                 .melon.price[2]
3197
3198               In scalar context, the number of possible members is returned.
3199
3200       tag
3201
3202       "tag" TYPE
3203       "tag" TYPE, TAG
3204       "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3205               The "tag" method can be used to tag properties to a TYPE. It's
3206               a bit like having "configure" for individual types.
3207
3208               See "USING TAGS" for an example.
3209
3210               Note that while you can tag whole types as well as compound
3211               members, it is not possible to tag array members, i.e. you can‐
3212               not treat, for example, "a[1]" and "a[2]" differently.
3213
3214               Also note that in code like this
3215
3216                 struct test {
3217                   int a;
3218                   struct {
3219                     int x;
3220                   } b, c;
3221                 };
3222
3223               if you tag "test.b.x", this will also tag "test.c.x" implic‐
3224               itly.
3225
3226               It is also possible to tag basic types if you really want to do
3227               that, for example:
3228
3229                 $c->tag('int', Format => 'Binary');
3230
3231               To remove a tag from a type, you can either set that tag to
3232               "undef", for example
3233
3234                 $c->tag('test', Hooks => undef);
3235
3236               or use "untag".
3237
3238               To see if a tag is attached to a type or to get the value of a
3239               tag, pass only the type and tag name to "tag":
3240
3241                 $c->tag('test.a', Format => 'Binary');
3242
3243                 $hooks = $c->tag('test.a', 'Hooks');
3244                 $format = $c->tag('test.a', 'Format');
3245
3246               This will give you:
3247
3248                 $hooks = undef;
3249                 $format = 'Binary';
3250
3251               To see which tags are attached to a type, pass only the type.
3252               The "tag" method will now return a hash reference containing
3253               all tags attached to the type:
3254
3255                 $tags = $c->tag('test.a');
3256
3257               This will give you:
3258
3259                 $tags = {
3260                   'Format' => 'Binary'
3261                 };
3262
3263               "tag" will throw an exception if an error occurs.  If called as
3264               a 'set' method, it will return a reference to its object,
3265               allowing you to chain together consecutive method calls.
3266
3267               Note that when a compound is inlined, tags attached to the
3268               inlined compound are ignored, for example:
3269
3270                 $c->parse(<<ENDC);
3271                 struct header {
3272                   int id;
3273                   int len;
3274                   unsigned flags;
3275                 };
3276
3277                 struct message {
3278                   struct header;
3279                   short samples[32];
3280                 };
3281                 ENDC
3282
3283                 for my $type (qw( header message header.len )) {
3284                   $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3285                 }
3286
3287                 for my $type (qw( header message )) {
3288                   print "[unpacking $type]\n";
3289                   $u = $c->unpack($type, $data);
3290                 }
3291
3292               This will print:
3293
3294                 [unpacking header]
3295                 unpack: header.len
3296                 unpack: header
3297                 [unpacking message]
3298                 unpack: header.len
3299                 unpack: message
3300
3301               As you can see from the above output, tags attached to members
3302               of inlined compounds ("header.len" are still handled.
3303
3304               The following tags can be configured:
3305
3306               "Format" => 'Binary' ⎪ 'String'
3307                   The "Format" tag allows you to control the way binary data
3308                   is converted by "pack" and "unpack".
3309
3310                   If you tag a "TYPE" as "Binary", it will not be converted
3311                   at all, i.e. it will be passed through as a binary string.
3312
3313                   If you tag it as "String", it will be treated like a null-
3314                   terminated C string, i.e. "unpack" will convert the C
3315                   string to a Perl string and vice versa.
3316
3317                   See "The Format Tag" for an example.
3318
3319               "ByteOrder" => 'BigEndian' ⎪ 'LittleEndian'
3320                   The "ByteOrder" tag allows you to explicitly set the byte
3321                   order of a TYPE.
3322
3323                   See "The ByteOrder Tag" for an example.
3324
3325               "Dimension" => '*'
3326               "Dimension" => VALUE
3327               "Dimension" => MEMBER
3328               "Dimension" => SUB
3329               "Dimension" => [ SUB, ARGS ]
3330                   The "Dimension" tag allows you to alter the size of an
3331                   array dynamically.
3332
3333                   You can tag fixed size arrays as being flexible using '*'.
3334                   This is useful if you cannot use flexible array members in
3335                   your source code.
3336
3337                     $c->tag('type.array', Dimension => '*');
3338
3339                   You can also tag an array to have a fixed size different
3340                   from the one it was originally declared with.
3341
3342                     $c->tag('type.array', Dimension => 42);
3343
3344                   If the array is a member of a compound, you can also tag it
3345                   with to have a size corresponding to the value of another
3346                   member in that compound.
3347
3348                     $c->tag('type.array', Dimension => 'count');
3349
3350                   Finally, you can specify a subroutine that is called when
3351                   the size of the array needs to be determined.
3352
3353                     $c->tag('type.array', Dimension => \&get_count);
3354
3355                   By default, and if the array is a compound member, that
3356                   subroutine will be passed a reference to the hash storing
3357                   the data for the compound.
3358
3359                   You can also instruct Convert::Binary::C to pass additional
3360                   arguments to the subroutine by passing an array reference
3361                   instead of the subroutine reference. This array contains
3362                   the subroutine reference as well as a list of arguments.
3363                   It is possible to define certain special arguments using
3364                   the "arg" method.
3365
3366                     $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3367
3368                   See "The Dimension Tag" for various examples.
3369
3370               "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3371                   The "Hooks" tag allows you to register subroutines as
3372                   hooks.
3373
3374                   Hooks are called whenever a certain "TYPE" is packed or
3375                   unpacked. Hooks are currently considered an experimental
3376                   feature.
3377
3378                   "HOOK" can be one of the following:
3379
3380                     pack
3381                     unpack
3382                     pack_ptr
3383                     unpack_ptr
3384
3385                   "pack" and "unpack" hooks are called when processing their
3386                   "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3387                   when processing pointers to their "TYPE".
3388
3389                   "SUB" is a reference to a subroutine that usually takes one
3390                   input argument, processes it and returns one output argu‐
3391                   ment.
3392
3393                   Alternatively, you can pass a custom list of arguments to
3394                   the hook by using an array reference instead of "SUB" that
3395                   holds the subroutine reference in the first element and the
3396                   arguments to be passed to the subroutine as the other ele‐
3397                   ments.  This way, you can even pass special arguments to
3398                   the hook using the "arg" method.
3399
3400                   Here are a few examples for registering hooks:
3401
3402                     $c->tag('ObjectType', Hooks => {
3403                               pack   => \&obj_pack,
3404                               unpack => \&obj_unpack
3405                             });
3406
3407                     $c->tag('ProtocolId', Hooks => {
3408                               unpack => sub { $protos[$_[0]] }
3409                             });
3410
3411                     $c->tag('ProtocolId', Hooks => {
3412                               unpack_ptr => [sub {
3413                                                sprintf "$_[0]:{0x%X}", $_[1]
3414                                              },
3415                                              $c->arg('TYPE', 'DATA')
3416                                             ],
3417                             });
3418
3419                   Note that the above example registers both an "unpack" hook
3420                   and an "unpack_ptr" hook for "ProtocolId" with two separate
3421                   calls to "tag". As long as you don't explicitly overwrite a
3422                   previously registered hook, it won't be modified or removed
3423                   by registering other hooks for the same "TYPE".
3424
3425                   To remove all registered hooks for a type, simply remove
3426                   the "Hooks" tag:
3427
3428                     $c->untag('ProtocolId', 'Hooks');
3429
3430                   To remove only a single hook, pass "undef" as "SUB" instead
3431                   of a subroutine reference:
3432
3433                     $c->tag('ObjectType', Hooks => { pack => undef });
3434
3435                   If all hooks are removed, the whole "Hooks" tag is removed.
3436
3437                   See "The Hooks Tag" for examples on how to use hooks.
3438
3439       untag
3440
3441       "untag" TYPE
3442       "untag" TYPE, TAG1, TAG2, ...
3443               Use the "untag" method to remove one, more, or all tags from a
3444               type. If you don't pass any tag names, all tags attached to the
3445               type will be removed. Otherwise only the listed tags will be
3446               removed.
3447
3448               See "USING TAGS" for an example.
3449
3450       arg
3451
3452       "arg" 'ARG', ...
3453               Creates placeholders for special arguments to be passed to
3454               hooks or other subroutines. These arguments are currently:
3455
3456               "SELF"
3457                   A reference to the calling Convert::Binary::C object. This
3458                   may be useful if you need to work with the object inside
3459                   the subroutine.
3460
3461               "TYPE"
3462                   The name of the type that is currently being processed by
3463                   the hook.
3464
3465               "DATA"
3466                   The data argument that is passed to the subroutine.
3467
3468               "HOOK"
3469                   The type of the hook as which the subroutine has been
3470                   called, for example "pack" or "unpack_ptr".
3471
3472               "arg" will return a placeholder for each argument it is being
3473               passed. Note that not all arguments may be supported depending
3474               on the context of the subroutine.
3475
3476       dependencies
3477
3478       "dependencies"
3479               After some code has been parsed using either the "parse" or
3480               "parse_file" methods, the "dependencies" method can be used to
3481               retrieve information about all files that the object depends
3482               on, i.e. all files that have been parsed.
3483
3484               In scalar context, the method returns a hash reference.  Each
3485               key is the name of a file. The values are again hash refer‐
3486               ences, each of which holds the size, modification time (mtime),
3487               and change time (ctime) of the file at the moment it was
3488               parsed.
3489
3490                 use Convert::Binary::C;
3491                 use Data::Dumper;
3492
3493                 #----------------------------------------------------------
3494                 # Create object, set include path, parse 'string.h' header
3495                 #----------------------------------------------------------
3496                 my $c = Convert::Binary::C->new
3497                         ->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include',
3498                                   '/usr/include')
3499                         ->parse_file('string.h');
3500
3501                 #----------------------------------------------------------
3502                 # Get dependencies of the object, extract dependency files
3503                 #----------------------------------------------------------
3504                 my $depend = $c->dependencies;
3505                 my @files  = keys %$depend;
3506
3507                 #-----------------------------
3508                 # Dump dependencies and files
3509                 #-----------------------------
3510                 print Data::Dumper->Dump([$depend, \@files],
3511                                       [qw( depend   *files )]);
3512
3513               The above code would print something like this:
3514
3515                 $depend = {
3516                   '/usr/include/features.h' => {
3517                     'ctime' => 1196609327,
3518                     'mtime' => 1196609232,
3519                     'size' => 11688
3520                   },
3521                   '/usr/include/gnu/stubs-32.h' => {
3522                     'ctime' => 1196609327,
3523                     'mtime' => 1196609305,
3524                     'size' => 624
3525                   },
3526                   '/usr/include/sys/cdefs.h' => {
3527                     'ctime' => 1196609327,
3528                     'mtime' => 1196609269,
3529                     'size' => 11773
3530                   },
3531                   '/usr/include/gnu/stubs.h' => {
3532                     'ctime' => 1196609327,
3533                     'mtime' => 1196609232,
3534                     'size' => 315
3535                   },
3536                   '/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include/stddef.h' => {
3537                     'ctime' => 1203359674,
3538                     'mtime' => 1203357922,
3539                     'size' => 12695
3540                   },
3541                   '/usr/include/string.h' => {
3542                     'ctime' => 1196609327,
3543                     'mtime' => 1196609262,
3544                     'size' => 16438
3545                   },
3546                   '/usr/include/bits/wordsize.h' => {
3547                     'ctime' => 1196609327,
3548                     'mtime' => 1196609257,
3549                     'size' => 873
3550                   }
3551                 };
3552                 @files = (
3553                   '/usr/include/features.h',
3554                   '/usr/include/gnu/stubs-32.h',
3555                   '/usr/include/sys/cdefs.h',
3556                   '/usr/include/gnu/stubs.h',
3557                   '/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include/stddef.h',
3558                   '/usr/include/string.h',
3559                   '/usr/include/bits/wordsize.h'
3560                 );
3561
3562               In list context, the method returns the names of all files that
3563               have been parsed, i.e. the following lines are equivalent:
3564
3565                 @files = keys %{$c->dependencies};
3566                 @files = $c->dependencies;
3567
3568       sourcify
3569
3570       "sourcify"
3571       "sourcify" CONFIG
3572               Returns a string that holds the C source code necessary to rep‐
3573               resent all parsed C data structures.
3574
3575                 use Convert::Binary::C;
3576
3577                 $c = new Convert::Binary::C;
3578                 $c->parse(<<'END');
3579
3580                 #define ADD(a, b) ((a) + (b))
3581                 #define NUMBER 42
3582
3583                 typedef struct _mytype mytype;
3584
3585                 struct _mytype {
3586                   union {
3587                     int         iCount;
3588                     enum count *pCount;
3589                   } counter;
3590                 #pragma pack( push, 1 )
3591                   struct {
3592                     char string[NUMBER];
3593                     int  array[NUMBER/sizeof(int)];
3594                   } storage;
3595                 #pragma pack( pop )
3596                   mytype *next;
3597                 };
3598
3599                 enum count { ZERO, ONE, TWO, THREE };
3600
3601                 END
3602
3603                 print $c->sourcify;
3604
3605               The above code would print something like this:
3606
3607                 /* typedef predeclarations */
3608
3609                 typedef struct _mytype mytype;
3610
3611                 /* defined enums */
3612
3613                 enum count
3614                 {
3615                       ZERO,
3616                       ONE,
3617                       TWO,
3618                       THREE
3619                 };
3620
3621                 /* defined structs and unions */
3622
3623                 struct _mytype
3624                 {
3625                       union
3626                       {
3627                               int iCount;
3628                               enum count *pCount;
3629                       } counter;
3630                 #pragma pack(push, 1)
3631                       struct
3632                       {
3633                               char string[42];
3634                               int array[10];
3635                       } storage;
3636                 #pragma pack(pop)
3637                       mytype *next;
3638                 };
3639
3640               The purpose of the "sourcify" method is to enable some kind of
3641               platform-independent caching. The C code generated by "sour‐
3642               cify" can be parsed by any standard C compiler, as well as of
3643               course by the Convert::Binary::C parser. However, the code may
3644               be significantly shorter than the code that has originally been
3645               parsed.
3646
3647               When parsing a typical header file, it's easily possible that
3648               you need to open dozens of other files that are included from
3649               that file, and end up parsing several hundred kilobytes of C
3650               code. Since most of it is usually preprocessor directives,
3651               function prototypes and comments, the "sourcify" function
3652               strips this down to a few kilobytes. Saving the "sourcify"
3653               string and parsing it next time instead of the original code
3654               may be a lot faster.
3655
3656               The "sourcify" method takes a hash reference as an optional
3657               argument. It can be used to tweak the method's output.  The
3658               following options can be configured.
3659
3660               "Context" => 0 ⎪ 1
3661                   Turns preprocessor context information on or off. If this
3662                   is turned on, "sourcify" will insert "#line" preprocessor
3663                   directives in its output. So in the above example
3664
3665                     print $c->sourcify({ Context => 1 });
3666
3667                   would print:
3668
3669                     /* typedef predeclarations */
3670
3671                     typedef struct _mytype mytype;
3672
3673                     /* defined enums */
3674
3675                     #line 21 "[buffer]"
3676                     enum count
3677                     {
3678                           ZERO,
3679                           ONE,
3680                           TWO,
3681                           THREE
3682                     };
3683
3684                     /* defined structs and unions */
3685
3686                     #line 7 "[buffer]"
3687                     struct _mytype
3688                     {
3689                     #line 8 "[buffer]"
3690                           union
3691                           {
3692                                   int iCount;
3693                                   enum count *pCount;
3694                           } counter;
3695                     #pragma pack(push, 1)
3696                     #line 13 "[buffer]"
3697                           struct
3698                           {
3699                                   char string[42];
3700                                   int array[10];
3701                           } storage;
3702                     #pragma pack(pop)
3703                           mytype *next;
3704                     };
3705
3706                   Note that "[buffer]" refers to the here-doc buffer when
3707                   using "parse".
3708
3709               "Defines" => 0 ⎪ 1
3710                   Turn this on if you want all the defined macros to be part
3711                   of the source code output. Given the example code above
3712
3713                     print $c->sourcify({ Defines => 1 });
3714
3715                   would print:
3716
3717                     /* typedef predeclarations */
3718
3719                     typedef struct _mytype mytype;
3720
3721                     /* defined enums */
3722
3723                     enum count
3724                     {
3725                           ZERO,
3726                           ONE,
3727                           TWO,
3728                           THREE
3729                     };
3730
3731                     /* defined structs and unions */
3732
3733                     struct _mytype
3734                     {
3735                           union
3736                           {
3737                                   int iCount;
3738                                   enum count *pCount;
3739                           } counter;
3740                     #pragma pack(push, 1)
3741                           struct
3742                           {
3743                                   char string[42];
3744                                   int array[10];
3745                           } storage;
3746                     #pragma pack(pop)
3747                           mytype *next;
3748                     };
3749
3750                     /* preprocessor defines */
3751
3752                     #define ADD(a, b) ((a) + (b))
3753                     #define NUMBER 42
3754
3755                   The macro definitions always appear at the end of the
3756                   source code.  The order of the macro definitions is unde‐
3757                   fined.
3758
3759       The following methods can be used to retrieve information about the
3760       definitions that have been parsed. The examples given in the descrip‐
3761       tion for "enum", "compound" and "typedef" all assume this piece of C
3762       code has been parsed:
3763
3764         #define ABC_SIZE 2
3765         #define MULTIPLY(x, y) ((x)*(y))
3766
3767         #ifdef ABC_SIZE
3768         # define DEFINED
3769         #else
3770         # define NOT_DEFINED
3771         #endif
3772
3773         typedef unsigned long U32;
3774         typedef void *any;
3775
3776         enum __socket_type
3777         {
3778           SOCK_STREAM    = 1,
3779           SOCK_DGRAM     = 2,
3780           SOCK_RAW       = 3,
3781           SOCK_RDM       = 4,
3782           SOCK_SEQPACKET = 5,
3783           SOCK_PACKET    = 10
3784         };
3785
3786         struct STRUCT_SV {
3787           void *sv_any;
3788           U32   sv_refcnt;
3789           U32   sv_flags;
3790         };
3791
3792         typedef union {
3793           int abc[ABC_SIZE];
3794           struct xxx {
3795             int a;
3796             int b;
3797           }   ab[3][4];
3798           any ptr;
3799         } test;
3800
3801       enum_names
3802
3803       "enum_names"
3804               Returns a list of identifiers of all defined enumeration
3805               objects. Enumeration objects don't necessarily have an identi‐
3806               fier, so something like
3807
3808                 enum { A, B, C };
3809
3810               will obviously not appear in the list returned by the
3811               "enum_names" method. Also, enumerations that are not defined
3812               within the source code - like in
3813
3814                 struct foo {
3815                   enum weekday *pWeekday;
3816                   unsigned long year;
3817                 };
3818
3819               where only a pointer to the "weekday" enumeration object is
3820               used - will not be returned, even though they have an identi‐
3821               fier. So for the above two enumerations, "enum_names" will
3822               return an empty list:
3823
3824                 @names = $c->enum_names;
3825
3826               The only way to retrieve a list of all enumeration identifiers
3827               is to use the "enum" method without additional arguments. You
3828               can get a list of all enumeration objects that have an identi‐
3829               fier by using
3830
3831                 @enums = map { $_->{identifier} ⎪⎪ () } $c->enum;
3832
3833               but these may not have a definition. Thus, the two arrays would
3834               look like this:
3835
3836                 @names = ();
3837                 @enums = ('weekday');
3838
3839               The "def" method returns a true value for all identifiers
3840               returned by "enum_names".
3841
3842       enum
3843
3844       enum
3845       "enum" LIST
3846               Returns a list of references to hashes containing detailed
3847               information about all enumerations that have been parsed.
3848
3849               If a list of enumeration identifiers is passed to the method,
3850               the returned list will only contain hash references for those
3851               enumerations. The enumeration identifiers may optionally be
3852               prefixed by "enum".
3853
3854               If an enumeration identifier cannot be found, the returned list
3855               will contain an undefined value at that position.
3856
3857               In scalar context, the number of enumerations will be returned
3858               as long as the number of arguments to the method call is not 1.
3859               In the latter case, a hash reference holding information for
3860               the enumeration will be returned.
3861
3862               The list returned by the "enum" method looks similar to this:
3863
3864                 @enum = (
3865                   {
3866                     'enumerators' => {
3867                       'SOCK_STREAM' => 1,
3868                       'SOCK_RAW' => 3,
3869                       'SOCK_SEQPACKET' => 5,
3870                       'SOCK_RDM' => 4,
3871                       'SOCK_PACKET' => 10,
3872                       'SOCK_DGRAM' => 2
3873                     },
3874                     'identifier' => '__socket_type',
3875                     'context' => 'definitions.c(13)',
3876                     'size' => 4,
3877                     'sign' => 0
3878                   }
3879                 );
3880
3881               "identifier"
3882                   holds the enumeration identifier. This key is not present
3883                   if the enumeration has no identifier.
3884
3885               "context"
3886                   is the context in which the enumeration is defined. This is
3887                   the filename followed by the line number in parentheses.
3888
3889               "enumerators"
3890                   is a reference to a hash table that holds all enumerators
3891                   of the enumeration.
3892
3893               "sign"
3894                   is a boolean indicating if the enumeration is signed (i.e.
3895                   has negative values).
3896
3897               One useful application may be to create a hash table that holds
3898               all enumerators of all defined enumerations:
3899
3900                 %enum = map %{ $_->{enumerators} ⎪⎪ {} }, $c->enum;
3901
3902               The %enum hash table would then be:
3903
3904                 %enum = (
3905                   'SOCK_STREAM' => 1,
3906                   'SOCK_RAW' => 3,
3907                   'SOCK_SEQPACKET' => 5,
3908                   'SOCK_RDM' => 4,
3909                   'SOCK_DGRAM' => 2,
3910                   'SOCK_PACKET' => 10
3911                 );
3912
3913       compound_names
3914
3915       "compound_names"
3916               Returns a list of identifiers of all structs and unions (com‐
3917               pound data structures) that are defined in the parsed source
3918               code. Like enumerations, compounds don't need to have an iden‐
3919               tifier, nor do they need to be defined.
3920
3921               Again, the only way to retrieve information about all struct
3922               and union objects is to use the "compound" method and don't
3923               pass it any arguments. If you should need a list of all struct
3924               and union identifiers, you can use:
3925
3926                 @compound = map { $_->{identifier} ⎪⎪ () } $c->compound;
3927
3928               The "def" method returns a true value for all identifiers
3929               returned by "compound_names".
3930
3931               If you need the names of only the structs or only the unions,
3932               use the "struct_names" and "union_names" methods respectively.
3933
3934       compound
3935
3936       "compound"
3937       "compound" LIST
3938               Returns a list of references to hashes containing detailed
3939               information about all compounds (structs and unions) that have
3940               been parsed.
3941
3942               If a list of struct/union identifiers is passed to the method,
3943               the returned list will only contain hash references for those
3944               compounds. The identifiers may optionally be prefixed by
3945               "struct" or "union", which limits the search to the specified
3946               kind of compound.
3947
3948               If an identifier cannot be found, the returned list will con‐
3949               tain an undefined value at that position.
3950
3951               In scalar context, the number of compounds will be returned as
3952               long as the number of arguments to the method call is not 1. In
3953               the latter case, a hash reference holding information for the
3954               compound will be returned.
3955
3956               The list returned by the "compound" method looks similar to
3957               this:
3958
3959                 @compound = (
3960                   {
3961                     'identifier' => 'STRUCT_SV',
3962                     'align' => 1,
3963                     'context' => 'definitions.c(23)',
3964                     'pack' => 0,
3965                     'type' => 'struct',
3966                     'declarations' => [
3967                       {
3968                         'declarators' => [
3969                           {
3970                             'declarator' => '*sv_any',
3971                             'size' => 4,
3972                             'offset' => 0
3973                           }
3974                         ],
3975                         'type' => 'void'
3976                       },
3977                       {
3978                         'declarators' => [
3979                           {
3980                             'declarator' => 'sv_refcnt',
3981                             'size' => 4,
3982                             'offset' => 4
3983                           }
3984                         ],
3985                         'type' => 'U32'
3986                       },
3987                       {
3988                         'declarators' => [
3989                           {
3990                             'declarator' => 'sv_flags',
3991                             'size' => 4,
3992                             'offset' => 8
3993                           }
3994                         ],
3995                         'type' => 'U32'
3996                       }
3997                     ],
3998                     'size' => 12
3999                   },
4000                   {
4001                     'identifier' => 'xxx',
4002                     'align' => 1,
4003                     'context' => 'definitions.c(31)',
4004                     'pack' => 0,
4005                     'type' => 'struct',
4006                     'declarations' => [
4007                       {
4008                         'declarators' => [
4009                           {
4010                             'declarator' => 'a',
4011                             'size' => 4,
4012                             'offset' => 0
4013                           }
4014                         ],
4015                         'type' => 'int'
4016                       },
4017                       {
4018                         'declarators' => [
4019                           {
4020                             'declarator' => 'b',
4021                             'size' => 4,
4022                             'offset' => 4
4023                           }
4024                         ],
4025                         'type' => 'int'
4026                       }
4027                     ],
4028                     'size' => 8
4029                   },
4030                   {
4031                     'align' => 1,
4032                     'context' => 'definitions.c(29)',
4033                     'pack' => 0,
4034                     'type' => 'union',
4035                     'declarations' => [
4036                       {
4037                         'declarators' => [
4038                           {
4039                             'declarator' => 'abc[2]',
4040                             'size' => 8,
4041                             'offset' => 0
4042                           }
4043                         ],
4044                         'type' => 'int'
4045                       },
4046                       {
4047                         'declarators' => [
4048                           {
4049                             'declarator' => 'ab[3][4]',
4050                             'size' => 96,
4051                             'offset' => 0
4052                           }
4053                         ],
4054                         'type' => 'struct xxx'
4055                       },
4056                       {
4057                         'declarators' => [
4058                           {
4059                             'declarator' => 'ptr',
4060                             'size' => 4,
4061                             'offset' => 0
4062                           }
4063                         ],
4064                         'type' => 'any'
4065                       }
4066                     ],
4067                     'size' => 96
4068                   }
4069                 );
4070
4071               "identifier"
4072                   holds the struct or union identifier. This key is not
4073                   present if the compound has no identifier.
4074
4075               "context"
4076                   is the context in which the struct or union is defined.
4077                   This is the filename followed by the line number in paren‐
4078                   theses.
4079
4080               "type"
4081                   is either 'struct' or 'union'.
4082
4083               "size"
4084                   is the size of the struct or union.
4085
4086               "align"
4087                   is the alignment of the struct or union.
4088
4089               "pack"
4090                   is the struct member alignment if the compound is packed,
4091                   or zero otherwise.
4092
4093               "declarations"
4094                   is an array of hash references describing each struct dec‐
4095                   laration:
4096
4097                   "type"
4098                       is the type of the struct declaration. This may be a
4099                       string or a reference to a hash describing the type.
4100
4101                   "declarators"
4102                       is an array of hashes describing each declarator:
4103
4104                       "declarator"
4105                           is a string representation of the declarator.
4106
4107                       "offset"
4108                           is the offset of the struct member represented by
4109                           the current declarator relative to the beginning of
4110                           the struct or union.
4111
4112                       "size"
4113                           is the size occupied by the struct member repre‐
4114                           sented by the current declarator.
4115
4116               It may be useful to have separate lists for structs and unions.
4117               One way to retrieve such lists would be to use
4118
4119                 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4120                     for $c->compound;
4121
4122               However, you should use the "struct" and "union" methods, which
4123               is a lot simpler:
4124
4125                 @structs = $c->struct;
4126                 @unions  = $c->union;
4127
4128       struct_names
4129
4130       "struct_names"
4131               Returns a list of all defined struct identifiers.  This is
4132               equivalent to calling "compound_names", just that it only
4133               returns the names of the struct identifiers and doesn't return
4134               the names of the union identifiers.
4135
4136       struct
4137
4138       "struct"
4139       "struct" LIST
4140               Like the "compound" method, but only allows for structs.
4141
4142       union_names
4143
4144       "union_names"
4145               Returns a list of all defined union identifiers.  This is
4146               equivalent to calling "compound_names", just that it only
4147               returns the names of the union identifiers and doesn't return
4148               the names of the struct identifiers.
4149
4150       union
4151
4152       "union"
4153       "union" LIST
4154               Like the "compound" method, but only allows for unions.
4155
4156       typedef_names
4157
4158       "typedef_names"
4159               Returns a list of all defined typedef identifiers. Typedefs
4160               that do not specify a type that you could actually work with
4161               will not be returned.
4162
4163               The "def" method returns a true value for all identifiers
4164               returned by "typedef_names".
4165
4166       typedef
4167
4168       "typedef"
4169       "typedef" LIST
4170               Returns a list of references to hashes containing detailed
4171               information about all typedefs that have been parsed.
4172
4173               If a list of typedef identifiers is passed to the method, the
4174               returned list will only contain hash references for those type‐
4175               defs.
4176
4177               If an identifier cannot be found, the returned list will con‐
4178               tain an undefined value at that position.
4179
4180               In scalar context, the number of typedefs will be returned as
4181               long as the number of arguments to the method call is not 1. In
4182               the latter case, a hash reference holding information for the
4183               typedef will be returned.
4184
4185               The list returned by the "typedef" method looks similar to
4186               this:
4187
4188                 @typedef = (
4189                   {
4190                     'declarator' => 'U32',
4191                     'type' => 'unsigned long'
4192                   },
4193                   {
4194                     'declarator' => '*any',
4195                     'type' => 'void'
4196                   },
4197                   {
4198                     'declarator' => 'test',
4199                     'type' => {
4200                       'align' => 1,
4201                       'context' => 'definitions.c(29)',
4202                       'pack' => 0,
4203                       'type' => 'union',
4204                       'declarations' => [
4205                         {
4206                           'declarators' => [
4207                             {
4208                               'declarator' => 'abc[2]',
4209                               'size' => 8,
4210                               'offset' => 0
4211                             }
4212                           ],
4213                           'type' => 'int'
4214                         },
4215                         {
4216                           'declarators' => [
4217                             {
4218                               'declarator' => 'ab[3][4]',
4219                               'size' => 96,
4220                               'offset' => 0
4221                             }
4222                           ],
4223                           'type' => 'struct xxx'
4224                         },
4225                         {
4226                           'declarators' => [
4227                             {
4228                               'declarator' => 'ptr',
4229                               'size' => 4,
4230                               'offset' => 0
4231                             }
4232                           ],
4233                           'type' => 'any'
4234                         }
4235                       ],
4236                       'size' => 96
4237                     }
4238                   }
4239                 );
4240
4241               "declarator"
4242                   is the type declarator.
4243
4244               "type"
4245                   is the type specification. This may be a string or a refer‐
4246                   ence to a hash describing the type.  See "enum" and "com‐
4247                   pound" for a description on how to interpret this hash.
4248
4249       macro_names
4250
4251       "macro_names"
4252               Returns a list of all defined macro names.
4253
4254               The list returned by the "macro_names" method looks similar to
4255               this:
4256
4257                 @macro_names = (
4258                   '__STDC_VERSION__',
4259                   '__STDC_HOSTED__',
4260                   'DEFINED',
4261                   'MULTIPLY',
4262                   'ABC_SIZE'
4263                 );
4264
4265               This works only as long as the preprocessor is not reset.  See
4266               "Preprocessor configuration" for details.
4267
4268       macro
4269
4270       "macro"
4271       "macro" LIST
4272               Returns the definitions for all defined macros.
4273
4274               If a list of macro names is passed to the method, the returned
4275               list will only contain the definitions for those macros. For
4276               undefined macros, "undef" will be returned.
4277
4278               The list returned by the "macro" method looks similar to this:
4279
4280                 @macro = (
4281                   '__STDC_VERSION__ 199901L',
4282                   '__STDC_HOSTED__ 1',
4283                   'DEFINED',
4284                   'MULTIPLY(x, y) ((x)*(y))',
4285                   'ABC_SIZE 2'
4286                 );
4287
4288               This works only as long as the preprocessor is not reset.  See
4289               "Preprocessor configuration" for details.
4290

FUNCTIONS

4292       You can alternatively call the following functions as methods on Con‐
4293       vert::Binary::C objects.
4294
4295       feature
4296
4297       "feature" STRING
4298               Checks if Convert::Binary::C was built with certain features.
4299               For example,
4300
4301                 print "debugging version"
4302                     if Convert::Binary::C::feature('debug');
4303
4304               will check if Convert::Binary::C was built with debugging sup‐
4305               port enabled. The "feature" function returns 1 if the feature
4306               is enabled, 0 if the feature is disabled, and "undef" if the
4307               feature is unknown. Currently the only features that can be
4308               checked are "ieeefp" and "debug".
4309
4310               You can enable or disable certain features at compile time of
4311               the module by using the
4312
4313                 perl Makefile.PL enable-feature disable-feature
4314
4315               syntax.
4316
4317       native
4318
4319       "native"
4320       "native" STRING
4321               Returns the value of a property of the native system that Con‐
4322               vert::Binary::C was built on. For example,
4323
4324                 $size = Convert::Binary::C::native('IntSize');
4325
4326               will fetch the size of an "int" on the native system.  The fol‐
4327               lowing properties can be queried:
4328
4329                 Alignment
4330                 ByteOrder
4331                 CharSize
4332                 CompoundAlignment
4333                 DoubleSize
4334                 EnumSize
4335                 FloatSize
4336                 HostedC
4337                 IntSize
4338                 LongDoubleSize
4339                 LongLongSize
4340                 LongSize
4341                 PointerSize
4342                 ShortSize
4343                 StdCVersion
4344                 UnsignedBitfields
4345                 UnsignedChars
4346
4347               You can also call "native" without arguments, in which case it
4348               will return a reference to a hash with all properties, like:
4349
4350                 $native = {
4351                   'StdCVersion' => undef,
4352                   'ByteOrder' => 'LittleEndian',
4353                   'LongSize' => 4,
4354                   'IntSize' => 4,
4355                   'HostedC' => 1,
4356                   'ShortSize' => 2,
4357                   'UnsignedChars' => 0,
4358                   'DoubleSize' => 8,
4359                   'CharSize' => 1,
4360                   'EnumSize' => 4,
4361                   'PointerSize' => 4,
4362                   'FloatSize' => 4,
4363                   'LongLongSize' => 8,
4364                   'Alignment' => 4,
4365                   'LongDoubleSize' => 12,
4366                   'UnsignedBitfields' => 0,
4367                   'CompoundAlignment' => 1
4368                 };
4369
4370               The contents of that hash are suitable for passing them to the
4371               "configure" method.
4372

DEBUGGING

4374       Like perl itself, Convert::Binary::C can be compiled with debugging
4375       support that can then be selectively enabled at runtime. You can spec‐
4376       ify whether you like to build Convert::Binary::C with debugging support
4377       or not by explicitly giving an argument to Makefile.PL.  Use
4378
4379         perl Makefile.PL enable-debug
4380
4381       to enable debugging, or
4382
4383         perl Makefile.PL disable-debug
4384
4385       to disable debugging. The default will depend on how your perl binary
4386       was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4387       be built with debugging support, too.
4388
4389       Once you have built Convert::Binary::C with debugging support, you can
4390       use the following syntax to enable debug output. Instead of
4391
4392         use Convert::Binary::C;
4393
4394       you simply say
4395
4396         use Convert::Binary::C debug => 'all';
4397
4398       which will enable all debug output. However, I don't recommend to
4399       enable all debug output, because that can be a fairly large amount.
4400
4401       Debugging options
4402
4403       Instead of saying "all", you can pass a string that consists of one or
4404       more of the following characters:
4405
4406         m   enable memory allocation tracing
4407         M   enable memory allocation & assertion tracing
4408
4409         h   enable hash table debugging
4410         H   enable hash table dumps
4411
4412         d   enable debug output from the XS module
4413         c   enable debug output from the ctlib
4414         t   enable debug output about type objects
4415
4416         l   enable debug output from the C lexer
4417         p   enable debug output from the C parser
4418         P   enable debug output from the C preprocessor
4419         r   enable debug output from the #pragma parser
4420
4421         y   enable debug output from yacc (bison)
4422
4423       So the following might give you a brief overview of what's going on
4424       inside Convert::Binary::C:
4425
4426         use Convert::Binary::C debug => 'dct';
4427
4428       When you want to debug memory allocation using
4429
4430         use Convert::Binary::C debug => 'm';
4431
4432       you can use the Perl script check_alloc.pl that resides in the
4433       ctlib/util/tool directory to extract statistics about memory usage and
4434       information about memory leaks from the resulting debug output.
4435
4436       Redirecting debug output
4437
4438       By default, all debug output is written to "stderr". You can, however,
4439       redirect the debug output to a file with the "debugfile" option:
4440
4441         use Convert::Binary::C debug     => 'dcthHm',
4442                                debugfile => './debug.out';
4443
4444       If the file cannot be opened, you'll receive a warning and the output
4445       will go the "stderr" way again.
4446
4447       Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4448       and "CBC_DEBUG_FILE" to turn on debug output.
4449
4450       If Convert::Binary::C is built without debugging support, passing the
4451       "debug" or "debugfile" options will cause a warning to be issued. The
4452       corresponding environment variables will simply be ignored.
4453

ENVIRONMENT

4455       "CBC_ORDER_MEMBERS"
4456
4457       Setting this variable to a non-zero value will globally turn on hash
4458       key ordering for compound members. Have a look at the "OrderMembers"
4459       option for details.
4460
4461       Setting the variable to the name of a perl module will additionally use
4462       this module instead of the predefined modules for member ordering to
4463       tie the hashes to.
4464
4465       "CBC_DEBUG_OPT"
4466
4467       If Convert::Binary::C is built with debugging support, you can use this
4468       variable to specify the debugging options.
4469
4470       "CBC_DEBUG_FILE"
4471
4472       If Convert::Binary::C is built with debugging support, you can use this
4473       variable to redirect the debug output to a file.
4474
4475       "CBC_DISABLE_PARSER"
4476
4477       This variable is intended purely for development. Setting it to a non-
4478       zero value disables the Convert::Binary::C parser, which means that no
4479       information is collected from the file or code that is parsed. However,
4480       the preprocessor will run, which is useful for benchmarking the pre‐
4481       processor.
4482

FLEXIBLE ARRAY MEMBERS AND INCOMPLETE TYPES

4484       Flexible array members are a feature introduced with ISO-C99.  It's a
4485       common problem that you have a variable length data field at the end of
4486       a structure, for example an array of characters at the end of a message
4487       struct. ISO-C99 allows you to write this as:
4488
4489         struct message {
4490           long header;
4491           char data[];
4492         };
4493
4494       The advantage is that you clearly indicate that the size of the
4495       appended data is variable, and that the "data" member doesn't contrib‐
4496       ute to the size of the "message" structure.
4497
4498       When packing or unpacking data, Convert::Binary::C deals with flexible
4499       array members as if their length was adjustable. For example, "unpack"
4500       will adapt the length of the array depending on the input string:
4501
4502         $msg1 = $c->unpack('message', 'abcdefg');
4503         $msg2 = $c->unpack('message', 'abcdefghijkl');
4504
4505       The following data is unpacked:
4506
4507         $msg1 = {
4508           'data' => [
4509             101,
4510             102,
4511             103
4512           ],
4513           'header' => 1633837924
4514         };
4515         $msg2 = {
4516           'data' => [
4517             101,
4518             102,
4519             103,
4520             104,
4521             105,
4522             106,
4523             107,
4524             108
4525           ],
4526           'header' => 1633837924
4527         };
4528
4529       Similarly, pack will adjust the length of the output string according
4530       to the data you feed in:
4531
4532         use Data::Hexdumper;
4533
4534         $msg = {
4535           header => 4711,
4536           data   => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4537         };
4538
4539         $data = $c->pack('message', $msg);
4540
4541         print hexdump(data => $data);
4542
4543       This would print:
4544
4545           0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{⎪}~
4546           0x0010 : 7F 80 81 82 83 84 85 86 87 88                   : ..........
4547
4548       Incomplete types such as
4549
4550         typedef unsigned long array[];
4551
4552       are handled in exactly the same way. Thus, you can easily
4553
4554         $array = $c->unpack('array', '?'x20);
4555
4556       which will unpack the following array:
4557
4558         $array = [
4559           1061109567,
4560           1061109567,
4561           1061109567,
4562           1061109567,
4563           1061109567
4564         ];
4565
4566       You can also alter the length of an array using the "Dimension" tag.
4567

FLOATING POINT VALUES

4569       When using Convert::Binary::C to handle floating point values, you have
4570       to be aware of some limitations.
4571
4572       You're usually safe if all your platforms are using the IEEE floating
4573       point format. During the Convert::Binary::C build process, the "ieeefp"
4574       feature will automatically be enabled if the host is using IEEE float‐
4575       ing point. You can check for this feature at runtime using the "fea‐
4576       ture" function:
4577
4578         if (Convert::Binary::C::feature('ieeefp')) {
4579           # do something
4580         }
4581
4582       When IEEE floating point support is enabled, the module can also handle
4583       floating point values of a different byteorder.
4584
4585       If your host platform is not using IEEE floating point, the "ieeefp"
4586       feature will be disabled. Convert::Binary::C then will be more restric‐
4587       tive, refusing to handle any non-native floating point values.
4588
4589       However, Convert::Binary::C cannot detect the floating point format
4590       used by your target platform. It can only try to prevent problems in
4591       obvious cases. If you know your target platform has a completely dif‐
4592       ferent floating point format, don't use floating point conversion at
4593       all.
4594
4595       Whenever Convert::Binary::C detects that it cannot properly do floating
4596       point value conversion, it will issue a warning and will not attempt to
4597       convert the floating point value.
4598

BITFIELDS

4600       Bitfield support in Convert::Binary::C is currently in an experimental
4601       state. You are encouraged to test it, but you should not blindly rely
4602       on its results.
4603
4604       You are also encouraged to supply layouting algorithms for compilers
4605       whose bitfield implementation is not handled correctly at the moment.
4606       Even better that the plain algorithm is of course a patch that adds a
4607       new bitfield layouting engine.
4608
4609       While bitfields may not be handled correctly by the conversion routines
4610       yet, they are always parsed correctly. This means that you can reliably
4611       use the declarator fields as returned by the "struct" or "typedef"
4612       methods.  Given the following source
4613
4614         struct bitfield {
4615           int seven:7;
4616           int :1;
4617           int four:4, :0;
4618           int integer;
4619         };
4620
4621       a call to "struct" will return
4622
4623         @struct = (
4624           {
4625             'identifier' => 'bitfield',
4626             'align' => 1,
4627             'context' => 'bitfields.c(1)',
4628             'pack' => 0,
4629             'type' => 'struct',
4630             'declarations' => [
4631               {
4632                 'declarators' => [
4633                   {
4634                     'declarator' => 'seven:7'
4635                   }
4636                 ],
4637                 'type' => 'int'
4638               },
4639               {
4640                 'declarators' => [
4641                   {
4642                     'declarator' => ':1'
4643                   }
4644                 ],
4645                 'type' => 'int'
4646               },
4647               {
4648                 'declarators' => [
4649                   {
4650                     'declarator' => 'four:4'
4651                   },
4652                   {
4653                     'declarator' => ':0'
4654                   }
4655                 ],
4656                 'type' => 'int'
4657               },
4658               {
4659                 'declarators' => [
4660                   {
4661                     'declarator' => 'integer',
4662                     'size' => 4,
4663                     'offset' => 4
4664                   }
4665                 ],
4666                 'type' => 'int'
4667               }
4668             ],
4669             'size' => 8
4670           }
4671         );
4672
4673       No size/offset keys will currently be returned for bitfield entries.
4674

MULTITHREADING

4676       Convert::Binary::C was designed to be thread-safe.
4677

INHERITANCE

4679       If you wish to derive a new class from Convert::Binary::C, this is rel‐
4680       atively easy. Despite their XS implementation, Convert::Binary::C
4681       objects are actually blessed hash references.
4682
4683       The XS data is stored in a read-only hash value for the key that is the
4684       empty string. So it is safe to use any non-empty hash key when deriving
4685       your own class.  In addition, Convert::Binary::C does quite a lot of
4686       checks to detect corruption in the object hash.
4687
4688       If you store private data in the hash, you should override the "clone"
4689       method and provide the necessary code to clone your private data.
4690       You'll have to call "SUPER::clone", but this will only clone the Con‐
4691       vert::Binary::C part of the object.
4692
4693       For an example of a derived class, you can have a look at Con‐
4694       vert::Binary::C::Cached.
4695

PORTABILITY

4697       Convert::Binary::C should build and run on most of the platforms that
4698       Perl runs on:
4699
4700       ·   Various Linux systems
4701
4702       ·   Various BSD systems
4703
4704       ·   HP-UX
4705
4706       ·   Compaq/HP Tru64 Unix
4707
4708       ·   Mac-OS X
4709
4710       ·   Cygwin
4711
4712       ·   Windows 98/NT/2000/XP
4713
4714       Also, many architectures are supported:
4715
4716       ·   Various Intel Pentium and Itanium systems
4717
4718       ·   Various Alpha systems
4719
4720       ·   HP PA-RISC
4721
4722       ·   Power-PC
4723
4724       ·   StrongARM
4725
4726       The module should build with any perl binary from 5.004 up to the lat‐
4727       est development version.
4728

COMPARISON WITH SIMILAR MODULES

4730       Most of the time when you're really looking for Convert::Binary::C
4731       you'll actually end up finding one of the following modules. Some of
4732       them have different goals, so it's probably worth pointing out the dif‐
4733       ferences.
4734
4735       C::Include
4736
4737       Like Convert::Binary::C, this module aims at doing conversion from and
4738       to binary data based on C types.  However, its configurability is very
4739       limited compared to Convert::Binary::C. Also, it does not parse all C
4740       code correctly. It's slower than Convert::Binary::C, doesn't have a
4741       preprocessor. On the plus side, it's written in pure Perl.
4742
4743       C::DynaLib::Struct
4744
4745       This module doesn't allow you to reuse your C source code. One main
4746       goal of Convert::Binary::C was to avoid code duplication or, even
4747       worse, having to maintain different representations of your data struc‐
4748       tures.  Like C::Include, C::DynaLib::Struct is rather limited in its
4749       configurability.
4750
4751       Win32::API::Struct
4752
4753       This module has a special purpose. It aims at building structs for
4754       interfacing Perl code with Windows API code.
4755

CREDITS

4757       · My love Jennifer for always being there, for filling my life with joy
4758         and last but not least for proofreading the documentation.
4759
4760       · Alain Barbet <alian@cpan.org> for testing and debugging support.
4761
4762       · Mitchell N. Charity for giving me pointers into various interesting
4763         directions.
4764
4765       · Alexis Denis for making me improve (externally) and simplify (inter‐
4766         nally) floating point support. He can also be blamed (indirectly) for
4767         the "initializer" method, as I need it in my effort to support bit‐
4768         fields some day.
4769
4770       · Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4771         way to and back home from work, and for making me think about sup‐
4772         porting "pack" and "unpack" for compound members.
4773
4774       · Thorsten Jens <thojens@gmx.de> for testing the package on various
4775         platforms.
4776
4777       · Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4778         giving invaluable feedback.
4779
4780       · Thomas Pornin <pornin@bolet.org> for his excellent "ucpp" preproces‐
4781         sor library.
4782
4783       · Marc Rosenthal for his suggestions and support.
4784
4785       · James Roskind, as his C parser was a great starting point to fix all
4786         the problems I had with my original parser based only on the ANSI
4787         ruleset.
4788
4789       · Gisbert W. Selke for spotting some interesting bugs and providing
4790         extensive reports.
4791
4792       · Steffen Zimmermann for a prolific discussion on the cloning algo‐
4793         rithm.
4794

MAILING LIST

4796       There's also a mailing list that you can join:
4797
4798         convert-binary-c@yahoogroups.com
4799
4800       To subscribe, simply send mail to:
4801
4802         convert-binary-c-subscribe@yahoogroups.com
4803
4804       You can use this mailing list for non-bug problems, questions or dis‐
4805       cussions.
4806

BUGS

4808       I'm sure there are still lots of bugs in the code for this module. If
4809       you find any bugs, Convert::Binary::C doesn't seem to build on your
4810       system or any of its tests fail, please use the CPAN Request Tracker at
4811       <http://rt.cpan.org/> to create a ticket for the module. Alternatively,
4812       just send a mail to <mhx@cpan.org>.
4813

EXPERIMENTAL FEATURES

4815       Some features in Convert::Binary::C are marked as experimental.  This
4816       has most probably one of the following reasons:
4817
4818       · The feature does not behave in exactly the way that I wish it did,
4819         possibly due to some limitations in the current design of the module.
4820
4821       · The feature hasn't been tested enough and may completely fail to pro‐
4822         duce the expected results.
4823
4824       I hope to fix most issues with these experimental features someday, but
4825       this may mean that I have to change the way they currently work in a
4826       way that's not backwards compatible.  So if any of these features is
4827       useful to you, you can use it, but you should be aware that the behav‐
4828       iour or the interface may change in future releases of this module.
4829

TODO

4831       If you're interested in what I currently plan to improve (or fix), have
4832       a look at the TODO file.
4833

POSTCARDS

4835       If you're using my module and like it, you can show your appreciation
4836       by sending me a postcard from where you live. I won't urge you to do
4837       it, it's completely up to you. To me, this is just a very nice way of
4838       receiving feedback about my work. Please send your postcard to:
4839
4840         Marcus Holland-Moritz
4841         Kuppinger Weg 28
4842         71116 Gaertringen
4843         GERMANY
4844
4845       If you feel that sending a postcard is too much effort, you maybe want
4846       to rate the module at <http://cpanratings.perl.org/>.
4847
4849       Copyright (c) 2002-2008 Marcus Holland-Moritz. All rights reserved.
4850       This program is free software; you can redistribute it and/or modify it
4851       under the same terms as Perl itself.
4852
4853       The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4854       redistribution details refer to ctlib/ucpp/README.
4855
4856       Portions copyright (c) 1989, 1990 James A. Roskind.
4857
4858       The include files located in tests/include/include, which are used in
4859       some of the test scripts are (c) 1991-1999, 2000, 2001 Free Software
4860       Foundation, Inc. They are neither required to create the binary nor
4861       linked to the source code of this module in any other way.
4862

SEE ALSO

4864       See ccconfig, perl, perldata, perlop, perlvar, Data::Dumper and
4865       Scalar::Util.
4866
4867
4868
4869perl v5.8.8                       2008-04-15             Convert::Binary::C(3)
Impressum