1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4
6 Convert::Binary::C - Binary Data Conversion using C Types
7
9 Simple
10 use Convert::Binary::C;
11
12 #---------------------------------------------
13 # Create a new object and parse embedded code
14 #---------------------------------------------
15 my $c = Convert::Binary::C->new->parse(<<ENDC);
16
17 enum Month { JAN, FEB, MAR, APR, MAY, JUN,
18 JUL, AUG, SEP, OCT, NOV, DEC };
19
20 struct Date {
21 int year;
22 enum Month month;
23 int day;
24 };
25
26 ENDC
27
28 #-----------------------------------------------
29 # Pack Perl data structure into a binary string
30 #-----------------------------------------------
31 my $date = { year => 2002, month => 'DEC', day => 24 };
32
33 my $packed = $c->pack('Date', $date);
34
35 Advanced
36 use Convert::Binary::C;
37 use Data::Dumper;
38
39 #---------------------
40 # Create a new object
41 #---------------------
42 my $c = Convert::Binary::C->new(ByteOrder => 'BigEndian');
43
44 #---------------------------------------------------
45 # Add include paths and global preprocessor defines
46 #---------------------------------------------------
47 $c->Include('/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include',
48 '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include-fixed',
49 '/usr/include')
50 ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
51
52 #----------------------------------
53 # Parse the 'time.h' header file
54 #----------------------------------
55 $c->parse_file('time.h');
56
57 #---------------------------------------
58 # See which files the object depends on
59 #---------------------------------------
60 print Dumper([$c->dependencies]);
61
62 #-----------------------------------------------------------
63 # See if struct timespec is defined and dump its definition
64 #-----------------------------------------------------------
65 if ($c->def('struct timespec')) {
66 print Dumper($c->struct('timespec'));
67 }
68
69 #-------------------------------
70 # Create some binary dummy data
71 #-------------------------------
72 my $data = "binary_test_string";
73
74 #--------------------------------------------------------
75 # Unpack $data according to 'struct timespec' definition
76 #--------------------------------------------------------
77 if (length($data) >= $c->sizeof('timespec')) {
78 my $perl = $c->unpack('timespec', $data);
79 print Dumper($perl);
80 }
81
82 #--------------------------------------------------------
83 # See which member lies at offset 5 of 'struct timespec'
84 #--------------------------------------------------------
85 my $member = $c->member('timespec', 5);
86 print "member('timespec', 5) = '$member'\n";
87
89 Convert::Binary::C is a preprocessor and parser for C type definitions.
90 It is highly configurable and supports arbitrarily complex data
91 structures. Its object-oriented interface has "pack" and "unpack"
92 methods that act as replacements for Perl's "pack" and "unpack" and
93 allow one to use C types instead of a string representation of the data
94 structure for conversion of binary data from and to Perl's complex data
95 structures.
96
97 Actually, what Convert::Binary::C does is not very different from what
98 a C compiler does, just that it doesn't compile the source code into an
99 object file or executable, but only parses the code and allows Perl to
100 use the enumerations, structs, unions and typedefs that have been
101 defined within your C source for binary data conversion, similar to
102 Perl's "pack" and "unpack".
103
104 Beyond that, the module offers a lot of convenience methods to retrieve
105 information about the C types that have been parsed.
106
107 Background and History
108 In late 2000 I wrote a real-time debugging interface for an embedded
109 medical device that allowed me to send out data from that device over
110 its integrated Ethernet adapter. The interface was "printf()"-like, so
111 you could easily send out strings or numbers. But you could also send
112 out what I called arbitrary data, which was intended for arbitrary
113 blocks of the device's memory.
114
115 Another part of this real-time debugger was a Perl application running
116 on my workstation that gathered all the messages that were sent out
117 from the embedded device. It printed all the strings and numbers, and
118 hex-dumped the arbitrary data. However, manually parsing a couple of
119 300 byte hex-dumps of a complex C structure is not only frustrating,
120 but also error-prone and time consuming.
121
122 Using "unpack" to retrieve the contents of a C structure works fine for
123 small structures and if you don't have to deal with struct member
124 alignment. But otherwise, maintaining such code can be as awful as
125 deciphering hex-dumps.
126
127 As I didn't find anything to solve my problem on the CPAN, I wrote a
128 little module that translated simple C structs into "unpack" strings.
129 It worked, but it was slow. And since it couldn't deal with struct
130 member alignment, I soon found myself adding padding bytes everywhere.
131 So again, I had to maintain two sources, and changing one of them
132 forced me to touch the other one.
133
134 All in all, this little module seemed to make my task a bit easier, but
135 it was far from being what I was thinking of:
136
137 • A module that could directly use the source I've been coding for the
138 embedded device without any modifications.
139
140 • A module that could be configured to match the properties of the
141 different compilers and target platforms I was using.
142
143 • A module that was fast enough to decode a great amount of binary data
144 even on my slow workstation.
145
146 I didn't know how to accomplish these tasks until I read something
147 about XS. At least, it seemed as if it could solve my performance
148 problems. However, writing a C parser in C isn't easier than it is in
149 Perl. But writing a C preprocessor from scratch is even worse.
150
151 Fortunately enough, after a few weeks of searching I found both, a
152 lean, open-source C preprocessor library, and a reusable YACC grammar
153 for ANSI-C. That was the beginning of the development of
154 Convert::Binary::C in late 2001.
155
156 Now, I'm successfully using the module in my embedded environment since
157 long before it appeared on CPAN. From my point of view, it is exactly
158 what I had in mind. It's fast, flexible, easy to use and portable. It
159 doesn't require external programs or other Perl modules.
160
161 About this document
162 This document describes how to use Convert::Binary::C. A lot of
163 different features are presented, and the example code sometimes uses
164 Perl's more advanced language elements. If your experience with Perl is
165 rather limited, you should know how to use Perl's very good
166 documentation system.
167
168 To look up one of the manpages, use the "perldoc" command. For
169 example,
170
171 perldoc perl
172
173 will show you Perl's main manpage. To look up a specific Perl function,
174 use "perldoc -f":
175
176 perldoc -f map
177
178 gives you more information about the "map" function. You can also
179 search the FAQ using "perldoc -q":
180
181 perldoc -q array
182
183 will give you everything you ever wanted to know about Perl arrays. But
184 now, let's go on with some real stuff!
185
186 Why use Convert::Binary::C?
187 Say you want to pack (or unpack) data according to the following C
188 structure:
189
190 struct foo {
191 char ary[3];
192 unsigned short baz;
193 int bar;
194 };
195
196 You could of course use Perl's "pack" and "unpack" functions:
197
198 @ary = (1, 2, 3);
199 $baz = 40000;
200 $bar = -4711;
201 $binary = pack 'c3 S i', @ary, $baz, $bar;
202
203 But this implies that the struct members are byte aligned. If they were
204 long aligned (which is the default for most compilers), you'd have to
205 write
206
207 $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
208
209 which doesn't really increase readability.
210
211 Now imagine that you need to pack the data for a completely different
212 architecture with different byte order. You would look into the "pack"
213 manpage again and perhaps come up with this:
214
215 $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
216
217 However, if you try to unpack $foo again, your signed values have
218 turned into unsigned ones.
219
220 All this can still be managed with Perl. But imagine your structures
221 get more complex? Imagine you need to support different platforms?
222 Imagine you need to make changes to the structures? You'll not only
223 have to change the C source but also dozens of "pack" strings in your
224 Perl code. This is no fun. And Perl should be fun.
225
226 Now, wouldn't it be great if you could just read in the C source you've
227 already written and use all the types defined there for packing and
228 unpacking? That's what Convert::Binary::C does.
229
230 Creating a Convert::Binary::C object
231 To use Convert::Binary::C just say
232
233 use Convert::Binary::C;
234
235 to load the module. Its interface is completely object oriented, so it
236 doesn't export any functions.
237
238 Next, you need to create a new Convert::Binary::C object. This can be
239 done by either
240
241 $c = Convert::Binary::C->new;
242
243 or
244
245 $c = Convert::Binary::C->new;
246
247 You can optionally pass configuration options to the constructor as
248 described in the next section.
249
250 Configuring the object
251 To configure a Convert::Binary::C object, you can either call the
252 "configure" method or directly pass the configuration options to the
253 constructor. If you want to change byte order and alignment, you can
254 use
255
256 $c->configure(ByteOrder => 'LittleEndian',
257 Alignment => 2);
258
259 or you can change the construction code to
260
261 $c = Convert::Binary::C->new(ByteOrder => 'LittleEndian',
262 Alignment => 2);
263
264 Either way, the object will now know that it should use little endian
265 (Intel) byte order and 2-byte struct member alignment for packing and
266 unpacking.
267
268 Alternatively, you can use the option names as names of methods to
269 configure the object, like:
270
271 $c->ByteOrder('LittleEndian');
272
273 You can also retrieve information about the current configuration of a
274 Convert::Binary::C object. For details, see the section about the
275 "configure" method.
276
277 Parsing C code
278 Convert::Binary::C allows two ways of parsing C source. Either by
279 parsing external C header or C source files:
280
281 $c->parse_file('header.h');
282
283 Or by parsing C code embedded in your script:
284
285 $c->parse(<<'CCODE');
286 struct foo {
287 char ary[3];
288 unsigned short baz;
289 int bar;
290 };
291 CCODE
292
293 Now the object $c will know everything about "struct foo". The example
294 above uses a so-called here-document. It allows one to easily embed
295 multi-line strings in your code. You can find more about here-documents
296 in perldata or perlop.
297
298 Since the "parse" and "parse_file" methods throw an exception when a
299 parse error occurs, you usually want to catch these in an "eval" block:
300
301 eval { $c->parse_file('header.h') };
302 if ($@) {
303 # handle error appropriately
304 }
305
306 Perl's special $@ variable will contain an empty string (which
307 evaluates to a false value in boolean context) on success or an error
308 string on failure.
309
310 As another feature, "parse" and "parse_file" return a reference to
311 their object on success, just like "configure" does when you're
312 configuring the object. This will allow you to write constructs like
313 this:
314
315 my $c = eval {
316 Convert::Binary::C->new(Include => ['/usr/include'])
317 ->parse_file('header.h')
318 };
319 if ($@) {
320 # handle error appropriately
321 }
322
323 Packing and unpacking
324 Convert::Binary::C has two methods, "pack" and "unpack", that act
325 similar to the functions of same denominator in Perl. To perform the
326 packing described in the example above, you could write:
327
328 $data = {
329 ary => [1, 2, 3],
330 baz => 40000,
331 bar => -4711,
332 };
333 $binary = $c->pack('foo', $data);
334
335 Unpacking will work exactly the same way, just that the "unpack" method
336 will take a byte string as its input and will return a reference to a
337 (possibly very complex) Perl data structure.
338
339 $binary = get_data_from_memory();
340 $data = $c->unpack('foo', $binary);
341
342 You can now easily access all of the values:
343
344 print "foo.ary[1] = $data->{ary}[1]\n";
345
346 Or you can even more conveniently use the Data::Dumper module:
347
348 use Data::Dumper;
349 print Dumper($data);
350
351 The output would look something like this:
352
353 $VAR1 = {
354 'ary' => [
355 42,
356 48,
357 100
358 ],
359 'baz' => 5000,
360 'bar' => -271
361 };
362
363 Preprocessor configuration
364 Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C
365 preprocessor. It is compliant to ISO-C99, so you don't have to worry
366 about using even weird preprocessor constructs in your code.
367
368 If your C source contains includes or depends upon preprocessor
369 defines, you may need to configure the internal preprocessor. Use the
370 "Include" and "Define" configuration options for that:
371
372 $c->configure(Include => ['/usr/include',
373 '/home/mhx/include'],
374 Define => [qw( NDEBUG FOO=42 )]);
375
376 If your code uses system includes, it is most likely that you will need
377 to define the symbols that are usually defined by the compiler.
378
379 On some operating systems, the system includes require the preprocessor
380 to predefine a certain set of assertions. Assertions are supported by
381 "ucpp", and you can define them either in the source code using
382 "#assert" or as a property of the Convert::Binary::C object using
383 "Assert":
384
385 $c->configure(Assert => ['predicate(answer)']);
386
387 Information about defined macros can be retrieved from the preprocessor
388 as long as its configuration isn't changed. The preprocessor is
389 implicitly reset if you change one of the following configuration
390 options:
391
392 Include
393 Define
394 Assert
395 HasCPPComments
396 HasMacroVAARGS
397
398 Supported pragma directives
399 Convert::Binary::C supports the "pack" pragma to locally override
400 struct member alignment. The supported syntax is as follows:
401
402 #pragma pack( ALIGN )
403 Sets the new alignment to ALIGN. If ALIGN is 0, resets the
404 alignment to its original value.
405
406 #pragma pack
407 Resets the alignment to its original value.
408
409 #pragma pack( push, ALIGN )
410 Saves the current alignment on a stack and sets the new alignment
411 to ALIGN. If ALIGN is 0, sets the alignment to the default
412 alignment.
413
414 #pragma pack( pop )
415 Restores the alignment to the last value saved on the stack.
416
417 /* Example assumes sizeof( short ) == 2, sizeof( long ) == 4. */
418
419 #pragma pack(1)
420
421 struct nopad {
422 char a; /* no padding bytes between 'a' and 'b' */
423 long b;
424 };
425
426 #pragma pack /* reset to "native" alignment */
427
428 #pragma pack( push, 2 )
429
430 struct pad {
431 char a; /* one padding byte between 'a' and 'b' */
432 long b;
433
434 #pragma pack( push, 1 )
435
436 struct {
437 char c; /* no padding between 'c' and 'd' */
438 short d;
439 } e; /* sizeof( e ) == 3 */
440
441 #pragma pack( pop ); /* back to pack( 2 ) */
442
443 long f; /* one padding byte between 'e' and 'f' */
444 };
445
446 #pragma pack( pop ); /* back to "native" */
447
448 The "pack" pragma as it is currently implemented only affects the
449 maximum struct member alignment. There are compilers that also allow
450 one to specify the minimum struct member alignment. This is not
451 supported by Convert::Binary::C.
452
453 Automatic configuration using "ccconfig"
454 As there are over 20 different configuration options, setting all of
455 them correctly can be a lengthy and tedious task.
456
457 The "ccconfig" script, which is bundled with this module, aims at
458 automatically determining the correct compiler configuration by testing
459 the compiler executable. It works for both, native and cross compilers.
460
462 This section covers one of the fundamental features of
463 Convert::Binary::C. It's how type expressions, referred to as TYPEs in
464 the method reference, are handled by the module.
465
466 Many of the methods, namely "pack", "unpack", "sizeof", "typeof",
467 "member", "offsetof", "def", "initializer" and "tag", are passed a TYPE
468 to operate on as their first argument.
469
470 Standard Types
471 These are trivial. Standard types are simply enum names, struct names,
472 union names, or typedefs. Almost every method that wants a TYPE will
473 accept a standard type.
474
475 For enums, structs and unions, the prefixes "enum", "struct" and
476 "union" are optional. However, if a typedef with the same name exists,
477 like in
478
479 struct foo {
480 int bar;
481 };
482
483 typedef int foo;
484
485 you will have to use the prefix to distinguish between the struct and
486 the typedef. Otherwise, a typedef is always given preference.
487
488 Basic Types
489 Basic types, or atomic types, are "int" or "char", for example. It's
490 possible to use these basic types without having parsed any code. You
491 can simply do
492
493 $c = Convert::Binary::C->new;
494 $size = $c->sizeof('unsigned long');
495 $data = $c->pack('short int', 42);
496
497 Even though the above works fine, it is not possible to define more
498 complex types on the fly, so
499
500 $size = $c->sizeof('struct { int a, b; }');
501
502 will result in an error.
503
504 Basic types are not supported by all methods. For example, it makes no
505 sense to use "member" or "offsetof" on a basic type. Using "typeof"
506 isn't very useful, but supported.
507
508 Member Expressions
509 This is by far the most complex part, depending on the complexity of
510 your data structures. Any standard type that defines a compound or an
511 array may be followed by a member expression to select only a certain
512 part of the data type. Say you have parsed the following C code:
513
514 struct foo {
515 long type;
516 struct {
517 short x, y;
518 } array[20];
519 };
520
521 typedef struct foo matrix[8][8];
522
523 You may want to know the size of the "array" member of "struct foo".
524 This is quite easy:
525
526 print $c->sizeof('foo.array'), " bytes";
527
528 will print
529
530 80 bytes
531
532 depending of course on the "ShortSize" you configured.
533
534 If you wanted to unpack only a single column of "matrix", that's easy
535 as well (and of course it doesn't matter which index you use):
536
537 $column = $c->unpack('matrix[2]', $data);
538
539 Just like in C, it is possible to use out-of-bounds array indices.
540 This means that, for example, despite "array" is declared to have 20
541 elements, the following code
542
543 $size = $c->sizeof('foo.array[4711]');
544 $offset = $c->offsetof('foo', 'array[-13]');
545
546 is perfectly valid and will result in:
547
548 $size = 4
549 $offset = -44
550
551 Member expressions can be arbitrarily complex:
552
553 $type = $c->typeof('matrix[2][3].array[7].y');
554 print "the type is $type";
555
556 will, for example, print
557
558 the type is short
559
560 Member expressions are also used as the second argument to "offsetof".
561
562 Offsets
563 Members returned by the "member" method have an optional offset suffix
564 to indicate that the given offset doesn't point to the start of that
565 member. For example,
566
567 $member = $c->member('matrix', 1431);
568 print $member;
569
570 will print
571
572 [2][0].array[3].y+1
573
574 If you would use this as a member expression, like in
575
576 $size = $c->sizeof("matrix $member");
577
578 the offset suffix will simply be ignored. Actually, it will be ignored
579 for all methods if it's used in the first argument.
580
581 When used in the second argument to "offsetof", it will usually do what
582 you mean, i. e. the offset suffix, if present, will be considered when
583 determining the offset. This behaviour ensures that
584
585 $member = $c->member('foo', 43);
586 $offset = $c->offsetof('foo', $member);
587 print "'$member' is located at offset $offset of struct foo";
588
589 will always correctly set $offset:
590
591 '.array[8].y+1' is located at offset 43 of struct foo
592
593 If this is not what you mean, e.g. because you want to know the offset
594 where the member returned by "member" starts, you just have to remove
595 the suffix:
596
597 $member =~ s/\+\d+$//;
598 $offset = $c->offsetof('foo', $member);
599 print "'$member' starts at offset $offset of struct foo";
600
601 This would then print:
602
603 '.array[8].y' starts at offset 42 of struct foo
604
606 In a nutshell, tags are properties that you can attach to types.
607
608 You can add tags to types using the "tag" method, and remove them using
609 "tag" or "untag", for example:
610
611 # Attach 'Format' and 'Hooks' tags
612 $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
613
614 $c->untag('type', 'Format'); # Remove only 'Format' tag
615 $c->untag('type'); # Remove all tags
616
617 You can also use "tag" to see which tags are attached to a type, for
618 example:
619
620 $tags = $c->tag('type');
621
622 This would give you:
623
624 $tags = {
625 'Hooks' => {
626 'pack' => \&rout
627 },
628 'Format' => 'String'
629 };
630
631 Currently, there are only a couple of different tags that influence the
632 way data is packed and unpacked. There are probably more tags to come
633 in the future.
634
635 The Format Tag
636 One of the tags currently available is the "Format" tag. Using this
637 tag, you can tell a Convert::Binary::C object to pack and unpack a
638 certain data type in a special way.
639
640 For example, if you have a (fixed length) string type
641
642 typedef char str_type[40];
643
644 this type would, by default, be unpacked as an array of "char"s. That's
645 because it is only an array of "char"s, and Convert::Binary::C doesn't
646 know it is actually used as a string.
647
648 But you can tell Convert::Binary::C that "str_type" is a C string using
649 the "Format" tag:
650
651 $c->tag('str_type', Format => 'String');
652
653 This will make "unpack" (and of course also "pack") treat the binary
654 data like a null-terminated C string:
655
656 $binary = "Hello World!\n\0 this is just some dummy data";
657 $hello = $c->unpack('str_type', $binary);
658 print $hello;
659
660 would thusly print:
661
662 Hello World!
663
664 Of course, this also works the other way round:
665
666 use Data::Hexdumper;
667
668 $binary = $c->pack('str_type', "Just another C::B::C hacker");
669 print hexdump(data => $binary);
670
671 would print:
672
673 0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
674 0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
675 0x0020 : 00 00 00 00 00 00 00 00 : ........
676
677 If you want Convert::Binary::C to not interpret the binary data at all,
678 you can set the "Format" tag to "Binary". This might not be seem very
679 useful, as "pack" and "unpack" would just pass through the unmodified
680 binary data. But you can tag not only whole types, but also compound
681 members. For example
682
683 $c->parse(<<ENDC);
684 struct packet {
685 unsigned short header;
686 unsigned short flags;
687 unsigned char payload[28];
688 };
689 ENDC
690
691 $c->tag('packet.payload', Format => 'Binary');
692
693 would allow you to write:
694
695 read FILE, $payload, $c->sizeof('packet.payload');
696
697 $packet = {
698 header => 4711,
699 flags => 0xf00f,
700 payload => $payload,
701 };
702
703 $binary = $c->pack('packet', $packet);
704
705 print hexdump(data => $binary);
706
707 This would print something like:
708
709 0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
710 0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
711
712 For obvious reasons, it is not allowed to attach a "Format" tag to
713 bitfield members. Trying to do so will result in an exception being
714 thrown by the "tag" method.
715
716 The ByteOrder Tag
717 The "ByteOrder" tag allows you to override the byte order of certain
718 types or members. The implementation of this tag is considered
719 experimental and may be subject to changes in the future.
720
721 Usually it doesn't make much sense to override the byte order, but
722 there may be applications where a sub-structure is packed in a
723 different byte order than the surrounding structure.
724
725 Take, for example, the following code:
726
727 $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
728 OrderMembers => 1);
729 $c->parse(<<'ENDC');
730
731 typedef unsigned short u_16;
732
733 struct coords_3d {
734 int x, y, z;
735 };
736
737 struct coords_msg {
738 u_16 header;
739 u_16 length;
740 struct coords_3d coords;
741 };
742
743 ENDC
744
745 Assume that while "coords_msg" is big endian, the embedded coordinates
746 "coords_3d" are stored in little endian format for some reason. In C,
747 you'll have to handle this manually.
748
749 But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
750 to either the "coords_3d" structure or to the "coords" member of the
751 "coords_msg" structure. Both will work in this case. The only
752 difference is that if you tag the "coords" member, "coords_3d" will
753 only be treated as little endian if you "pack" or "unpack" the
754 "coords_msg" structure. (BTW, you could also tag all members of
755 "coords_3d" individually, but that would be inefficient.)
756
757 So, let's attach the "ByteOrder" tag to the "coords" member:
758
759 $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
760
761 Assume the following binary message:
762
763 0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
764
765 If you unpack this message...
766
767 $msg = $c->unpack('coords_msg', $binary);
768
769 ...you will get the following data structure:
770
771 $msg = {
772 'header' => 42,
773 'length' => 12,
774 'coords' => {
775 'x' => -1,
776 'y' => 2,
777 'z' => 42
778 }
779 };
780
781 Without the "ByteOrder" tag, you would get:
782
783 $msg = {
784 'header' => 42,
785 'length' => 12,
786 'coords' => {
787 'x' => -1,
788 'y' => 33554432,
789 'z' => 704643072
790 }
791 };
792
793 The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
794 of the tagged object recursively. Of course, it is also possible to
795 override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
796 child type. Confused? Here's an example. In addition to tagging the
797 "coords" member as little endian, we now tag "coords_3d.y" as big
798 endian:
799
800 $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
801 $msg = $c->unpack('coords_msg', $binary);
802
803 This will return the following data structure:
804
805 $msg = {
806 'header' => 42,
807 'length' => 12,
808 'coords' => {
809 'x' => -1,
810 'y' => 33554432,
811 'z' => 42
812 }
813 };
814
815 Note that if you tag both a type and a member of that type within a
816 compound, the tag attached to the type itself has higher precedence.
817 Using the example above, if you would attach a "ByteOrder" tag to both
818 "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
819 would always win.
820
821 Also note that the "ByteOrder" tag might not work as expected along
822 with bitfields, which is why the implementation is considered
823 experimental. Bitfields are currently not affected by the "ByteOrder"
824 tag at all. This is because the byte order would affect the bitfield
825 layout, and a consistent implementation supporting multiple layouts of
826 the same struct would be quite bulky and probably slow down the whole
827 module.
828
829 If you really need the correct behaviour, you can use the following
830 trick:
831
832 $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
833
834 $le->parse(<<'ENDC');
835
836 typedef unsigned short u_16;
837 typedef unsigned long u_32;
838
839 struct message {
840 u_16 header;
841 u_16 length;
842 struct {
843 u_32 a;
844 u_32 b;
845 u_32 c : 7;
846 u_32 d : 5;
847 u_32 e : 20;
848 } data;
849 };
850
851 ENDC
852
853 $be = $le->clone->ByteOrder('BigEndian');
854
855 $le->tag('message.data', Format => 'Binary', Hooks => {
856 unpack => sub { $be->unpack('message.data', @_) },
857 pack => sub { $be->pack('message.data', @_) },
858 });
859
860
861 $msg = $le->unpack('message', $binary);
862
863 This uses the "Format" and "Hooks" tags along with a big endian "clone"
864 of the original little endian object. It attaches hooks to the little
865 endian object and in the hooks it uses the big endian object to "pack"
866 and "unpack" the binary data.
867
868 The Dimension Tag
869 The "Dimension" tag allows you to override the declared dimension of an
870 array for packing or unpacking data. The implementation of this tag is
871 considered very experimental and will definitely change in a future
872 release.
873
874 That being said, the "Dimension" tag is primarily useful to support
875 variable length arrays. Usually, you have to write the following code
876 for such a variable length array in C:
877
878 struct c_message
879 {
880 unsigned count;
881 char data[1];
882 };
883
884 So, because you cannot declare an empty array, you declare an array
885 with a single element. If you have a ISO-C99 compliant compiler, you
886 can write this code instead:
887
888 struct c99_message
889 {
890 unsigned count;
891 char data[];
892 };
893
894 This explicitly tells the compiler that "data" is a flexible array
895 member. Convert::Binary::C already uses this information to handle
896 flexible array members in a special way.
897
898 As you can see in the following example, the two types are treated
899 differently:
900
901 $data = pack 'NC*', 3, 1..8;
902 $uc = $c->unpack('c_message', $data);
903 $uc99 = $c->unpack('c99_message', $data);
904
905 This will result in:
906
907 $uc = {'count' => 3,'data' => [1]};
908 $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
909
910 However, only few compilers support ISO-C99, and you probably don't
911 want to change your existing code only to get some extra features when
912 using Convert::Binary::C.
913
914 So it is possible to attach a tag to the "data" member of the
915 "c_message" struct that tells Convert::Binary::C to treat the array as
916 if it were flexible:
917
918 $c->tag('c_message.data', Dimension => '*');
919
920 Now both "c_message" and "c99_message" will behave exactly the same
921 when using "pack" or "unpack". Repeating the above code:
922
923 $uc = $c->unpack('c_message', $data);
924
925 This will result in:
926
927 $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
928
929 But there's more you can do. Even though it probably doesn't make much
930 sense, you can tag a fixed dimension to an array:
931
932 $c->tag('c_message.data', Dimension => '5');
933
934 This will obviously result in:
935
936 $uc = {'count' => 3,'data' => [1,2,3,4,5]};
937
938 A more useful way to use the "Dimension" tag is to set it to the name
939 of a member in the same compound:
940
941 $c->tag('c_message.data', Dimension => 'count');
942
943 Convert::Binary::C will now use the value of that member to determine
944 the size of the array, so unpacking will result in:
945
946 $uc = {'count' => 3,'data' => [1,2,3]};
947
948 Of course, you can also tag flexible array members. And yes, it's also
949 possible to use more complex member expressions:
950
951 $c->parse(<<ENDC);
952 struct msg_header
953 {
954 unsigned len[2];
955 };
956
957 struct more_complex
958 {
959 struct msg_header hdr;
960 char data[];
961 };
962 ENDC
963
964 $data = pack 'NNC*', 42, 7, 1 .. 10;
965
966 $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
967
968 $u = $c->unpack('more_complex', $data);
969
970 The result will be:
971
972 $u = {
973 'hdr' => {
974 'len' => [
975 42,
976 7
977 ]
978 },
979 'data' => [
980 1,
981 2,
982 3,
983 4,
984 5,
985 6,
986 7
987 ]
988 };
989
990 By the way, it's also possible to tag arrays that are not embedded
991 inside a compound:
992
993 $c->parse(<<ENDC);
994 typedef unsigned short short_array[];
995 ENDC
996
997 $c->tag('short_array', Dimension => '5');
998
999 $u = $c->unpack('short_array', $data);
1000
1001 Resulting in:
1002
1003 $u = [0,42,0,7,258];
1004
1005 The final and most powerful way to define a "Dimension" tag is to pass
1006 it a subroutine reference. The referenced subroutine can execute
1007 whatever code is necessary to determine the size of the tagged array:
1008
1009 sub get_size
1010 {
1011 my $m = shift;
1012 return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1013 }
1014
1015 $c->tag('more_complex.data', Dimension => \&get_size);
1016
1017 $u = $c->unpack('more_complex', $data);
1018
1019 As you can guess from the above code, the subroutine is being passed a
1020 reference to hash that stores the already unpacked part of the compound
1021 embedding the tagged array. This is the result:
1022
1023 $u = {
1024 'hdr' => {
1025 'len' => [
1026 42,
1027 7
1028 ]
1029 },
1030 'data' => [
1031 1,
1032 2,
1033 3,
1034 4,
1035 5,
1036 6
1037 ]
1038 };
1039
1040 You can also pass custom arguments to the subroutines by using the
1041 "arg" method. This is similar to the functionality offered by the
1042 "Hooks" tag.
1043
1044 Of course, all that also works for the "pack" method as well.
1045
1046 However, the current implementation has at least one shortcomings,
1047 which is why it's experimental: The "Dimension" tag doesn't impact
1048 compound layout. This means that while you can alter the size of an
1049 array in the middle of a compound, the offset of the members after that
1050 array won't be impacted. I'd rather like to see the layout adapt
1051 dynamically, so this is what I'm hoping to implement in the future.
1052
1053 The Hooks Tag
1054 Hooks are a special kind of tag that can be extremely useful.
1055
1056 Using hooks, you can easily override the way "pack" and "unpack" handle
1057 data using your own subroutines. If you define hooks for a certain
1058 data type, each time this data type is processed the corresponding hook
1059 will be called to allow you to modify that data.
1060
1061 Basic Hooks
1062
1063 Here's an example. Let's assume the following C code has been parsed:
1064
1065 typedef unsigned int u_32;
1066 typedef u_32 ProtoId;
1067 typedef ProtoId MyProtoId;
1068
1069 struct MsgHeader {
1070 MyProtoId id;
1071 u_32 len;
1072 };
1073
1074 struct String {
1075 u_32 len;
1076 char buf[];
1077 };
1078
1079 You could now use the types above and, for example, unpack binary data
1080 representing a "MsgHeader" like this:
1081
1082 $msg_header = $c->unpack('MsgHeader', $data);
1083
1084 This would give you:
1085
1086 $msg_header = {
1087 'id' => 42,
1088 'len' => 13
1089 };
1090
1091 Instead of dealing with "ProtoId"'s as integers, you would rather like
1092 to have them as clear text. You could provide subroutines to convert
1093 between clear text and integers:
1094
1095 %proto = (
1096 CATS => 1,
1097 DOGS => 42,
1098 HEDGEHOGS => 4711,
1099 );
1100
1101 %rproto = reverse %proto;
1102
1103 sub ProtoId_unpack {
1104 $rproto{$_[0]} || 'unknown protocol'
1105 }
1106
1107 sub ProtoId_pack {
1108 $proto{$_[0]} or die 'unknown protocol'
1109 }
1110
1111 You can now register these subroutines by attaching a "Hooks" tag to
1112 "ProtoId" using the "tag" method:
1113
1114 $c->tag('ProtoId', Hooks => { pack => \&ProtoId_pack,
1115 unpack => \&ProtoId_unpack });
1116
1117 Doing exactly the same unpack on "MsgHeader" again would now return:
1118
1119 $msg_header = {
1120 'id' => 'DOGS',
1121 'len' => 13
1122 };
1123
1124 Actually, if you don't need the reverse operation, you don't even have
1125 to register a "pack" hook. Or, even better, you can have a more
1126 intelligent "unpack" hook that creates a dual-typed variable:
1127
1128 use Scalar::Util qw(dualvar);
1129
1130 sub ProtoId_unpack2 {
1131 dualvar $_[0], $rproto{$_[0]} || 'unknown protocol'
1132 }
1133
1134 $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1135
1136 $msg_header = $c->unpack('MsgHeader', $data);
1137
1138 Just as before, this would print
1139
1140 $msg_header = {
1141 'id' => 'DOGS',
1142 'len' => 13
1143 };
1144
1145 but without requiring a "pack" hook for packing, at least as long as
1146 you keep the variable dual-typed.
1147
1148 Hooks are usually called with exactly one argument, which is the data
1149 that should be processed (see "Advanced Hooks" for details on how to
1150 customize hook arguments). They are called in scalar context and
1151 expected to return the processed data.
1152
1153 To get rid of registered hooks, you can either undefine only certain
1154 hooks
1155
1156 $c->tag('ProtoId', Hooks => { pack => undef });
1157
1158 or all hooks:
1159
1160 $c->tag('ProtoId', Hooks => undef);
1161
1162 Of course, hooks are not restricted to handling integer values. You
1163 could just as well attach hooks for the "String" struct from the code
1164 above. A useful example would be to have these hooks:
1165
1166 sub string_unpack {
1167 my $s = shift;
1168 pack "c$s->{len}", @{$s->{buf}};
1169 }
1170
1171 sub string_pack {
1172 my $s = shift;
1173 return {
1174 len => length $s,
1175 buf => [ unpack 'c*', $s ],
1176 }
1177 }
1178
1179 (Don't be confused by the fact that the "unpack" hook uses "pack" and
1180 the "pack" hook uses "unpack". And also see "Advanced Hooks" for a
1181 more clever approach.)
1182
1183 While you would normally get the following output when unpacking a
1184 "String"
1185
1186 $string = {
1187 'len' => 12,
1188 'buf' => [
1189 72,
1190 101,
1191 108,
1192 108,
1193 111,
1194 32,
1195 87,
1196 111,
1197 114,
1198 108,
1199 100,
1200 33
1201 ]
1202 };
1203
1204 you could just register the hooks using
1205
1206 $c->tag('String', Hooks => { pack => \&string_pack,
1207 unpack => \&string_unpack });
1208
1209 and you would get a nice human-readable Perl string:
1210
1211 $string = 'Hello World!';
1212
1213 Packing a string turns out to be just as easy:
1214
1215 use Data::Hexdumper;
1216
1217 $data = $c->pack('String', 'Just another Perl hacker,');
1218
1219 print hexdump(data => $data);
1220
1221 This would print:
1222
1223 0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1224 0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C : .Perl.hacker,
1225
1226 If you want to find out if or which hooks are registered for a certain
1227 type, you can also use the "tag" method:
1228
1229 $hooks = $c->tag('String', 'Hooks');
1230
1231 This would return:
1232
1233 $hooks = {
1234 'unpack' => \&string_unpack,
1235 'pack' => \&string_pack
1236 };
1237
1238 Advanced Hooks
1239
1240 It is also possible to combine hooks with using the "Format" tag. This
1241 can be useful if you know better than Convert::Binary::C how to
1242 interpret the binary data. In the previous section, we've handled this
1243 type
1244
1245 struct String {
1246 u_32 len;
1247 char buf[];
1248 };
1249
1250 with the following hooks:
1251
1252 sub string_unpack {
1253 my $s = shift;
1254 pack "c$s->{len}", @{$s->{buf}};
1255 }
1256
1257 sub string_pack {
1258 my $s = shift;
1259 return {
1260 len => length $s,
1261 buf => [ unpack 'c*', $s ],
1262 }
1263 }
1264
1265 $c->tag('String', Hooks => { pack => \&string_pack,
1266 unpack => \&string_unpack });
1267
1268 As you can see in the hook code, "buf" is expected to be an array of
1269 characters. For the "unpack" case Convert::Binary::C first turns the
1270 binary data into a Perl array, and then the hook packs it back into a
1271 string. The intermediate array creation and destruction is completely
1272 useless. Same thing, of course, for the "pack" case.
1273
1274 Here's a clever way to handle this. Just tag "buf" as binary
1275
1276 $c->tag('String.buf', Format => 'Binary');
1277
1278 and use the following hooks instead:
1279
1280 sub string_unpack2 {
1281 my $s = shift;
1282 substr $s->{buf}, 0, $s->{len};
1283 }
1284
1285 sub string_pack2 {
1286 my $s = shift;
1287 return {
1288 len => length $s,
1289 buf => $s,
1290 }
1291 }
1292
1293 $c->tag('String', Hooks => { pack => \&string_pack2,
1294 unpack => \&string_unpack2 });
1295
1296 This will be exactly equivalent to the old code, but faster and
1297 probably even much easier to understand.
1298
1299 But hooks are even more powerful. You can customize the arguments that
1300 are passed to your hooks and you can use "arg" to pass certain special
1301 arguments, such as the name of the type that is currently being
1302 processed by the hook.
1303
1304 The following example shows how it is easily possible to peek into the
1305 perl internals using hooks.
1306
1307 use Config;
1308
1309 $c = Convert::Binary::C->new(%CC, OrderMembers => 1);
1310 $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1311 $c->parse(<<ENDC);
1312 #include "EXTERN.h"
1313 #include "perl.h"
1314 ENDC
1315
1316 $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1317 $c->arg(qw(SELF TYPE DATA))] })
1318 for qw( XPVAV XPVHV );
1319
1320 First, we add the perl core include path and parse perl.h. Then, we add
1321 an "unpack_ptr" hook for a couple of the internal data types.
1322
1323 The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1324 a certain data structure is processed. This is by far the most
1325 experimental part of the hooks feature, as this includes any kind of
1326 pointer. There's no way for the hook to know the difference between a
1327 plain pointer, or a pointer to a pointer, or a pointer to an array
1328 (this is because the difference doesn't matter anywhere else in
1329 Convert::Binary::C).
1330
1331 But the hook above makes use of another very interesting feature: It
1332 uses "arg" to pass special arguments to the hook subroutine. Usually,
1333 the hook subroutine is simply passed a single data argument. But using
1334 the above definition, it'll get a reference to the calling object
1335 ("SELF"), the name of the type being processed ("TYPE") and the data
1336 ("DATA").
1337
1338 But how does our hook look like?
1339
1340 sub unpack_ptr {
1341 my($self, $type, $ptr) = @_;
1342 $ptr or return '<NULL>';
1343 my $size = $self->sizeof($type);
1344 $self->unpack($type, unpack("P$size", pack('Q', $ptr)));
1345 }
1346
1347 As you can see, the hook is rather simple. First, it receives the
1348 arguments mentioned above. It performs a quick check if the pointer is
1349 "NULL" and shouldn't be processed any further. Next, it determines the
1350 size of the type being processed. And finally, it'll just use the "P"n
1351 unpack template to read from that memory location and recursively call
1352 "unpack" to unpack the type. (And yes, this may of course again call
1353 other hooks.)
1354
1355 Now, let's test that:
1356
1357 my $ref = { foo => 42, bar => 4711 };
1358 my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1359
1360 print Dumper(unpack_ptr($c, 'AV', $ptr));
1361
1362 Just for the fun of it, we create a blessed array reference. But how do
1363 we get a pointer to the corresponding "AV"? This is rather easy, as the
1364 address of the "AV" is just the hex value that appears when using the
1365 array reference in string context. So we just grab that and turn it
1366 into decimal. All that's left to do is just call our hook, as it can
1367 already handle "AV" pointers. And this is what we get:
1368
1369 $VAR1 = {
1370 'sv_any' => {
1371 'xmg_stash' => 0,
1372 'xmg_u' => {
1373 'xmg_magic' => 0,
1374 'xmg_hash_index' => 0
1375 },
1376 'xav_fill' => 2,
1377 'xav_max' => 7,
1378 'xav_alloc' => 0
1379 },
1380 'sv_refcnt' => 1,
1381 'sv_flags' => 536870924,
1382 'sv_u' => {
1383 'svu_pv' => '94716517508048',
1384 'svu_iv' => '94716517508048',
1385 'svu_uv' => '94716517508048',
1386 'svu_nv' => '4.67961773944475e-310',
1387 'svu_rv' => '94716517508048',
1388 'svu_array' => '94716517508048',
1389 'svu_hash' => '94716517508048',
1390 'svu_gp' => '94716517508048',
1391 'svu_fp' => '94716517508048'
1392 }
1393 };
1394
1395 Even though it is rather easy to do such stuff using "unpack_ptr"
1396 hooks, you should really know what you're doing and do it with extreme
1397 care because of the limitations mentioned above. It's really easy to
1398 run into segmentation faults when you're dereferencing pointers that
1399 point to memory which you don't own.
1400
1401 Performance
1402
1403 Using hooks isn't for free. In performance-critical applications you
1404 have to keep in mind that hooks are actually perl subroutines and that
1405 they are called once for every value of a registered type that is being
1406 packed or unpacked. If only about 10% of the values require hooks to be
1407 called, you'll hardly notice the difference (if your hooks are
1408 implemented efficiently, that is). But if all values would require
1409 hooks to be called, that alone could easily make packing and unpacking
1410 very slow.
1411
1412 Tag Order
1413 Since it is possible to attach multiple tags to a single type, the
1414 order in which the tags are processed is important. Here's a small
1415 table that shows the processing order.
1416
1417 pack unpack
1418 ---------------------
1419 Hooks Format
1420 Format ByteOrder
1421 ByteOrder Hooks
1422
1423 As a general rule, the "Hooks" tag is always the first thing processed
1424 when packing data, and the last thing processed when unpacking data.
1425
1426 The "Format" and "ByteOrder" tags are exclusive, but when both are
1427 given the "Format" tag wins.
1428
1430 new
1431 "new"
1432 "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1433 The constructor is used to create a new Convert::Binary::C
1434 object. You can simply use
1435
1436 $c = Convert::Binary::C->new;
1437
1438 without additional arguments to create an object, or you can
1439 optionally pass any arguments to the constructor that are
1440 described for the "configure" method.
1441
1442 configure
1443 "configure"
1444 "configure" OPTION
1445 "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1446 This method can be used to configure an existing
1447 Convert::Binary::C object or to retrieve its current
1448 configuration.
1449
1450 To configure the object, the list of options consists of key
1451 and value pairs and must therefore contain an even number of
1452 elements. "configure" (and also "new" if used with
1453 configuration options) will throw an exception if you pass an
1454 odd number of elements. Configuration will normally look like
1455 this:
1456
1457 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1458
1459 To retrieve the current value of a configuration option, you
1460 must pass a single argument to "configure" that holds the name
1461 of the option, just like
1462
1463 $order = $c->configure('ByteOrder');
1464
1465 If you want to get the values of all configuration options at
1466 once, you can call "configure" without any arguments and it
1467 will return a reference to a hash table that holds the whole
1468 object configuration. This can be conveniently used with the
1469 Data::Dumper module, for example:
1470
1471 use Convert::Binary::C;
1472 use Data::Dumper;
1473
1474 $c = Convert::Binary::C->new(Define => ['DEBUGGING', 'FOO=123'],
1475 Include => ['/usr/include']);
1476
1477 print Dumper($c->configure);
1478
1479 Which will print something like this:
1480
1481 $VAR1 = {
1482 'DisabledKeywords' => [],
1483 'HasCPPComments' => 1,
1484 'UnsignedChars' => 0,
1485 'LongDoubleSize' => 16,
1486 'OrderMembers' => 1,
1487 'CompoundAlignment' => 1,
1488 'UnsignedBitfields' => 0,
1489 'DoubleSize' => 8,
1490 'Assert' => [],
1491 'PointerSize' => 8,
1492 'ByteOrder' => 'LittleEndian',
1493 'Warnings' => 0,
1494 'LongSize' => 8,
1495 'Include' => [
1496 '/usr/include'
1497 ],
1498 'EnumType' => 'Integer',
1499 'EnumSize' => 4,
1500 'ShortSize' => 2,
1501 'IntSize' => 4,
1502 'StdCVersion' => 199901,
1503 'HostedC' => 1,
1504 'Alignment' => 1,
1505 'HasMacroVAARGS' => 1,
1506 'KeywordMap' => {},
1507 'Define' => [
1508 'DEBUGGING',
1509 'FOO=123'
1510 ],
1511 'LongLongSize' => 8,
1512 'CharSize' => 1,
1513 'FloatSize' => 4,
1514 'Bitfields' => {
1515 'Engine' => 'Generic'
1516 }
1517 };
1518
1519 Since you may not always want to write a "configure" call when
1520 you only want to change a single configuration item, you can
1521 use any configuration option name as a method name, like:
1522
1523 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1524
1525 (Yes, the example doesn't make very much sense... ;-)
1526
1527 However, you should keep in mind that configuration methods
1528 that can take lists (namely "Include", "Define" and "Assert",
1529 but not "DisabledKeywords") may behave slightly different than
1530 their "configure" equivalent. If you pass these methods a
1531 single argument that is an array reference, the current list
1532 will be replaced by the new one, which is just the behaviour of
1533 the corresponding "configure" call. So the following are
1534 equivalent:
1535
1536 $c->configure(Define => ['foo', 'bar=123']);
1537 $c->Define(['foo', 'bar=123']);
1538
1539 But if you pass a list of strings instead of an array reference
1540 (which cannot be done when using "configure"), the new list
1541 items are appended to the current list, so
1542
1543 $c = Convert::Binary::C->new(Include => ['/include']);
1544 $c->Include('/usr/include', '/usr/local/include');
1545 print Dumper($c->Include);
1546
1547 $c->Include(['/usr/local/include']);
1548 print Dumper($c->Include);
1549
1550 will first print all three include paths, but finally only
1551 "/usr/local/include" will be configured:
1552
1553 $VAR1 = [
1554 '/include',
1555 '/usr/include',
1556 '/usr/local/include'
1557 ];
1558 $VAR1 = [
1559 '/usr/local/include'
1560 ];
1561
1562 Furthermore, configuration methods can be chained together, as
1563 they return a reference to their object if called as a set
1564 method. So, if you like, you can configure your object like
1565 this:
1566
1567 $c = Convert::Binary::C->new(IntSize => 4)
1568 ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1569 ->ByteOrder('BigEndian');
1570
1571 $c->configure(EnumType => 'Both', Alignment => 4)
1572 ->Include('/usr/include', '/usr/local/include');
1573
1574 In the example above, "qw( ... )" is the word list quoting
1575 operator. It returns a list of all non-whitespace sequences,
1576 and is especially useful for configuring preprocessor defines
1577 or assertions. The following assignments are equivalent:
1578
1579 @array = ('one', 'two', 'three');
1580 @array = qw(one two three);
1581
1582 You can configure the following options. Unknown options, as
1583 well as invalid values for an option, will cause the object to
1584 throw exceptions.
1585
1586 "IntSize" => 0 | 1 | 2 | 4 | 8
1587 Set the number of bytes that are occupied by an integer.
1588 This is in most cases 2 or 4. If you set it to zero, the
1589 size of an integer on the host system will be used. This is
1590 also the default unless overridden by
1591 "CBC_DEFAULT_INT_SIZE" at compile time.
1592
1593 "CharSize" => 0 | 1 | 2 | 4 | 8
1594 Set the number of bytes that are occupied by a "char".
1595 This rarely needs to be changed, except for some platforms
1596 that don't care about bytes, for example DSPs. If you set
1597 this to zero, the size of a "char" on the host system will
1598 be used. This is also the default unless overridden by
1599 "CBC_DEFAULT_CHAR_SIZE" at compile time.
1600
1601 "ShortSize" => 0 | 1 | 2 | 4 | 8
1602 Set the number of bytes that are occupied by a short
1603 integer. Although integers explicitly declared as "short"
1604 should be always 16 bit, there are compilers that make a
1605 short 8 bit wide. If you set it to zero, the size of a
1606 short integer on the host system will be used. This is also
1607 the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1608 at compile time.
1609
1610 "LongSize" => 0 | 1 | 2 | 4 | 8
1611 Set the number of bytes that are occupied by a long
1612 integer. If set to zero, the size of a long integer on the
1613 host system will be used. This is also the default unless
1614 overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1615
1616 "LongLongSize" => 0 | 1 | 2 | 4 | 8
1617 Set the number of bytes that are occupied by a long long
1618 integer. If set to zero, the size of a long long integer on
1619 the host system, or 8, will be used. This is also the
1620 default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1621 at compile time.
1622
1623 "FloatSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1624 Set the number of bytes that are occupied by a single
1625 precision floating point value. If you set it to zero, the
1626 size of a "float" on the host system will be used. This is
1627 also the default unless overridden by
1628 "CBC_DEFAULT_FLOAT_SIZE" at compile time. For details on
1629 floating point support, see "FLOATING POINT VALUES".
1630
1631 "DoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1632 Set the number of bytes that are occupied by a double
1633 precision floating point value. If you set it to zero, the
1634 size of a "double" on the host system will be used. This is
1635 also the default unless overridden by
1636 "CBC_DEFAULT_DOUBLE_SIZE" at compile time. For details on
1637 floating point support, see "FLOATING POINT VALUES".
1638
1639 "LongDoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1640 Set the number of bytes that are occupied by a double
1641 precision floating point value. If you set it to zero, the
1642 size of a "long double" on the host system, or 12 will be
1643 used. This is also the default unless overridden by
1644 "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1645 on floating point support, see "FLOATING POINT VALUES".
1646
1647 "PointerSize" => 0 | 1 | 2 | 4 | 8
1648 Set the number of bytes that are occupied by a pointer.
1649 This is in most cases 2 or 4. If you set it to zero, the
1650 size of a pointer on the host system will be used. This is
1651 also the default unless overridden by
1652 "CBC_DEFAULT_PTR_SIZE" at compile time.
1653
1654 "EnumSize" => -1 | 0 | 1 | 2 | 4 | 8
1655 Set the number of bytes that are occupied by an enumeration
1656 type. On most systems, this is equal to the size of an
1657 integer, which is also the default. However, for some
1658 compilers, the size of an enumeration type depends on the
1659 size occupied by the largest enumerator. So the size may
1660 vary between 1 and 8. If you have
1661
1662 enum foo {
1663 ONE = 100, TWO = 200
1664 };
1665
1666 this will occupy one byte because the enum can be
1667 represented as an unsigned one-byte value. However,
1668
1669 enum foo {
1670 ONE = -100, TWO = 200
1671 };
1672
1673 will occupy two bytes, because the -100 forces the type to
1674 be signed, and 200 doesn't fit into a signed one-byte
1675 value. Therefore, the type used is a signed two-byte
1676 value. If this is the behaviour you need, set the EnumSize
1677 to 0.
1678
1679 Some compilers try to follow this strategy, but don't care
1680 whether the enumeration has signed values or not. They
1681 always declare an enum as signed. On such a compiler, given
1682
1683 enum one { ONE = -100, TWO = 100 };
1684 enum two { ONE = 100, TWO = 200 };
1685
1686 enum "one" will occupy only one byte, while enum "two" will
1687 occupy two bytes, even though it could be represented by a
1688 unsigned one-byte value. If this is the behaviour of your
1689 compiler, set EnumSize to "-1".
1690
1691 "Alignment" => 0 | 1 | 2 | 4 | 8 | 16
1692 Set the struct member alignment. This option controls where
1693 padding bytes are inserted between struct members. It
1694 globally sets the alignment for all structs/unions.
1695 However, this can be overridden from within the source code
1696 with the common "pack" pragma as explained in "Supported
1697 pragma directives". The default alignment is 1, which
1698 means no padding bytes are inserted. A setting of 0 means
1699 native alignment, i.e. the alignment of the system that
1700 Convert::Binary::C has been compiled on. You can determine
1701 the native properties using the "native" function.
1702
1703 The "Alignment" option is similar to the "-Zp[n]" option of
1704 the Intel compiler. It globally specifies the maximum
1705 boundary to which struct members are aligned. Consider the
1706 following structure and the sizes of "char", "short",
1707 "long" and "double" being 1, 2, 4 and 8, respectively.
1708
1709 struct align {
1710 char a;
1711 short b, c;
1712 long d;
1713 double e;
1714 };
1715
1716 With an alignment of 1 (the default), the struct members
1717 would be packed tightly:
1718
1719 0 1 2 3 4 5 6 7 8 9 10 11 12
1720 +---+---+---+---+---+---+---+---+---+---+---+---+
1721 | a | b | c | d | ...
1722 +---+---+---+---+---+---+---+---+---+---+---+---+
1723
1724 12 13 14 15 16 17
1725 +---+---+---+---+---+
1726 ... e |
1727 +---+---+---+---+---+
1728
1729 With an alignment of 2, the struct members larger than one
1730 byte would be aligned to 2-byte boundaries, which results
1731 in a single padding byte between "a" and "b".
1732
1733 0 1 2 3 4 5 6 7 8 9 10 11 12
1734 +---+---+---+---+---+---+---+---+---+---+---+---+
1735 | a | * | b | c | d | ...
1736 +---+---+---+---+---+---+---+---+---+---+---+---+
1737
1738 12 13 14 15 16 17 18
1739 +---+---+---+---+---+---+
1740 ... e |
1741 +---+---+---+---+---+---+
1742
1743 With an alignment of 4, the struct members of size 2 would
1744 be aligned to 2-byte boundaries and larger struct members
1745 would be aligned to 4-byte boundaries:
1746
1747 0 1 2 3 4 5 6 7 8 9 10 11 12
1748 +---+---+---+---+---+---+---+---+---+---+---+---+
1749 | a | * | b | c | * | * | d | ...
1750 +---+---+---+---+---+---+---+---+---+---+---+---+
1751
1752 12 13 14 15 16 17 18 19 20
1753 +---+---+---+---+---+---+---+---+
1754 ... | e |
1755 +---+---+---+---+---+---+---+---+
1756
1757 This layout of the struct members allows the compiler to
1758 generate optimized code because aligned members can be
1759 accessed more easily by the underlying architecture.
1760
1761 Finally, setting the alignment to 8 will align "double"s to
1762 8-byte boundaries:
1763
1764 0 1 2 3 4 5 6 7 8 9 10 11 12
1765 +---+---+---+---+---+---+---+---+---+---+---+---+
1766 | a | * | b | c | * | * | d | ...
1767 +---+---+---+---+---+---+---+---+---+---+---+---+
1768
1769 12 13 14 15 16 17 18 19 20 21 22 23 24
1770 +---+---+---+---+---+---+---+---+---+---+---+---+
1771 ... | * | * | * | * | e |
1772 +---+---+---+---+---+---+---+---+---+---+---+---+
1773
1774 Further increasing the alignment does not alter the layout
1775 of our structure, as only members larger that 8 bytes would
1776 be affected.
1777
1778 The alignment of a structure depends on its largest member
1779 and on the setting of the "Alignment" option. With
1780 "Alignment" set to 2, a structure holding a "long" would be
1781 aligned to a 2-byte boundary, while a structure containing
1782 only "char"s would have no alignment restrictions.
1783 (Unfortunately, that's not the whole story. See the
1784 "CompoundAlignment" option for details.)
1785
1786 Here's another example. Assuming 8-byte alignment, the
1787 following two structs will both have a size of 16 bytes:
1788
1789 struct one {
1790 char c;
1791 double d;
1792 };
1793
1794 struct two {
1795 double d;
1796 char c;
1797 };
1798
1799 This is clear for "struct one", because the member "d" has
1800 to be aligned to an 8-byte boundary, and thus 7 padding
1801 bytes are inserted after "c". But for "struct two", the
1802 padding bytes are inserted at the end of the structure,
1803 which doesn't make much sense immediately. However, it
1804 makes perfect sense if you think about an array of "struct
1805 two". Each "double" has to be aligned to an 8-byte
1806 boundary, an thus each array element would have to occupy
1807 16 bytes. With that in mind, it would be strange if a
1808 "struct two" variable would have a different size. And it
1809 would make the widely used construct
1810
1811 struct two array[] = { {1.0, 0}, {2.0, 1} };
1812 int elements = sizeof(array) / sizeof(struct two);
1813
1814 impossible.
1815
1816 The alignment behaviour described here seems to be common
1817 for all compilers. However, not all compilers have an
1818 option to configure their default alignment.
1819
1820 "CompoundAlignment" => 0 | 1 | 2 | 4 | 8 | 16
1821 Usually, the alignment of a compound (i.e. a "struct" or a
1822 "union") depends only on its largest member and on the
1823 setting of the "Alignment" option. There are, however,
1824 architectures and compilers where compounds can have
1825 different alignment constraints.
1826
1827 For most platforms and compilers, the alignment constraint
1828 for compounds is 1 byte. That is, on most platforms
1829
1830 struct onebyte {
1831 char byte;
1832 };
1833
1834 will have an alignment of 1 and also a size of 1. But if
1835 you take an ARM architecture, the above "struct onebyte"
1836 will have an alignment of 4, and thus also a size of 4.
1837
1838 You can configure this by setting "CompoundAlignment" to 4.
1839 This will ensure that the alignment of compounds is always
1840 4.
1841
1842 Setting "CompoundAlignment" to 0 means native compound
1843 alignment, i.e. the compound alignment of the system that
1844 Convert::Binary::C has been compiled on. You can determine
1845 the native properties using the "native" function.
1846
1847 There are also compilers for certain platforms that allow
1848 you to adjust the compound alignment. If you're not aware
1849 of the fact that your compiler/architecture has a compound
1850 alignment other than 1, strange things can happen. If, for
1851 example, the compound alignment is 2 and you have something
1852 like
1853
1854 typedef unsigned char U8;
1855
1856 struct msg_head {
1857 U8 cmd;
1858 struct {
1859 U8 hi;
1860 U8 low;
1861 } crc16;
1862 U8 len;
1863 };
1864
1865 there will be one padding byte inserted before the embedded
1866 "crc16" struct and after the "len" member, which is most
1867 probably not what was intended:
1868
1869 0 1 2 3 4 5 6
1870 +-----+-----+-----+-----+-----+-----+
1871 | cmd | * | hi | low | len | * |
1872 +-----+-----+-----+-----+-----+-----+
1873
1874 Note that both "#pragma pack" and the "Alignment" option
1875 can override "CompoundAlignment". If you set
1876 "CompoundAlignment" to 4, but "Alignment" to 2, compounds
1877 will actually be aligned on 2-byte boundaries.
1878
1879 "ByteOrder" => 'BigEndian' | 'LittleEndian'
1880 Set the byte order for integers larger than a single byte.
1881 Little endian (Intel, least significant byte first) and big
1882 endian (Motorola, most significant byte first) byte order
1883 are supported. The default byte order is the same as the
1884 byte order of the host system unless overridden by
1885 "CBC_DEFAULT_BYTEORDER" at compile time.
1886
1887 "EnumType" => 'Integer' | 'String' | 'Both'
1888 This option controls the type that enumeration constants
1889 will have in data structures returned by the "unpack"
1890 method. If you have the following definitions:
1891
1892 typedef enum {
1893 SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1894 THURSDAY, FRIDAY, SATURDAY
1895 } Weekday;
1896
1897 typedef enum {
1898 JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1899 AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1900 } Month;
1901
1902 typedef struct {
1903 int year;
1904 Month month;
1905 int day;
1906 Weekday weekday;
1907 } Date;
1908
1909 and a byte string that holds a packed Date struct, then
1910 you'll get the following results from a call to the
1911 "unpack" method.
1912
1913 "Integer"
1914 Enumeration constants are returned as plain integers.
1915 This is fast, but may be not very useful. It is also
1916 the default.
1917
1918 $date = {
1919 'year' => 2002,
1920 'month' => 0,
1921 'day' => 7,
1922 'weekday' => 1
1923 };
1924
1925 "String"
1926 Enumeration constants are returned as strings. This
1927 will create a string constant for every unpacked
1928 enumeration constant and thus consumes more time and
1929 memory. However, the result may be more useful.
1930
1931 $date = {
1932 'year' => 2002,
1933 'month' => 'JANUARY',
1934 'day' => 7,
1935 'weekday' => 'MONDAY'
1936 };
1937
1938 "Both"
1939 Enumeration constants are returned as double typed
1940 scalars. If evaluated in string context, the
1941 enumeration constant will be a string, if evaluated in
1942 numeric context, the enumeration constant will be an
1943 integer.
1944
1945 $date = $c->EnumType('Both')->unpack('Date', $binary);
1946
1947 printf "Weekday = %s (%d)\n\n", $date->{weekday},
1948 $date->{weekday};
1949
1950 if ($date->{month} == 0) {
1951 print "It's $date->{month}, happy new year!\n\n";
1952 }
1953
1954 print Dumper($date);
1955
1956 This will print:
1957
1958 Weekday = MONDAY (1)
1959
1960 It's JANUARY, happy new year!
1961
1962 $VAR1 = {
1963 'year' => 2002,
1964 'month' => 'JANUARY',
1965 'day' => 7,
1966 'weekday' => 'MONDAY'
1967 };
1968
1969 "DisabledKeywords" => [ KEYWORDS ]
1970 This option allows you to selectively deactivate certain
1971 keywords in the C parser. Some C compilers don't have the
1972 complete ANSI keyword set, i.e. they don't recognize the
1973 keywords "const" or "void", for example. If you do
1974
1975 typedef int void;
1976
1977 on such a compiler, this will usually be ok. But if you
1978 parse this with an ANSI compiler, it will be a syntax
1979 error. To parse the above code correctly, you have to
1980 disable the "void" keyword in the Convert::Binary::C
1981 parser:
1982
1983 $c->DisabledKeywords([qw( void )]);
1984
1985 By default, the Convert::Binary::C parser will recognize
1986 the keywords "inline" and "restrict". If your compiler
1987 doesn't have these new keywords, it usually doesn't matter.
1988 Only if you're using the keywords as identifiers, like in
1989
1990 typedef struct inline {
1991 int a, b;
1992 } restrict;
1993
1994 you'll have to disable these ISO-C99 keywords:
1995
1996 $c->DisabledKeywords([qw( inline restrict )]);
1997
1998 The parser allows you to disable the following keywords:
1999
2000 asm
2001 auto
2002 const
2003 double
2004 enum
2005 extern
2006 float
2007 inline
2008 long
2009 register
2010 restrict
2011 short
2012 signed
2013 static
2014 unsigned
2015 void
2016 volatile
2017
2018 "KeywordMap" => { KEYWORD => TOKEN, ... }
2019 This option allows you to add new keywords to the parser.
2020 These new keywords can either be mapped to existing tokens
2021 or simply ignored. For example, recent versions of the GNU
2022 compiler recognize the keywords "__signed__" and
2023 "__extension__". The first one obviously is a synonym for
2024 "signed", while the second one is only a marker for a
2025 language extension.
2026
2027 Using the preprocessor, you could of course do the
2028 following:
2029
2030 $c->Define(qw( __signed__=signed __extension__= ));
2031
2032 However, the preprocessor symbols could be undefined or
2033 redefined in the code, and
2034
2035 #ifdef __signed__
2036 # undef __signed__
2037 #endif
2038
2039 typedef __extension__ __signed__ long long s_quad;
2040
2041 would generate a parse error, because "__signed__" is an
2042 unexpected identifier.
2043
2044 Instead of utilizing the preprocessor, you'll have to
2045 create mappings for the new keywords directly in the parser
2046 using "KeywordMap". In the above example, you want to map
2047 "__signed__" to the built-in C keyword "signed" and ignore
2048 "__extension__". This could be done with the following
2049 code:
2050
2051 $c->KeywordMap({ __signed__ => 'signed',
2052 __extension__ => undef });
2053
2054 You can specify any valid identifier as hash key, and
2055 either a valid C keyword or "undef" as hash value. Having
2056 configured the object that way, you could parse even
2057
2058 #ifdef __signed__
2059 # undef __signed__
2060 #endif
2061
2062 typedef __extension__ __signed__ long long s_quad;
2063
2064 without problems.
2065
2066 Note that "KeywordMap" and "DisabledKeywords" perfectly
2067 work together. You could, for example, disable the "signed"
2068 keyword, but still have "__signed__" mapped to the original
2069 "signed" token:
2070
2071 $c->configure(DisabledKeywords => [ 'signed' ],
2072 KeywordMap => { __signed__ => 'signed' });
2073
2074 This would allow you to define
2075
2076 typedef __signed__ long signed;
2077
2078 which would normally be a syntax error because "signed"
2079 cannot be used as an identifier.
2080
2081 "UnsignedChars" => 0 | 1
2082 Use this boolean option if you want characters to be
2083 unsigned if specified without an explicit "signed" or
2084 "unsigned" type specifier. By default, characters are
2085 signed.
2086
2087 "UnsignedBitfields" => 0 | 1
2088 Use this boolean option if you want bitfields to be
2089 unsigned if specified without an explicit "signed" or
2090 "unsigned" type specifier. By default, bitfields are
2091 signed.
2092
2093 "Warnings" => 0 | 1
2094 Use this boolean option if you want warnings to be issued
2095 during the parsing of source code. Currently, warnings are
2096 only reported by the preprocessor, so don't expect the
2097 output to cover everything.
2098
2099 By default, warnings are turned off and only errors will be
2100 reported. However, even these errors are turned off if you
2101 run without the "-w" flag.
2102
2103 "HasCPPComments" => 0 | 1
2104 Use this option to turn C++ comments on or off. By default,
2105 C++ comments are enabled. Disabling C++ comments may be
2106 necessary if your code includes strange things like:
2107
2108 one = 4 //* <- divide */ 4;
2109 two = 2;
2110
2111 With C++ comments, the above will be interpreted as
2112
2113 one = 4
2114 two = 2;
2115
2116 which will obviously be a syntax error, but without C++
2117 comments, it will be interpreted as
2118
2119 one = 4 / 4;
2120 two = 2;
2121
2122 which is correct.
2123
2124 "HasMacroVAARGS" => 0 | 1
2125 Use this option to turn the "__VA_ARGS__" macro expansion
2126 on or off. If this is enabled (which is the default), you
2127 can use variable length argument lists in your preprocessor
2128 macros.
2129
2130 #define DEBUG( ... ) fprintf( stderr, __VA_ARGS__ )
2131
2132 There's normally no reason to turn that feature off.
2133
2134 "StdCVersion" => undef | INTEGER
2135 Use this option to change the value of the preprocessor's
2136 predefined "__STDC_VERSION__" macro. When set to "undef",
2137 the macro will not be defined.
2138
2139 "HostedC" => undef | 0 | 1
2140 Use this option to change the value of the preprocessor's
2141 predefined "__STDC_HOSTED__" macro. When set to "undef",
2142 the macro will not be defined.
2143
2144 "Include" => [ INCLUDES ]
2145 Use this option to set the include path for the internal
2146 preprocessor. The option value is a reference to an array
2147 of strings, each string holding a directory that should be
2148 searched for includes.
2149
2150 "Define" => [ DEFINES ]
2151 Use this option to define symbols in the preprocessor. The
2152 option value is, again, a reference to an array of strings.
2153 Each string can be either just a symbol or an assignment to
2154 a symbol. This is completely equivalent to what the "-D"
2155 option does for most preprocessors.
2156
2157 The following will define the symbol "FOO" and define "BAR"
2158 to be 12345:
2159
2160 $c->configure(Define => [qw( FOO BAR=12345 )]);
2161
2162 "Assert" => [ ASSERTIONS ]
2163 Use this option to make assertions in the preprocessor. If
2164 you don't know what assertions are, don't be concerned,
2165 since they're deprecated anyway. They are, however, used in
2166 some system's include files. The value is an array
2167 reference, just like for the macro definitions. Only the
2168 way the assertions are defined is a bit different and
2169 mimics the way they are defined with the "#assert"
2170 directive:
2171
2172 $c->configure(Assert => ['foo(bar)']);
2173
2174 "OrderMembers" => 0 | 1
2175 When using "unpack" on compounds and iterating over the
2176 returned hash, the order of the compound members is
2177 generally not preserved due to the nature of hash tables.
2178 It is not even guaranteed that the order is the same
2179 between different runs of the same program. This can be
2180 very annoying if you simply use to dump your data
2181 structures and the compound members always show up in a
2182 different order.
2183
2184 By setting "OrderMembers" to a non-zero value, all hashes
2185 returned by "unpack" are tied to a class that preserves the
2186 order of the hash keys. This way, all compound members
2187 will be returned in the correct order just as they are
2188 defined in your C code.
2189
2190 use Convert::Binary::C;
2191 use Data::Dumper;
2192
2193 $c = Convert::Binary::C->new->parse(<<'ENDC');
2194 struct test {
2195 char one;
2196 char two;
2197 struct {
2198 char never;
2199 char change;
2200 char this;
2201 char order;
2202 } three;
2203 char four;
2204 };
2205 ENDC
2206
2207 $data = "Convert";
2208
2209 $u1 = $c->unpack('test', $data);
2210 $c->OrderMembers(1);
2211 $u2 = $c->unpack('test', $data);
2212
2213 print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2214
2215 This will print something like:
2216
2217 $u1 = {
2218 'one' => 67,
2219 'two' => 111,
2220 'three' => {
2221 'never' => 110,
2222 'change' => 118,
2223 'this' => 101,
2224 'order' => 114
2225 },
2226 'four' => 116
2227 };
2228 $u2 = {
2229 'one' => 67,
2230 'two' => 111,
2231 'three' => {
2232 'never' => 110,
2233 'change' => 118,
2234 'this' => 101,
2235 'order' => 114
2236 },
2237 'four' => 116
2238 };
2239
2240 To be able to use this option, you have to install one of
2241 the following modules: Tie::Hash::Indexed, Hash::Ordered or
2242 Tie::IxHash. If more than one of these modules is
2243 installed, Convert::Binary::C will use them in that order
2244 of preference.
2245
2246 When using this option, you should keep in mind that tied
2247 hashes are significantly slower and consume more memory
2248 than ordinary hashes, even when the class they're tied to
2249 is implemented efficiently. So don't turn this option on if
2250 you don't have to.
2251
2252 You can also influence hash member ordering by using the
2253 "CBC_ORDER_MEMBERS" environment variable.
2254
2255 "Bitfields" => { OPTION => VALUE, ... }
2256 Use this option to specify and configure a bitfield
2257 layouting engine. You can choose an engine by passing its
2258 name to the "Engine" option, like:
2259
2260 $c->configure(Bitfields => { Engine => 'Generic' });
2261
2262 Each engine can have its own set of options, although
2263 currently none of them does.
2264
2265 You can choose between the following bitfield engines:
2266
2267 "Generic"
2268 This engine implements the behaviour of most UNIX C
2269 compilers, including GCC. It does not handle packed
2270 bitfields yet.
2271
2272 "Microsoft"
2273 This engine implements the behaviour of Microsoft's
2274 "cl" compiler. It should be fairly complete and can
2275 handle packed bitfields.
2276
2277 "Simple"
2278 This engine is only used for testing the bitfield
2279 infrastructure in Convert::Binary::C. There's usually
2280 no reason to use it.
2281
2282 You can reconfigure all options even after you have parsed some
2283 code. The changes will be applied to the already parsed
2284 definitions. This works as long as array lengths are not
2285 affected by the changes. If you have Alignment and IntSize set
2286 to 4 and parse code like this
2287
2288 typedef struct {
2289 char abc;
2290 int day;
2291 } foo;
2292
2293 struct bar {
2294 foo zap[2*sizeof(foo)];
2295 };
2296
2297 the array "zap" in "struct bar" will obviously have 16
2298 elements. If you reconfigure the alignment to 1 now, the size
2299 of "foo" is now 5 instead of 8. While the alignment is adjusted
2300 correctly, the number of elements in array "zap" will still be
2301 16 and will not be changed to 10.
2302
2303 parse
2304 "parse" CODE
2305 Parses a string of valid C code. All enumeration, compound and
2306 type definitions are extracted. You can call the "parse" and
2307 "parse_file" methods as often as you like to add further
2308 definitions to the Convert::Binary::C object.
2309
2310 "parse" will throw an exception if an error occurs. On
2311 success, the method returns a reference to its object.
2312
2313 See "Parsing C code" for an example.
2314
2315 parse_file
2316 "parse_file" FILE
2317 Parses a C source file. All enumeration, compound and type
2318 definitions are extracted. You can call the "parse" and
2319 "parse_file" methods as often as you like to add further
2320 definitions to the Convert::Binary::C object.
2321
2322 "parse_file" will search the include path given via the
2323 "Include" option for the file if it cannot find it in the
2324 current directory.
2325
2326 "parse_file" will throw an exception if an error occurs. On
2327 success, the method returns a reference to its object.
2328
2329 See "Parsing C code" for an example.
2330
2331 When calling "parse" or "parse_file" multiple times, you may
2332 use types previously defined, but you are not allowed to
2333 redefine types. The state of the preprocessor is also saved, so
2334 you may also use defines from a previous parse. This works only
2335 as long as the preprocessor is not reset. See "Preprocessor
2336 configuration" for details.
2337
2338 When you're parsing C source files instead of C header files,
2339 note that local definitions are ignored. This means that type
2340 definitions hidden within functions will not be recognized by
2341 Convert::Binary::C. This is necessary because different
2342 functions (even different blocks within the same function) can
2343 define types with the same name:
2344
2345 void my_func(int i)
2346 {
2347 if (i < 10)
2348 {
2349 enum digit { ONE, TWO, THREE } x = ONE;
2350 printf("%d, %d\n", i, x);
2351 }
2352 else
2353 {
2354 enum digit { THREE, TWO, ONE } x = ONE;
2355 printf("%d, %d\n", i, x);
2356 }
2357 }
2358
2359 The above is a valid piece of C code, but it's not possible for
2360 Convert::Binary::C to distinguish between the different
2361 definitions of "enum digit", as they're only defined locally
2362 within the corresponding block.
2363
2364 clean
2365 "clean" Clears all information that has been collected during previous
2366 calls to "parse" or "parse_file". You can use this method if
2367 you want to parse some entirely different code, but with the
2368 same configuration.
2369
2370 The "clean" method returns a reference to its object.
2371
2372 clone
2373 "clone" Makes the object return an exact independent copy of itself.
2374
2375 $c = Convert::Binary::C->new(Include => ['/usr/include']);
2376 $c->parse_file('definitions.c');
2377 $clone = $c->clone;
2378
2379 The above code is technically equivalent (Mostly. Actually,
2380 using "sourcify" and "parse" might alter the order of the
2381 parsed data, which would make methods such as "compound" return
2382 the definitions in a different order.) to:
2383
2384 $c = Convert::Binary::C->new(Include => ['/usr/include']);
2385 $c->parse_file('definitions.c');
2386 $clone = Convert::Binary::C->new(%{$c->configure});
2387 $clone->parse($c->sourcify);
2388
2389 Using "clone" is just a lot faster.
2390
2391 def
2392 "def" NAME
2393 "def" TYPE
2394 If you need to know if a definition for a certain type name
2395 exists, use this method. You pass it the name of an enum,
2396 struct, union or typedef, and it will return a non-empty string
2397 being either "enum", "struct", "union", or "typedef" if there's
2398 a definition for the type in question, an empty string if
2399 there's no such definition, or "undef" if the name is
2400 completely unknown. If the type can be interpreted as a basic
2401 type, "basic" will be returned.
2402
2403 If you pass in a TYPE, the output will be slightly different.
2404 If the specified member exists, the "def" method will return
2405 "member". If the member doesn't exist, or if the type cannot
2406 have members, the empty string will be returned. Again, if the
2407 name of the type is completely unknown, "undef" will be
2408 returned. This may be useful if you want to check if a certain
2409 member exists within a compound, for example.
2410
2411 use Convert::Binary::C;
2412
2413 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2414
2415 typedef struct __not not;
2416 typedef struct __not *ptr;
2417
2418 struct foo {
2419 enum bar *xxx;
2420 };
2421
2422 typedef int quad[4];
2423
2424 ENDC
2425
2426 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2427 quad quad[3] quad[5] quad[-3] short[1] ),
2428 'unsigned long')
2429 {
2430 my $def = $c->def($type);
2431 printf "%-14s => %s\n",
2432 $type, defined $def ? "'$def'" : 'undef';
2433 }
2434
2435 The following would be returned by the "def" method:
2436
2437 not => ''
2438 ptr => 'typedef'
2439 foo => 'struct'
2440 bar => ''
2441 xxx => undef
2442 foo.xxx => 'member'
2443 foo.abc => ''
2444 xxx.yyy => undef
2445 quad => 'typedef'
2446 quad[3] => 'member'
2447 quad[5] => 'member'
2448 quad[-3] => 'member'
2449 short[1] => undef
2450 unsigned long => 'basic'
2451
2452 So, if "def" returns a non-empty string, you can safely use any
2453 other method with that type's name or with that member
2454 expression.
2455
2456 Concerning arrays, note that the index into an array doesn't
2457 need to be within the bounds of the array's definition, just
2458 like in C. In the above example, "quad[5]" and "quad[-3]" are
2459 valid members of the "quad" array, even though it is declared
2460 to have only four elements.
2461
2462 In cases where the typedef namespace overlaps with the
2463 namespace of enums/structs/unions, the "def" method will give
2464 preference to the typedef and will thus return the string
2465 "typedef". You could however force interpretation as an enum,
2466 struct or union by putting "enum", "struct" or "union" in front
2467 of the type's name.
2468
2469 defined
2470 "defined" MACRO
2471 You can use the "defined" method to find out if a certain macro
2472 is defined, just like you would use the "defined" operator of
2473 the preprocessor. For example, the following code
2474
2475 use Convert::Binary::C;
2476
2477 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2478
2479 #define ADD(a, b) ((a) + (b))
2480
2481 #if 1
2482 # define DEFINED
2483 #else
2484 # define UNDEFINED
2485 #endif
2486
2487 ENDC
2488
2489 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2490 my $not = $c->defined($macro) ? '' : ' not';
2491 print "Macro '$macro' is$not defined.\n";
2492 }
2493
2494 would print:
2495
2496 Macro 'ADD' is defined.
2497 Macro 'DEFINED' is defined.
2498 Macro 'UNDEFINED' is not defined.
2499
2500 You have to keep in mind that this works only as long as the
2501 preprocessor is not reset. See "Preprocessor configuration" for
2502 details.
2503
2504 pack
2505 "pack" TYPE
2506 "pack" TYPE, DATA
2507 "pack" TYPE, DATA, STRING
2508 Use this method to pack a complex data structure into a binary
2509 string according to a type definition that has been previously
2510 parsed. DATA must be a scalar matching the type definition. C
2511 structures and unions are represented by references to Perl
2512 hashes, C arrays by references to Perl arrays.
2513
2514 use Convert::Binary::C;
2515 use Data::Dumper;
2516 use Data::Hexdumper;
2517
2518 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2519 , LongSize => 4
2520 , ShortSize => 2
2521 )
2522 ->parse(<<'ENDC');
2523 struct test {
2524 char ary[3];
2525 union {
2526 short word[2];
2527 long quad;
2528 } uni;
2529 };
2530 ENDC
2531
2532 Hashes don't have to contain a key for each compound member and
2533 arrays may be truncated:
2534
2535 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2536
2537 Elements not defined in the Perl data structure will be set to
2538 zero in the packed byte string. If you pass "undef" as or
2539 simply omit the second parameter, the whole string will be
2540 initialized with zero bytes. On success, the packed byte string
2541 is returned.
2542
2543 print hexdump(data => $binary);
2544
2545 The above code would print:
2546
2547 0x0000 : 01 02 00 00 00 00 2A : ......*
2548
2549 You could also use "unpack" and dump the data structure.
2550
2551 $unpacked = $c->unpack('test', $binary);
2552 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2553
2554 This would print:
2555
2556 $unpacked = {
2557 'ary' => [
2558 1,
2559 2,
2560 0
2561 ],
2562 'uni' => {
2563 'word' => [
2564 0,
2565 42
2566 ],
2567 'quad' => 42
2568 }
2569 };
2570
2571 If TYPE refers to a compound object, you may pack any member of
2572 that compound object. Simply add a member expression to the
2573 type name, just as you would access the member in C:
2574
2575 $array = $c->pack('test.ary', [1, 2, 3]);
2576 print hexdump(data => $array);
2577
2578 $value = $c->pack('test.uni.word[1]', 2);
2579 print hexdump(data => $value);
2580
2581 This would give you:
2582
2583 0x0000 : 01 02 03 : ...
2584 0x0000 : 00 02 : ..
2585
2586 Call "pack" with the optional STRING argument if you want to
2587 use an existing binary string to insert the data. If called in
2588 a void context, "pack" will directly modify the string you
2589 passed as the third argument. Otherwise, a copy of the string
2590 is created, and "pack" will modify and return the copy, so the
2591 original string will remain unchanged.
2592
2593 The 3-argument version may be useful if you want to change only
2594 a few members of a complex data structure without having to
2595 "unpack" everything, change the members, and then "pack" again
2596 (which could waste lots of memory and CPU cycles). So, instead
2597 of doing something like
2598
2599 $test = $c->unpack('test', $binary);
2600 $test->{uni}{quad} = 4711;
2601 $new = $c->pack('test', $test);
2602
2603 to change the "uni.quad" member of $packed, you could simply do
2604 either
2605
2606 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2607
2608 or
2609
2610 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2611
2612 while the latter would directly modify $packed. Besides this
2613 code being a lot shorter (and perhaps even more readable), it
2614 can be significantly faster if you're dealing with really big
2615 data blocks.
2616
2617 If the length of the input string is less than the size
2618 required by the type, the string (or its copy) is extended and
2619 the extended part is initialized to zero. If the length is
2620 more than the size required by the type, the string is kept at
2621 that length, and also a copy would be an exact copy of that
2622 string.
2623
2624 $too_short = pack "C*", (1 .. 4);
2625 $too_long = pack "C*", (1 .. 20);
2626
2627 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2628 print "too_short:\n", hexdump(data => $too_short);
2629
2630 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2631 print "\ncopy:\n", hexdump(data => $copy);
2632
2633 This would print:
2634
2635 too_short:
2636 0x0000 : 01 02 03 00 00 47 11 : .....G.
2637
2638 copy:
2639 0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2640 0x0010 : 11 12 13 14 : ....
2641
2642 unpack
2643 "unpack" TYPE, STRING
2644 Use this method to unpack a binary string and create an
2645 arbitrarily complex Perl data structure based on a previously
2646 parsed type definition.
2647
2648 use Convert::Binary::C;
2649 use Data::Dumper;
2650
2651 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2652 , LongSize => 4
2653 , ShortSize => 2
2654 )
2655 ->parse( <<'ENDC' );
2656 struct test {
2657 char ary[3];
2658 union {
2659 short word[2];
2660 long *quad;
2661 } uni;
2662 };
2663 ENDC
2664
2665 # Generate some binary dummy data
2666 $binary = pack "C*", 1 .. $c->sizeof('test');
2667
2668 On failure, e.g. if the specified type cannot be found, the
2669 method will throw an exception. On success, a reference to a
2670 complex Perl data structure is returned, which can directly be
2671 dumped using the Data::Dumper module:
2672
2673 $unpacked = $c->unpack('test', $binary);
2674 print Dumper($unpacked);
2675
2676 This would print:
2677
2678 $VAR1 = {
2679 'ary' => [
2680 1,
2681 2,
2682 3
2683 ],
2684 'uni' => {
2685 'word' => [
2686 1029,
2687 1543
2688 ],
2689 'quad' => '289644378304612875'
2690 }
2691 };
2692
2693 If TYPE refers to a compound object, you may unpack any member
2694 of that compound object. Simply add a member expression to the
2695 type name, just as you would access the member in C:
2696
2697 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2698
2699 $unpack1 = $unpacked->{uni}{word};
2700 $unpack2 = $c->unpack('test.uni.word', $binary2);
2701
2702 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2703
2704 You will find that the output is exactly the same for both
2705 $unpack1 and $unpack2:
2706
2707 $unpack1 = [
2708 1029,
2709 1543
2710 ];
2711 $unpack2 = [
2712 1029,
2713 1543
2714 ];
2715
2716 When "unpack" is called in list context, it will unpack as many
2717 elements as possible from STRING, including zero if STRING is
2718 not long enough.
2719
2720 initializer
2721 "initializer" TYPE
2722 "initializer" TYPE, DATA
2723 The "initializer" method can be used retrieve an initializer
2724 string for a certain TYPE. This can be useful if you have to
2725 initialize only a couple of members in a huge compound type or
2726 if you simply want to generate initializers automatically.
2727
2728 struct date {
2729 unsigned year : 12;
2730 unsigned month: 4;
2731 unsigned day : 5;
2732 unsigned hour : 5;
2733 unsigned min : 6;
2734 };
2735
2736 typedef struct {
2737 enum { DATE, QWORD } type;
2738 short number;
2739 union {
2740 struct date date;
2741 unsigned long qword;
2742 } choice;
2743 } data;
2744
2745 Given the above code has been parsed
2746
2747 $init = $c->initializer('data');
2748 print "data x = $init;\n";
2749
2750 would print the following:
2751
2752 data x = {
2753 0,
2754 0,
2755 {
2756 {
2757 0,
2758 0,
2759 0,
2760 0,
2761 0
2762 }
2763 }
2764 };
2765
2766 You could directly put that into a C program, although it
2767 probably isn't very useful yet. It becomes more useful if you
2768 actually specify how you want to initialize the type:
2769
2770 $data = {
2771 type => 'QWORD',
2772 choice => {
2773 date => { month => 12, day => 24 },
2774 qword => 4711,
2775 },
2776 stuff => 'yes?',
2777 };
2778
2779 $init = $c->initializer('data', $data);
2780 print "data x = $init;\n";
2781
2782 This would print the following:
2783
2784 data x = {
2785 QWORD,
2786 0,
2787 {
2788 {
2789 0,
2790 12,
2791 24,
2792 0,
2793 0
2794 }
2795 }
2796 };
2797
2798 As only the first member of a "union" can be initialized,
2799 "choice.qword" is ignored. You will not be warned about the
2800 fact that you probably tried to initialize a member other than
2801 the first. This is considered a feature, because it allows you
2802 to use "unpack" to generate the initializer data:
2803
2804 $data = $c->unpack('data', $binary);
2805 $init = $c->initializer('data', $data);
2806
2807 Since "unpack" unpacks all union members, you would otherwise
2808 have to delete all but the first one previous to feeding it
2809 into "initializer".
2810
2811 Also, "stuff" is ignored, because it actually isn't a member of
2812 "data". You won't be warned about that either.
2813
2814 sizeof
2815 "sizeof" TYPE
2816 This method will return the size of a C type in bytes. If it
2817 cannot find the type, it will throw an exception.
2818
2819 If the type defines some kind of compound object, you may ask
2820 for the size of a member of that compound object:
2821
2822 $size = $c->sizeof('test.uni.word[1]');
2823
2824 This would set $size to 2.
2825
2826 typeof
2827 "typeof" TYPE
2828 This method will return the type of a C member. While this
2829 only makes sense for compound types, it's legal to also use it
2830 for non-compound types. If it cannot find the type, it will
2831 throw an exception.
2832
2833 The "typeof" method can be used on any valid member, even on
2834 arrays or unnamed types. It will always return a string that
2835 holds the name (or in case of unnamed types only the class) of
2836 the type, optionally followed by a '*' character to indicate
2837 it's a pointer type, and optionally followed by one or more
2838 array dimensions if it's an array type. If the type is a
2839 bitfield, the type name is followed by a colon and the number
2840 of bits.
2841
2842 struct test {
2843 char ary[3];
2844 union {
2845 short word[2];
2846 long *quad;
2847 } uni;
2848 struct {
2849 unsigned short six:6;
2850 unsigned short ten:10;
2851 } bits;
2852 };
2853
2854 Given the above C code has been parsed, calls to "typeof" would
2855 return the following values:
2856
2857 $c->typeof('test') => 'struct test'
2858 $c->typeof('test.ary') => 'char [3]'
2859 $c->typeof('test.uni') => 'union'
2860 $c->typeof('test.uni.quad') => 'long *'
2861 $c->typeof('test.uni.word') => 'short [2]'
2862 $c->typeof('test.uni.word[1]') => 'short'
2863 $c->typeof('test.bits') => 'struct'
2864 $c->typeof('test.bits.six') => 'unsigned short :6'
2865 $c->typeof('test.bits.ten') => 'unsigned short :10'
2866
2867 offsetof
2868 "offsetof" TYPE, MEMBER
2869 You can use "offsetof" just like the C macro of same
2870 denominator. It will simply return the offset (in bytes) of
2871 MEMBER relative to TYPE.
2872
2873 use Convert::Binary::C;
2874
2875 $c = Convert::Binary::C->new( Alignment => 4
2876 , LongSize => 4
2877 , PointerSize => 4
2878 )
2879 ->parse(<<'ENDC');
2880 typedef struct {
2881 char abc;
2882 long day;
2883 int *ptr;
2884 } week;
2885
2886 struct test {
2887 week zap[8];
2888 };
2889 ENDC
2890
2891 @args = (
2892 ['test', 'zap[5].day' ],
2893 ['test.zap[2]', 'day' ],
2894 ['test', 'zap[5].day+1'],
2895 ['test', 'zap[-3].ptr' ],
2896 );
2897
2898 for (@args) {
2899 my $offset = eval { $c->offsetof(@$_) };
2900 printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2901 }
2902
2903 The final loop will print:
2904
2905 $c->offsetof('test', 'zap[5].day') => 64
2906 $c->offsetof('test.zap[2]', 'day') => 4
2907 $c->offsetof('test', 'zap[5].day+1') => 65
2908 $c->offsetof('test', 'zap[-3].ptr') => -28
2909
2910 • The first iteration simply shows that the offset of
2911 "zap[5].day" is 64 relative to the beginning of "struct
2912 test".
2913
2914 • You may additionally specify a member for the type passed as
2915 the first argument, as shown in the second iteration.
2916
2917 • The offset suffix is also supported by "offsetof", so the
2918 third iteration will correctly print 65.
2919
2920 • The last iteration demonstrates that even out-of-bounds array
2921 indices are handled correctly, just as they are handled in C.
2922
2923 Unlike the C macro, "offsetof" also works on array types.
2924
2925 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2926 print "offset = $offset";
2927
2928 This will print:
2929
2930 offset = 46
2931
2932 If TYPE is a compound, MEMBER may optionally be prefixed with a
2933 dot, so
2934
2935 printf "offset = %d\n", $c->offsetof('week', 'day');
2936 printf "offset = %d\n", $c->offsetof('week', '.day');
2937
2938 are both equivalent and will print
2939
2940 offset = 4
2941 offset = 4
2942
2943 This allows one to
2944
2945 • use the C macro style, without a leading dot, and
2946
2947 • directly use the output of the "member" method, which
2948 includes a leading dot for compound types, as input for the
2949 MEMBER argument.
2950
2951 member
2952 "member" TYPE
2953 "member" TYPE, OFFSET
2954 You can think of "member" as being the reverse of the
2955 "offsetof" method. However, as this is more complex, there's no
2956 equivalent to "member" in the C language.
2957
2958 Usually this method is used if you want to retrieve the name of
2959 the member that is located at a specific offset of a previously
2960 parsed type.
2961
2962 use Convert::Binary::C;
2963
2964 $c = Convert::Binary::C->new( Alignment => 4
2965 , LongSize => 4
2966 , PointerSize => 4
2967 )
2968 ->parse(<<'ENDC');
2969 typedef struct {
2970 char abc;
2971 long day;
2972 int *ptr;
2973 } week;
2974
2975 struct test {
2976 week zap[8];
2977 };
2978 ENDC
2979
2980 for my $offset (24, 39, 69, 99) {
2981 print "\$c->member('test', $offset)";
2982 my $member = eval { $c->member('test', $offset) };
2983 print $@ ? "\n exception: $@" : " => '$member'\n";
2984 }
2985
2986 This will print:
2987
2988 $c->member('test', 24) => '.zap[2].abc'
2989 $c->member('test', 39) => '.zap[3]+3'
2990 $c->member('test', 69) => '.zap[5].ptr+1'
2991 $c->member('test', 99)
2992 exception: Offset 99 out of range (0 <= offset < 96)
2993
2994 • The output of the first iteration is obvious. The member
2995 "zap[2].abc" is located at offset 24 of "struct test".
2996
2997 • In the second iteration, the offset points into a region of
2998 padding bytes and thus no member of "week" can be named.
2999 Instead of a member name the offset relative to "zap[3]" is
3000 appended.
3001
3002 • In the third iteration, the offset points to "zap[5].ptr".
3003 However, "zap[5].ptr" is located at 68, not at 69, and thus
3004 the remaining offset of 1 is also appended.
3005
3006 • The last iteration causes an exception because the offset of
3007 99 is not valid for "struct test" since the size of "struct
3008 test" is only 96. You might argue that this is inconsistent,
3009 since "offsetof" can also handle out-of-bounds array members.
3010 But as soon as you have more than one level of array nesting,
3011 there's an infinite number of out-of-bounds members for a
3012 single given offset, so it would be impossible to return a
3013 list of all members.
3014
3015 You can additionally specify a member for the type passed as
3016 the first argument:
3017
3018 $member = $c->member('test.zap[2]', 6);
3019 print $member;
3020
3021 This will print:
3022
3023 .day+2
3024
3025 Like "offsetof", "member" also works on array types:
3026
3027 $member = $c->member('test.zap', 42);
3028 print $member;
3029
3030 This will print:
3031
3032 [3].day+2
3033
3034 While the behaviour for "struct"s is quite obvious, the
3035 behaviour for "union"s is rather tricky. As a single offset
3036 usually references more than one member of a union, there are
3037 certain rules that the algorithm uses for determining the best
3038 member.
3039
3040 • The first non-compound member that is referenced without an
3041 offset has the highest priority.
3042
3043 • If no member is referenced without an offset, the first non-
3044 compound member that is referenced with an offset will be
3045 returned.
3046
3047 • Otherwise the first padding region that is encountered will
3048 be taken.
3049
3050 As an example, given 4-byte-alignment and the union
3051
3052 union choice {
3053 struct {
3054 char color[2];
3055 long size;
3056 char taste;
3057 } apple;
3058 char grape[3];
3059 struct {
3060 long weight;
3061 short price[3];
3062 } melon;
3063 };
3064
3065 the "member" method would return what is shown in the Member
3066 column of the following table. The Type column shows the result
3067 of the "typeof" method when passing the corresponding member.
3068
3069 Offset Member Type
3070 --------------------------------------
3071 0 .apple.color[0] 'char'
3072 1 .apple.color[1] 'char'
3073 2 .grape[2] 'char'
3074 3 .melon.weight+3 'long'
3075 4 .apple.size 'long'
3076 5 .apple.size+1 'long'
3077 6 .melon.price[1] 'short'
3078 7 .apple.size+3 'long'
3079 8 .apple.taste 'char'
3080 9 .melon.price[2]+1 'short'
3081 10 .apple+10 'struct'
3082 11 .apple+11 'struct'
3083
3084 It's like having a stack of all the union members and looking
3085 through the stack for the shiniest piece you can see. The
3086 beginning of a member (denoted by uppercase letters) is always
3087 shinier than the rest of a member, while padding regions
3088 (denoted by dashes) aren't shiny at all.
3089
3090 Offset 0 1 2 3 4 5 6 7 8 9 10 11
3091 -------------------------------------------------------
3092 apple (C) (C) - - (S) (s) s (s) (T) - (-) (-)
3093 grape G G (G)
3094 melon W w w (w) P p (P) p P (p) - -
3095
3096 If you look through that stack from top to bottom, you'll end
3097 up at the parenthesized members.
3098
3099 Alternatively, if you're not only interested in the best
3100 member, you can call "member" in list context, which makes it
3101 return all members referenced by the given offset.
3102
3103 Offset Member Type
3104 --------------------------------------
3105 0 .apple.color[0] 'char'
3106 .grape[0] 'char'
3107 .melon.weight 'long'
3108 1 .apple.color[1] 'char'
3109 .grape[1] 'char'
3110 .melon.weight+1 'long'
3111 2 .grape[2] 'char'
3112 .melon.weight+2 'long'
3113 .apple+2 'struct'
3114 3 .melon.weight+3 'long'
3115 .apple+3 'struct'
3116 4 .apple.size 'long'
3117 .melon.price[0] 'short'
3118 5 .apple.size+1 'long'
3119 .melon.price[0]+1 'short'
3120 6 .melon.price[1] 'short'
3121 .apple.size+2 'long'
3122 7 .apple.size+3 'long'
3123 .melon.price[1]+1 'short'
3124 8 .apple.taste 'char'
3125 .melon.price[2] 'short'
3126 9 .melon.price[2]+1 'short'
3127 .apple+9 'struct'
3128 10 .apple+10 'struct'
3129 .melon+10 'struct'
3130 11 .apple+11 'struct'
3131 .melon+11 'struct'
3132
3133 The first member returned is always the best member. The other
3134 members are sorted according to the rules given above. This
3135 means that members referenced without an offset are followed by
3136 members referenced with an offset. Padding regions will be at
3137 the end.
3138
3139 If OFFSET is not given in the method call, "member" will return
3140 a list of all possible members of TYPE.
3141
3142 print "$_\n" for $c->member('choice');
3143
3144 This will print:
3145
3146 .apple.color[0]
3147 .apple.color[1]
3148 .apple.size
3149 .apple.taste
3150 .grape[0]
3151 .grape[1]
3152 .grape[2]
3153 .melon.weight
3154 .melon.price[0]
3155 .melon.price[1]
3156 .melon.price[2]
3157
3158 In scalar context, the number of possible members is returned.
3159
3160 tag
3161 "tag" TYPE
3162 "tag" TYPE, TAG
3163 "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3164 The "tag" method can be used to tag properties to a TYPE. It's
3165 a bit like having "configure" for individual types.
3166
3167 See "USING TAGS" for an example.
3168
3169 Note that while you can tag whole types as well as compound
3170 members, it is not possible to tag array members, i.e. you
3171 cannot treat, for example, "a[1]" and "a[2]" differently.
3172
3173 Also note that in code like this
3174
3175 struct test {
3176 int a;
3177 struct {
3178 int x;
3179 } b, c;
3180 };
3181
3182 if you tag "test.b.x", this will also tag "test.c.x"
3183 implicitly.
3184
3185 It is also possible to tag basic types if you really want to do
3186 that, for example:
3187
3188 $c->tag('int', Format => 'Binary');
3189
3190 To remove a tag from a type, you can either set that tag to
3191 "undef", for example
3192
3193 $c->tag('test', Hooks => undef);
3194
3195 or use "untag".
3196
3197 To see if a tag is attached to a type or to get the value of a
3198 tag, pass only the type and tag name to "tag":
3199
3200 $c->tag('test.a', Format => 'Binary');
3201
3202 $hooks = $c->tag('test.a', 'Hooks');
3203 $format = $c->tag('test.a', 'Format');
3204
3205 This will give you:
3206
3207 $hooks = undef;
3208 $format = 'Binary';
3209
3210 To see which tags are attached to a type, pass only the type.
3211 The "tag" method will now return a hash reference containing
3212 all tags attached to the type:
3213
3214 $tags = $c->tag('test.a');
3215
3216 This will give you:
3217
3218 $tags = {
3219 'Format' => 'Binary'
3220 };
3221
3222 "tag" will throw an exception if an error occurs. If called as
3223 a 'set' method, it will return a reference to its object,
3224 allowing you to chain together consecutive method calls.
3225
3226 Note that when a compound is inlined, tags attached to the
3227 inlined compound are ignored, for example:
3228
3229 $c->parse(<<ENDC);
3230 struct header {
3231 int id;
3232 int len;
3233 unsigned flags;
3234 };
3235
3236 struct message {
3237 struct header;
3238 short samples[32];
3239 };
3240 ENDC
3241
3242 for my $type (qw( header message header.len )) {
3243 $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3244 }
3245
3246 for my $type (qw( header message )) {
3247 print "[unpacking $type]\n";
3248 $u = $c->unpack($type, $data);
3249 }
3250
3251 This will print:
3252
3253 [unpacking header]
3254 unpack: header.len
3255 unpack: header
3256 [unpacking message]
3257 unpack: header.len
3258 unpack: message
3259
3260 As you can see from the above output, tags attached to members
3261 of inlined compounds ("header.len" are still handled.
3262
3263 The following tags can be configured:
3264
3265 "Format" => 'Binary' | 'String'
3266 The "Format" tag allows you to control the way binary data
3267 is converted by "pack" and "unpack".
3268
3269 If you tag a "TYPE" as "Binary", it will not be converted
3270 at all, i.e. it will be passed through as a binary string.
3271
3272 If you tag it as "String", it will be treated like a null-
3273 terminated C string, i.e. "unpack" will convert the C
3274 string to a Perl string and vice versa.
3275
3276 See "The Format Tag" for an example.
3277
3278 "ByteOrder" => 'BigEndian' | 'LittleEndian'
3279 The "ByteOrder" tag allows you to explicitly set the byte
3280 order of a TYPE.
3281
3282 See "The ByteOrder Tag" for an example.
3283
3284 "Dimension" => '*'
3285 "Dimension" => VALUE
3286 "Dimension" => MEMBER
3287 "Dimension" => SUB
3288 "Dimension" => [ SUB, ARGS ]
3289 The "Dimension" tag allows you to alter the size of an
3290 array dynamically.
3291
3292 You can tag fixed size arrays as being flexible using '*'.
3293 This is useful if you cannot use flexible array members in
3294 your source code.
3295
3296 $c->tag('type.array', Dimension => '*');
3297
3298 You can also tag an array to have a fixed size different
3299 from the one it was originally declared with.
3300
3301 $c->tag('type.array', Dimension => 42);
3302
3303 If the array is a member of a compound, you can also tag it
3304 with to have a size corresponding to the value of another
3305 member in that compound.
3306
3307 $c->tag('type.array', Dimension => 'count');
3308
3309 Finally, you can specify a subroutine that is called when
3310 the size of the array needs to be determined.
3311
3312 $c->tag('type.array', Dimension => \&get_count);
3313
3314 By default, and if the array is a compound member, that
3315 subroutine will be passed a reference to the hash storing
3316 the data for the compound.
3317
3318 You can also instruct Convert::Binary::C to pass additional
3319 arguments to the subroutine by passing an array reference
3320 instead of the subroutine reference. This array contains
3321 the subroutine reference as well as a list of arguments.
3322 It is possible to define certain special arguments using
3323 the "arg" method.
3324
3325 $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3326
3327 See "The Dimension Tag" for various examples.
3328
3329 "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3330 The "Hooks" tag allows you to register subroutines as
3331 hooks.
3332
3333 Hooks are called whenever a certain "TYPE" is packed or
3334 unpacked. Hooks are currently considered an experimental
3335 feature.
3336
3337 "HOOK" can be one of the following:
3338
3339 pack
3340 unpack
3341 pack_ptr
3342 unpack_ptr
3343
3344 "pack" and "unpack" hooks are called when processing their
3345 "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3346 when processing pointers to their "TYPE".
3347
3348 "SUB" is a reference to a subroutine that usually takes one
3349 input argument, processes it and returns one output
3350 argument.
3351
3352 Alternatively, you can pass a custom list of arguments to
3353 the hook by using an array reference instead of "SUB" that
3354 holds the subroutine reference in the first element and the
3355 arguments to be passed to the subroutine as the other
3356 elements. This way, you can even pass special arguments to
3357 the hook using the "arg" method.
3358
3359 Here are a few examples for registering hooks:
3360
3361 $c->tag('ObjectType', Hooks => {
3362 pack => \&obj_pack,
3363 unpack => \&obj_unpack
3364 });
3365
3366 $c->tag('ProtocolId', Hooks => {
3367 unpack => sub { $protos[$_[0]] }
3368 });
3369
3370 $c->tag('ProtocolId', Hooks => {
3371 unpack_ptr => [sub {
3372 sprintf "$_[0]:{0x%X}", $_[1]
3373 },
3374 $c->arg('TYPE', 'DATA')
3375 ],
3376 });
3377
3378 Note that the above example registers both an "unpack" hook
3379 and an "unpack_ptr" hook for "ProtocolId" with two separate
3380 calls to "tag". As long as you don't explicitly overwrite a
3381 previously registered hook, it won't be modified or removed
3382 by registering other hooks for the same "TYPE".
3383
3384 To remove all registered hooks for a type, simply remove
3385 the "Hooks" tag:
3386
3387 $c->untag('ProtocolId', 'Hooks');
3388
3389 To remove only a single hook, pass "undef" as "SUB" instead
3390 of a subroutine reference:
3391
3392 $c->tag('ObjectType', Hooks => { pack => undef });
3393
3394 If all hooks are removed, the whole "Hooks" tag is removed.
3395
3396 See "The Hooks Tag" for examples on how to use hooks.
3397
3398 untag
3399 "untag" TYPE
3400 "untag" TYPE, TAG1, TAG2, ...
3401 Use the "untag" method to remove one, more, or all tags from a
3402 type. If you don't pass any tag names, all tags attached to the
3403 type will be removed. Otherwise only the listed tags will be
3404 removed.
3405
3406 See "USING TAGS" for an example.
3407
3408 arg
3409 "arg" 'ARG', ...
3410 Creates placeholders for special arguments to be passed to
3411 hooks or other subroutines. These arguments are currently:
3412
3413 "SELF"
3414 A reference to the calling Convert::Binary::C object. This
3415 may be useful if you need to work with the object inside
3416 the subroutine.
3417
3418 "TYPE"
3419 The name of the type that is currently being processed by
3420 the hook.
3421
3422 "DATA"
3423 The data argument that is passed to the subroutine.
3424
3425 "HOOK"
3426 The type of the hook as which the subroutine has been
3427 called, for example "pack" or "unpack_ptr".
3428
3429 "arg" will return a placeholder for each argument it is being
3430 passed. Note that not all arguments may be supported depending
3431 on the context of the subroutine.
3432
3433 dependencies
3434 "dependencies"
3435 After some code has been parsed using either the "parse" or
3436 "parse_file" methods, the "dependencies" method can be used to
3437 retrieve information about all files that the object depends
3438 on, i.e. all files that have been parsed.
3439
3440 In scalar context, the method returns a hash reference. Each
3441 key is the name of a file. The values are again hash
3442 references, each of which holds the size, modification time
3443 (mtime), and change time (ctime) of the file at the moment it
3444 was parsed.
3445
3446 use Convert::Binary::C;
3447 use Data::Dumper;
3448
3449 #----------------------------------------------------------
3450 # Create object, set include path, parse 'string.h' header
3451 #----------------------------------------------------------
3452 my $c = Convert::Binary::C->new
3453 ->Include('/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include',
3454 '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include-fixed',
3455 '/usr/include')
3456 ->parse_file('string.h');
3457
3458 #----------------------------------------------------------
3459 # Get dependencies of the object, extract dependency files
3460 #----------------------------------------------------------
3461 my $depend = $c->dependencies;
3462 my @files = keys %$depend;
3463
3464 #-----------------------------
3465 # Dump dependencies and files
3466 #-----------------------------
3467 print Data::Dumper->Dump([$depend, \@files],
3468 [qw( depend *files )]);
3469
3470 The above code would print something like this:
3471
3472 $depend = {
3473 '/usr/include/sys/cdefs.h' => {
3474 'size' => 20051,
3475 'mtime' => 1604969938,
3476 'ctime' => 1604969964
3477 },
3478 '/usr/include/gnu/stubs-32.h' => {
3479 'size' => 449,
3480 'mtime' => 1604969908,
3481 'ctime' => 1604969964
3482 },
3483 '/usr/include/bits/wordsize.h' => {
3484 'size' => 442,
3485 'mtime' => 1604969934,
3486 'ctime' => 1604969964
3487 },
3488 '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/stddef.h' => {
3489 'size' => 12959,
3490 'mtime' => 1604974286,
3491 'ctime' => 1604975398
3492 },
3493 '/usr/include/stdc-predef.h' => {
3494 'size' => 2290,
3495 'mtime' => 1604969927,
3496 'ctime' => 1604969964
3497 },
3498 '/usr/include/string.h' => {
3499 'size' => 18766,
3500 'mtime' => 1604969936,
3501 'ctime' => 1604969964
3502 },
3503 '/usr/include/bits/types/locale_t.h' => {
3504 'size' => 983,
3505 'mtime' => 1604969927,
3506 'ctime' => 1604969964
3507 },
3508 '/usr/include/bits/long-double.h' => {
3509 'size' => 970,
3510 'mtime' => 1604969933,
3511 'ctime' => 1604969964
3512 },
3513 '/usr/include/bits/libc-header-start.h' => {
3514 'size' => 3288,
3515 'mtime' => 1604969927,
3516 'ctime' => 1604969964
3517 },
3518 '/usr/include/strings.h' => {
3519 'size' => 4753,
3520 'mtime' => 1604969936,
3521 'ctime' => 1604969964
3522 },
3523 '/usr/include/gnu/stubs.h' => {
3524 'size' => 384,
3525 'mtime' => 1604969927,
3526 'ctime' => 1604969964
3527 },
3528 '/usr/include/bits/types/__locale_t.h' => {
3529 'size' => 1722,
3530 'mtime' => 1604969927,
3531 'ctime' => 1604969964
3532 },
3533 '/usr/include/features.h' => {
3534 'size' => 17235,
3535 'mtime' => 1604969927,
3536 'ctime' => 1604969964
3537 }
3538 };
3539 @files = (
3540 '/usr/include/sys/cdefs.h',
3541 '/usr/include/gnu/stubs-32.h',
3542 '/usr/include/bits/wordsize.h',
3543 '/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/stddef.h',
3544 '/usr/include/stdc-predef.h',
3545 '/usr/include/string.h',
3546 '/usr/include/bits/types/locale_t.h',
3547 '/usr/include/bits/long-double.h',
3548 '/usr/include/bits/libc-header-start.h',
3549 '/usr/include/strings.h',
3550 '/usr/include/gnu/stubs.h',
3551 '/usr/include/bits/types/__locale_t.h',
3552 '/usr/include/features.h'
3553 );
3554
3555 In list context, the method returns the names of all files that
3556 have been parsed, i.e. the following lines are equivalent:
3557
3558 @files = keys %{$c->dependencies};
3559 @files = $c->dependencies;
3560
3561 sourcify
3562 "sourcify"
3563 "sourcify" CONFIG
3564 Returns a string that holds the C source code necessary to
3565 represent all parsed C data structures.
3566
3567 use Convert::Binary::C;
3568
3569 $c = Convert::Binary::C->new;
3570 $c->parse(<<'END');
3571
3572 #define ADD(a, b) ((a) + (b))
3573 #define NUMBER 42
3574
3575 typedef struct _mytype mytype;
3576
3577 struct _mytype {
3578 union {
3579 int iCount;
3580 enum count *pCount;
3581 } counter;
3582 #pragma pack( push, 1 )
3583 struct {
3584 char string[NUMBER];
3585 int array[NUMBER/sizeof(int)];
3586 } storage;
3587 #pragma pack( pop )
3588 mytype *next;
3589 };
3590
3591 enum count { ZERO, ONE, TWO, THREE };
3592
3593 END
3594
3595 print $c->sourcify;
3596
3597 The above code would print something like this:
3598
3599 /* typedef predeclarations */
3600
3601 typedef struct _mytype mytype;
3602
3603 /* defined enums */
3604
3605 enum count
3606 {
3607 ZERO,
3608 ONE,
3609 TWO,
3610 THREE
3611 };
3612
3613
3614 /* defined structs and unions */
3615
3616 struct _mytype
3617 {
3618 union
3619 {
3620 int iCount;
3621 enum count *pCount;
3622 } counter;
3623 #pragma pack(push, 1)
3624 struct
3625 {
3626 char string[42];
3627 int array[10];
3628 } storage;
3629 #pragma pack(pop)
3630 mytype *next;
3631 };
3632
3633 The purpose of the "sourcify" method is to enable some kind of
3634 platform-independent caching. The C code generated by
3635 "sourcify" can be parsed by any standard C compiler, as well as
3636 of course by the Convert::Binary::C parser. However, the code
3637 may be significantly shorter than the code that has originally
3638 been parsed.
3639
3640 When parsing a typical header file, it's easily possible that
3641 you need to open dozens of other files that are included from
3642 that file, and end up parsing several hundred kilobytes of C
3643 code. Since most of it is usually preprocessor directives,
3644 function prototypes and comments, the "sourcify" function
3645 strips this down to a few kilobytes. Saving the "sourcify"
3646 string and parsing it next time instead of the original code
3647 may be a lot faster.
3648
3649 The "sourcify" method takes a hash reference as an optional
3650 argument. It can be used to tweak the method's output. The
3651 following options can be configured.
3652
3653 "Context" => 0 | 1
3654 Turns preprocessor context information on or off. If this
3655 is turned on, "sourcify" will insert "#line" preprocessor
3656 directives in its output. So in the above example
3657
3658 print $c->sourcify({ Context => 1 });
3659
3660 would print:
3661
3662 /* typedef predeclarations */
3663
3664 typedef struct _mytype mytype;
3665
3666 /* defined enums */
3667
3668
3669 #line 21 "[buffer]"
3670 enum count
3671 {
3672 ZERO,
3673 ONE,
3674 TWO,
3675 THREE
3676 };
3677
3678
3679 /* defined structs and unions */
3680
3681
3682 #line 7 "[buffer]"
3683 struct _mytype
3684 {
3685 #line 8 "[buffer]"
3686 union
3687 {
3688 int iCount;
3689 enum count *pCount;
3690 } counter;
3691 #pragma pack(push, 1)
3692 #line 13 "[buffer]"
3693 struct
3694 {
3695 char string[42];
3696 int array[10];
3697 } storage;
3698 #pragma pack(pop)
3699 mytype *next;
3700 };
3701
3702 Note that "[buffer]" refers to the here-doc buffer when
3703 using "parse".
3704
3705 "Defines" => 0 | 1
3706 Turn this on if you want all the defined macros to be part
3707 of the source code output. Given the example code above
3708
3709 print $c->sourcify({ Defines => 1 });
3710
3711 would print:
3712
3713 /* typedef predeclarations */
3714
3715 typedef struct _mytype mytype;
3716
3717 /* defined enums */
3718
3719 enum count
3720 {
3721 ZERO,
3722 ONE,
3723 TWO,
3724 THREE
3725 };
3726
3727
3728 /* defined structs and unions */
3729
3730 struct _mytype
3731 {
3732 union
3733 {
3734 int iCount;
3735 enum count *pCount;
3736 } counter;
3737 #pragma pack(push, 1)
3738 struct
3739 {
3740 char string[42];
3741 int array[10];
3742 } storage;
3743 #pragma pack(pop)
3744 mytype *next;
3745 };
3746
3747 /* preprocessor defines */
3748
3749 #define ADD(a, b) ((a) + (b))
3750 #define NUMBER 42
3751
3752 The macro definitions always appear at the end of the
3753 source code. The order of the macro definitions is
3754 undefined.
3755
3756 The following methods can be used to retrieve information about the
3757 definitions that have been parsed. The examples given in the
3758 description for "enum", "compound" and "typedef" all assume this piece
3759 of C code has been parsed:
3760
3761 #define ABC_SIZE 2
3762 #define MULTIPLY(x, y) ((x)*(y))
3763
3764 #ifdef ABC_SIZE
3765 # define DEFINED
3766 #else
3767 # define NOT_DEFINED
3768 #endif
3769
3770 typedef unsigned long U32;
3771 typedef void *any;
3772
3773 enum __socket_type
3774 {
3775 SOCK_STREAM = 1,
3776 SOCK_DGRAM = 2,
3777 SOCK_RAW = 3,
3778 SOCK_RDM = 4,
3779 SOCK_SEQPACKET = 5,
3780 SOCK_PACKET = 10
3781 };
3782
3783 struct STRUCT_SV {
3784 void *sv_any;
3785 U32 sv_refcnt;
3786 U32 sv_flags;
3787 };
3788
3789 typedef union {
3790 int abc[ABC_SIZE];
3791 struct xxx {
3792 int a;
3793 int b;
3794 } ab[3][4];
3795 any ptr;
3796 } test;
3797
3798 enum_names
3799 "enum_names"
3800 Returns a list of identifiers of all defined enumeration
3801 objects. Enumeration objects don't necessarily have an
3802 identifier, so something like
3803
3804 enum { A, B, C };
3805
3806 will obviously not appear in the list returned by the
3807 "enum_names" method. Also, enumerations that are not defined
3808 within the source code - like in
3809
3810 struct foo {
3811 enum weekday *pWeekday;
3812 unsigned long year;
3813 };
3814
3815 where only a pointer to the "weekday" enumeration object is
3816 used - will not be returned, even though they have an
3817 identifier. So for the above two enumerations, "enum_names"
3818 will return an empty list:
3819
3820 @names = $c->enum_names;
3821
3822 The only way to retrieve a list of all enumeration identifiers
3823 is to use the "enum" method without additional arguments. You
3824 can get a list of all enumeration objects that have an
3825 identifier by using
3826
3827 @enums = map { $_->{identifier} || () } $c->enum;
3828
3829 but these may not have a definition. Thus, the two arrays would
3830 look like this:
3831
3832 @names = ();
3833 @enums = ('weekday');
3834
3835 The "def" method returns a true value for all identifiers
3836 returned by "enum_names".
3837
3838 enum
3839 enum
3840 "enum" LIST
3841 Returns a list of references to hashes containing detailed
3842 information about all enumerations that have been parsed.
3843
3844 If a list of enumeration identifiers is passed to the method,
3845 the returned list will only contain hash references for those
3846 enumerations. The enumeration identifiers may optionally be
3847 prefixed by "enum".
3848
3849 If an enumeration identifier cannot be found, the returned list
3850 will contain an undefined value at that position.
3851
3852 In scalar context, the number of enumerations will be returned
3853 as long as the number of arguments to the method call is not 1.
3854 In the latter case, a hash reference holding information for
3855 the enumeration will be returned.
3856
3857 The list returned by the "enum" method looks similar to this:
3858
3859 @enum = (
3860 {
3861 'enumerators' => {
3862 'SOCK_STREAM' => 1,
3863 'SOCK_DGRAM' => 2,
3864 'SOCK_PACKET' => 10,
3865 'SOCK_SEQPACKET' => 5,
3866 'SOCK_RDM' => 4,
3867 'SOCK_RAW' => 3
3868 },
3869 'identifier' => '__socket_type',
3870 'size' => 4,
3871 'sign' => 0,
3872 'context' => 'definitions.c(13)'
3873 }
3874 );
3875
3876 "identifier"
3877 holds the enumeration identifier. This key is not present
3878 if the enumeration has no identifier.
3879
3880 "context"
3881 is the context in which the enumeration is defined. This is
3882 the filename followed by the line number in parentheses.
3883
3884 "enumerators"
3885 is a reference to a hash table that holds all enumerators
3886 of the enumeration.
3887
3888 "sign"
3889 is a boolean indicating if the enumeration is signed (i.e.
3890 has negative values).
3891
3892 One useful application may be to create a hash table that holds
3893 all enumerators of all defined enumerations:
3894
3895 %enum = map %{ $_->{enumerators} || {} }, $c->enum;
3896
3897 The %enum hash table would then be:
3898
3899 %enum = (
3900 'SOCK_RDM' => 4,
3901 'SOCK_SEQPACKET' => 5,
3902 'SOCK_PACKET' => 10,
3903 'SOCK_STREAM' => 1,
3904 'SOCK_DGRAM' => 2,
3905 'SOCK_RAW' => 3
3906 );
3907
3908 compound_names
3909 "compound_names"
3910 Returns a list of identifiers of all structs and unions
3911 (compound data structures) that are defined in the parsed
3912 source code. Like enumerations, compounds don't need to have an
3913 identifier, nor do they need to be defined.
3914
3915 Again, the only way to retrieve information about all struct
3916 and union objects is to use the "compound" method and don't
3917 pass it any arguments. If you should need a list of all struct
3918 and union identifiers, you can use:
3919
3920 @compound = map { $_->{identifier} || () } $c->compound;
3921
3922 The "def" method returns a true value for all identifiers
3923 returned by "compound_names".
3924
3925 If you need the names of only the structs or only the unions,
3926 use the "struct_names" and "union_names" methods respectively.
3927
3928 compound
3929 "compound"
3930 "compound" LIST
3931 Returns a list of references to hashes containing detailed
3932 information about all compounds (structs and unions) that have
3933 been parsed.
3934
3935 If a list of struct/union identifiers is passed to the method,
3936 the returned list will only contain hash references for those
3937 compounds. The identifiers may optionally be prefixed by
3938 "struct" or "union", which limits the search to the specified
3939 kind of compound.
3940
3941 If an identifier cannot be found, the returned list will
3942 contain an undefined value at that position.
3943
3944 In scalar context, the number of compounds will be returned as
3945 long as the number of arguments to the method call is not 1. In
3946 the latter case, a hash reference holding information for the
3947 compound will be returned.
3948
3949 The list returned by the "compound" method looks similar to
3950 this:
3951
3952 @compound = (
3953 {
3954 'identifier' => 'STRUCT_SV',
3955 'align' => 1,
3956 'declarations' => [
3957 {
3958 'type' => 'void',
3959 'declarators' => [
3960 {
3961 'size' => 8,
3962 'offset' => 0,
3963 'declarator' => '*sv_any'
3964 }
3965 ]
3966 },
3967 {
3968 'type' => 'U32',
3969 'declarators' => [
3970 {
3971 'size' => 8,
3972 'offset' => 8,
3973 'declarator' => 'sv_refcnt'
3974 }
3975 ]
3976 },
3977 {
3978 'type' => 'U32',
3979 'declarators' => [
3980 {
3981 'size' => 8,
3982 'offset' => 16,
3983 'declarator' => 'sv_flags'
3984 }
3985 ]
3986 }
3987 ],
3988 'type' => 'struct',
3989 'size' => 24,
3990 'context' => 'definitions.c(23)',
3991 'pack' => 0
3992 },
3993 {
3994 'identifier' => 'xxx',
3995 'align' => 1,
3996 'declarations' => [
3997 {
3998 'type' => 'int',
3999 'declarators' => [
4000 {
4001 'size' => 4,
4002 'offset' => 0,
4003 'declarator' => 'a'
4004 }
4005 ]
4006 },
4007 {
4008 'type' => 'int',
4009 'declarators' => [
4010 {
4011 'size' => 4,
4012 'offset' => 4,
4013 'declarator' => 'b'
4014 }
4015 ]
4016 }
4017 ],
4018 'type' => 'struct',
4019 'size' => 8,
4020 'context' => 'definitions.c(31)',
4021 'pack' => 0
4022 },
4023 {
4024 'align' => 1,
4025 'declarations' => [
4026 {
4027 'type' => 'int',
4028 'declarators' => [
4029 {
4030 'size' => 8,
4031 'offset' => 0,
4032 'declarator' => 'abc[2]'
4033 }
4034 ]
4035 },
4036 {
4037 'type' => 'struct xxx',
4038 'declarators' => [
4039 {
4040 'size' => 96,
4041 'offset' => 0,
4042 'declarator' => 'ab[3][4]'
4043 }
4044 ]
4045 },
4046 {
4047 'type' => 'any',
4048 'declarators' => [
4049 {
4050 'size' => 8,
4051 'offset' => 0,
4052 'declarator' => 'ptr'
4053 }
4054 ]
4055 }
4056 ],
4057 'type' => 'union',
4058 'size' => 96,
4059 'context' => 'definitions.c(29)',
4060 'pack' => 0
4061 }
4062 );
4063
4064 "identifier"
4065 holds the struct or union identifier. This key is not
4066 present if the compound has no identifier.
4067
4068 "context"
4069 is the context in which the struct or union is defined.
4070 This is the filename followed by the line number in
4071 parentheses.
4072
4073 "type"
4074 is either 'struct' or 'union'.
4075
4076 "size"
4077 is the size of the struct or union.
4078
4079 "align"
4080 is the alignment of the struct or union.
4081
4082 "pack"
4083 is the struct member alignment if the compound is packed,
4084 or zero otherwise.
4085
4086 "declarations"
4087 is an array of hash references describing each struct
4088 declaration:
4089
4090 "type"
4091 is the type of the struct declaration. This may be a
4092 string or a reference to a hash describing the type.
4093
4094 "declarators"
4095 is an array of hashes describing each declarator:
4096
4097 "declarator"
4098 is a string representation of the declarator.
4099
4100 "offset"
4101 is the offset of the struct member represented by
4102 the current declarator relative to the beginning of
4103 the struct or union.
4104
4105 "size"
4106 is the size occupied by the struct member
4107 represented by the current declarator.
4108
4109 It may be useful to have separate lists for structs and unions.
4110 One way to retrieve such lists would be to use
4111
4112 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4113 for $c->compound;
4114
4115 However, you should use the "struct" and "union" methods, which
4116 is a lot simpler:
4117
4118 @structs = $c->struct;
4119 @unions = $c->union;
4120
4121 struct_names
4122 "struct_names"
4123 Returns a list of all defined struct identifiers. This is
4124 equivalent to calling "compound_names", just that it only
4125 returns the names of the struct identifiers and doesn't return
4126 the names of the union identifiers.
4127
4128 struct
4129 "struct"
4130 "struct" LIST
4131 Like the "compound" method, but only allows for structs.
4132
4133 union_names
4134 "union_names"
4135 Returns a list of all defined union identifiers. This is
4136 equivalent to calling "compound_names", just that it only
4137 returns the names of the union identifiers and doesn't return
4138 the names of the struct identifiers.
4139
4140 union
4141 "union"
4142 "union" LIST
4143 Like the "compound" method, but only allows for unions.
4144
4145 typedef_names
4146 "typedef_names"
4147 Returns a list of all defined typedef identifiers. Typedefs
4148 that do not specify a type that you could actually work with
4149 will not be returned.
4150
4151 The "def" method returns a true value for all identifiers
4152 returned by "typedef_names".
4153
4154 typedef
4155 "typedef"
4156 "typedef" LIST
4157 Returns a list of references to hashes containing detailed
4158 information about all typedefs that have been parsed.
4159
4160 If a list of typedef identifiers is passed to the method, the
4161 returned list will only contain hash references for those
4162 typedefs.
4163
4164 If an identifier cannot be found, the returned list will
4165 contain an undefined value at that position.
4166
4167 In scalar context, the number of typedefs will be returned as
4168 long as the number of arguments to the method call is not 1. In
4169 the latter case, a hash reference holding information for the
4170 typedef will be returned.
4171
4172 The list returned by the "typedef" method looks similar to
4173 this:
4174
4175 @typedef = (
4176 {
4177 'type' => 'unsigned long',
4178 'declarator' => 'U32'
4179 },
4180 {
4181 'type' => 'void',
4182 'declarator' => '*any'
4183 },
4184 {
4185 'type' => {
4186 'align' => 1,
4187 'declarations' => [
4188 {
4189 'type' => 'int',
4190 'declarators' => [
4191 {
4192 'size' => 8,
4193 'offset' => 0,
4194 'declarator' => 'abc[2]'
4195 }
4196 ]
4197 },
4198 {
4199 'type' => 'struct xxx',
4200 'declarators' => [
4201 {
4202 'size' => 96,
4203 'offset' => 0,
4204 'declarator' => 'ab[3][4]'
4205 }
4206 ]
4207 },
4208 {
4209 'type' => 'any',
4210 'declarators' => [
4211 {
4212 'size' => 8,
4213 'offset' => 0,
4214 'declarator' => 'ptr'
4215 }
4216 ]
4217 }
4218 ],
4219 'type' => 'union',
4220 'size' => 96,
4221 'context' => 'definitions.c(29)',
4222 'pack' => 0
4223 },
4224 'declarator' => 'test'
4225 }
4226 );
4227
4228 "declarator"
4229 is the type declarator.
4230
4231 "type"
4232 is the type specification. This may be a string or a
4233 reference to a hash describing the type. See "enum" and
4234 "compound" for a description on how to interpret this hash.
4235
4236 macro_names
4237 "macro_names"
4238 Returns a list of all defined macro names.
4239
4240 The list returned by the "macro_names" method looks similar to
4241 this:
4242
4243 @macro_names = (
4244 '__STDC_VERSION__',
4245 '__STDC_HOSTED__',
4246 'DEFINED',
4247 'MULTIPLY',
4248 'ABC_SIZE'
4249 );
4250
4251 This works only as long as the preprocessor is not reset. See
4252 "Preprocessor configuration" for details.
4253
4254 macro
4255 "macro"
4256 "macro" LIST
4257 Returns the definitions for all defined macros.
4258
4259 If a list of macro names is passed to the method, the returned
4260 list will only contain the definitions for those macros. For
4261 undefined macros, "undef" will be returned.
4262
4263 The list returned by the "macro" method looks similar to this:
4264
4265 @macro = (
4266 '__STDC_VERSION__ 199901L',
4267 '__STDC_HOSTED__ 1',
4268 'DEFINED',
4269 'MULTIPLY(x, y) ((x)*(y))',
4270 'ABC_SIZE 2'
4271 );
4272
4273 This works only as long as the preprocessor is not reset. See
4274 "Preprocessor configuration" for details.
4275
4277 You can alternatively call the following functions as methods on
4278 Convert::Binary::C objects.
4279
4280 feature
4281 "feature" STRING
4282 Checks if Convert::Binary::C was built with certain features.
4283 For example,
4284
4285 print "debugging version"
4286 if Convert::Binary::C::feature('debug');
4287
4288 will check if Convert::Binary::C was built with debugging
4289 support enabled. The "feature" function returns 1 if the
4290 feature is enabled, 0 if the feature is disabled, and "undef"
4291 if the feature is unknown. Currently the only features that can
4292 be checked are "ieeefp" and "debug".
4293
4294 You can enable or disable certain features at compile time of
4295 the module by using the
4296
4297 perl Makefile.PL enable-feature disable-feature
4298
4299 syntax.
4300
4301 native
4302 "native"
4303 "native" STRING
4304 Returns the value of a property of the native system that
4305 Convert::Binary::C was built on. For example,
4306
4307 $size = Convert::Binary::C::native('IntSize');
4308
4309 will fetch the size of an "int" on the native system. The
4310 following properties can be queried:
4311
4312 Alignment
4313 ByteOrder
4314 CharSize
4315 CompoundAlignment
4316 DoubleSize
4317 EnumSize
4318 FloatSize
4319 HostedC
4320 IntSize
4321 LongDoubleSize
4322 LongLongSize
4323 LongSize
4324 PointerSize
4325 ShortSize
4326 StdCVersion
4327 UnsignedBitfields
4328 UnsignedChars
4329
4330 You can also call "native" without arguments, in which case it
4331 will return a reference to a hash with all properties, like:
4332
4333 $native = {
4334 'EnumSize' => 4,
4335 'ShortSize' => 2,
4336 'UnsignedChars' => 0,
4337 'IntSize' => 4,
4338 'LongDoubleSize' => 16,
4339 'StdCVersion' => 201710,
4340 'HostedC' => 1,
4341 'CompoundAlignment' => 1,
4342 'UnsignedBitfields' => 0,
4343 'DoubleSize' => 8,
4344 'Alignment' => 16,
4345 'PointerSize' => 8,
4346 'ByteOrder' => 'LittleEndian',
4347 'LongLongSize' => 8,
4348 'CharSize' => 1,
4349 'LongSize' => 8,
4350 'FloatSize' => 4
4351 };
4352
4353 The contents of that hash are suitable for passing them to the
4354 "configure" method.
4355
4357 Like perl itself, Convert::Binary::C can be compiled with debugging
4358 support that can then be selectively enabled at runtime. You can
4359 specify whether you like to build Convert::Binary::C with debugging
4360 support or not by explicitly giving an argument to Makefile.PL. Use
4361
4362 perl Makefile.PL enable-debug
4363
4364 to enable debugging, or
4365
4366 perl Makefile.PL disable-debug
4367
4368 to disable debugging. The default will depend on how your perl binary
4369 was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4370 be built with debugging support, too.
4371
4372 Once you have built Convert::Binary::C with debugging support, you can
4373 use the following syntax to enable debug output. Instead of
4374
4375 use Convert::Binary::C;
4376
4377 you simply say
4378
4379 use Convert::Binary::C debug => 'all';
4380
4381 which will enable all debug output. However, I don't recommend to
4382 enable all debug output, because that can be a fairly large amount.
4383
4384 Debugging options
4385 Instead of saying "all", you can pass a string that consists of one or
4386 more of the following characters:
4387
4388 m enable memory allocation tracing
4389 M enable memory allocation & assertion tracing
4390
4391 h enable hash table debugging
4392 H enable hash table dumps
4393
4394 d enable debug output from the XS module
4395 c enable debug output from the ctlib
4396 t enable debug output about type objects
4397
4398 l enable debug output from the C lexer
4399 p enable debug output from the C parser
4400 P enable debug output from the C preprocessor
4401 r enable debug output from the #pragma parser
4402
4403 y enable debug output from yacc (bison)
4404
4405 So the following might give you a brief overview of what's going on
4406 inside Convert::Binary::C:
4407
4408 use Convert::Binary::C debug => 'dct';
4409
4410 When you want to debug memory allocation using
4411
4412 use Convert::Binary::C debug => 'm';
4413
4414 you can use the Perl script check_alloc.pl that resides in the
4415 ctlib/util/tool directory to extract statistics about memory usage and
4416 information about memory leaks from the resulting debug output.
4417
4418 Redirecting debug output
4419 By default, all debug output is written to "stderr". You can, however,
4420 redirect the debug output to a file with the "debugfile" option:
4421
4422 use Convert::Binary::C debug => 'dcthHm',
4423 debugfile => './debug.out';
4424
4425 If the file cannot be opened, you'll receive a warning and the output
4426 will go the "stderr" way again.
4427
4428 Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4429 and "CBC_DEBUG_FILE" to turn on debug output.
4430
4431 If Convert::Binary::C is built without debugging support, passing the
4432 "debug" or "debugfile" options will cause a warning to be issued. The
4433 corresponding environment variables will simply be ignored.
4434
4436 "CBC_ORDER_MEMBERS"
4437 Setting this variable to a non-zero value will globally turn on hash
4438 key ordering for compound members. Have a look at the "OrderMembers"
4439 option for details.
4440
4441 Setting the variable to the name of a perl module will additionally use
4442 this module instead of the predefined modules for member ordering to
4443 tie the hashes to.
4444
4445 "CBC_DEBUG_OPT"
4446 If Convert::Binary::C is built with debugging support, you can use this
4447 variable to specify the debugging options.
4448
4449 "CBC_DEBUG_FILE"
4450 If Convert::Binary::C is built with debugging support, you can use this
4451 variable to redirect the debug output to a file.
4452
4453 "CBC_DISABLE_PARSER"
4454 This variable is intended purely for development. Setting it to a non-
4455 zero value disables the Convert::Binary::C parser, which means that no
4456 information is collected from the file or code that is parsed. However,
4457 the preprocessor will run, which is useful for benchmarking the
4458 preprocessor.
4459
4461 Flexible array members are a feature introduced with ISO-C99. It's a
4462 common problem that you have a variable length data field at the end of
4463 a structure, for example an array of characters at the end of a message
4464 struct. ISO-C99 allows you to write this as:
4465
4466 struct message {
4467 long header;
4468 char data[];
4469 };
4470
4471 The advantage is that you clearly indicate that the size of the
4472 appended data is variable, and that the "data" member doesn't
4473 contribute to the size of the "message" structure.
4474
4475 When packing or unpacking data, Convert::Binary::C deals with flexible
4476 array members as if their length was adjustable. For example, "unpack"
4477 will adapt the length of the array depending on the input string:
4478
4479 $msg1 = $c->unpack('message', 'abcdefg');
4480 $msg2 = $c->unpack('message', 'abcdefghijkl');
4481
4482 The following data is unpacked:
4483
4484 $msg1 = {
4485 'header' => 1633837924,
4486 'data' => [
4487 101,
4488 102,
4489 103
4490 ]
4491 };
4492 $msg2 = {
4493 'header' => 1633837924,
4494 'data' => [
4495 101,
4496 102,
4497 103,
4498 104,
4499 105,
4500 106,
4501 107,
4502 108
4503 ]
4504 };
4505
4506 Similarly, pack will adjust the length of the output string according
4507 to the data you feed in:
4508
4509 use Data::Hexdumper;
4510
4511 $msg = {
4512 header => 4711,
4513 data => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4514 };
4515
4516 $data = $c->pack('message', $msg);
4517
4518 print hexdump(data => $data);
4519
4520 This would print:
4521
4522 0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{|}~
4523 0x0010 : 7F 80 81 82 83 84 85 86 87 88 : ..........
4524
4525 Incomplete types such as
4526
4527 typedef unsigned long array[];
4528
4529 are handled in exactly the same way. Thus, you can easily
4530
4531 $array = $c->unpack('array', '?'x20);
4532
4533 which will unpack the following array:
4534
4535 $array = [
4536 1061109567,
4537 1061109567,
4538 1061109567,
4539 1061109567,
4540 1061109567
4541 ];
4542
4543 You can also alter the length of an array using the "Dimension" tag.
4544
4546 When using Convert::Binary::C to handle floating point values, you have
4547 to be aware of some limitations.
4548
4549 You're usually safe if all your platforms are using the IEEE floating
4550 point format. During the Convert::Binary::C build process, the "ieeefp"
4551 feature will automatically be enabled if the host is using IEEE
4552 floating point. You can check for this feature at runtime using the
4553 "feature" function:
4554
4555 if (Convert::Binary::C::feature('ieeefp')) {
4556 # do something
4557 }
4558
4559 When IEEE floating point support is enabled, the module can also handle
4560 floating point values of a different byteorder.
4561
4562 If your host platform is not using IEEE floating point, the "ieeefp"
4563 feature will be disabled. Convert::Binary::C then will be more
4564 restrictive, refusing to handle any non-native floating point values.
4565
4566 However, Convert::Binary::C cannot detect the floating point format
4567 used by your target platform. It can only try to prevent problems in
4568 obvious cases. If you know your target platform has a completely
4569 different floating point format, don't use floating point conversion at
4570 all.
4571
4572 Whenever Convert::Binary::C detects that it cannot properly do floating
4573 point value conversion, it will issue a warning and will not attempt to
4574 convert the floating point value.
4575
4577 Bitfield support in Convert::Binary::C is currently in an experimental
4578 state. You are encouraged to test it, but you should not blindly rely
4579 on its results.
4580
4581 You are also encouraged to supply layouting algorithms for compilers
4582 whose bitfield implementation is not handled correctly at the moment.
4583 Even better that the plain algorithm is of course a patch that adds a
4584 new bitfield layouting engine.
4585
4586 While bitfields may not be handled correctly by the conversion routines
4587 yet, they are always parsed correctly. This means that you can reliably
4588 use the declarator fields as returned by the "struct" or "typedef"
4589 methods. Given the following source
4590
4591 struct bitfield {
4592 int seven:7;
4593 int :1;
4594 int four:4, :0;
4595 int integer;
4596 };
4597
4598 a call to "struct" will return
4599
4600 @struct = (
4601 {
4602 'identifier' => 'bitfield',
4603 'align' => 1,
4604 'declarations' => [
4605 {
4606 'type' => 'int',
4607 'declarators' => [
4608 {
4609 'declarator' => 'seven:7'
4610 }
4611 ]
4612 },
4613 {
4614 'type' => 'int',
4615 'declarators' => [
4616 {
4617 'declarator' => ':1'
4618 }
4619 ]
4620 },
4621 {
4622 'type' => 'int',
4623 'declarators' => [
4624 {
4625 'declarator' => 'four:4'
4626 },
4627 {
4628 'declarator' => ':0'
4629 }
4630 ]
4631 },
4632 {
4633 'type' => 'int',
4634 'declarators' => [
4635 {
4636 'size' => 4,
4637 'offset' => 4,
4638 'declarator' => 'integer'
4639 }
4640 ]
4641 }
4642 ],
4643 'type' => 'struct',
4644 'size' => 8,
4645 'context' => 'bitfields.c(1)',
4646 'pack' => 0
4647 }
4648 );
4649
4650 No size/offset keys will currently be returned for bitfield entries.
4651
4653 Convert::Binary::C was designed to be thread-safe.
4654
4656 If you wish to derive a new class from Convert::Binary::C, this is
4657 relatively easy. Despite their XS implementation, Convert::Binary::C
4658 objects are actually blessed hash references.
4659
4660 The XS data is stored in a read-only hash value for the key that is the
4661 empty string. So it is safe to use any non-empty hash key when deriving
4662 your own class. In addition, Convert::Binary::C does quite a lot of
4663 checks to detect corruption in the object hash.
4664
4665 If you store private data in the hash, you should override the "clone"
4666 method and provide the necessary code to clone your private data.
4667 You'll have to call "SUPER::clone", but this will only clone the
4668 Convert::Binary::C part of the object.
4669
4670 For an example of a derived class, you can have a look at
4671 Convert::Binary::C::Cached.
4672
4674 Convert::Binary::C should build and run on most of the platforms that
4675 Perl runs on:
4676
4677 • Various Linux systems
4678
4679 • Various BSD systems
4680
4681 • HP-UX
4682
4683 • Compaq/HP Tru64 Unix
4684
4685 • Mac-OS X
4686
4687 • Cygwin
4688
4689 • Windows 98/NT/2000/XP
4690
4691 Also, many architectures are supported:
4692
4693 • Various Intel Pentium and Itanium systems
4694
4695 • Various Alpha systems
4696
4697 • HP PA-RISC
4698
4699 • Power-PC
4700
4701 • StrongARM
4702
4703 The module should build with any perl binary from 5.004 up to the
4704 latest development version.
4705
4707 Most of the time when you're really looking for Convert::Binary::C
4708 you'll actually end up finding one of the following modules. Some of
4709 them have different goals, so it's probably worth pointing out the
4710 differences.
4711
4712 C::Include
4713 Like Convert::Binary::C, this module aims at doing conversion from and
4714 to binary data based on C types. However, its configurability is very
4715 limited compared to Convert::Binary::C. Also, it does not parse all C
4716 code correctly. It's slower than Convert::Binary::C, doesn't have a
4717 preprocessor. On the plus side, it's written in pure Perl.
4718
4719 C::DynaLib::Struct
4720 This module doesn't allow you to reuse your C source code. One main
4721 goal of Convert::Binary::C was to avoid code duplication or, even
4722 worse, having to maintain different representations of your data
4723 structures. Like C::Include, C::DynaLib::Struct is rather limited in
4724 its configurability.
4725
4726 Win32::API::Struct
4727 This module has a special purpose. It aims at building structs for
4728 interfacing Perl code with Windows API code.
4729
4731 • Alain Barbet <alian@cpan.org> for testing and debugging support.
4732
4733 • Mitchell N. Charity for giving me pointers into various interesting
4734 directions.
4735
4736 • Alexis Denis for making me improve (externally) and simplify
4737 (internally) floating point support. He can also be blamed
4738 (indirectly) for the "initializer" method, as I need it in my effort
4739 to support bitfields some day.
4740
4741 • Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4742 way to and back home from work, and for making me think about
4743 supporting "pack" and "unpack" for compound members.
4744
4745 • Thorsten Jens <thojens@gmx.de> for testing the package on various
4746 platforms.
4747
4748 • Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4749 giving invaluable feedback.
4750
4751 • Thomas Pornin <pornin@bolet.org> for his excellent "ucpp"
4752 preprocessor library.
4753
4754 • Marc Rosenthal for his suggestions and support.
4755
4756 • James Roskind, as his C parser was a great starting point to fix all
4757 the problems I had with my original parser based only on the ANSI
4758 ruleset.
4759
4760 • Gisbert W. Selke for spotting some interesting bugs and providing
4761 extensive reports.
4762
4763 • Steffen Zimmermann for a prolific discussion on the cloning
4764 algorithm.
4765
4767 I'm sure there are still lots of bugs in the code for this module. If
4768 you find any bugs, Convert::Binary::C doesn't seem to build on your
4769 system or any of its tests fail, please report the issue at
4770 <https://github.com/mhx/Convert-Binary-C/issues>.
4771
4773 Some features in Convert::Binary::C are marked as experimental. This
4774 has most probably one of the following reasons:
4775
4776 • The feature does not behave in exactly the way that I wish it did,
4777 possibly due to some limitations in the current design of the module.
4778
4779 • The feature hasn't been tested enough and may completely fail to
4780 produce the expected results.
4781
4782 I hope to fix most issues with these experimental features someday, but
4783 this may mean that I have to change the way they currently work in a
4784 way that's not backwards compatible. So if any of these features is
4785 useful to you, you can use it, but you should be aware that the
4786 behaviour or the interface may change in future releases of this
4787 module.
4788
4790 If you're interested in what I currently plan to improve (or fix), have
4791 a look at the TODO file.
4792
4794 Copyright (c) 2002-2020 Marcus Holland-Moritz. All rights reserved.
4795 This program is free software; you can redistribute it and/or modify it
4796 under the same terms as Perl itself.
4797
4798 The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4799 redistribution details refer to ctlib/ucpp/README.
4800
4801 Portions copyright (c) 1989, 1990 James A. Roskind.
4802
4804 See ccconfig, perl, perldata, perlop, perlvar, Data::Dumper and
4805 Scalar::Util.
4806
4807
4808
4809perl v5.36.0 2022-07-22 Convert::Binary::C(3)