1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4
6 Convert::Binary::C - Binary Data Conversion using C Types
7
9 Simple
10 use Convert::Binary::C;
11
12 #---------------------------------------------
13 # Create a new object and parse embedded code
14 #---------------------------------------------
15 my $c = Convert::Binary::C->new->parse(<<ENDC);
16
17 enum Month { JAN, FEB, MAR, APR, MAY, JUN,
18 JUL, AUG, SEP, OCT, NOV, DEC };
19
20 struct Date {
21 int year;
22 enum Month month;
23 int day;
24 };
25
26 ENDC
27
28 #-----------------------------------------------
29 # Pack Perl data structure into a binary string
30 #-----------------------------------------------
31 my $date = { year => 2002, month => 'DEC', day => 24 };
32
33 my $packed = $c->pack('Date', $date);
34
35 Advanced
36 use Convert::Binary::C;
37 use Data::Dumper;
38
39 #---------------------
40 # Create a new object
41 #---------------------
42 my $c = new Convert::Binary::C ByteOrder => 'BigEndian';
43
44 #---------------------------------------------------
45 # Add include paths and global preprocessor defines
46 #---------------------------------------------------
47 $c->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include',
48 '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include-fixed',
49 '/usr/include')
50 ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
51
52 #----------------------------------
53 # Parse the 'time.h' header file
54 #----------------------------------
55 $c->parse_file('time.h');
56
57 #---------------------------------------
58 # See which files the object depends on
59 #---------------------------------------
60 print Dumper([$c->dependencies]);
61
62 #-----------------------------------------------------------
63 # See if struct timespec is defined and dump its definition
64 #-----------------------------------------------------------
65 if ($c->def('struct timespec')) {
66 print Dumper($c->struct('timespec'));
67 }
68
69 #-------------------------------
70 # Create some binary dummy data
71 #-------------------------------
72 my $data = "binary_test_string";
73
74 #--------------------------------------------------------
75 # Unpack $data according to 'struct timespec' definition
76 #--------------------------------------------------------
77 if (length($data) >= $c->sizeof('timespec')) {
78 my $perl = $c->unpack('timespec', $data);
79 print Dumper($perl);
80 }
81
82 #--------------------------------------------------------
83 # See which member lies at offset 5 of 'struct timespec'
84 #--------------------------------------------------------
85 my $member = $c->member('timespec', 5);
86 print "member('timespec', 5) = '$member'\n";
87
89 Convert::Binary::C is a preprocessor and parser for C type definitions.
90 It is highly configurable and supports arbitrarily complex data
91 structures. Its object-oriented interface has "pack" and "unpack"
92 methods that act as replacements for Perl's "pack" and "unpack" and
93 allow one to use C types instead of a string representation of the data
94 structure for conversion of binary data from and to Perl's complex data
95 structures.
96
97 Actually, what Convert::Binary::C does is not very different from what
98 a C compiler does, just that it doesn't compile the source code into an
99 object file or executable, but only parses the code and allows Perl to
100 use the enumerations, structs, unions and typedefs that have been
101 defined within your C source for binary data conversion, similar to
102 Perl's "pack" and "unpack".
103
104 Beyond that, the module offers a lot of convenience methods to retrieve
105 information about the C types that have been parsed.
106
107 Background and History
108 In late 2000 I wrote a real-time debugging interface for an embedded
109 medical device that allowed me to send out data from that device over
110 its integrated Ethernet adapter. The interface was "printf()"-like, so
111 you could easily send out strings or numbers. But you could also send
112 out what I called arbitrary data, which was intended for arbitrary
113 blocks of the device's memory.
114
115 Another part of this real-time debugger was a Perl application running
116 on my workstation that gathered all the messages that were sent out
117 from the embedded device. It printed all the strings and numbers, and
118 hex-dumped the arbitrary data. However, manually parsing a couple of
119 300 byte hex-dumps of a complex C structure is not only frustrating,
120 but also error-prone and time consuming.
121
122 Using "unpack" to retrieve the contents of a C structure works fine for
123 small structures and if you don't have to deal with struct member
124 alignment. But otherwise, maintaining such code can be as awful as
125 deciphering hex-dumps.
126
127 As I didn't find anything to solve my problem on the CPAN, I wrote a
128 little module that translated simple C structs into "unpack" strings.
129 It worked, but it was slow. And since it couldn't deal with struct
130 member alignment, I soon found myself adding padding bytes everywhere.
131 So again, I had to maintain two sources, and changing one of them
132 forced me to touch the other one.
133
134 All in all, this little module seemed to make my task a bit easier, but
135 it was far from being what I was thinking of:
136
137 · A module that could directly use the source I've been coding for the
138 embedded device without any modifications.
139
140 · A module that could be configured to match the properties of the
141 different compilers and target platforms I was using.
142
143 · A module that was fast enough to decode a great amount of binary data
144 even on my slow workstation.
145
146 I didn't know how to accomplish these tasks until I read something
147 about XS. At least, it seemed as if it could solve my performance
148 problems. However, writing a C parser in C isn't easier than it is in
149 Perl. But writing a C preprocessor from scratch is even worse.
150
151 Fortunately enough, after a few weeks of searching I found both, a
152 lean, open-source C preprocessor library, and a reusable YACC grammar
153 for ANSI-C. That was the beginning of the development of
154 Convert::Binary::C in late 2001.
155
156 Now, I'm successfully using the module in my embedded environment since
157 long before it appeared on CPAN. From my point of view, it is exactly
158 what I had in mind. It's fast, flexible, easy to use and portable. It
159 doesn't require external programs or other Perl modules.
160
161 About this document
162 This document describes how to use Convert::Binary::C. A lot of
163 different features are presented, and the example code sometimes uses
164 Perl's more advanced language elements. If your experience with Perl is
165 rather limited, you should know how to use Perl's very good
166 documentation system.
167
168 To look up one of the manpages, use the "perldoc" command. For
169 example,
170
171 perldoc perl
172
173 will show you Perl's main manpage. To look up a specific Perl function,
174 use "perldoc -f":
175
176 perldoc -f map
177
178 gives you more information about the "map" function. You can also
179 search the FAQ using "perldoc -q":
180
181 perldoc -q array
182
183 will give you everything you ever wanted to know about Perl arrays. But
184 now, let's go on with some real stuff!
185
186 Why use Convert::Binary::C?
187 Say you want to pack (or unpack) data according to the following C
188 structure:
189
190 struct foo {
191 char ary[3];
192 unsigned short baz;
193 int bar;
194 };
195
196 You could of course use Perl's "pack" and "unpack" functions:
197
198 @ary = (1, 2, 3);
199 $baz = 40000;
200 $bar = -4711;
201 $binary = pack 'c3 S i', @ary, $baz, $bar;
202
203 But this implies that the struct members are byte aligned. If they were
204 long aligned (which is the default for most compilers), you'd have to
205 write
206
207 $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
208
209 which doesn't really increase readability.
210
211 Now imagine that you need to pack the data for a completely different
212 architecture with different byte order. You would look into the "pack"
213 manpage again and perhaps come up with this:
214
215 $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
216
217 However, if you try to unpack $foo again, your signed values have
218 turned into unsigned ones.
219
220 All this can still be managed with Perl. But imagine your structures
221 get more complex? Imagine you need to support different platforms?
222 Imagine you need to make changes to the structures? You'll not only
223 have to change the C source but also dozens of "pack" strings in your
224 Perl code. This is no fun. And Perl should be fun.
225
226 Now, wouldn't it be great if you could just read in the C source you've
227 already written and use all the types defined there for packing and
228 unpacking? That's what Convert::Binary::C does.
229
230 Creating a Convert::Binary::C object
231 To use Convert::Binary::C just say
232
233 use Convert::Binary::C;
234
235 to load the module. Its interface is completely object oriented, so it
236 doesn't export any functions.
237
238 Next, you need to create a new Convert::Binary::C object. This can be
239 done by either
240
241 $c = Convert::Binary::C->new;
242
243 or
244
245 $c = new Convert::Binary::C;
246
247 You can optionally pass configuration options to the constructor as
248 described in the next section.
249
250 Configuring the object
251 To configure a Convert::Binary::C object, you can either call the
252 "configure" method or directly pass the configuration options to the
253 constructor. If you want to change byte order and alignment, you can
254 use
255
256 $c->configure(ByteOrder => 'LittleEndian',
257 Alignment => 2);
258
259 or you can change the construction code to
260
261 $c = new Convert::Binary::C ByteOrder => 'LittleEndian',
262 Alignment => 2;
263
264 Either way, the object will now know that it should use little endian
265 (Intel) byte order and 2-byte struct member alignment for packing and
266 unpacking.
267
268 Alternatively, you can use the option names as names of methods to
269 configure the object, like:
270
271 $c->ByteOrder('LittleEndian');
272
273 You can also retrieve information about the current configuration of a
274 Convert::Binary::C object. For details, see the section about the
275 "configure" method.
276
277 Parsing C code
278 Convert::Binary::C allows two ways of parsing C source. Either by
279 parsing external C header or C source files:
280
281 $c->parse_file('header.h');
282
283 Or by parsing C code embedded in your script:
284
285 $c->parse(<<'CCODE');
286 struct foo {
287 char ary[3];
288 unsigned short baz;
289 int bar;
290 };
291 CCODE
292
293 Now the object $c will know everything about "struct foo". The example
294 above uses a so-called here-document. It allows one to easily embed
295 multi-line strings in your code. You can find more about here-documents
296 in perldata or perlop.
297
298 Since the "parse" and "parse_file" methods throw an exception when a
299 parse error occurs, you usually want to catch these in an "eval" block:
300
301 eval { $c->parse_file('header.h') };
302 if ($@) {
303 # handle error appropriately
304 }
305
306 Perl's special $@ variable will contain an empty string (which
307 evaluates to a false value in boolean context) on success or an error
308 string on failure.
309
310 As another feature, "parse" and "parse_file" return a reference to
311 their object on success, just like "configure" does when you're
312 configuring the object. This will allow you to write constructs like
313 this:
314
315 my $c = eval {
316 Convert::Binary::C->new(Include => ['/usr/include'])
317 ->parse_file('header.h')
318 };
319 if ($@) {
320 # handle error appropriately
321 }
322
323 Packing and unpacking
324 Convert::Binary::C has two methods, "pack" and "unpack", that act
325 similar to the functions of same denominator in Perl. To perform the
326 packing described in the example above, you could write:
327
328 $data = {
329 ary => [1, 2, 3],
330 baz => 40000,
331 bar => -4711,
332 };
333 $binary = $c->pack('foo', $data);
334
335 Unpacking will work exactly the same way, just that the "unpack" method
336 will take a byte string as its input and will return a reference to a
337 (possibly very complex) Perl data structure.
338
339 $binary = get_data_from_memory();
340 $data = $c->unpack('foo', $binary);
341
342 You can now easily access all of the values:
343
344 print "foo.ary[1] = $data->{ary}[1]\n";
345
346 Or you can even more conveniently use the Data::Dumper module:
347
348 use Data::Dumper;
349 print Dumper($data);
350
351 The output would look something like this:
352
353 $VAR1 = {
354 'bar' => -271,
355 'baz' => 5000,
356 'ary' => [
357 42,
358 48,
359 100
360 ]
361 };
362
363 Preprocessor configuration
364 Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C
365 preprocessor. It is compliant to ISO-C99, so you don't have to worry
366 about using even weird preprocessor constructs in your code.
367
368 If your C source contains includes or depends upon preprocessor
369 defines, you may need to configure the internal preprocessor. Use the
370 "Include" and "Define" configuration options for that:
371
372 $c->configure(Include => ['/usr/include',
373 '/home/mhx/include'],
374 Define => [qw( NDEBUG FOO=42 )]);
375
376 If your code uses system includes, it is most likely that you will need
377 to define the symbols that are usually defined by the compiler.
378
379 On some operating systems, the system includes require the preprocessor
380 to predefine a certain set of assertions. Assertions are supported by
381 "ucpp", and you can define them either in the source code using
382 "#assert" or as a property of the Convert::Binary::C object using
383 "Assert":
384
385 $c->configure(Assert => ['predicate(answer)']);
386
387 Information about defined macros can be retrieved from the preprocessor
388 as long as its configuration isn't changed. The preprocessor is
389 implicitly reset if you change one of the following configuration
390 options:
391
392 Include
393 Define
394 Assert
395 HasCPPComments
396 HasMacroVAARGS
397
398 Supported pragma directives
399 Convert::Binary::C supports the "pack" pragma to locally override
400 struct member alignment. The supported syntax is as follows:
401
402 #pragma pack( ALIGN )
403 Sets the new alignment to ALIGN. If ALIGN is 0, resets the
404 alignment to its original value.
405
406 #pragma pack
407 Resets the alignment to its original value.
408
409 #pragma pack( push, ALIGN )
410 Saves the current alignment on a stack and sets the new alignment
411 to ALIGN. If ALIGN is 0, sets the alignment to the default
412 alignment.
413
414 #pragma pack( pop )
415 Restores the alignment to the last value saved on the stack.
416
417 /* Example assumes sizeof( short ) == 2, sizeof( long ) == 4. */
418
419 #pragma pack(1)
420
421 struct nopad {
422 char a; /* no padding bytes between 'a' and 'b' */
423 long b;
424 };
425
426 #pragma pack /* reset to "native" alignment */
427
428 #pragma pack( push, 2 )
429
430 struct pad {
431 char a; /* one padding byte between 'a' and 'b' */
432 long b;
433
434 #pragma pack( push, 1 )
435
436 struct {
437 char c; /* no padding between 'c' and 'd' */
438 short d;
439 } e; /* sizeof( e ) == 3 */
440
441 #pragma pack( pop ); /* back to pack( 2 ) */
442
443 long f; /* one padding byte between 'e' and 'f' */
444 };
445
446 #pragma pack( pop ); /* back to "native" */
447
448 The "pack" pragma as it is currently implemented only affects the
449 maximum struct member alignment. There are compilers that also allow
450 one to specify the minimum struct member alignment. This is not
451 supported by Convert::Binary::C.
452
453 Automatic configuration using "ccconfig"
454 As there are over 20 different configuration options, setting all of
455 them correctly can be a lengthy and tedious task.
456
457 The "ccconfig" script, which is bundled with this module, aims at
458 automatically determining the correct compiler configuration by testing
459 the compiler executable. It works for both, native and cross compilers.
460
462 This section covers one of the fundamental features of
463 Convert::Binary::C. It's how type expressions, referred to as TYPEs in
464 the method reference, are handled by the module.
465
466 Many of the methods, namely "pack", "unpack", "sizeof", "typeof",
467 "member", "offsetof", "def", "initializer" and "tag", are passed a TYPE
468 to operate on as their first argument.
469
470 Standard Types
471 These are trivial. Standard types are simply enum names, struct names,
472 union names, or typedefs. Almost every method that wants a TYPE will
473 accept a standard type.
474
475 For enums, structs and unions, the prefixes "enum", "struct" and
476 "union" are optional. However, if a typedef with the same name exists,
477 like in
478
479 struct foo {
480 int bar;
481 };
482
483 typedef int foo;
484
485 you will have to use the prefix to distinguish between the struct and
486 the typedef. Otherwise, a typedef is always given preference.
487
488 Basic Types
489 Basic types, or atomic types, are "int" or "char", for example. It's
490 possible to use these basic types without having parsed any code. You
491 can simply do
492
493 $c = new Convert::Binary::C;
494 $size = $c->sizeof('unsigned long');
495 $data = $c->pack('short int', 42);
496
497 Even though the above works fine, it is not possible to define more
498 complex types on the fly, so
499
500 $size = $c->sizeof('struct { int a, b; }');
501
502 will result in an error.
503
504 Basic types are not supported by all methods. For example, it makes no
505 sense to use "member" or "offsetof" on a basic type. Using "typeof"
506 isn't very useful, but supported.
507
508 Member Expressions
509 This is by far the most complex part, depending on the complexity of
510 your data structures. Any standard type that defines a compound or an
511 array may be followed by a member expression to select only a certain
512 part of the data type. Say you have parsed the following C code:
513
514 struct foo {
515 long type;
516 struct {
517 short x, y;
518 } array[20];
519 };
520
521 typedef struct foo matrix[8][8];
522
523 You may want to know the size of the "array" member of "struct foo".
524 This is quite easy:
525
526 print $c->sizeof('foo.array'), " bytes";
527
528 will print
529
530 80 bytes
531
532 depending of course on the "ShortSize" you configured.
533
534 If you wanted to unpack only a single column of "matrix", that's easy
535 as well (and of course it doesn't matter which index you use):
536
537 $column = $c->unpack('matrix[2]', $data);
538
539 Just like in C, it is possible to use out-of-bounds array indices.
540 This means that, for example, despite "array" is declared to have 20
541 elements, the following code
542
543 $size = $c->sizeof('foo.array[4711]');
544 $offset = $c->offsetof('foo', 'array[-13]');
545
546 is perfectly valid and will result in:
547
548 $size = 4
549 $offset = -48
550
551 Member expressions can be arbitrarily complex:
552
553 $type = $c->typeof('matrix[2][3].array[7].y');
554 print "the type is $type";
555
556 will, for example, print
557
558 the type is short
559
560 Member expressions are also used as the second argument to "offsetof".
561
562 Offsets
563 Members returned by the "member" method have an optional offset suffix
564 to indicate that the given offset doesn't point to the start of that
565 member. For example,
566
567 $member = $c->member('matrix', 1431);
568 print $member;
569
570 will print
571
572 [2][1].type+3
573
574 If you would use this as a member expression, like in
575
576 $size = $c->sizeof("matrix $member");
577
578 the offset suffix will simply be ignored. Actually, it will be ignored
579 for all methods if it's used in the first argument.
580
581 When used in the second argument to "offsetof", it will usually do what
582 you mean, i. e. the offset suffix, if present, will be considered when
583 determining the offset. This behaviour ensures that
584
585 $member = $c->member('foo', 43);
586 $offset = $c->offsetof('foo', $member);
587 print "'$member' is located at offset $offset of struct foo";
588
589 will always correctly set $offset:
590
591 '.array[9].y+1' is located at offset 43 of struct foo
592
593 If this is not what you mean, e.g. because you want to know the offset
594 where the member returned by "member" starts, you just have to remove
595 the suffix:
596
597 $member =~ s/\+\d+$//;
598 $offset = $c->offsetof('foo', $member);
599 print "'$member' starts at offset $offset of struct foo";
600
601 This would then print:
602
603 '.array[9].y' starts at offset 42 of struct foo
604
606 In a nutshell, tags are properties that you can attach to types.
607
608 You can add tags to types using the "tag" method, and remove them using
609 "tag" or "untag", for example:
610
611 # Attach 'Format' and 'Hooks' tags
612 $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
613
614 $c->untag('type', 'Format'); # Remove only 'Format' tag
615 $c->untag('type'); # Remove all tags
616
617 You can also use "tag" to see which tags are attached to a type, for
618 example:
619
620 $tags = $c->tag('type');
621
622 This would give you:
623
624 $tags = {
625 'Hooks' => {
626 'pack' => \&rout
627 },
628 'Format' => 'String'
629 };
630
631 Currently, there are only a couple of different tags that influence the
632 way data is packed and unpacked. There are probably more tags to come
633 in the future.
634
635 The Format Tag
636 One of the tags currently available is the "Format" tag. Using this
637 tag, you can tell a Convert::Binary::C object to pack and unpack a
638 certain data type in a special way.
639
640 For example, if you have a (fixed length) string type
641
642 typedef char str_type[40];
643
644 this type would, by default, be unpacked as an array of "char"s. That's
645 because it is only an array of "char"s, and Convert::Binary::C doesn't
646 know it is actually used as a string.
647
648 But you can tell Convert::Binary::C that "str_type" is a C string using
649 the "Format" tag:
650
651 $c->tag('str_type', Format => 'String');
652
653 This will make "unpack" (and of course also "pack") treat the binary
654 data like a null-terminated C string:
655
656 $binary = "Hello World!\n\0 this is just some dummy data";
657 $hello = $c->unpack('str_type', $binary);
658 print $hello;
659
660 would thusly print:
661
662 Hello World!
663
664 Of course, this also works the other way round:
665
666 use Data::Hexdumper;
667
668 $binary = $c->pack('str_type', "Just another C::B::C hacker");
669 print hexdump(data => $binary);
670
671 would print:
672
673 0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
674 0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
675 0x0020 : 00 00 00 00 00 00 00 00 : ........
676
677 If you want Convert::Binary::C to not interpret the binary data at all,
678 you can set the "Format" tag to "Binary". This might not be seem very
679 useful, as "pack" and "unpack" would just pass through the unmodified
680 binary data. But you can tag not only whole types, but also compound
681 members. For example
682
683 $c->parse(<<ENDC);
684 struct packet {
685 unsigned short header;
686 unsigned short flags;
687 unsigned char payload[28];
688 };
689 ENDC
690
691 $c->tag('packet.payload', Format => 'Binary');
692
693 would allow you to write:
694
695 read FILE, $payload, $c->sizeof('packet.payload');
696
697 $packet = {
698 header => 4711,
699 flags => 0xf00f,
700 payload => $payload,
701 };
702
703 $binary = $c->pack('packet', $packet);
704
705 print hexdump(data => $binary);
706
707 This would print something like:
708
709 0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
710 0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
711
712 For obvious reasons, it is not allowed to attach a "Format" tag to
713 bitfield members. Trying to do so will result in an exception being
714 thrown by the "tag" method.
715
716 The ByteOrder Tag
717 The "ByteOrder" tag allows you to override the byte order of certain
718 types or members. The implementation of this tag is considered
719 experimental and may be subject to changes in the future.
720
721 Usually it doesn't make much sense to override the byte order, but
722 there may be applications where a sub-structure is packed in a
723 different byte order than the surrounding structure.
724
725 Take, for example, the following code:
726
727 $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
728 OrderMembers => 1);
729 $c->parse(<<'ENDC');
730
731 typedef unsigned short u_16;
732
733 struct coords_3d {
734 long x, y, z;
735 };
736
737 struct coords_msg {
738 u_16 header;
739 u_16 length;
740 struct coords_3d coords;
741 };
742
743 ENDC
744
745 Assume that while "coords_msg" is big endian, the embedded coordinates
746 "coords_3d" are stored in little endian format for some reason. In C,
747 you'll have to handle this manually.
748
749 But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
750 to either the "coords_3d" structure or to the "coords" member of the
751 "coords_msg" structure. Both will work in this case. The only
752 difference is that if you tag the "coords" member, "coords_3d" will
753 only be treated as little endian if you "pack" or "unpack" the
754 "coords_msg" structure. (BTW, you could also tag all members of
755 "coords_3d" individually, but that would be inefficient.)
756
757 So, let's attach the "ByteOrder" tag to the "coords" member:
758
759 $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
760
761 Assume the following binary message:
762
763 0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
764
765 If you unpack this message...
766
767 $msg = $c->unpack('coords_msg', $binary);
768
769 ...you will get the following data structure:
770
771 $msg = {
772 'header' => 42,
773 'length' => 12,
774 'coords' => {
775 'x' => -1,
776 'y' => 2,
777 'z' => 42
778 }
779 };
780
781 Without the "ByteOrder" tag, you would get:
782
783 $msg = {
784 'header' => 42,
785 'length' => 12,
786 'coords' => {
787 'x' => -1,
788 'y' => 33554432,
789 'z' => 704643072
790 }
791 };
792
793 The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
794 of the tagged object recursively. Of course, it is also possible to
795 override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
796 child type. Confused? Here's an example. In addition to tagging the
797 "coords" member as little endian, we now tag "coords_3d.y" as big
798 endian:
799
800 $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
801 $msg = $c->unpack('coords_msg', $binary);
802
803 This will return the following data structure:
804
805 $msg = {
806 'header' => 42,
807 'length' => 12,
808 'coords' => {
809 'x' => -1,
810 'y' => 33554432,
811 'z' => 42
812 }
813 };
814
815 Note that if you tag both a type and a member of that type within a
816 compound, the tag attached to the type itself has higher precedence.
817 Using the example above, if you would attach a "ByteOrder" tag to both
818 "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
819 would always win.
820
821 Also note that the "ByteOrder" tag might not work as expected along
822 with bitfields, which is why the implementation is considered
823 experimental. Bitfields are currently not affected by the "ByteOrder"
824 tag at all. This is because the byte order would affect the bitfield
825 layout, and a consistent implementation supporting multiple layouts of
826 the same struct would be quite bulky and probably slow down the whole
827 module.
828
829 If you really need the correct behaviour, you can use the following
830 trick:
831
832 $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
833
834 $le->parse(<<'ENDC');
835
836 typedef unsigned short u_16;
837 typedef unsigned long u_32;
838
839 struct message {
840 u_16 header;
841 u_16 length;
842 struct {
843 u_32 a;
844 u_32 b;
845 u_32 c : 7;
846 u_32 d : 5;
847 u_32 e : 20;
848 } data;
849 };
850
851 ENDC
852
853 $be = $le->clone->ByteOrder('BigEndian');
854
855 $le->tag('message.data', Format => 'Binary', Hooks => {
856 unpack => sub { $be->unpack('message.data', @_) },
857 pack => sub { $be->pack('message.data', @_) },
858 });
859
860
861 $msg = $le->unpack('message', $binary);
862
863 This uses the "Format" and "Hooks" tags along with a big endian "clone"
864 of the original little endian object. It attaches hooks to the little
865 endian object and in the hooks it uses the big endian object to "pack"
866 and "unpack" the binary data.
867
868 The Dimension Tag
869 The "Dimension" tag allows you to override the declared dimension of an
870 array for packing or unpacking data. The implementation of this tag is
871 considered very experimental and will definitely change in a future
872 release.
873
874 That being said, the "Dimension" tag is primarily useful to support
875 variable length arrays. Usually, you have to write the following code
876 for such a variable length array in C:
877
878 struct c_message
879 {
880 unsigned count;
881 char data[1];
882 };
883
884 So, because you cannot declare an empty array, you declare an array
885 with a single element. If you have a ISO-C99 compliant compiler, you
886 can write this code instead:
887
888 struct c99_message
889 {
890 unsigned count;
891 char data[];
892 };
893
894 This explicitly tells the compiler that "data" is a flexible array
895 member. Convert::Binary::C already uses this information to handle
896 flexible array members in a special way.
897
898 As you can see in the following example, the two types are treated
899 differently:
900
901 $data = pack 'NC*', 3, 1..8;
902 $uc = $c->unpack('c_message', $data);
903 $uc99 = $c->unpack('c99_message', $data);
904
905 This will result in:
906
907 $uc = {'count' => 3,'data' => [1]};
908 $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
909
910 However, only few compilers support ISO-C99, and you probably don't
911 want to change your existing code only to get some extra features when
912 using Convert::Binary::C.
913
914 So it is possible to attach a tag to the "data" member of the
915 "c_message" struct that tells Convert::Binary::C to treat the array as
916 if it were flexible:
917
918 $c->tag('c_message.data', Dimension => '*');
919
920 Now both "c_message" and "c99_message" will behave exactly the same
921 when using "pack" or "unpack". Repeating the above code:
922
923 $uc = $c->unpack('c_message', $data);
924
925 This will result in:
926
927 $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
928
929 But there's more you can do. Even though it probably doesn't make much
930 sense, you can tag a fixed dimension to an array:
931
932 $c->tag('c_message.data', Dimension => '5');
933
934 This will obviously result in:
935
936 $uc = {'count' => 3,'data' => [1,2,3,4,5]};
937
938 A more useful way to use the "Dimension" tag is to set it to the name
939 of a member in the same compound:
940
941 $c->tag('c_message.data', Dimension => 'count');
942
943 Convert::Binary::C will now use the value of that member to determine
944 the size of the array, so unpacking will result in:
945
946 $uc = {'count' => 3,'data' => [1,2,3]};
947
948 Of course, you can also tag flexible array members. And yes, it's also
949 possible to use more complex member expressions:
950
951 $c->parse(<<ENDC);
952 struct msg_header
953 {
954 unsigned len[2];
955 };
956
957 struct more_complex
958 {
959 struct msg_header hdr;
960 char data[];
961 };
962 ENDC
963
964 $data = pack 'NNC*', 42, 7, 1 .. 10;
965
966 $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
967
968 $u = $c->unpack('more_complex', $data);
969
970 The result will be:
971
972 $u = {
973 'hdr' => {
974 'len' => [
975 42,
976 7
977 ]
978 },
979 'data' => [
980 1,
981 2,
982 3,
983 4,
984 5,
985 6,
986 7
987 ]
988 };
989
990 By the way, it's also possible to tag arrays that are not embedded
991 inside a compound:
992
993 $c->parse(<<ENDC);
994 typedef unsigned short short_array[];
995 ENDC
996
997 $c->tag('short_array', Dimension => '5');
998
999 $u = $c->unpack('short_array', $data);
1000
1001 Resulting in:
1002
1003 $u = [0,42,0,7,258];
1004
1005 The final and most powerful way to define a "Dimension" tag is to pass
1006 it a subroutine reference. The referenced subroutine can execute
1007 whatever code is necessary to determine the size of the tagged array:
1008
1009 sub get_size
1010 {
1011 my $m = shift;
1012 return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1013 }
1014
1015 $c->tag('more_complex.data', Dimension => \&get_size);
1016
1017 $u = $c->unpack('more_complex', $data);
1018
1019 As you can guess from the above code, the subroutine is being passed a
1020 reference to hash that stores the already unpacked part of the compound
1021 embedding the tagged array. This is the result:
1022
1023 $u = {
1024 'hdr' => {
1025 'len' => [
1026 42,
1027 7
1028 ]
1029 },
1030 'data' => [
1031 1,
1032 2,
1033 3,
1034 4,
1035 5,
1036 6
1037 ]
1038 };
1039
1040 You can also pass custom arguments to the subroutines by using the
1041 "arg" method. This is similar to the functionality offered by the
1042 "Hooks" tag.
1043
1044 Of course, all that also works for the "pack" method as well.
1045
1046 However, the current implementation has at least one shortcomings,
1047 which is why it's experimental: The "Dimension" tag doesn't impact
1048 compound layout. This means that while you can alter the size of an
1049 array in the middle of a compound, the offset of the members after that
1050 array won't be impacted. I'd rather like to see the layout adapt
1051 dynamically, so this is what I'm hoping to implement in the future.
1052
1053 The Hooks Tag
1054 Hooks are a special kind of tag that can be extremely useful.
1055
1056 Using hooks, you can easily override the way "pack" and "unpack" handle
1057 data using your own subroutines. If you define hooks for a certain
1058 data type, each time this data type is processed the corresponding hook
1059 will be called to allow you to modify that data.
1060
1061 Basic Hooks
1062
1063 Here's an example. Let's assume the following C code has been parsed:
1064
1065 typedef unsigned long u_32;
1066 typedef u_32 ProtoId;
1067 typedef ProtoId MyProtoId;
1068
1069 struct MsgHeader {
1070 MyProtoId id;
1071 u_32 len;
1072 };
1073
1074 struct String {
1075 u_32 len;
1076 char buf[];
1077 };
1078
1079 You could now use the types above and, for example, unpack binary data
1080 representing a "MsgHeader" like this:
1081
1082 $msg_header = $c->unpack('MsgHeader', $data);
1083
1084 This would give you:
1085
1086 $msg_header = {
1087 'len' => 13,
1088 'id' => 42
1089 };
1090
1091 Instead of dealing with "ProtoId"'s as integers, you would rather like
1092 to have them as clear text. You could provide subroutines to convert
1093 between clear text and integers:
1094
1095 %proto = (
1096 CATS => 1,
1097 DOGS => 42,
1098 HEDGEHOGS => 4711,
1099 );
1100
1101 %rproto = reverse %proto;
1102
1103 sub ProtoId_unpack {
1104 $rproto{$_[0]} || 'unknown protocol'
1105 }
1106
1107 sub ProtoId_pack {
1108 $proto{$_[0]} or die 'unknown protocol'
1109 }
1110
1111 You can now register these subroutines by attaching a "Hooks" tag to
1112 "ProtoId" using the "tag" method:
1113
1114 $c->tag('ProtoId', Hooks => { pack => \&ProtoId_pack,
1115 unpack => \&ProtoId_unpack });
1116
1117 Doing exactly the same unpack on "MsgHeader" again would now return:
1118
1119 $msg_header = {
1120 'len' => 13,
1121 'id' => 'DOGS'
1122 };
1123
1124 Actually, if you don't need the reverse operation, you don't even have
1125 to register a "pack" hook. Or, even better, you can have a more
1126 intelligent "unpack" hook that creates a dual-typed variable:
1127
1128 use Scalar::Util qw(dualvar);
1129
1130 sub ProtoId_unpack2 {
1131 dualvar $_[0], $rproto{$_[0]} || 'unknown protocol'
1132 }
1133
1134 $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1135
1136 $msg_header = $c->unpack('MsgHeader', $data);
1137
1138 Just as before, this would print
1139
1140 $msg_header = {
1141 'len' => 13,
1142 'id' => 'DOGS'
1143 };
1144
1145 but without requiring a "pack" hook for packing, at least as long as
1146 you keep the variable dual-typed.
1147
1148 Hooks are usually called with exactly one argument, which is the data
1149 that should be processed (see "Advanced Hooks" for details on how to
1150 customize hook arguments). They are called in scalar context and
1151 expected to return the processed data.
1152
1153 To get rid of registered hooks, you can either undefine only certain
1154 hooks
1155
1156 $c->tag('ProtoId', Hooks => { pack => undef });
1157
1158 or all hooks:
1159
1160 $c->tag('ProtoId', Hooks => undef);
1161
1162 Of course, hooks are not restricted to handling integer values. You
1163 could just as well attach hooks for the "String" struct from the code
1164 above. A useful example would be to have these hooks:
1165
1166 sub string_unpack {
1167 my $s = shift;
1168 pack "c$s->{len}", @{$s->{buf}};
1169 }
1170
1171 sub string_pack {
1172 my $s = shift;
1173 return {
1174 len => length $s,
1175 buf => [ unpack 'c*', $s ],
1176 }
1177 }
1178
1179 (Don't be confused by the fact that the "unpack" hook uses "pack" and
1180 the "pack" hook uses "unpack". And also see "Advanced Hooks" for a
1181 more clever approach.)
1182
1183 While you would normally get the following output when unpacking a
1184 "String"
1185
1186 $string = {
1187 'len' => 12,
1188 'buf' => [
1189 72,
1190 101,
1191 108,
1192 108,
1193 111,
1194 32,
1195 87,
1196 111,
1197 114,
1198 108,
1199 100,
1200 33
1201 ]
1202 };
1203
1204 you could just register the hooks using
1205
1206 $c->tag('String', Hooks => { pack => \&string_pack,
1207 unpack => \&string_unpack });
1208
1209 and you would get a nice human-readable Perl string:
1210
1211 $string = 'Hello World!';
1212
1213 Packing a string turns out to be just as easy:
1214
1215 use Data::Hexdumper;
1216
1217 $data = $c->pack('String', 'Just another Perl hacker,');
1218
1219 print hexdump(data => $data);
1220
1221 This would print:
1222
1223 0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1224 0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C : .Perl.hacker,
1225
1226 If you want to find out if or which hooks are registered for a certain
1227 type, you can also use the "tag" method:
1228
1229 $hooks = $c->tag('String', 'Hooks');
1230
1231 This would return:
1232
1233 $hooks = {
1234 'unpack' => \&string_unpack,
1235 'pack' => \&string_pack
1236 };
1237
1238 Advanced Hooks
1239
1240 It is also possible to combine hooks with using the "Format" tag. This
1241 can be useful if you know better than Convert::Binary::C how to
1242 interpret the binary data. In the previous section, we've handled this
1243 type
1244
1245 struct String {
1246 u_32 len;
1247 char buf[];
1248 };
1249
1250 with the following hooks:
1251
1252 sub string_unpack {
1253 my $s = shift;
1254 pack "c$s->{len}", @{$s->{buf}};
1255 }
1256
1257 sub string_pack {
1258 my $s = shift;
1259 return {
1260 len => length $s,
1261 buf => [ unpack 'c*', $s ],
1262 }
1263 }
1264
1265 $c->tag('String', Hooks => { pack => \&string_pack,
1266 unpack => \&string_unpack });
1267
1268 As you can see in the hook code, "buf" is expected to be an array of
1269 characters. For the "unpack" case Convert::Binary::C first turns the
1270 binary data into a Perl array, and then the hook packs it back into a
1271 string. The intermediate array creation and destruction is completely
1272 useless. Same thing, of course, for the "pack" case.
1273
1274 Here's a clever way to handle this. Just tag "buf" as binary
1275
1276 $c->tag('String.buf', Format => 'Binary');
1277
1278 and use the following hooks instead:
1279
1280 sub string_unpack2 {
1281 my $s = shift;
1282 substr $s->{buf}, 0, $s->{len};
1283 }
1284
1285 sub string_pack2 {
1286 my $s = shift;
1287 return {
1288 len => length $s,
1289 buf => $s,
1290 }
1291 }
1292
1293 $c->tag('String', Hooks => { pack => \&string_pack2,
1294 unpack => \&string_unpack2 });
1295
1296 This will be exactly equivalent to the old code, but faster and
1297 probably even much easier to understand.
1298
1299 But hooks are even more powerful. You can customize the arguments that
1300 are passed to your hooks and you can use "arg" to pass certain special
1301 arguments, such as the name of the type that is currently being
1302 processed by the hook.
1303
1304 The following example shows how it is easily possible to peek into the
1305 perl internals using hooks.
1306
1307 use Config;
1308
1309 $c = new Convert::Binary::C %CC, OrderMembers => 1;
1310 $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1311 $c->parse(<<ENDC);
1312 #include "EXTERN.h"
1313 #include "perl.h"
1314 ENDC
1315
1316 $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1317 $c->arg(qw(SELF TYPE DATA))] })
1318 for qw( XPVAV XPVHV );
1319
1320 First, we add the perl core include path and parse perl.h. Then, we add
1321 an "unpack_ptr" hook for a couple of the internal data types.
1322
1323 The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1324 a certain data structure is processed. This is by far the most
1325 experimental part of the hooks feature, as this includes any kind of
1326 pointer. There's no way for the hook to know the difference between a
1327 plain pointer, or a pointer to a pointer, or a pointer to an array
1328 (this is because the difference doesn't matter anywhere else in
1329 Convert::Binary::C).
1330
1331 But the hook above makes use of another very interesting feature: It
1332 uses "arg" to pass special arguments to the hook subroutine. Usually,
1333 the hook subroutine is simply passed a single data argument. But using
1334 the above definition, it'll get a reference to the calling object
1335 ("SELF"), the name of the type being processed ("TYPE") and the data
1336 ("DATA").
1337
1338 But how does our hook look like?
1339
1340 sub unpack_ptr {
1341 my($self, $type, $ptr) = @_;
1342 $ptr or return '<NULL>';
1343 my $size = $self->sizeof($type);
1344 $self->unpack($type, unpack("P$size", pack('I', $ptr)));
1345 }
1346
1347 As you can see, the hook is rather simple. First, it receives the
1348 arguments mentioned above. It performs a quick check if the pointer is
1349 "NULL" and shouldn't be processed any further. Next, it determines the
1350 size of the type being processed. And finally, it'll just use the "P"n
1351 unpack template to read from that memory location and recursively call
1352 "unpack" to unpack the type. (And yes, this may of course again call
1353 other hooks.)
1354
1355 Now, let's test that:
1356
1357 my $ref = { foo => 42, bar => 4711 };
1358 my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1359
1360 print Dumper(unpack_ptr($c, 'AV', $ptr));
1361
1362 Just for the fun of it, we create a blessed array reference. But how do
1363 we get a pointer to the corresponding "AV"? This is rather easy, as the
1364 address of the "AV" is just the hex value that appears when using the
1365 array reference in string context. So we just grab that and turn it
1366 into decimal. All that's left to do is just call our hook, as it can
1367 already handle "AV" pointers. And this is what we get:
1368
1369 $VAR1 = {
1370 'sv_any' => {
1371 'xnv_u' => {
1372 'xnv_nv' => '0',
1373 'xgv_stash' => 0,
1374 'xpad_cop_seq' => {
1375 'xlow' => 0,
1376 'xhigh' => 0
1377 },
1378 'xbm_s' => {
1379 'xbm_previous' => 0,
1380 'xbm_flags' => 0,
1381 'xbm_rare' => 0
1382 }
1383 },
1384 'xav_fill' => 2,
1385 'xav_max' => 7,
1386 'xiv_u' => {
1387 'xivu_iv' => 2,
1388 'xivu_uv' => 2,
1389 'xivu_p1' => 2,
1390 'xivu_i32' => 2,
1391 'xivu_namehek' => 2,
1392 'xivu_hv' => 2
1393 },
1394 'xmg_u' => {
1395 'xmg_magic' => 0,
1396 'xmg_ourstash' => 0
1397 },
1398 'xmg_stash' => 0
1399 },
1400 'sv_refcnt' => 1,
1401 'sv_flags' => 536870924,
1402 'sv_u' => {
1403 'svu_pv' => 142054140,
1404 'svu_iv' => 142054140,
1405 'svu_uv' => 142054140,
1406 'svu_rv' => 142054140,
1407 'svu_array' => 142054140,
1408 'svu_hash' => 142054140,
1409 'svu_gp' => 142054140
1410 }
1411 };
1412
1413 Even though it is rather easy to do such stuff using "unpack_ptr"
1414 hooks, you should really know what you're doing and do it with extreme
1415 care because of the limitations mentioned above. It's really easy to
1416 run into segmentation faults when you're dereferencing pointers that
1417 point to memory which you don't own.
1418
1419 Performance
1420
1421 Using hooks isn't for free. In performance-critical applications you
1422 have to keep in mind that hooks are actually perl subroutines and that
1423 they are called once for every value of a registered type that is being
1424 packed or unpacked. If only about 10% of the values require hooks to be
1425 called, you'll hardly notice the difference (if your hooks are
1426 implemented efficiently, that is). But if all values would require
1427 hooks to be called, that alone could easily make packing and unpacking
1428 very slow.
1429
1430 Tag Order
1431 Since it is possible to attach multiple tags to a single type, the
1432 order in which the tags are processed is important. Here's a small
1433 table that shows the processing order.
1434
1435 pack unpack
1436 ---------------------
1437 Hooks Format
1438 Format ByteOrder
1439 ByteOrder Hooks
1440
1441 As a general rule, the "Hooks" tag is always the first thing processed
1442 when packing data, and the last thing processed when unpacking data.
1443
1444 The "Format" and "ByteOrder" tags are exclusive, but when both are
1445 given the "Format" tag wins.
1446
1448 new
1449 "new"
1450 "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1451 The constructor is used to create a new Convert::Binary::C
1452 object. You can simply use
1453
1454 $c = new Convert::Binary::C;
1455
1456 without additional arguments to create an object, or you can
1457 optionally pass any arguments to the constructor that are
1458 described for the "configure" method.
1459
1460 configure
1461 "configure"
1462 "configure" OPTION
1463 "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1464 This method can be used to configure an existing
1465 Convert::Binary::C object or to retrieve its current
1466 configuration.
1467
1468 To configure the object, the list of options consists of key
1469 and value pairs and must therefore contain an even number of
1470 elements. "configure" (and also "new" if used with
1471 configuration options) will throw an exception if you pass an
1472 odd number of elements. Configuration will normally look like
1473 this:
1474
1475 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1476
1477 To retrieve the current value of a configuration option, you
1478 must pass a single argument to "configure" that holds the name
1479 of the option, just like
1480
1481 $order = $c->configure('ByteOrder');
1482
1483 If you want to get the values of all configuration options at
1484 once, you can call "configure" without any arguments and it
1485 will return a reference to a hash table that holds the whole
1486 object configuration. This can be conveniently used with the
1487 Data::Dumper module, for example:
1488
1489 use Convert::Binary::C;
1490 use Data::Dumper;
1491
1492 $c = new Convert::Binary::C Define => ['DEBUGGING', 'FOO=123'],
1493 Include => ['/usr/include'];
1494
1495 print Dumper($c->configure);
1496
1497 Which will print something like this:
1498
1499 $VAR1 = {
1500 'Define' => [
1501 'DEBUGGING',
1502 'FOO=123'
1503 ],
1504 'StdCVersion' => 199901,
1505 'ByteOrder' => 'LittleEndian',
1506 'LongSize' => 4,
1507 'IntSize' => 4,
1508 'HostedC' => 1,
1509 'ShortSize' => 2,
1510 'HasMacroVAARGS' => 1,
1511 'Assert' => [],
1512 'UnsignedChars' => 0,
1513 'DoubleSize' => 8,
1514 'CharSize' => 1,
1515 'EnumType' => 'Integer',
1516 'PointerSize' => 4,
1517 'EnumSize' => 4,
1518 'DisabledKeywords' => [],
1519 'FloatSize' => 4,
1520 'Alignment' => 1,
1521 'LongLongSize' => 8,
1522 'LongDoubleSize' => 12,
1523 'KeywordMap' => {},
1524 'Include' => [
1525 '/usr/include'
1526 ],
1527 'HasCPPComments' => 1,
1528 'Bitfields' => {
1529 'Engine' => 'Generic'
1530 },
1531 'UnsignedBitfields' => 0,
1532 'Warnings' => 0,
1533 'CompoundAlignment' => 1,
1534 'OrderMembers' => 0
1535 };
1536
1537 Since you may not always want to write a "configure" call when
1538 you only want to change a single configuration item, you can
1539 use any configuration option name as a method name, like:
1540
1541 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1542
1543 (Yes, the example doesn't make very much sense... ;-)
1544
1545 However, you should keep in mind that configuration methods
1546 that can take lists (namely "Include", "Define" and "Assert",
1547 but not "DisabledKeywords") may behave slightly different than
1548 their "configure" equivalent. If you pass these methods a
1549 single argument that is an array reference, the current list
1550 will be replaced by the new one, which is just the behaviour of
1551 the corresponding "configure" call. So the following are
1552 equivalent:
1553
1554 $c->configure(Define => ['foo', 'bar=123']);
1555 $c->Define(['foo', 'bar=123']);
1556
1557 But if you pass a list of strings instead of an array reference
1558 (which cannot be done when using "configure"), the new list
1559 items are appended to the current list, so
1560
1561 $c = new Convert::Binary::C Include => ['/include'];
1562 $c->Include('/usr/include', '/usr/local/include');
1563 print Dumper($c->Include);
1564
1565 $c->Include(['/usr/local/include']);
1566 print Dumper($c->Include);
1567
1568 will first print all three include paths, but finally only
1569 "/usr/local/include" will be configured:
1570
1571 $VAR1 = [
1572 '/include',
1573 '/usr/include',
1574 '/usr/local/include'
1575 ];
1576 $VAR1 = [
1577 '/usr/local/include'
1578 ];
1579
1580 Furthermore, configuration methods can be chained together, as
1581 they return a reference to their object if called as a set
1582 method. So, if you like, you can configure your object like
1583 this:
1584
1585 $c = Convert::Binary::C->new(IntSize => 4)
1586 ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1587 ->ByteOrder('BigEndian');
1588
1589 $c->configure(EnumType => 'Both', Alignment => 4)
1590 ->Include('/usr/include', '/usr/local/include');
1591
1592 In the example above, "qw( ... )" is the word list quoting
1593 operator. It returns a list of all non-whitespace sequences,
1594 and is especially useful for configuring preprocessor defines
1595 or assertions. The following assignments are equivalent:
1596
1597 @array = ('one', 'two', 'three');
1598 @array = qw(one two three);
1599
1600 You can configure the following options. Unknown options, as
1601 well as invalid values for an option, will cause the object to
1602 throw exceptions.
1603
1604 "IntSize" => 0 | 1 | 2 | 4 | 8
1605 Set the number of bytes that are occupied by an integer.
1606 This is in most cases 2 or 4. If you set it to zero, the
1607 size of an integer on the host system will be used. This is
1608 also the default unless overridden by
1609 "CBC_DEFAULT_INT_SIZE" at compile time.
1610
1611 "CharSize" => 0 | 1 | 2 | 4 | 8
1612 Set the number of bytes that are occupied by a "char".
1613 This rarely needs to be changed, except for some platforms
1614 that don't care about bytes, for example DSPs. If you set
1615 this to zero, the size of a "char" on the host system will
1616 be used. This is also the default unless overridden by
1617 "CBC_DEFAULT_CHAR_SIZE" at compile time.
1618
1619 "ShortSize" => 0 | 1 | 2 | 4 | 8
1620 Set the number of bytes that are occupied by a short
1621 integer. Although integers explicitly declared as "short"
1622 should be always 16 bit, there are compilers that make a
1623 short 8 bit wide. If you set it to zero, the size of a
1624 short integer on the host system will be used. This is also
1625 the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1626 at compile time.
1627
1628 "LongSize" => 0 | 1 | 2 | 4 | 8
1629 Set the number of bytes that are occupied by a long
1630 integer. If set to zero, the size of a long integer on the
1631 host system will be used. This is also the default unless
1632 overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1633
1634 "LongLongSize" => 0 | 1 | 2 | 4 | 8
1635 Set the number of bytes that are occupied by a long long
1636 integer. If set to zero, the size of a long long integer on
1637 the host system, or 8, will be used. This is also the
1638 default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1639 at compile time.
1640
1641 "FloatSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1642 Set the number of bytes that are occupied by a single
1643 precision floating point value. If you set it to zero, the
1644 size of a "float" on the host system will be used. This is
1645 also the default unless overridden by
1646 "CBC_DEFAULT_FLOAT_SIZE" at compile time. For details on
1647 floating point support, see "FLOATING POINT VALUES".
1648
1649 "DoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1650 Set the number of bytes that are occupied by a double
1651 precision floating point value. If you set it to zero, the
1652 size of a "double" on the host system will be used. This is
1653 also the default unless overridden by
1654 "CBC_DEFAULT_DOUBLE_SIZE" at compile time. For details on
1655 floating point support, see "FLOATING POINT VALUES".
1656
1657 "LongDoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1658 Set the number of bytes that are occupied by a double
1659 precision floating point value. If you set it to zero, the
1660 size of a "long double" on the host system, or 12 will be
1661 used. This is also the default unless overridden by
1662 "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1663 on floating point support, see "FLOATING POINT VALUES".
1664
1665 "PointerSize" => 0 | 1 | 2 | 4 | 8
1666 Set the number of bytes that are occupied by a pointer.
1667 This is in most cases 2 or 4. If you set it to zero, the
1668 size of a pointer on the host system will be used. This is
1669 also the default unless overridden by
1670 "CBC_DEFAULT_PTR_SIZE" at compile time.
1671
1672 "EnumSize" => -1 | 0 | 1 | 2 | 4 | 8
1673 Set the number of bytes that are occupied by an enumeration
1674 type. On most systems, this is equal to the size of an
1675 integer, which is also the default. However, for some
1676 compilers, the size of an enumeration type depends on the
1677 size occupied by the largest enumerator. So the size may
1678 vary between 1 and 8. If you have
1679
1680 enum foo {
1681 ONE = 100, TWO = 200
1682 };
1683
1684 this will occupy one byte because the enum can be
1685 represented as an unsigned one-byte value. However,
1686
1687 enum foo {
1688 ONE = -100, TWO = 200
1689 };
1690
1691 will occupy two bytes, because the -100 forces the type to
1692 be signed, and 200 doesn't fit into a signed one-byte
1693 value. Therefore, the type used is a signed two-byte
1694 value. If this is the behaviour you need, set the EnumSize
1695 to 0.
1696
1697 Some compilers try to follow this strategy, but don't care
1698 whether the enumeration has signed values or not. They
1699 always declare an enum as signed. On such a compiler, given
1700
1701 enum one { ONE = -100, TWO = 100 };
1702 enum two { ONE = 100, TWO = 200 };
1703
1704 enum "one" will occupy only one byte, while enum "two" will
1705 occupy two bytes, even though it could be represented by a
1706 unsigned one-byte value. If this is the behaviour of your
1707 compiler, set EnumSize to "-1".
1708
1709 "Alignment" => 0 | 1 | 2 | 4 | 8 | 16
1710 Set the struct member alignment. This option controls where
1711 padding bytes are inserted between struct members. It
1712 globally sets the alignment for all structs/unions.
1713 However, this can be overridden from within the source code
1714 with the common "pack" pragma as explained in "Supported
1715 pragma directives". The default alignment is 1, which
1716 means no padding bytes are inserted. A setting of 0 means
1717 native alignment, i.e. the alignment of the system that
1718 Convert::Binary::C has been compiled on. You can determine
1719 the native properties using the "native" function.
1720
1721 The "Alignment" option is similar to the "-Zp[n]" option of
1722 the Intel compiler. It globally specifies the maximum
1723 boundary to which struct members are aligned. Consider the
1724 following structure and the sizes of "char", "short",
1725 "long" and "double" being 1, 2, 4 and 8, respectively.
1726
1727 struct align {
1728 char a;
1729 short b, c;
1730 long d;
1731 double e;
1732 };
1733
1734 With an alignment of 1 (the default), the struct members
1735 would be packed tightly:
1736
1737 0 1 2 3 4 5 6 7 8 9 10 11 12
1738 +---+---+---+---+---+---+---+---+---+---+---+---+
1739 | a | b | c | d | ...
1740 +---+---+---+---+---+---+---+---+---+---+---+---+
1741
1742 12 13 14 15 16 17
1743 +---+---+---+---+---+
1744 ... e |
1745 +---+---+---+---+---+
1746
1747 With an alignment of 2, the struct members larger than one
1748 byte would be aligned to 2-byte boundaries, which results
1749 in a single padding byte between "a" and "b".
1750
1751 0 1 2 3 4 5 6 7 8 9 10 11 12
1752 +---+---+---+---+---+---+---+---+---+---+---+---+
1753 | a | * | b | c | d | ...
1754 +---+---+---+---+---+---+---+---+---+---+---+---+
1755
1756 12 13 14 15 16 17 18
1757 +---+---+---+---+---+---+
1758 ... e |
1759 +---+---+---+---+---+---+
1760
1761 With an alignment of 4, the struct members of size 2 would
1762 be aligned to 2-byte boundaries and larger struct members
1763 would be aligned to 4-byte boundaries:
1764
1765 0 1 2 3 4 5 6 7 8 9 10 11 12
1766 +---+---+---+---+---+---+---+---+---+---+---+---+
1767 | a | * | b | c | * | * | d | ...
1768 +---+---+---+---+---+---+---+---+---+---+---+---+
1769
1770 12 13 14 15 16 17 18 19 20
1771 +---+---+---+---+---+---+---+---+
1772 ... | e |
1773 +---+---+---+---+---+---+---+---+
1774
1775 This layout of the struct members allows the compiler to
1776 generate optimized code because aligned members can be
1777 accessed more easily by the underlying architecture.
1778
1779 Finally, setting the alignment to 8 will align "double"s to
1780 8-byte boundaries:
1781
1782 0 1 2 3 4 5 6 7 8 9 10 11 12
1783 +---+---+---+---+---+---+---+---+---+---+---+---+
1784 | a | * | b | c | * | * | d | ...
1785 +---+---+---+---+---+---+---+---+---+---+---+---+
1786
1787 12 13 14 15 16 17 18 19 20 21 22 23 24
1788 +---+---+---+---+---+---+---+---+---+---+---+---+
1789 ... | * | * | * | * | e |
1790 +---+---+---+---+---+---+---+---+---+---+---+---+
1791
1792 Further increasing the alignment does not alter the layout
1793 of our structure, as only members larger that 8 bytes would
1794 be affected.
1795
1796 The alignment of a structure depends on its largest member
1797 and on the setting of the "Alignment" option. With
1798 "Alignment" set to 2, a structure holding a "long" would be
1799 aligned to a 2-byte boundary, while a structure containing
1800 only "char"s would have no alignment restrictions.
1801 (Unfortunately, that's not the whole story. See the
1802 "CompoundAlignment" option for details.)
1803
1804 Here's another example. Assuming 8-byte alignment, the
1805 following two structs will both have a size of 16 bytes:
1806
1807 struct one {
1808 char c;
1809 double d;
1810 };
1811
1812 struct two {
1813 double d;
1814 char c;
1815 };
1816
1817 This is clear for "struct one", because the member "d" has
1818 to be aligned to an 8-byte boundary, and thus 7 padding
1819 bytes are inserted after "c". But for "struct two", the
1820 padding bytes are inserted at the end of the structure,
1821 which doesn't make much sense immediately. However, it
1822 makes perfect sense if you think about an array of "struct
1823 two". Each "double" has to be aligned to an 8-byte
1824 boundary, an thus each array element would have to occupy
1825 16 bytes. With that in mind, it would be strange if a
1826 "struct two" variable would have a different size. And it
1827 would make the widely used construct
1828
1829 struct two array[] = { {1.0, 0}, {2.0, 1} };
1830 int elements = sizeof(array) / sizeof(struct two);
1831
1832 impossible.
1833
1834 The alignment behaviour described here seems to be common
1835 for all compilers. However, not all compilers have an
1836 option to configure their default alignment.
1837
1838 "CompoundAlignment" => 0 | 1 | 2 | 4 | 8 | 16
1839 Usually, the alignment of a compound (i.e. a "struct" or a
1840 "union") depends only on its largest member and on the
1841 setting of the "Alignment" option. There are, however,
1842 architectures and compilers where compounds can have
1843 different alignment constraints.
1844
1845 For most platforms and compilers, the alignment constraint
1846 for compounds is 1 byte. That is, on most platforms
1847
1848 struct onebyte {
1849 char byte;
1850 };
1851
1852 will have an alignment of 1 and also a size of 1. But if
1853 you take an ARM architecture, the above "struct onebyte"
1854 will have an alignment of 4, and thus also a size of 4.
1855
1856 You can configure this by setting "CompoundAlignment" to 4.
1857 This will ensure that the alignment of compounds is always
1858 4.
1859
1860 Setting "CompoundAlignment" to 0 means native compound
1861 alignment, i.e. the compound alignment of the system that
1862 Convert::Binary::C has been compiled on. You can determine
1863 the native properties using the "native" function.
1864
1865 There are also compilers for certain platforms that allow
1866 you to adjust the compound alignment. If you're not aware
1867 of the fact that your compiler/architecture has a compound
1868 alignment other than 1, strange things can happen. If, for
1869 example, the compound alignment is 2 and you have something
1870 like
1871
1872 typedef unsigned char U8;
1873
1874 struct msg_head {
1875 U8 cmd;
1876 struct {
1877 U8 hi;
1878 U8 low;
1879 } crc16;
1880 U8 len;
1881 };
1882
1883 there will be one padding byte inserted before the embedded
1884 "crc16" struct and after the "len" member, which is most
1885 probably not what was intended:
1886
1887 0 1 2 3 4 5 6
1888 +-----+-----+-----+-----+-----+-----+
1889 | cmd | * | hi | low | len | * |
1890 +-----+-----+-----+-----+-----+-----+
1891
1892 Note that both "#pragma pack" and the "Alignment" option
1893 can override "CompoundAlignment". If you set
1894 "CompoundAlignment" to 4, but "Alignment" to 2, compounds
1895 will actually be aligned on 2-byte boundaries.
1896
1897 "ByteOrder" => 'BigEndian' | 'LittleEndian'
1898 Set the byte order for integers larger than a single byte.
1899 Little endian (Intel, least significant byte first) and big
1900 endian (Motorola, most significant byte first) byte order
1901 are supported. The default byte order is the same as the
1902 byte order of the host system unless overridden by
1903 "CBC_DEFAULT_BYTEORDER" at compile time.
1904
1905 "EnumType" => 'Integer' | 'String' | 'Both'
1906 This option controls the type that enumeration constants
1907 will have in data structures returned by the "unpack"
1908 method. If you have the following definitions:
1909
1910 typedef enum {
1911 SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1912 THURSDAY, FRIDAY, SATURDAY
1913 } Weekday;
1914
1915 typedef enum {
1916 JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1917 AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1918 } Month;
1919
1920 typedef struct {
1921 int year;
1922 Month month;
1923 int day;
1924 Weekday weekday;
1925 } Date;
1926
1927 and a byte string that holds a packed Date struct, then
1928 you'll get the following results from a call to the
1929 "unpack" method.
1930
1931 "Integer"
1932 Enumeration constants are returned as plain integers.
1933 This is fast, but may be not very useful. It is also
1934 the default.
1935
1936 $date = {
1937 'weekday' => 1,
1938 'month' => 0,
1939 'day' => 7,
1940 'year' => 2002
1941 };
1942
1943 "String"
1944 Enumeration constants are returned as strings. This
1945 will create a string constant for every unpacked
1946 enumeration constant and thus consumes more time and
1947 memory. However, the result may be more useful.
1948
1949 $date = {
1950 'weekday' => 'MONDAY',
1951 'month' => 'JANUARY',
1952 'day' => 7,
1953 'year' => 2002
1954 };
1955
1956 "Both"
1957 Enumeration constants are returned as double typed
1958 scalars. If evaluated in string context, the
1959 enumeration constant will be a string, if evaluated in
1960 numeric context, the enumeration constant will be an
1961 integer.
1962
1963 $date = $c->EnumType('Both')->unpack('Date', $binary);
1964
1965 printf "Weekday = %s (%d)\n\n", $date->{weekday},
1966 $date->{weekday};
1967
1968 if ($date->{month} == 0) {
1969 print "It's $date->{month}, happy new year!\n\n";
1970 }
1971
1972 print Dumper($date);
1973
1974 This will print:
1975
1976 Weekday = MONDAY (1)
1977
1978 It's JANUARY, happy new year!
1979
1980 $VAR1 = {
1981 'weekday' => 'MONDAY',
1982 'month' => 'JANUARY',
1983 'day' => 7,
1984 'year' => 2002
1985 };
1986
1987 "DisabledKeywords" => [ KEYWORDS ]
1988 This option allows you to selectively deactivate certain
1989 keywords in the C parser. Some C compilers don't have the
1990 complete ANSI keyword set, i.e. they don't recognize the
1991 keywords "const" or "void", for example. If you do
1992
1993 typedef int void;
1994
1995 on such a compiler, this will usually be ok. But if you
1996 parse this with an ANSI compiler, it will be a syntax
1997 error. To parse the above code correctly, you have to
1998 disable the "void" keyword in the Convert::Binary::C
1999 parser:
2000
2001 $c->DisabledKeywords([qw( void )]);
2002
2003 By default, the Convert::Binary::C parser will recognize
2004 the keywords "inline" and "restrict". If your compiler
2005 doesn't have these new keywords, it usually doesn't matter.
2006 Only if you're using the keywords as identifiers, like in
2007
2008 typedef struct inline {
2009 int a, b;
2010 } restrict;
2011
2012 you'll have to disable these ISO-C99 keywords:
2013
2014 $c->DisabledKeywords([qw( inline restrict )]);
2015
2016 The parser allows you to disable the following keywords:
2017
2018 asm
2019 auto
2020 const
2021 double
2022 enum
2023 extern
2024 float
2025 inline
2026 long
2027 register
2028 restrict
2029 short
2030 signed
2031 static
2032 unsigned
2033 void
2034 volatile
2035
2036 "KeywordMap" => { KEYWORD => TOKEN, ... }
2037 This option allows you to add new keywords to the parser.
2038 These new keywords can either be mapped to existing tokens
2039 or simply ignored. For example, recent versions of the GNU
2040 compiler recognize the keywords "__signed__" and
2041 "__extension__". The first one obviously is a synonym for
2042 "signed", while the second one is only a marker for a
2043 language extension.
2044
2045 Using the preprocessor, you could of course do the
2046 following:
2047
2048 $c->Define(qw( __signed__=signed __extension__= ));
2049
2050 However, the preprocessor symbols could be undefined or
2051 redefined in the code, and
2052
2053 #ifdef __signed__
2054 # undef __signed__
2055 #endif
2056
2057 typedef __extension__ __signed__ long long s_quad;
2058
2059 would generate a parse error, because "__signed__" is an
2060 unexpected identifier.
2061
2062 Instead of utilizing the preprocessor, you'll have to
2063 create mappings for the new keywords directly in the parser
2064 using "KeywordMap". In the above example, you want to map
2065 "__signed__" to the built-in C keyword "signed" and ignore
2066 "__extension__". This could be done with the following
2067 code:
2068
2069 $c->KeywordMap({ __signed__ => 'signed',
2070 __extension__ => undef });
2071
2072 You can specify any valid identifier as hash key, and
2073 either a valid C keyword or "undef" as hash value. Having
2074 configured the object that way, you could parse even
2075
2076 #ifdef __signed__
2077 # undef __signed__
2078 #endif
2079
2080 typedef __extension__ __signed__ long long s_quad;
2081
2082 without problems.
2083
2084 Note that "KeywordMap" and "DisabledKeywords" perfectly
2085 work together. You could, for example, disable the "signed"
2086 keyword, but still have "__signed__" mapped to the original
2087 "signed" token:
2088
2089 $c->configure(DisabledKeywords => [ 'signed' ],
2090 KeywordMap => { __signed__ => 'signed' });
2091
2092 This would allow you to define
2093
2094 typedef __signed__ long signed;
2095
2096 which would normally be a syntax error because "signed"
2097 cannot be used as an identifier.
2098
2099 "UnsignedChars" => 0 | 1
2100 Use this boolean option if you want characters to be
2101 unsigned if specified without an explicit "signed" or
2102 "unsigned" type specifier. By default, characters are
2103 signed.
2104
2105 "UnsignedBitfields" => 0 | 1
2106 Use this boolean option if you want bitfields to be
2107 unsigned if specified without an explicit "signed" or
2108 "unsigned" type specifier. By default, bitfields are
2109 signed.
2110
2111 "Warnings" => 0 | 1
2112 Use this boolean option if you want warnings to be issued
2113 during the parsing of source code. Currently, warnings are
2114 only reported by the preprocessor, so don't expect the
2115 output to cover everything.
2116
2117 By default, warnings are turned off and only errors will be
2118 reported. However, even these errors are turned off if you
2119 run without the "-w" flag.
2120
2121 "HasCPPComments" => 0 | 1
2122 Use this option to turn C++ comments on or off. By default,
2123 C++ comments are enabled. Disabling C++ comments may be
2124 necessary if your code includes strange things like:
2125
2126 one = 4 //* <- divide */ 4;
2127 two = 2;
2128
2129 With C++ comments, the above will be interpreted as
2130
2131 one = 4
2132 two = 2;
2133
2134 which will obviously be a syntax error, but without C++
2135 comments, it will be interpreted as
2136
2137 one = 4 / 4;
2138 two = 2;
2139
2140 which is correct.
2141
2142 "HasMacroVAARGS" => 0 | 1
2143 Use this option to turn the "__VA_ARGS__" macro expansion
2144 on or off. If this is enabled (which is the default), you
2145 can use variable length argument lists in your preprocessor
2146 macros.
2147
2148 #define DEBUG( ... ) fprintf( stderr, __VA_ARGS__ )
2149
2150 There's normally no reason to turn that feature off.
2151
2152 "StdCVersion" => undef | INTEGER
2153 Use this option to change the value of the preprocessor's
2154 predefined "__STDC_VERSION__" macro. When set to "undef",
2155 the macro will not be defined.
2156
2157 "HostedC" => undef | 0 | 1
2158 Use this option to change the value of the preprocessor's
2159 predefined "__STDC_HOSTED__" macro. When set to "undef",
2160 the macro will not be defined.
2161
2162 "Include" => [ INCLUDES ]
2163 Use this option to set the include path for the internal
2164 preprocessor. The option value is a reference to an array
2165 of strings, each string holding a directory that should be
2166 searched for includes.
2167
2168 "Define" => [ DEFINES ]
2169 Use this option to define symbols in the preprocessor. The
2170 option value is, again, a reference to an array of strings.
2171 Each string can be either just a symbol or an assignment to
2172 a symbol. This is completely equivalent to what the "-D"
2173 option does for most preprocessors.
2174
2175 The following will define the symbol "FOO" and define "BAR"
2176 to be 12345:
2177
2178 $c->configure(Define => [qw( FOO BAR=12345 )]);
2179
2180 "Assert" => [ ASSERTIONS ]
2181 Use this option to make assertions in the preprocessor. If
2182 you don't know what assertions are, don't be concerned,
2183 since they're deprecated anyway. They are, however, used in
2184 some system's include files. The value is an array
2185 reference, just like for the macro definitions. Only the
2186 way the assertions are defined is a bit different and
2187 mimics the way they are defined with the "#assert"
2188 directive:
2189
2190 $c->configure(Assert => ['foo(bar)']);
2191
2192 "OrderMembers" => 0 | 1
2193 When using "unpack" on compounds and iterating over the
2194 returned hash, the order of the compound members is
2195 generally not preserved due to the nature of hash tables.
2196 It is not even guaranteed that the order is the same
2197 between different runs of the same program. This can be
2198 very annoying if you simply use to dump your data
2199 structures and the compound members always show up in a
2200 different order.
2201
2202 By setting "OrderMembers" to a non-zero value, all hashes
2203 returned by "unpack" are tied to a class that preserves the
2204 order of the hash keys. This way, all compound members
2205 will be returned in the correct order just as they are
2206 defined in your C code.
2207
2208 use Convert::Binary::C;
2209 use Data::Dumper;
2210
2211 $c = Convert::Binary::C->new->parse(<<'ENDC');
2212 struct test {
2213 char one;
2214 char two;
2215 struct {
2216 char never;
2217 char change;
2218 char this;
2219 char order;
2220 } three;
2221 char four;
2222 };
2223 ENDC
2224
2225 $data = "Convert";
2226
2227 $u1 = $c->unpack('test', $data);
2228 $c->OrderMembers(1);
2229 $u2 = $c->unpack('test', $data);
2230
2231 print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2232
2233 This will print something like:
2234
2235 $u1 = {
2236 'three' => {
2237 'change' => 118,
2238 'order' => 114,
2239 'this' => 101,
2240 'never' => 110
2241 },
2242 'one' => 67,
2243 'two' => 111,
2244 'four' => 116
2245 };
2246 $u2 = {
2247 'one' => 67,
2248 'two' => 111,
2249 'three' => {
2250 'never' => 110,
2251 'change' => 118,
2252 'this' => 101,
2253 'order' => 114
2254 },
2255 'four' => 116
2256 };
2257
2258 To be able to use this option, you have to install either
2259 the Tie::Hash::Indexed or the Tie::IxHash module. If both
2260 are installed, Convert::Binary::C will give preference to
2261 Tie::Hash::Indexed because it's faster.
2262
2263 When using this option, you should keep in mind that tied
2264 hashes are significantly slower and consume more memory
2265 than ordinary hashes, even when the class they're tied to
2266 is implemented efficiently. So don't turn this option on if
2267 you don't have to.
2268
2269 You can also influence hash member ordering by using the
2270 "CBC_ORDER_MEMBERS" environment variable.
2271
2272 "Bitfields" => { OPTION => VALUE, ... }
2273 Use this option to specify and configure a bitfield
2274 layouting engine. You can choose an engine by passing its
2275 name to the "Engine" option, like:
2276
2277 $c->configure(Bitfields => { Engine => 'Generic' });
2278
2279 Each engine can have its own set of options, although
2280 currently none of them does.
2281
2282 You can choose between the following bitfield engines:
2283
2284 "Generic"
2285 This engine implements the behaviour of most UNIX C
2286 compilers, including GCC. It does not handle packed
2287 bitfields yet.
2288
2289 "Microsoft"
2290 This engine implements the behaviour of Microsoft's
2291 "cl" compiler. It should be fairly complete and can
2292 handle packed bitfields.
2293
2294 "Simple"
2295 This engine is only used for testing the bitfield
2296 infrastructure in Convert::Binary::C. There's usually
2297 no reason to use it.
2298
2299 You can reconfigure all options even after you have parsed some
2300 code. The changes will be applied to the already parsed
2301 definitions. This works as long as array lengths are not
2302 affected by the changes. If you have Alignment and IntSize set
2303 to 4 and parse code like this
2304
2305 typedef struct {
2306 char abc;
2307 int day;
2308 } foo;
2309
2310 struct bar {
2311 foo zap[2*sizeof(foo)];
2312 };
2313
2314 the array "zap" in "struct bar" will obviously have 16
2315 elements. If you reconfigure the alignment to 1 now, the size
2316 of "foo" is now 5 instead of 8. While the alignment is adjusted
2317 correctly, the number of elements in array "zap" will still be
2318 16 and will not be changed to 10.
2319
2320 parse
2321 "parse" CODE
2322 Parses a string of valid C code. All enumeration, compound and
2323 type definitions are extracted. You can call the "parse" and
2324 "parse_file" methods as often as you like to add further
2325 definitions to the Convert::Binary::C object.
2326
2327 "parse" will throw an exception if an error occurs. On
2328 success, the method returns a reference to its object.
2329
2330 See "Parsing C code" for an example.
2331
2332 parse_file
2333 "parse_file" FILE
2334 Parses a C source file. All enumeration, compound and type
2335 definitions are extracted. You can call the "parse" and
2336 "parse_file" methods as often as you like to add further
2337 definitions to the Convert::Binary::C object.
2338
2339 "parse_file" will search the include path given via the
2340 "Include" option for the file if it cannot find it in the
2341 current directory.
2342
2343 "parse_file" will throw an exception if an error occurs. On
2344 success, the method returns a reference to its object.
2345
2346 See "Parsing C code" for an example.
2347
2348 When calling "parse" or "parse_file" multiple times, you may
2349 use types previously defined, but you are not allowed to
2350 redefine types. The state of the preprocessor is also saved, so
2351 you may also use defines from a previous parse. This works only
2352 as long as the preprocessor is not reset. See "Preprocessor
2353 configuration" for details.
2354
2355 When you're parsing C source files instead of C header files,
2356 note that local definitions are ignored. This means that type
2357 definitions hidden within functions will not be recognized by
2358 Convert::Binary::C. This is necessary because different
2359 functions (even different blocks within the same function) can
2360 define types with the same name:
2361
2362 void my_func(int i)
2363 {
2364 if (i < 10)
2365 {
2366 enum digit { ONE, TWO, THREE } x = ONE;
2367 printf("%d, %d\n", i, x);
2368 }
2369 else
2370 {
2371 enum digit { THREE, TWO, ONE } x = ONE;
2372 printf("%d, %d\n", i, x);
2373 }
2374 }
2375
2376 The above is a valid piece of C code, but it's not possible for
2377 Convert::Binary::C to distinguish between the different
2378 definitions of "enum digit", as they're only defined locally
2379 within the corresponding block.
2380
2381 clean
2382 "clean" Clears all information that has been collected during previous
2383 calls to "parse" or "parse_file". You can use this method if
2384 you want to parse some entirely different code, but with the
2385 same configuration.
2386
2387 The "clean" method returns a reference to its object.
2388
2389 clone
2390 "clone" Makes the object return an exact independent copy of itself.
2391
2392 $c = new Convert::Binary::C Include => ['/usr/include'];
2393 $c->parse_file('definitions.c');
2394 $clone = $c->clone;
2395
2396 The above code is technically equivalent (Mostly. Actually,
2397 using "sourcify" and "parse" might alter the order of the
2398 parsed data, which would make methods such as "compound" return
2399 the definitions in a different order.) to:
2400
2401 $c = new Convert::Binary::C Include => ['/usr/include'];
2402 $c->parse_file('definitions.c');
2403 $clone = new Convert::Binary::C %{$c->configure};
2404 $clone->parse($c->sourcify);
2405
2406 Using "clone" is just a lot faster.
2407
2408 def
2409 "def" NAME
2410 "def" TYPE
2411 If you need to know if a definition for a certain type name
2412 exists, use this method. You pass it the name of an enum,
2413 struct, union or typedef, and it will return a non-empty string
2414 being either "enum", "struct", "union", or "typedef" if there's
2415 a definition for the type in question, an empty string if
2416 there's no such definition, or "undef" if the name is
2417 completely unknown. If the type can be interpreted as a basic
2418 type, "basic" will be returned.
2419
2420 If you pass in a TYPE, the output will be slightly different.
2421 If the specified member exists, the "def" method will return
2422 "member". If the member doesn't exist, or if the type cannot
2423 have members, the empty string will be returned. Again, if the
2424 name of the type is completely unknown, "undef" will be
2425 returned. This may be useful if you want to check if a certain
2426 member exists within a compound, for example.
2427
2428 use Convert::Binary::C;
2429
2430 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2431
2432 typedef struct __not not;
2433 typedef struct __not *ptr;
2434
2435 struct foo {
2436 enum bar *xxx;
2437 };
2438
2439 typedef int quad[4];
2440
2441 ENDC
2442
2443 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2444 quad quad[3] quad[5] quad[-3] short[1] ),
2445 'unsigned long')
2446 {
2447 my $def = $c->def($type);
2448 printf "%-14s => %s\n",
2449 $type, defined $def ? "'$def'" : 'undef';
2450 }
2451
2452 The following would be returned by the "def" method:
2453
2454 not => ''
2455 ptr => 'typedef'
2456 foo => 'struct'
2457 bar => ''
2458 xxx => undef
2459 foo.xxx => 'member'
2460 foo.abc => ''
2461 xxx.yyy => undef
2462 quad => 'typedef'
2463 quad[3] => 'member'
2464 quad[5] => 'member'
2465 quad[-3] => 'member'
2466 short[1] => undef
2467 unsigned long => 'basic'
2468
2469 So, if "def" returns a non-empty string, you can safely use any
2470 other method with that type's name or with that member
2471 expression.
2472
2473 Concerning arrays, note that the index into an array doesn't
2474 need to be within the bounds of the array's definition, just
2475 like in C. In the above example, "quad[5]" and "quad[-3]" are
2476 valid members of the "quad" array, even though it is declared
2477 to have only four elements.
2478
2479 In cases where the typedef namespace overlaps with the
2480 namespace of enums/structs/unions, the "def" method will give
2481 preference to the typedef and will thus return the string
2482 "typedef". You could however force interpretation as an enum,
2483 struct or union by putting "enum", "struct" or "union" in front
2484 of the type's name.
2485
2486 defined
2487 "defined" MACRO
2488 You can use the "defined" method to find out if a certain macro
2489 is defined, just like you would use the "defined" operator of
2490 the preprocessor. For example, the following code
2491
2492 use Convert::Binary::C;
2493
2494 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2495
2496 #define ADD(a, b) ((a) + (b))
2497
2498 #if 1
2499 # define DEFINED
2500 #else
2501 # define UNDEFINED
2502 #endif
2503
2504 ENDC
2505
2506 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2507 my $not = $c->defined($macro) ? '' : ' not';
2508 print "Macro '$macro' is$not defined.\n";
2509 }
2510
2511 would print:
2512
2513 Macro 'ADD' is defined.
2514 Macro 'DEFINED' is defined.
2515 Macro 'UNDEFINED' is not defined.
2516
2517 You have to keep in mind that this works only as long as the
2518 preprocessor is not reset. See "Preprocessor configuration" for
2519 details.
2520
2521 pack
2522 "pack" TYPE
2523 "pack" TYPE, DATA
2524 "pack" TYPE, DATA, STRING
2525 Use this method to pack a complex data structure into a binary
2526 string according to a type definition that has been previously
2527 parsed. DATA must be a scalar matching the type definition. C
2528 structures and unions are represented by references to Perl
2529 hashes, C arrays by references to Perl arrays.
2530
2531 use Convert::Binary::C;
2532 use Data::Dumper;
2533 use Data::Hexdumper;
2534
2535 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2536 , LongSize => 4
2537 , ShortSize => 2
2538 )
2539 ->parse(<<'ENDC');
2540 struct test {
2541 char ary[3];
2542 union {
2543 short word[2];
2544 long quad;
2545 } uni;
2546 };
2547 ENDC
2548
2549 Hashes don't have to contain a key for each compound member and
2550 arrays may be truncated:
2551
2552 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2553
2554 Elements not defined in the Perl data structure will be set to
2555 zero in the packed byte string. If you pass "undef" as or
2556 simply omit the second parameter, the whole string will be
2557 initialized with zero bytes. On success, the packed byte string
2558 is returned.
2559
2560 print hexdump(data => $binary);
2561
2562 The above code would print:
2563
2564 0x0000 : 01 02 00 00 00 00 2A : ......*
2565
2566 You could also use "unpack" and dump the data structure.
2567
2568 $unpacked = $c->unpack('test', $binary);
2569 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2570
2571 This would print:
2572
2573 $unpacked = {
2574 'uni' => {
2575 'word' => [
2576 0,
2577 42
2578 ],
2579 'quad' => 42
2580 },
2581 'ary' => [
2582 1,
2583 2,
2584 0
2585 ]
2586 };
2587
2588 If TYPE refers to a compound object, you may pack any member of
2589 that compound object. Simply add a member expression to the
2590 type name, just as you would access the member in C:
2591
2592 $array = $c->pack('test.ary', [1, 2, 3]);
2593 print hexdump(data => $array);
2594
2595 $value = $c->pack('test.uni.word[1]', 2);
2596 print hexdump(data => $value);
2597
2598 This would give you:
2599
2600 0x0000 : 01 02 03 : ...
2601 0x0000 : 00 02 : ..
2602
2603 Call "pack" with the optional STRING argument if you want to
2604 use an existing binary string to insert the data. If called in
2605 a void context, "pack" will directly modify the string you
2606 passed as the third argument. Otherwise, a copy of the string
2607 is created, and "pack" will modify and return the copy, so the
2608 original string will remain unchanged.
2609
2610 The 3-argument version may be useful if you want to change only
2611 a few members of a complex data structure without having to
2612 "unpack" everything, change the members, and then "pack" again
2613 (which could waste lots of memory and CPU cycles). So, instead
2614 of doing something like
2615
2616 $test = $c->unpack('test', $binary);
2617 $test->{uni}{quad} = 4711;
2618 $new = $c->pack('test', $test);
2619
2620 to change the "uni.quad" member of $packed, you could simply do
2621 either
2622
2623 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2624
2625 or
2626
2627 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2628
2629 while the latter would directly modify $packed. Besides this
2630 code being a lot shorter (and perhaps even more readable), it
2631 can be significantly faster if you're dealing with really big
2632 data blocks.
2633
2634 If the length of the input string is less than the size
2635 required by the type, the string (or its copy) is extended and
2636 the extended part is initialized to zero. If the length is
2637 more than the size required by the type, the string is kept at
2638 that length, and also a copy would be an exact copy of that
2639 string.
2640
2641 $too_short = pack "C*", (1 .. 4);
2642 $too_long = pack "C*", (1 .. 20);
2643
2644 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2645 print "too_short:\n", hexdump(data => $too_short);
2646
2647 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2648 print "\ncopy:\n", hexdump(data => $copy);
2649
2650 This would print:
2651
2652 too_short:
2653 0x0000 : 01 02 03 00 00 47 11 : .....G.
2654
2655 copy:
2656 0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2657 0x0010 : 11 12 13 14 : ....
2658
2659 unpack
2660 "unpack" TYPE, STRING
2661 Use this method to unpack a binary string and create an
2662 arbitrarily complex Perl data structure based on a previously
2663 parsed type definition.
2664
2665 use Convert::Binary::C;
2666 use Data::Dumper;
2667
2668 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2669 , LongSize => 4
2670 , ShortSize => 2
2671 )
2672 ->parse( <<'ENDC' );
2673 struct test {
2674 char ary[3];
2675 union {
2676 short word[2];
2677 long *quad;
2678 } uni;
2679 };
2680 ENDC
2681
2682 # Generate some binary dummy data
2683 $binary = pack "C*", 1 .. $c->sizeof('test');
2684
2685 On failure, e.g. if the specified type cannot be found, the
2686 method will throw an exception. On success, a reference to a
2687 complex Perl data structure is returned, which can directly be
2688 dumped using the Data::Dumper module:
2689
2690 $unpacked = $c->unpack('test', $binary);
2691 print Dumper($unpacked);
2692
2693 This would print:
2694
2695 $VAR1 = {
2696 'uni' => {
2697 'word' => [
2698 1029,
2699 1543
2700 ],
2701 'quad' => 67438087
2702 },
2703 'ary' => [
2704 1,
2705 2,
2706 3
2707 ]
2708 };
2709
2710 If TYPE refers to a compound object, you may unpack any member
2711 of that compound object. Simply add a member expression to the
2712 type name, just as you would access the member in C:
2713
2714 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2715
2716 $unpack1 = $unpacked->{uni}{word};
2717 $unpack2 = $c->unpack('test.uni.word', $binary2);
2718
2719 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2720
2721 You will find that the output is exactly the same for both
2722 $unpack1 and $unpack2:
2723
2724 $unpack1 = [
2725 1029,
2726 1543
2727 ];
2728 $unpack2 = [
2729 1029,
2730 1543
2731 ];
2732
2733 When "unpack" is called in list context, it will unpack as many
2734 elements as possible from STRING, including zero if STRING is
2735 not long enough.
2736
2737 initializer
2738 "initializer" TYPE
2739 "initializer" TYPE, DATA
2740 The "initializer" method can be used retrieve an initializer
2741 string for a certain TYPE. This can be useful if you have to
2742 initialize only a couple of members in a huge compound type or
2743 if you simply want to generate initializers automatically.
2744
2745 struct date {
2746 unsigned year : 12;
2747 unsigned month: 4;
2748 unsigned day : 5;
2749 unsigned hour : 5;
2750 unsigned min : 6;
2751 };
2752
2753 typedef struct {
2754 enum { DATE, QWORD } type;
2755 short number;
2756 union {
2757 struct date date;
2758 unsigned long qword;
2759 } choice;
2760 } data;
2761
2762 Given the above code has been parsed
2763
2764 $init = $c->initializer('data');
2765 print "data x = $init;\n";
2766
2767 would print the following:
2768
2769 data x = {
2770 0,
2771 0,
2772 {
2773 {
2774 0,
2775 0,
2776 0,
2777 0,
2778 0
2779 }
2780 }
2781 };
2782
2783 You could directly put that into a C program, although it
2784 probably isn't very useful yet. It becomes more useful if you
2785 actually specify how you want to initialize the type:
2786
2787 $data = {
2788 type => 'QWORD',
2789 choice => {
2790 date => { month => 12, day => 24 },
2791 qword => 4711,
2792 },
2793 stuff => 'yes?',
2794 };
2795
2796 $init = $c->initializer('data', $data);
2797 print "data x = $init;\n";
2798
2799 This would print the following:
2800
2801 data x = {
2802 QWORD,
2803 0,
2804 {
2805 {
2806 0,
2807 12,
2808 24,
2809 0,
2810 0
2811 }
2812 }
2813 };
2814
2815 As only the first member of a "union" can be initialized,
2816 "choice.qword" is ignored. You will not be warned about the
2817 fact that you probably tried to initialize a member other than
2818 the first. This is considered a feature, because it allows you
2819 to use "unpack" to generate the initializer data:
2820
2821 $data = $c->unpack('data', $binary);
2822 $init = $c->initializer('data', $data);
2823
2824 Since "unpack" unpacks all union members, you would otherwise
2825 have to delete all but the first one previous to feeding it
2826 into "initializer".
2827
2828 Also, "stuff" is ignored, because it actually isn't a member of
2829 "data". You won't be warned about that either.
2830
2831 sizeof
2832 "sizeof" TYPE
2833 This method will return the size of a C type in bytes. If it
2834 cannot find the type, it will throw an exception.
2835
2836 If the type defines some kind of compound object, you may ask
2837 for the size of a member of that compound object:
2838
2839 $size = $c->sizeof('test.uni.word[1]');
2840
2841 This would set $size to 2.
2842
2843 typeof
2844 "typeof" TYPE
2845 This method will return the type of a C member. While this
2846 only makes sense for compound types, it's legal to also use it
2847 for non-compound types. If it cannot find the type, it will
2848 throw an exception.
2849
2850 The "typeof" method can be used on any valid member, even on
2851 arrays or unnamed types. It will always return a string that
2852 holds the name (or in case of unnamed types only the class) of
2853 the type, optionally followed by a '*' character to indicate
2854 it's a pointer type, and optionally followed by one or more
2855 array dimensions if it's an array type. If the type is a
2856 bitfield, the type name is followed by a colon and the number
2857 of bits.
2858
2859 struct test {
2860 char ary[3];
2861 union {
2862 short word[2];
2863 long *quad;
2864 } uni;
2865 struct {
2866 unsigned short six:6;
2867 unsigned short ten:10;
2868 } bits;
2869 };
2870
2871 Given the above C code has been parsed, calls to "typeof" would
2872 return the following values:
2873
2874 $c->typeof('test') => 'struct test'
2875 $c->typeof('test.ary') => 'char [3]'
2876 $c->typeof('test.uni') => 'union'
2877 $c->typeof('test.uni.quad') => 'long *'
2878 $c->typeof('test.uni.word') => 'short [2]'
2879 $c->typeof('test.uni.word[1]') => 'short'
2880 $c->typeof('test.bits') => 'struct'
2881 $c->typeof('test.bits.six') => 'unsigned short :6'
2882 $c->typeof('test.bits.ten') => 'unsigned short :10'
2883
2884 offsetof
2885 "offsetof" TYPE, MEMBER
2886 You can use "offsetof" just like the C macro of same
2887 denominator. It will simply return the offset (in bytes) of
2888 MEMBER relative to TYPE.
2889
2890 use Convert::Binary::C;
2891
2892 $c = Convert::Binary::C->new( Alignment => 4
2893 , LongSize => 4
2894 , PointerSize => 4
2895 )
2896 ->parse(<<'ENDC');
2897 typedef struct {
2898 char abc;
2899 long day;
2900 int *ptr;
2901 } week;
2902
2903 struct test {
2904 week zap[8];
2905 };
2906 ENDC
2907
2908 @args = (
2909 ['test', 'zap[5].day' ],
2910 ['test.zap[2]', 'day' ],
2911 ['test', 'zap[5].day+1'],
2912 ['test', 'zap[-3].ptr' ],
2913 );
2914
2915 for (@args) {
2916 my $offset = eval { $c->offsetof(@$_) };
2917 printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2918 }
2919
2920 The final loop will print:
2921
2922 $c->offsetof('test', 'zap[5].day') => 64
2923 $c->offsetof('test.zap[2]', 'day') => 4
2924 $c->offsetof('test', 'zap[5].day+1') => 65
2925 $c->offsetof('test', 'zap[-3].ptr') => -28
2926
2927 · The first iteration simply shows that the offset of
2928 "zap[5].day" is 64 relative to the beginning of "struct
2929 test".
2930
2931 · You may additionally specify a member for the type passed as
2932 the first argument, as shown in the second iteration.
2933
2934 · The offset suffix is also supported by "offsetof", so the
2935 third iteration will correctly print 65.
2936
2937 · The last iteration demonstrates that even out-of-bounds array
2938 indices are handled correctly, just as they are handled in C.
2939
2940 Unlike the C macro, "offsetof" also works on array types.
2941
2942 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2943 print "offset = $offset";
2944
2945 This will print:
2946
2947 offset = 46
2948
2949 If TYPE is a compound, MEMBER may optionally be prefixed with a
2950 dot, so
2951
2952 printf "offset = %d\n", $c->offsetof('week', 'day');
2953 printf "offset = %d\n", $c->offsetof('week', '.day');
2954
2955 are both equivalent and will print
2956
2957 offset = 4
2958 offset = 4
2959
2960 This allows one to
2961
2962 · use the C macro style, without a leading dot, and
2963
2964 · directly use the output of the "member" method, which
2965 includes a leading dot for compound types, as input for the
2966 MEMBER argument.
2967
2968 member
2969 "member" TYPE
2970 "member" TYPE, OFFSET
2971 You can think of "member" as being the reverse of the
2972 "offsetof" method. However, as this is more complex, there's no
2973 equivalent to "member" in the C language.
2974
2975 Usually this method is used if you want to retrieve the name of
2976 the member that is located at a specific offset of a previously
2977 parsed type.
2978
2979 use Convert::Binary::C;
2980
2981 $c = Convert::Binary::C->new( Alignment => 4
2982 , LongSize => 4
2983 , PointerSize => 4
2984 )
2985 ->parse(<<'ENDC');
2986 typedef struct {
2987 char abc;
2988 long day;
2989 int *ptr;
2990 } week;
2991
2992 struct test {
2993 week zap[8];
2994 };
2995 ENDC
2996
2997 for my $offset (24, 39, 69, 99) {
2998 print "\$c->member('test', $offset)";
2999 my $member = eval { $c->member('test', $offset) };
3000 print $@ ? "\n exception: $@" : " => '$member'\n";
3001 }
3002
3003 This will print:
3004
3005 $c->member('test', 24) => '.zap[2].abc'
3006 $c->member('test', 39) => '.zap[3]+3'
3007 $c->member('test', 69) => '.zap[5].ptr+1'
3008 $c->member('test', 99)
3009 exception: Offset 99 out of range (0 <= offset < 96)
3010
3011 · The output of the first iteration is obvious. The member
3012 "zap[2].abc" is located at offset 24 of "struct test".
3013
3014 · In the second iteration, the offset points into a region of
3015 padding bytes and thus no member of "week" can be named.
3016 Instead of a member name the offset relative to "zap[3]" is
3017 appended.
3018
3019 · In the third iteration, the offset points to "zap[5].ptr".
3020 However, "zap[5].ptr" is located at 68, not at 69, and thus
3021 the remaining offset of 1 is also appended.
3022
3023 · The last iteration causes an exception because the offset of
3024 99 is not valid for "struct test" since the size of "struct
3025 test" is only 96. You might argue that this is inconsistent,
3026 since "offsetof" can also handle out-of-bounds array members.
3027 But as soon as you have more than one level of array nesting,
3028 there's an infinite number of out-of-bounds members for a
3029 single given offset, so it would be impossible to return a
3030 list of all members.
3031
3032 You can additionally specify a member for the type passed as
3033 the first argument:
3034
3035 $member = $c->member('test.zap[2]', 6);
3036 print $member;
3037
3038 This will print:
3039
3040 .day+2
3041
3042 Like "offsetof", "member" also works on array types:
3043
3044 $member = $c->member('test.zap', 42);
3045 print $member;
3046
3047 This will print:
3048
3049 [3].day+2
3050
3051 While the behaviour for "struct"s is quite obvious, the
3052 behaviour for "union"s is rather tricky. As a single offset
3053 usually references more than one member of a union, there are
3054 certain rules that the algorithm uses for determining the best
3055 member.
3056
3057 · The first non-compound member that is referenced without an
3058 offset has the highest priority.
3059
3060 · If no member is referenced without an offset, the first non-
3061 compound member that is referenced with an offset will be
3062 returned.
3063
3064 · Otherwise the first padding region that is encountered will
3065 be taken.
3066
3067 As an example, given 4-byte-alignment and the union
3068
3069 union choice {
3070 struct {
3071 char color[2];
3072 long size;
3073 char taste;
3074 } apple;
3075 char grape[3];
3076 struct {
3077 long weight;
3078 short price[3];
3079 } melon;
3080 };
3081
3082 the "member" method would return what is shown in the Member
3083 column of the following table. The Type column shows the result
3084 of the "typeof" method when passing the corresponding member.
3085
3086 Offset Member Type
3087 --------------------------------------
3088 0 .apple.color[0] 'char'
3089 1 .apple.color[1] 'char'
3090 2 .grape[2] 'char'
3091 3 .melon.weight+3 'long'
3092 4 .apple.size 'long'
3093 5 .apple.size+1 'long'
3094 6 .melon.price[1] 'short'
3095 7 .apple.size+3 'long'
3096 8 .apple.taste 'char'
3097 9 .melon.price[2]+1 'short'
3098 10 .apple+10 'struct'
3099 11 .apple+11 'struct'
3100
3101 It's like having a stack of all the union members and looking
3102 through the stack for the shiniest piece you can see. The
3103 beginning of a member (denoted by uppercase letters) is always
3104 shinier than the rest of a member, while padding regions
3105 (denoted by dashes) aren't shiny at all.
3106
3107 Offset 0 1 2 3 4 5 6 7 8 9 10 11
3108 -------------------------------------------------------
3109 apple (C) (C) - - (S) (s) s (s) (T) - (-) (-)
3110 grape G G (G)
3111 melon W w w (w) P p (P) p P (p) - -
3112
3113 If you look through that stack from top to bottom, you'll end
3114 up at the parenthesized members.
3115
3116 Alternatively, if you're not only interested in the best
3117 member, you can call "member" in list context, which makes it
3118 return all members referenced by the given offset.
3119
3120 Offset Member Type
3121 --------------------------------------
3122 0 .apple.color[0] 'char'
3123 .grape[0] 'char'
3124 .melon.weight 'long'
3125 1 .apple.color[1] 'char'
3126 .grape[1] 'char'
3127 .melon.weight+1 'long'
3128 2 .grape[2] 'char'
3129 .melon.weight+2 'long'
3130 .apple+2 'struct'
3131 3 .melon.weight+3 'long'
3132 .apple+3 'struct'
3133 4 .apple.size 'long'
3134 .melon.price[0] 'short'
3135 5 .apple.size+1 'long'
3136 .melon.price[0]+1 'short'
3137 6 .melon.price[1] 'short'
3138 .apple.size+2 'long'
3139 7 .apple.size+3 'long'
3140 .melon.price[1]+1 'short'
3141 8 .apple.taste 'char'
3142 .melon.price[2] 'short'
3143 9 .melon.price[2]+1 'short'
3144 .apple+9 'struct'
3145 10 .apple+10 'struct'
3146 .melon+10 'struct'
3147 11 .apple+11 'struct'
3148 .melon+11 'struct'
3149
3150 The first member returned is always the best member. The other
3151 members are sorted according to the rules given above. This
3152 means that members referenced without an offset are followed by
3153 members referenced with an offset. Padding regions will be at
3154 the end.
3155
3156 If OFFSET is not given in the method call, "member" will return
3157 a list of all possible members of TYPE.
3158
3159 print "$_\n" for $c->member('choice');
3160
3161 This will print:
3162
3163 .apple.color[0]
3164 .apple.color[1]
3165 .apple.size
3166 .apple.taste
3167 .grape[0]
3168 .grape[1]
3169 .grape[2]
3170 .melon.weight
3171 .melon.price[0]
3172 .melon.price[1]
3173 .melon.price[2]
3174
3175 In scalar context, the number of possible members is returned.
3176
3177 tag
3178 "tag" TYPE
3179 "tag" TYPE, TAG
3180 "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3181 The "tag" method can be used to tag properties to a TYPE. It's
3182 a bit like having "configure" for individual types.
3183
3184 See "USING TAGS" for an example.
3185
3186 Note that while you can tag whole types as well as compound
3187 members, it is not possible to tag array members, i.e. you
3188 cannot treat, for example, "a[1]" and "a[2]" differently.
3189
3190 Also note that in code like this
3191
3192 struct test {
3193 int a;
3194 struct {
3195 int x;
3196 } b, c;
3197 };
3198
3199 if you tag "test.b.x", this will also tag "test.c.x"
3200 implicitly.
3201
3202 It is also possible to tag basic types if you really want to do
3203 that, for example:
3204
3205 $c->tag('int', Format => 'Binary');
3206
3207 To remove a tag from a type, you can either set that tag to
3208 "undef", for example
3209
3210 $c->tag('test', Hooks => undef);
3211
3212 or use "untag".
3213
3214 To see if a tag is attached to a type or to get the value of a
3215 tag, pass only the type and tag name to "tag":
3216
3217 $c->tag('test.a', Format => 'Binary');
3218
3219 $hooks = $c->tag('test.a', 'Hooks');
3220 $format = $c->tag('test.a', 'Format');
3221
3222 This will give you:
3223
3224 $hooks = undef;
3225 $format = 'Binary';
3226
3227 To see which tags are attached to a type, pass only the type.
3228 The "tag" method will now return a hash reference containing
3229 all tags attached to the type:
3230
3231 $tags = $c->tag('test.a');
3232
3233 This will give you:
3234
3235 $tags = {
3236 'Format' => 'Binary'
3237 };
3238
3239 "tag" will throw an exception if an error occurs. If called as
3240 a 'set' method, it will return a reference to its object,
3241 allowing you to chain together consecutive method calls.
3242
3243 Note that when a compound is inlined, tags attached to the
3244 inlined compound are ignored, for example:
3245
3246 $c->parse(<<ENDC);
3247 struct header {
3248 int id;
3249 int len;
3250 unsigned flags;
3251 };
3252
3253 struct message {
3254 struct header;
3255 short samples[32];
3256 };
3257 ENDC
3258
3259 for my $type (qw( header message header.len )) {
3260 $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3261 }
3262
3263 for my $type (qw( header message )) {
3264 print "[unpacking $type]\n";
3265 $u = $c->unpack($type, $data);
3266 }
3267
3268 This will print:
3269
3270 [unpacking header]
3271 unpack: header.len
3272 unpack: header
3273 [unpacking message]
3274 unpack: header.len
3275 unpack: message
3276
3277 As you can see from the above output, tags attached to members
3278 of inlined compounds ("header.len" are still handled.
3279
3280 The following tags can be configured:
3281
3282 "Format" => 'Binary' | 'String'
3283 The "Format" tag allows you to control the way binary data
3284 is converted by "pack" and "unpack".
3285
3286 If you tag a "TYPE" as "Binary", it will not be converted
3287 at all, i.e. it will be passed through as a binary string.
3288
3289 If you tag it as "String", it will be treated like a null-
3290 terminated C string, i.e. "unpack" will convert the C
3291 string to a Perl string and vice versa.
3292
3293 See "The Format Tag" for an example.
3294
3295 "ByteOrder" => 'BigEndian' | 'LittleEndian'
3296 The "ByteOrder" tag allows you to explicitly set the byte
3297 order of a TYPE.
3298
3299 See "The ByteOrder Tag" for an example.
3300
3301 "Dimension" => '*'
3302 "Dimension" => VALUE
3303 "Dimension" => MEMBER
3304 "Dimension" => SUB
3305 "Dimension" => [ SUB, ARGS ]
3306 The "Dimension" tag allows you to alter the size of an
3307 array dynamically.
3308
3309 You can tag fixed size arrays as being flexible using '*'.
3310 This is useful if you cannot use flexible array members in
3311 your source code.
3312
3313 $c->tag('type.array', Dimension => '*');
3314
3315 You can also tag an array to have a fixed size different
3316 from the one it was originally declared with.
3317
3318 $c->tag('type.array', Dimension => 42);
3319
3320 If the array is a member of a compound, you can also tag it
3321 with to have a size corresponding to the value of another
3322 member in that compound.
3323
3324 $c->tag('type.array', Dimension => 'count');
3325
3326 Finally, you can specify a subroutine that is called when
3327 the size of the array needs to be determined.
3328
3329 $c->tag('type.array', Dimension => \&get_count);
3330
3331 By default, and if the array is a compound member, that
3332 subroutine will be passed a reference to the hash storing
3333 the data for the compound.
3334
3335 You can also instruct Convert::Binary::C to pass additional
3336 arguments to the subroutine by passing an array reference
3337 instead of the subroutine reference. This array contains
3338 the subroutine reference as well as a list of arguments.
3339 It is possible to define certain special arguments using
3340 the "arg" method.
3341
3342 $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3343
3344 See "The Dimension Tag" for various examples.
3345
3346 "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3347 The "Hooks" tag allows you to register subroutines as
3348 hooks.
3349
3350 Hooks are called whenever a certain "TYPE" is packed or
3351 unpacked. Hooks are currently considered an experimental
3352 feature.
3353
3354 "HOOK" can be one of the following:
3355
3356 pack
3357 unpack
3358 pack_ptr
3359 unpack_ptr
3360
3361 "pack" and "unpack" hooks are called when processing their
3362 "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3363 when processing pointers to their "TYPE".
3364
3365 "SUB" is a reference to a subroutine that usually takes one
3366 input argument, processes it and returns one output
3367 argument.
3368
3369 Alternatively, you can pass a custom list of arguments to
3370 the hook by using an array reference instead of "SUB" that
3371 holds the subroutine reference in the first element and the
3372 arguments to be passed to the subroutine as the other
3373 elements. This way, you can even pass special arguments to
3374 the hook using the "arg" method.
3375
3376 Here are a few examples for registering hooks:
3377
3378 $c->tag('ObjectType', Hooks => {
3379 pack => \&obj_pack,
3380 unpack => \&obj_unpack
3381 });
3382
3383 $c->tag('ProtocolId', Hooks => {
3384 unpack => sub { $protos[$_[0]] }
3385 });
3386
3387 $c->tag('ProtocolId', Hooks => {
3388 unpack_ptr => [sub {
3389 sprintf "$_[0]:{0x%X}", $_[1]
3390 },
3391 $c->arg('TYPE', 'DATA')
3392 ],
3393 });
3394
3395 Note that the above example registers both an "unpack" hook
3396 and an "unpack_ptr" hook for "ProtocolId" with two separate
3397 calls to "tag". As long as you don't explicitly overwrite a
3398 previously registered hook, it won't be modified or removed
3399 by registering other hooks for the same "TYPE".
3400
3401 To remove all registered hooks for a type, simply remove
3402 the "Hooks" tag:
3403
3404 $c->untag('ProtocolId', 'Hooks');
3405
3406 To remove only a single hook, pass "undef" as "SUB" instead
3407 of a subroutine reference:
3408
3409 $c->tag('ObjectType', Hooks => { pack => undef });
3410
3411 If all hooks are removed, the whole "Hooks" tag is removed.
3412
3413 See "The Hooks Tag" for examples on how to use hooks.
3414
3415 untag
3416 "untag" TYPE
3417 "untag" TYPE, TAG1, TAG2, ...
3418 Use the "untag" method to remove one, more, or all tags from a
3419 type. If you don't pass any tag names, all tags attached to the
3420 type will be removed. Otherwise only the listed tags will be
3421 removed.
3422
3423 See "USING TAGS" for an example.
3424
3425 arg
3426 "arg" 'ARG', ...
3427 Creates placeholders for special arguments to be passed to
3428 hooks or other subroutines. These arguments are currently:
3429
3430 "SELF"
3431 A reference to the calling Convert::Binary::C object. This
3432 may be useful if you need to work with the object inside
3433 the subroutine.
3434
3435 "TYPE"
3436 The name of the type that is currently being processed by
3437 the hook.
3438
3439 "DATA"
3440 The data argument that is passed to the subroutine.
3441
3442 "HOOK"
3443 The type of the hook as which the subroutine has been
3444 called, for example "pack" or "unpack_ptr".
3445
3446 "arg" will return a placeholder for each argument it is being
3447 passed. Note that not all arguments may be supported depending
3448 on the context of the subroutine.
3449
3450 dependencies
3451 "dependencies"
3452 After some code has been parsed using either the "parse" or
3453 "parse_file" methods, the "dependencies" method can be used to
3454 retrieve information about all files that the object depends
3455 on, i.e. all files that have been parsed.
3456
3457 In scalar context, the method returns a hash reference. Each
3458 key is the name of a file. The values are again hash
3459 references, each of which holds the size, modification time
3460 (mtime), and change time (ctime) of the file at the moment it
3461 was parsed.
3462
3463 use Convert::Binary::C;
3464 use Data::Dumper;
3465
3466 #----------------------------------------------------------
3467 # Create object, set include path, parse 'string.h' header
3468 #----------------------------------------------------------
3469 my $c = Convert::Binary::C->new
3470 ->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include',
3471 '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include-fixed',
3472 '/usr/include')
3473 ->parse_file('string.h');
3474
3475 #----------------------------------------------------------
3476 # Get dependencies of the object, extract dependency files
3477 #----------------------------------------------------------
3478 my $depend = $c->dependencies;
3479 my @files = keys %$depend;
3480
3481 #-----------------------------
3482 # Dump dependencies and files
3483 #-----------------------------
3484 print Data::Dumper->Dump([$depend, \@files],
3485 [qw( depend *files )]);
3486
3487 The above code would print something like this:
3488
3489 $depend = {
3490 '/usr/include/features.h' => {
3491 'ctime' => 1300268052,
3492 'mtime' => 1300267911,
3493 'size' => 12511
3494 },
3495 '/usr/include/gnu/stubs-32.h' => {
3496 'ctime' => 1300268051,
3497 'mtime' => 1300268010,
3498 'size' => 624
3499 },
3500 '/usr/include/sys/cdefs.h' => {
3501 'ctime' => 1300268051,
3502 'mtime' => 1300267957,
3503 'size' => 13195
3504 },
3505 '/usr/include/gnu/stubs.h' => {
3506 'ctime' => 1300268051,
3507 'mtime' => 1300267911,
3508 'size' => 315
3509 },
3510 '/usr/include/string.h' => {
3511 'ctime' => 1300268052,
3512 'mtime' => 1300267944,
3513 'size' => 22572
3514 },
3515 '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include/stddef.h' => {
3516 'ctime' => 1300365679,
3517 'mtime' => 1300363914,
3518 'size' => 12542
3519 },
3520 '/usr/include/bits/wordsize.h' => {
3521 'ctime' => 1300268051,
3522 'mtime' => 1300267937,
3523 'size' => 873
3524 },
3525 '/usr/include/xlocale.h' => {
3526 'ctime' => 1300268051,
3527 'mtime' => 1300267915,
3528 'size' => 1764
3529 }
3530 };
3531 @files = (
3532 '/usr/include/features.h',
3533 '/usr/include/gnu/stubs-32.h',
3534 '/usr/include/sys/cdefs.h',
3535 '/usr/include/gnu/stubs.h',
3536 '/usr/include/string.h',
3537 '/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/include/stddef.h',
3538 '/usr/include/bits/wordsize.h',
3539 '/usr/include/xlocale.h'
3540 );
3541
3542 In list context, the method returns the names of all files that
3543 have been parsed, i.e. the following lines are equivalent:
3544
3545 @files = keys %{$c->dependencies};
3546 @files = $c->dependencies;
3547
3548 sourcify
3549 "sourcify"
3550 "sourcify" CONFIG
3551 Returns a string that holds the C source code necessary to
3552 represent all parsed C data structures.
3553
3554 use Convert::Binary::C;
3555
3556 $c = new Convert::Binary::C;
3557 $c->parse(<<'END');
3558
3559 #define ADD(a, b) ((a) + (b))
3560 #define NUMBER 42
3561
3562 typedef struct _mytype mytype;
3563
3564 struct _mytype {
3565 union {
3566 int iCount;
3567 enum count *pCount;
3568 } counter;
3569 #pragma pack( push, 1 )
3570 struct {
3571 char string[NUMBER];
3572 int array[NUMBER/sizeof(int)];
3573 } storage;
3574 #pragma pack( pop )
3575 mytype *next;
3576 };
3577
3578 enum count { ZERO, ONE, TWO, THREE };
3579
3580 END
3581
3582 print $c->sourcify;
3583
3584 The above code would print something like this:
3585
3586 /* typedef predeclarations */
3587
3588 typedef struct _mytype mytype;
3589
3590 /* defined enums */
3591
3592 enum count
3593 {
3594 ZERO,
3595 ONE,
3596 TWO,
3597 THREE
3598 };
3599
3600
3601 /* defined structs and unions */
3602
3603 struct _mytype
3604 {
3605 union
3606 {
3607 int iCount;
3608 enum count *pCount;
3609 } counter;
3610 #pragma pack(push, 1)
3611 struct
3612 {
3613 char string[42];
3614 int array[10];
3615 } storage;
3616 #pragma pack(pop)
3617 mytype *next;
3618 };
3619
3620 The purpose of the "sourcify" method is to enable some kind of
3621 platform-independent caching. The C code generated by
3622 "sourcify" can be parsed by any standard C compiler, as well as
3623 of course by the Convert::Binary::C parser. However, the code
3624 may be significantly shorter than the code that has originally
3625 been parsed.
3626
3627 When parsing a typical header file, it's easily possible that
3628 you need to open dozens of other files that are included from
3629 that file, and end up parsing several hundred kilobytes of C
3630 code. Since most of it is usually preprocessor directives,
3631 function prototypes and comments, the "sourcify" function
3632 strips this down to a few kilobytes. Saving the "sourcify"
3633 string and parsing it next time instead of the original code
3634 may be a lot faster.
3635
3636 The "sourcify" method takes a hash reference as an optional
3637 argument. It can be used to tweak the method's output. The
3638 following options can be configured.
3639
3640 "Context" => 0 | 1
3641 Turns preprocessor context information on or off. If this
3642 is turned on, "sourcify" will insert "#line" preprocessor
3643 directives in its output. So in the above example
3644
3645 print $c->sourcify({ Context => 1 });
3646
3647 would print:
3648
3649 /* typedef predeclarations */
3650
3651 typedef struct _mytype mytype;
3652
3653 /* defined enums */
3654
3655
3656 #line 21 "[buffer]"
3657 enum count
3658 {
3659 ZERO,
3660 ONE,
3661 TWO,
3662 THREE
3663 };
3664
3665
3666 /* defined structs and unions */
3667
3668
3669 #line 7 "[buffer]"
3670 struct _mytype
3671 {
3672 #line 8 "[buffer]"
3673 union
3674 {
3675 int iCount;
3676 enum count *pCount;
3677 } counter;
3678 #pragma pack(push, 1)
3679 #line 13 "[buffer]"
3680 struct
3681 {
3682 char string[42];
3683 int array[10];
3684 } storage;
3685 #pragma pack(pop)
3686 mytype *next;
3687 };
3688
3689 Note that "[buffer]" refers to the here-doc buffer when
3690 using "parse".
3691
3692 "Defines" => 0 | 1
3693 Turn this on if you want all the defined macros to be part
3694 of the source code output. Given the example code above
3695
3696 print $c->sourcify({ Defines => 1 });
3697
3698 would print:
3699
3700 /* typedef predeclarations */
3701
3702 typedef struct _mytype mytype;
3703
3704 /* defined enums */
3705
3706 enum count
3707 {
3708 ZERO,
3709 ONE,
3710 TWO,
3711 THREE
3712 };
3713
3714
3715 /* defined structs and unions */
3716
3717 struct _mytype
3718 {
3719 union
3720 {
3721 int iCount;
3722 enum count *pCount;
3723 } counter;
3724 #pragma pack(push, 1)
3725 struct
3726 {
3727 char string[42];
3728 int array[10];
3729 } storage;
3730 #pragma pack(pop)
3731 mytype *next;
3732 };
3733
3734 /* preprocessor defines */
3735
3736 #define ADD(a, b) ((a) + (b))
3737 #define NUMBER 42
3738
3739 The macro definitions always appear at the end of the
3740 source code. The order of the macro definitions is
3741 undefined.
3742
3743 The following methods can be used to retrieve information about the
3744 definitions that have been parsed. The examples given in the
3745 description for "enum", "compound" and "typedef" all assume this piece
3746 of C code has been parsed:
3747
3748 #define ABC_SIZE 2
3749 #define MULTIPLY(x, y) ((x)*(y))
3750
3751 #ifdef ABC_SIZE
3752 # define DEFINED
3753 #else
3754 # define NOT_DEFINED
3755 #endif
3756
3757 typedef unsigned long U32;
3758 typedef void *any;
3759
3760 enum __socket_type
3761 {
3762 SOCK_STREAM = 1,
3763 SOCK_DGRAM = 2,
3764 SOCK_RAW = 3,
3765 SOCK_RDM = 4,
3766 SOCK_SEQPACKET = 5,
3767 SOCK_PACKET = 10
3768 };
3769
3770 struct STRUCT_SV {
3771 void *sv_any;
3772 U32 sv_refcnt;
3773 U32 sv_flags;
3774 };
3775
3776 typedef union {
3777 int abc[ABC_SIZE];
3778 struct xxx {
3779 int a;
3780 int b;
3781 } ab[3][4];
3782 any ptr;
3783 } test;
3784
3785 enum_names
3786 "enum_names"
3787 Returns a list of identifiers of all defined enumeration
3788 objects. Enumeration objects don't necessarily have an
3789 identifier, so something like
3790
3791 enum { A, B, C };
3792
3793 will obviously not appear in the list returned by the
3794 "enum_names" method. Also, enumerations that are not defined
3795 within the source code - like in
3796
3797 struct foo {
3798 enum weekday *pWeekday;
3799 unsigned long year;
3800 };
3801
3802 where only a pointer to the "weekday" enumeration object is
3803 used - will not be returned, even though they have an
3804 identifier. So for the above two enumerations, "enum_names"
3805 will return an empty list:
3806
3807 @names = $c->enum_names;
3808
3809 The only way to retrieve a list of all enumeration identifiers
3810 is to use the "enum" method without additional arguments. You
3811 can get a list of all enumeration objects that have an
3812 identifier by using
3813
3814 @enums = map { $_->{identifier} || () } $c->enum;
3815
3816 but these may not have a definition. Thus, the two arrays would
3817 look like this:
3818
3819 @names = ();
3820 @enums = ('weekday');
3821
3822 The "def" method returns a true value for all identifiers
3823 returned by "enum_names".
3824
3825 enum
3826 enum
3827 "enum" LIST
3828 Returns a list of references to hashes containing detailed
3829 information about all enumerations that have been parsed.
3830
3831 If a list of enumeration identifiers is passed to the method,
3832 the returned list will only contain hash references for those
3833 enumerations. The enumeration identifiers may optionally be
3834 prefixed by "enum".
3835
3836 If an enumeration identifier cannot be found, the returned list
3837 will contain an undefined value at that position.
3838
3839 In scalar context, the number of enumerations will be returned
3840 as long as the number of arguments to the method call is not 1.
3841 In the latter case, a hash reference holding information for
3842 the enumeration will be returned.
3843
3844 The list returned by the "enum" method looks similar to this:
3845
3846 @enum = (
3847 {
3848 'enumerators' => {
3849 'SOCK_STREAM' => 1,
3850 'SOCK_RAW' => 3,
3851 'SOCK_SEQPACKET' => 5,
3852 'SOCK_RDM' => 4,
3853 'SOCK_PACKET' => 10,
3854 'SOCK_DGRAM' => 2
3855 },
3856 'identifier' => '__socket_type',
3857 'context' => 'definitions.c(13)',
3858 'size' => 4,
3859 'sign' => 0
3860 }
3861 );
3862
3863 "identifier"
3864 holds the enumeration identifier. This key is not present
3865 if the enumeration has no identifier.
3866
3867 "context"
3868 is the context in which the enumeration is defined. This is
3869 the filename followed by the line number in parentheses.
3870
3871 "enumerators"
3872 is a reference to a hash table that holds all enumerators
3873 of the enumeration.
3874
3875 "sign"
3876 is a boolean indicating if the enumeration is signed (i.e.
3877 has negative values).
3878
3879 One useful application may be to create a hash table that holds
3880 all enumerators of all defined enumerations:
3881
3882 %enum = map %{ $_->{enumerators} || {} }, $c->enum;
3883
3884 The %enum hash table would then be:
3885
3886 %enum = (
3887 'SOCK_STREAM' => 1,
3888 'SOCK_RAW' => 3,
3889 'SOCK_SEQPACKET' => 5,
3890 'SOCK_RDM' => 4,
3891 'SOCK_DGRAM' => 2,
3892 'SOCK_PACKET' => 10
3893 );
3894
3895 compound_names
3896 "compound_names"
3897 Returns a list of identifiers of all structs and unions
3898 (compound data structures) that are defined in the parsed
3899 source code. Like enumerations, compounds don't need to have an
3900 identifier, nor do they need to be defined.
3901
3902 Again, the only way to retrieve information about all struct
3903 and union objects is to use the "compound" method and don't
3904 pass it any arguments. If you should need a list of all struct
3905 and union identifiers, you can use:
3906
3907 @compound = map { $_->{identifier} || () } $c->compound;
3908
3909 The "def" method returns a true value for all identifiers
3910 returned by "compound_names".
3911
3912 If you need the names of only the structs or only the unions,
3913 use the "struct_names" and "union_names" methods respectively.
3914
3915 compound
3916 "compound"
3917 "compound" LIST
3918 Returns a list of references to hashes containing detailed
3919 information about all compounds (structs and unions) that have
3920 been parsed.
3921
3922 If a list of struct/union identifiers is passed to the method,
3923 the returned list will only contain hash references for those
3924 compounds. The identifiers may optionally be prefixed by
3925 "struct" or "union", which limits the search to the specified
3926 kind of compound.
3927
3928 If an identifier cannot be found, the returned list will
3929 contain an undefined value at that position.
3930
3931 In scalar context, the number of compounds will be returned as
3932 long as the number of arguments to the method call is not 1. In
3933 the latter case, a hash reference holding information for the
3934 compound will be returned.
3935
3936 The list returned by the "compound" method looks similar to
3937 this:
3938
3939 @compound = (
3940 {
3941 'identifier' => 'STRUCT_SV',
3942 'align' => 1,
3943 'context' => 'definitions.c(23)',
3944 'pack' => 0,
3945 'type' => 'struct',
3946 'declarations' => [
3947 {
3948 'declarators' => [
3949 {
3950 'declarator' => '*sv_any',
3951 'size' => 4,
3952 'offset' => 0
3953 }
3954 ],
3955 'type' => 'void'
3956 },
3957 {
3958 'declarators' => [
3959 {
3960 'declarator' => 'sv_refcnt',
3961 'size' => 4,
3962 'offset' => 4
3963 }
3964 ],
3965 'type' => 'U32'
3966 },
3967 {
3968 'declarators' => [
3969 {
3970 'declarator' => 'sv_flags',
3971 'size' => 4,
3972 'offset' => 8
3973 }
3974 ],
3975 'type' => 'U32'
3976 }
3977 ],
3978 'size' => 12
3979 },
3980 {
3981 'identifier' => 'xxx',
3982 'align' => 1,
3983 'context' => 'definitions.c(31)',
3984 'pack' => 0,
3985 'type' => 'struct',
3986 'declarations' => [
3987 {
3988 'declarators' => [
3989 {
3990 'declarator' => 'a',
3991 'size' => 4,
3992 'offset' => 0
3993 }
3994 ],
3995 'type' => 'int'
3996 },
3997 {
3998 'declarators' => [
3999 {
4000 'declarator' => 'b',
4001 'size' => 4,
4002 'offset' => 4
4003 }
4004 ],
4005 'type' => 'int'
4006 }
4007 ],
4008 'size' => 8
4009 },
4010 {
4011 'align' => 1,
4012 'context' => 'definitions.c(29)',
4013 'pack' => 0,
4014 'type' => 'union',
4015 'declarations' => [
4016 {
4017 'declarators' => [
4018 {
4019 'declarator' => 'abc[2]',
4020 'size' => 8,
4021 'offset' => 0
4022 }
4023 ],
4024 'type' => 'int'
4025 },
4026 {
4027 'declarators' => [
4028 {
4029 'declarator' => 'ab[3][4]',
4030 'size' => 96,
4031 'offset' => 0
4032 }
4033 ],
4034 'type' => 'struct xxx'
4035 },
4036 {
4037 'declarators' => [
4038 {
4039 'declarator' => 'ptr',
4040 'size' => 4,
4041 'offset' => 0
4042 }
4043 ],
4044 'type' => 'any'
4045 }
4046 ],
4047 'size' => 96
4048 }
4049 );
4050
4051 "identifier"
4052 holds the struct or union identifier. This key is not
4053 present if the compound has no identifier.
4054
4055 "context"
4056 is the context in which the struct or union is defined.
4057 This is the filename followed by the line number in
4058 parentheses.
4059
4060 "type"
4061 is either 'struct' or 'union'.
4062
4063 "size"
4064 is the size of the struct or union.
4065
4066 "align"
4067 is the alignment of the struct or union.
4068
4069 "pack"
4070 is the struct member alignment if the compound is packed,
4071 or zero otherwise.
4072
4073 "declarations"
4074 is an array of hash references describing each struct
4075 declaration:
4076
4077 "type"
4078 is the type of the struct declaration. This may be a
4079 string or a reference to a hash describing the type.
4080
4081 "declarators"
4082 is an array of hashes describing each declarator:
4083
4084 "declarator"
4085 is a string representation of the declarator.
4086
4087 "offset"
4088 is the offset of the struct member represented by
4089 the current declarator relative to the beginning of
4090 the struct or union.
4091
4092 "size"
4093 is the size occupied by the struct member
4094 represented by the current declarator.
4095
4096 It may be useful to have separate lists for structs and unions.
4097 One way to retrieve such lists would be to use
4098
4099 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4100 for $c->compound;
4101
4102 However, you should use the "struct" and "union" methods, which
4103 is a lot simpler:
4104
4105 @structs = $c->struct;
4106 @unions = $c->union;
4107
4108 struct_names
4109 "struct_names"
4110 Returns a list of all defined struct identifiers. This is
4111 equivalent to calling "compound_names", just that it only
4112 returns the names of the struct identifiers and doesn't return
4113 the names of the union identifiers.
4114
4115 struct
4116 "struct"
4117 "struct" LIST
4118 Like the "compound" method, but only allows for structs.
4119
4120 union_names
4121 "union_names"
4122 Returns a list of all defined union identifiers. This is
4123 equivalent to calling "compound_names", just that it only
4124 returns the names of the union identifiers and doesn't return
4125 the names of the struct identifiers.
4126
4127 union
4128 "union"
4129 "union" LIST
4130 Like the "compound" method, but only allows for unions.
4131
4132 typedef_names
4133 "typedef_names"
4134 Returns a list of all defined typedef identifiers. Typedefs
4135 that do not specify a type that you could actually work with
4136 will not be returned.
4137
4138 The "def" method returns a true value for all identifiers
4139 returned by "typedef_names".
4140
4141 typedef
4142 "typedef"
4143 "typedef" LIST
4144 Returns a list of references to hashes containing detailed
4145 information about all typedefs that have been parsed.
4146
4147 If a list of typedef identifiers is passed to the method, the
4148 returned list will only contain hash references for those
4149 typedefs.
4150
4151 If an identifier cannot be found, the returned list will
4152 contain an undefined value at that position.
4153
4154 In scalar context, the number of typedefs will be returned as
4155 long as the number of arguments to the method call is not 1. In
4156 the latter case, a hash reference holding information for the
4157 typedef will be returned.
4158
4159 The list returned by the "typedef" method looks similar to
4160 this:
4161
4162 @typedef = (
4163 {
4164 'declarator' => 'U32',
4165 'type' => 'unsigned long'
4166 },
4167 {
4168 'declarator' => '*any',
4169 'type' => 'void'
4170 },
4171 {
4172 'declarator' => 'test',
4173 'type' => {
4174 'align' => 1,
4175 'context' => 'definitions.c(29)',
4176 'pack' => 0,
4177 'type' => 'union',
4178 'declarations' => [
4179 {
4180 'declarators' => [
4181 {
4182 'declarator' => 'abc[2]',
4183 'size' => 8,
4184 'offset' => 0
4185 }
4186 ],
4187 'type' => 'int'
4188 },
4189 {
4190 'declarators' => [
4191 {
4192 'declarator' => 'ab[3][4]',
4193 'size' => 96,
4194 'offset' => 0
4195 }
4196 ],
4197 'type' => 'struct xxx'
4198 },
4199 {
4200 'declarators' => [
4201 {
4202 'declarator' => 'ptr',
4203 'size' => 4,
4204 'offset' => 0
4205 }
4206 ],
4207 'type' => 'any'
4208 }
4209 ],
4210 'size' => 96
4211 }
4212 }
4213 );
4214
4215 "declarator"
4216 is the type declarator.
4217
4218 "type"
4219 is the type specification. This may be a string or a
4220 reference to a hash describing the type. See "enum" and
4221 "compound" for a description on how to interpret this hash.
4222
4223 macro_names
4224 "macro_names"
4225 Returns a list of all defined macro names.
4226
4227 The list returned by the "macro_names" method looks similar to
4228 this:
4229
4230 @macro_names = (
4231 '__STDC_VERSION__',
4232 '__STDC_HOSTED__',
4233 'DEFINED',
4234 'MULTIPLY',
4235 'ABC_SIZE'
4236 );
4237
4238 This works only as long as the preprocessor is not reset. See
4239 "Preprocessor configuration" for details.
4240
4241 macro
4242 "macro"
4243 "macro" LIST
4244 Returns the definitions for all defined macros.
4245
4246 If a list of macro names is passed to the method, the returned
4247 list will only contain the definitions for those macros. For
4248 undefined macros, "undef" will be returned.
4249
4250 The list returned by the "macro" method looks similar to this:
4251
4252 @macro = (
4253 '__STDC_VERSION__ 199901L',
4254 '__STDC_HOSTED__ 1',
4255 'DEFINED',
4256 'MULTIPLY(x, y) ((x)*(y))',
4257 'ABC_SIZE 2'
4258 );
4259
4260 This works only as long as the preprocessor is not reset. See
4261 "Preprocessor configuration" for details.
4262
4264 You can alternatively call the following functions as methods on
4265 Convert::Binary::C objects.
4266
4267 feature
4268 "feature" STRING
4269 Checks if Convert::Binary::C was built with certain features.
4270 For example,
4271
4272 print "debugging version"
4273 if Convert::Binary::C::feature('debug');
4274
4275 will check if Convert::Binary::C was built with debugging
4276 support enabled. The "feature" function returns 1 if the
4277 feature is enabled, 0 if the feature is disabled, and "undef"
4278 if the feature is unknown. Currently the only features that can
4279 be checked are "ieeefp" and "debug".
4280
4281 You can enable or disable certain features at compile time of
4282 the module by using the
4283
4284 perl Makefile.PL enable-feature disable-feature
4285
4286 syntax.
4287
4288 native
4289 "native"
4290 "native" STRING
4291 Returns the value of a property of the native system that
4292 Convert::Binary::C was built on. For example,
4293
4294 $size = Convert::Binary::C::native('IntSize');
4295
4296 will fetch the size of an "int" on the native system. The
4297 following properties can be queried:
4298
4299 Alignment
4300 ByteOrder
4301 CharSize
4302 CompoundAlignment
4303 DoubleSize
4304 EnumSize
4305 FloatSize
4306 HostedC
4307 IntSize
4308 LongDoubleSize
4309 LongLongSize
4310 LongSize
4311 PointerSize
4312 ShortSize
4313 StdCVersion
4314 UnsignedBitfields
4315 UnsignedChars
4316
4317 You can also call "native" without arguments, in which case it
4318 will return a reference to a hash with all properties, like:
4319
4320 $native = {
4321 'StdCVersion' => undef,
4322 'ByteOrder' => 'LittleEndian',
4323 'LongSize' => 4,
4324 'IntSize' => 4,
4325 'HostedC' => 1,
4326 'ShortSize' => 2,
4327 'UnsignedChars' => 0,
4328 'DoubleSize' => 8,
4329 'CharSize' => 1,
4330 'EnumSize' => 4,
4331 'PointerSize' => 4,
4332 'FloatSize' => 4,
4333 'LongLongSize' => 8,
4334 'Alignment' => 4,
4335 'LongDoubleSize' => 12,
4336 'UnsignedBitfields' => 0,
4337 'CompoundAlignment' => 1
4338 };
4339
4340 The contents of that hash are suitable for passing them to the
4341 "configure" method.
4342
4344 Like perl itself, Convert::Binary::C can be compiled with debugging
4345 support that can then be selectively enabled at runtime. You can
4346 specify whether you like to build Convert::Binary::C with debugging
4347 support or not by explicitly giving an argument to Makefile.PL. Use
4348
4349 perl Makefile.PL enable-debug
4350
4351 to enable debugging, or
4352
4353 perl Makefile.PL disable-debug
4354
4355 to disable debugging. The default will depend on how your perl binary
4356 was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4357 be built with debugging support, too.
4358
4359 Once you have built Convert::Binary::C with debugging support, you can
4360 use the following syntax to enable debug output. Instead of
4361
4362 use Convert::Binary::C;
4363
4364 you simply say
4365
4366 use Convert::Binary::C debug => 'all';
4367
4368 which will enable all debug output. However, I don't recommend to
4369 enable all debug output, because that can be a fairly large amount.
4370
4371 Debugging options
4372 Instead of saying "all", you can pass a string that consists of one or
4373 more of the following characters:
4374
4375 m enable memory allocation tracing
4376 M enable memory allocation & assertion tracing
4377
4378 h enable hash table debugging
4379 H enable hash table dumps
4380
4381 d enable debug output from the XS module
4382 c enable debug output from the ctlib
4383 t enable debug output about type objects
4384
4385 l enable debug output from the C lexer
4386 p enable debug output from the C parser
4387 P enable debug output from the C preprocessor
4388 r enable debug output from the #pragma parser
4389
4390 y enable debug output from yacc (bison)
4391
4392 So the following might give you a brief overview of what's going on
4393 inside Convert::Binary::C:
4394
4395 use Convert::Binary::C debug => 'dct';
4396
4397 When you want to debug memory allocation using
4398
4399 use Convert::Binary::C debug => 'm';
4400
4401 you can use the Perl script check_alloc.pl that resides in the
4402 ctlib/util/tool directory to extract statistics about memory usage and
4403 information about memory leaks from the resulting debug output.
4404
4405 Redirecting debug output
4406 By default, all debug output is written to "stderr". You can, however,
4407 redirect the debug output to a file with the "debugfile" option:
4408
4409 use Convert::Binary::C debug => 'dcthHm',
4410 debugfile => './debug.out';
4411
4412 If the file cannot be opened, you'll receive a warning and the output
4413 will go the "stderr" way again.
4414
4415 Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4416 and "CBC_DEBUG_FILE" to turn on debug output.
4417
4418 If Convert::Binary::C is built without debugging support, passing the
4419 "debug" or "debugfile" options will cause a warning to be issued. The
4420 corresponding environment variables will simply be ignored.
4421
4423 "CBC_ORDER_MEMBERS"
4424 Setting this variable to a non-zero value will globally turn on hash
4425 key ordering for compound members. Have a look at the "OrderMembers"
4426 option for details.
4427
4428 Setting the variable to the name of a perl module will additionally use
4429 this module instead of the predefined modules for member ordering to
4430 tie the hashes to.
4431
4432 "CBC_DEBUG_OPT"
4433 If Convert::Binary::C is built with debugging support, you can use this
4434 variable to specify the debugging options.
4435
4436 "CBC_DEBUG_FILE"
4437 If Convert::Binary::C is built with debugging support, you can use this
4438 variable to redirect the debug output to a file.
4439
4440 "CBC_DISABLE_PARSER"
4441 This variable is intended purely for development. Setting it to a non-
4442 zero value disables the Convert::Binary::C parser, which means that no
4443 information is collected from the file or code that is parsed. However,
4444 the preprocessor will run, which is useful for benchmarking the
4445 preprocessor.
4446
4448 Flexible array members are a feature introduced with ISO-C99. It's a
4449 common problem that you have a variable length data field at the end of
4450 a structure, for example an array of characters at the end of a message
4451 struct. ISO-C99 allows you to write this as:
4452
4453 struct message {
4454 long header;
4455 char data[];
4456 };
4457
4458 The advantage is that you clearly indicate that the size of the
4459 appended data is variable, and that the "data" member doesn't
4460 contribute to the size of the "message" structure.
4461
4462 When packing or unpacking data, Convert::Binary::C deals with flexible
4463 array members as if their length was adjustable. For example, "unpack"
4464 will adapt the length of the array depending on the input string:
4465
4466 $msg1 = $c->unpack('message', 'abcdefg');
4467 $msg2 = $c->unpack('message', 'abcdefghijkl');
4468
4469 The following data is unpacked:
4470
4471 $msg1 = {
4472 'data' => [
4473 101,
4474 102,
4475 103
4476 ],
4477 'header' => 1633837924
4478 };
4479 $msg2 = {
4480 'data' => [
4481 101,
4482 102,
4483 103,
4484 104,
4485 105,
4486 106,
4487 107,
4488 108
4489 ],
4490 'header' => 1633837924
4491 };
4492
4493 Similarly, pack will adjust the length of the output string according
4494 to the data you feed in:
4495
4496 use Data::Hexdumper;
4497
4498 $msg = {
4499 header => 4711,
4500 data => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4501 };
4502
4503 $data = $c->pack('message', $msg);
4504
4505 print hexdump(data => $data);
4506
4507 This would print:
4508
4509 0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{|}~
4510 0x0010 : 7F 80 81 82 83 84 85 86 87 88 : ..........
4511
4512 Incomplete types such as
4513
4514 typedef unsigned long array[];
4515
4516 are handled in exactly the same way. Thus, you can easily
4517
4518 $array = $c->unpack('array', '?'x20);
4519
4520 which will unpack the following array:
4521
4522 $array = [
4523 1061109567,
4524 1061109567,
4525 1061109567,
4526 1061109567,
4527 1061109567
4528 ];
4529
4530 You can also alter the length of an array using the "Dimension" tag.
4531
4533 When using Convert::Binary::C to handle floating point values, you have
4534 to be aware of some limitations.
4535
4536 You're usually safe if all your platforms are using the IEEE floating
4537 point format. During the Convert::Binary::C build process, the "ieeefp"
4538 feature will automatically be enabled if the host is using IEEE
4539 floating point. You can check for this feature at runtime using the
4540 "feature" function:
4541
4542 if (Convert::Binary::C::feature('ieeefp')) {
4543 # do something
4544 }
4545
4546 When IEEE floating point support is enabled, the module can also handle
4547 floating point values of a different byteorder.
4548
4549 If your host platform is not using IEEE floating point, the "ieeefp"
4550 feature will be disabled. Convert::Binary::C then will be more
4551 restrictive, refusing to handle any non-native floating point values.
4552
4553 However, Convert::Binary::C cannot detect the floating point format
4554 used by your target platform. It can only try to prevent problems in
4555 obvious cases. If you know your target platform has a completely
4556 different floating point format, don't use floating point conversion at
4557 all.
4558
4559 Whenever Convert::Binary::C detects that it cannot properly do floating
4560 point value conversion, it will issue a warning and will not attempt to
4561 convert the floating point value.
4562
4564 Bitfield support in Convert::Binary::C is currently in an experimental
4565 state. You are encouraged to test it, but you should not blindly rely
4566 on its results.
4567
4568 You are also encouraged to supply layouting algorithms for compilers
4569 whose bitfield implementation is not handled correctly at the moment.
4570 Even better that the plain algorithm is of course a patch that adds a
4571 new bitfield layouting engine.
4572
4573 While bitfields may not be handled correctly by the conversion routines
4574 yet, they are always parsed correctly. This means that you can reliably
4575 use the declarator fields as returned by the "struct" or "typedef"
4576 methods. Given the following source
4577
4578 struct bitfield {
4579 int seven:7;
4580 int :1;
4581 int four:4, :0;
4582 int integer;
4583 };
4584
4585 a call to "struct" will return
4586
4587 @struct = (
4588 {
4589 'identifier' => 'bitfield',
4590 'align' => 1,
4591 'context' => 'bitfields.c(1)',
4592 'pack' => 0,
4593 'type' => 'struct',
4594 'declarations' => [
4595 {
4596 'declarators' => [
4597 {
4598 'declarator' => 'seven:7'
4599 }
4600 ],
4601 'type' => 'int'
4602 },
4603 {
4604 'declarators' => [
4605 {
4606 'declarator' => ':1'
4607 }
4608 ],
4609 'type' => 'int'
4610 },
4611 {
4612 'declarators' => [
4613 {
4614 'declarator' => 'four:4'
4615 },
4616 {
4617 'declarator' => ':0'
4618 }
4619 ],
4620 'type' => 'int'
4621 },
4622 {
4623 'declarators' => [
4624 {
4625 'declarator' => 'integer',
4626 'size' => 4,
4627 'offset' => 4
4628 }
4629 ],
4630 'type' => 'int'
4631 }
4632 ],
4633 'size' => 8
4634 }
4635 );
4636
4637 No size/offset keys will currently be returned for bitfield entries.
4638
4640 Convert::Binary::C was designed to be thread-safe.
4641
4643 If you wish to derive a new class from Convert::Binary::C, this is
4644 relatively easy. Despite their XS implementation, Convert::Binary::C
4645 objects are actually blessed hash references.
4646
4647 The XS data is stored in a read-only hash value for the key that is the
4648 empty string. So it is safe to use any non-empty hash key when deriving
4649 your own class. In addition, Convert::Binary::C does quite a lot of
4650 checks to detect corruption in the object hash.
4651
4652 If you store private data in the hash, you should override the "clone"
4653 method and provide the necessary code to clone your private data.
4654 You'll have to call "SUPER::clone", but this will only clone the
4655 Convert::Binary::C part of the object.
4656
4657 For an example of a derived class, you can have a look at
4658 Convert::Binary::C::Cached.
4659
4661 Convert::Binary::C should build and run on most of the platforms that
4662 Perl runs on:
4663
4664 · Various Linux systems
4665
4666 · Various BSD systems
4667
4668 · HP-UX
4669
4670 · Compaq/HP Tru64 Unix
4671
4672 · Mac-OS X
4673
4674 · Cygwin
4675
4676 · Windows 98/NT/2000/XP
4677
4678 Also, many architectures are supported:
4679
4680 · Various Intel Pentium and Itanium systems
4681
4682 · Various Alpha systems
4683
4684 · HP PA-RISC
4685
4686 · Power-PC
4687
4688 · StrongARM
4689
4690 The module should build with any perl binary from 5.004 up to the
4691 latest development version.
4692
4694 Most of the time when you're really looking for Convert::Binary::C
4695 you'll actually end up finding one of the following modules. Some of
4696 them have different goals, so it's probably worth pointing out the
4697 differences.
4698
4699 C::Include
4700 Like Convert::Binary::C, this module aims at doing conversion from and
4701 to binary data based on C types. However, its configurability is very
4702 limited compared to Convert::Binary::C. Also, it does not parse all C
4703 code correctly. It's slower than Convert::Binary::C, doesn't have a
4704 preprocessor. On the plus side, it's written in pure Perl.
4705
4706 C::DynaLib::Struct
4707 This module doesn't allow you to reuse your C source code. One main
4708 goal of Convert::Binary::C was to avoid code duplication or, even
4709 worse, having to maintain different representations of your data
4710 structures. Like C::Include, C::DynaLib::Struct is rather limited in
4711 its configurability.
4712
4713 Win32::API::Struct
4714 This module has a special purpose. It aims at building structs for
4715 interfacing Perl code with Windows API code.
4716
4718 · My love Jennifer for always being there, for filling my life with joy
4719 and last but not least for proofreading the documentation.
4720
4721 · Alain Barbet <alian@cpan.org> for testing and debugging support.
4722
4723 · Mitchell N. Charity for giving me pointers into various interesting
4724 directions.
4725
4726 · Alexis Denis for making me improve (externally) and simplify
4727 (internally) floating point support. He can also be blamed
4728 (indirectly) for the "initializer" method, as I need it in my effort
4729 to support bitfields some day.
4730
4731 · Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4732 way to and back home from work, and for making me think about
4733 supporting "pack" and "unpack" for compound members.
4734
4735 · Thorsten Jens <thojens@gmx.de> for testing the package on various
4736 platforms.
4737
4738 · Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4739 giving invaluable feedback.
4740
4741 · Thomas Pornin <pornin@bolet.org> for his excellent "ucpp"
4742 preprocessor library.
4743
4744 · Marc Rosenthal for his suggestions and support.
4745
4746 · James Roskind, as his C parser was a great starting point to fix all
4747 the problems I had with my original parser based only on the ANSI
4748 ruleset.
4749
4750 · Gisbert W. Selke for spotting some interesting bugs and providing
4751 extensive reports.
4752
4753 · Steffen Zimmermann for a prolific discussion on the cloning
4754 algorithm.
4755
4757 There's also a mailing list that you can join:
4758
4759 convert-binary-c@yahoogroups.com
4760
4761 To subscribe, simply send mail to:
4762
4763 convert-binary-c-subscribe@yahoogroups.com
4764
4765 You can use this mailing list for non-bug problems, questions or
4766 discussions.
4767
4769 I'm sure there are still lots of bugs in the code for this module. If
4770 you find any bugs, Convert::Binary::C doesn't seem to build on your
4771 system or any of its tests fail, please use the CPAN Request Tracker at
4772 <http://rt.cpan.org/> to create a ticket for the module. Alternatively,
4773 just send a mail to <mhx@cpan.org>.
4774
4776 Some features in Convert::Binary::C are marked as experimental. This
4777 has most probably one of the following reasons:
4778
4779 · The feature does not behave in exactly the way that I wish it did,
4780 possibly due to some limitations in the current design of the module.
4781
4782 · The feature hasn't been tested enough and may completely fail to
4783 produce the expected results.
4784
4785 I hope to fix most issues with these experimental features someday, but
4786 this may mean that I have to change the way they currently work in a
4787 way that's not backwards compatible. So if any of these features is
4788 useful to you, you can use it, but you should be aware that the
4789 behaviour or the interface may change in future releases of this
4790 module.
4791
4793 If you're interested in what I currently plan to improve (or fix), have
4794 a look at the TODO file.
4795
4797 If you're using my module and like it, you can show your appreciation
4798 by sending me a postcard from where you live. I won't urge you to do
4799 it, it's completely up to you. To me, this is just a very nice way of
4800 receiving feedback about my work. Please send your postcard to:
4801
4802 Marcus Holland-Moritz
4803 Kuppinger Weg 28
4804 71116 Gaertringen
4805 GERMANY
4806
4807 If you feel that sending a postcard is too much effort, you maybe want
4808 to rate the module at <http://cpanratings.perl.org/>.
4809
4811 Copyright (c) 2002-2015 Marcus Holland-Moritz. All rights reserved.
4812 This program is free software; you can redistribute it and/or modify it
4813 under the same terms as Perl itself.
4814
4815 The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4816 redistribution details refer to ctlib/ucpp/README.
4817
4818 Portions copyright (c) 1989, 1990 James A. Roskind.
4819
4820 The include files located in tests/include/include, which are used in
4821 some of the test scripts are (c) 1991-1999, 2000, 2001 Free Software
4822 Foundation, Inc. They are neither required to create the binary nor
4823 linked to the source code of this module in any other way.
4824
4826 See ccconfig, perl, perldata, perlop, perlvar, Data::Dumper and
4827 Scalar::Util.
4828
4829
4830
4831perl v5.30.1 2020-01-29 Convert::Binary::C(3)