1Convert::Binary::C(3) User Contributed Perl DocumentationConvert::Binary::C(3)
2
3
4
6 Convert::Binary::C - Binary Data Conversion using C Types
7
9 Simple
10 use Convert::Binary::C;
11
12 #---------------------------------------------
13 # Create a new object and parse embedded code
14 #---------------------------------------------
15 my $c = Convert::Binary::C->new->parse(<<ENDC);
16
17 enum Month { JAN, FEB, MAR, APR, MAY, JUN,
18 JUL, AUG, SEP, OCT, NOV, DEC };
19
20 struct Date {
21 int year;
22 enum Month month;
23 int day;
24 };
25
26 ENDC
27
28 #-----------------------------------------------
29 # Pack Perl data structure into a binary string
30 #-----------------------------------------------
31 my $date = { year => 2002, month => 'DEC', day => 24 };
32
33 my $packed = $c->pack('Date', $date);
34
35 Advanced
36 use Convert::Binary::C;
37 use Data::Dumper;
38
39 #---------------------
40 # Create a new object
41 #---------------------
42 my $c = new Convert::Binary::C ByteOrder => 'BigEndian';
43
44 #---------------------------------------------------
45 # Add include paths and global preprocessor defines
46 #---------------------------------------------------
47 $c->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include',
48 '/usr/include')
49 ->Define(qw( __USE_POSIX __USE_ISOC99=1 ));
50
51 #----------------------------------
52 # Parse the 'time.h' header file
53 #----------------------------------
54 $c->parse_file('time.h');
55
56 #---------------------------------------
57 # See which files the object depends on
58 #---------------------------------------
59 print Dumper([$c->dependencies]);
60
61 #-----------------------------------------------------------
62 # See if struct timespec is defined and dump its definition
63 #-----------------------------------------------------------
64 if ($c->def('struct timespec')) {
65 print Dumper($c->struct('timespec'));
66 }
67
68 #-------------------------------
69 # Create some binary dummy data
70 #-------------------------------
71 my $data = "binary_test_string";
72
73 #--------------------------------------------------------
74 # Unpack $data according to 'struct timespec' definition
75 #--------------------------------------------------------
76 if (length($data) >= $c->sizeof('timespec')) {
77 my $perl = $c->unpack('timespec', $data);
78 print Dumper($perl);
79 }
80
81 #--------------------------------------------------------
82 # See which member lies at offset 5 of 'struct timespec'
83 #--------------------------------------------------------
84 my $member = $c->member('timespec', 5);
85 print "member('timespec', 5) = '$member'\n";
86
88 Convert::Binary::C is a preprocessor and parser for C type definitions.
89 It is highly configurable and supports arbitrarily complex data
90 structures. Its object-oriented interface has "pack" and "unpack"
91 methods that act as replacements for Perl's "pack" and "unpack" and
92 allow to use C types instead of a string representation of the data
93 structure for conversion of binary data from and to Perl's complex data
94 structures.
95
96 Actually, what Convert::Binary::C does is not very different from what
97 a C compiler does, just that it doesn't compile the source code into an
98 object file or executable, but only parses the code and allows Perl to
99 use the enumerations, structs, unions and typedefs that have been
100 defined within your C source for binary data conversion, similar to
101 Perl's "pack" and "unpack".
102
103 Beyond that, the module offers a lot of convenience methods to retrieve
104 information about the C types that have been parsed.
105
106 Background and History
107 In late 2000 I wrote a real-time debugging interface for an embedded
108 medical device that allowed me to send out data from that device over
109 its integrated Ethernet adapter. The interface was "printf()"-like, so
110 you could easily send out strings or numbers. But you could also send
111 out what I called arbitrary data, which was intended for arbitrary
112 blocks of the device's memory.
113
114 Another part of this real-time debugger was a Perl application running
115 on my workstation that gathered all the messages that were sent out
116 from the embedded device. It printed all the strings and numbers, and
117 hex-dumped the arbitrary data. However, manually parsing a couple of
118 300 byte hex-dumps of a complex C structure is not only frustrating,
119 but also error-prone and time consuming.
120
121 Using "unpack" to retrieve the contents of a C structure works fine for
122 small structures and if you don't have to deal with struct member
123 alignment. But otherwise, maintaining such code can be as awful as
124 deciphering hex-dumps.
125
126 As I didn't find anything to solve my problem on the CPAN, I wrote a
127 little module that translated simple C structs into "unpack" strings.
128 It worked, but it was slow. And since it couldn't deal with struct
129 member alignment, I soon found myself adding padding bytes everywhere.
130 So again, I had to maintain two sources, and changing one of them
131 forced me to touch the other one.
132
133 All in all, this little module seemed to make my task a bit easier, but
134 it was far from being what I was thinking of:
135
136 · A module that could directly use the source I've been coding for the
137 embedded device without any modifications.
138
139 · A module that could be configured to match the properties of the
140 different compilers and target platforms I was using.
141
142 · A module that was fast enough to decode a great amount of binary data
143 even on my slow workstation.
144
145 I didn't know how to accomplish these tasks until I read something
146 about XS. At least, it seemed as if it could solve my performance
147 problems. However, writing a C parser in C isn't easier than it is in
148 Perl. But writing a C preprocessor from scratch is even worse.
149
150 Fortunately enough, after a few weeks of searching I found both, a
151 lean, open-source C preprocessor library, and a reusable YACC grammar
152 for ANSI-C. That was the beginning of the development of
153 Convert::Binary::C in late 2001.
154
155 Now, I'm successfully using the module in my embedded environment since
156 long before it appeared on CPAN. From my point of view, it is exactly
157 what I had in mind. It's fast, flexible, easy to use and portable. It
158 doesn't require external programs or other Perl modules.
159
160 About this document
161 This document describes how to use Convert::Binary::C. A lot of
162 different features are presented, and the example code sometimes uses
163 Perl's more advanced language elements. If your experience with Perl is
164 rather limited, you should know how to use Perl's very good
165 documentation system.
166
167 To look up one of the manpages, use the "perldoc" command. For
168 example,
169
170 perldoc perl
171
172 will show you Perl's main manpage. To look up a specific Perl function,
173 use "perldoc -f":
174
175 perldoc -f map
176
177 gives you more information about the "map" function. You can also
178 search the FAQ using "perldoc -q":
179
180 perldoc -q array
181
182 will give you everything you ever wanted to know about Perl arrays. But
183 now, let's go on with some real stuff!
184
185 Why use Convert::Binary::C?
186 Say you want to pack (or unpack) data according to the following C
187 structure:
188
189 struct foo {
190 char ary[3];
191 unsigned short baz;
192 int bar;
193 };
194
195 You could of course use Perl's "pack" and "unpack" functions:
196
197 @ary = (1, 2, 3);
198 $baz = 40000;
199 $bar = -4711;
200 $binary = pack 'c3 S i', @ary, $baz, $bar;
201
202 But this implies that the struct members are byte aligned. If they were
203 long aligned (which is the default for most compilers), you'd have to
204 write
205
206 $binary = pack 'c3 x S x2 i', @ary, $baz, $bar;
207
208 which doesn't really increase readability.
209
210 Now imagine that you need to pack the data for a completely different
211 architecture with different byte order. You would look into the "pack"
212 manpage again and perhaps come up with this:
213
214 $binary = pack 'c3 x n x2 N', @ary, $baz, $bar;
215
216 However, if you try to unpack $foo again, your signed values have
217 turned into unsigned ones.
218
219 All this can still be managed with Perl. But imagine your structures
220 get more complex? Imagine you need to support different platforms?
221 Imagine you need to make changes to the structures? You'll not only
222 have to change the C source but also dozens of "pack" strings in your
223 Perl code. This is no fun. And Perl should be fun.
224
225 Now, wouldn't it be great if you could just read in the C source you've
226 already written and use all the types defined there for packing and
227 unpacking? That's what Convert::Binary::C does.
228
229 Creating a Convert::Binary::C object
230 To use Convert::Binary::C just say
231
232 use Convert::Binary::C;
233
234 to load the module. Its interface is completely object oriented, so it
235 doesn't export any functions.
236
237 Next, you need to create a new Convert::Binary::C object. This can be
238 done by either
239
240 $c = Convert::Binary::C->new;
241
242 or
243
244 $c = new Convert::Binary::C;
245
246 You can optionally pass configuration options to the constructor as
247 described in the next section.
248
249 Configuring the object
250 To configure a Convert::Binary::C object, you can either call the
251 "configure" method or directly pass the configuration options to the
252 constructor. If you want to change byte order and alignment, you can
253 use
254
255 $c->configure(ByteOrder => 'LittleEndian',
256 Alignment => 2);
257
258 or you can change the construction code to
259
260 $c = new Convert::Binary::C ByteOrder => 'LittleEndian',
261 Alignment => 2;
262
263 Either way, the object will now know that it should use little endian
264 (Intel) byte order and 2-byte struct member alignment for packing and
265 unpacking.
266
267 Alternatively, you can use the option names as names of methods to
268 configure the object, like:
269
270 $c->ByteOrder('LittleEndian');
271
272 You can also retrieve information about the current configuration of a
273 Convert::Binary::C object. For details, see the section about the
274 "configure" method.
275
276 Parsing C code
277 Convert::Binary::C allows two ways of parsing C source. Either by
278 parsing external C header or C source files:
279
280 $c->parse_file('header.h');
281
282 Or by parsing C code embedded in your script:
283
284 $c->parse(<<'CCODE');
285 struct foo {
286 char ary[3];
287 unsigned short baz;
288 int bar;
289 };
290 CCODE
291
292 Now the object $c will know everything about "struct foo". The example
293 above uses a so-called here-document. It allows to easily embed multi-
294 line strings in your code. You can find more about here-documents in
295 perldata or perlop.
296
297 Since the "parse" and "parse_file" methods throw an exception when a
298 parse error occurs, you usually want to catch these in an "eval" block:
299
300 eval { $c->parse_file('header.h') };
301 if ($@) {
302 # handle error appropriately
303 }
304
305 Perl's special $@ variable will contain an empty string (which
306 evaluates to a false value in boolean context) on success or an error
307 string on failure.
308
309 As another feature, "parse" and "parse_file" return a reference to
310 their object on success, just like "configure" does when you're
311 configuring the object. This will allow you to write constructs like
312 this:
313
314 my $c = eval {
315 Convert::Binary::C->new(Include => ['/usr/include'])
316 ->parse_file('header.h')
317 };
318 if ($@) {
319 # handle error appropriately
320 }
321
322 Packing and unpacking
323 Convert::Binary::C has two methods, "pack" and "unpack", that act
324 similar to the functions of same denominator in Perl. To perform the
325 packing described in the example above, you could write:
326
327 $data = {
328 ary => [1, 2, 3],
329 baz => 40000,
330 bar => -4711,
331 };
332 $binary = $c->pack('foo', $data);
333
334 Unpacking will work exactly the same way, just that the "unpack" method
335 will take a byte string as its input and will return a reference to a
336 (possibly very complex) Perl data structure.
337
338 $binary = get_data_from_memory();
339 $data = $c->unpack('foo', $binary);
340
341 You can now easily access all of the values:
342
343 print "foo.ary[1] = $data->{ary}[1]\n";
344
345 Or you can even more conveniently use the Data::Dumper module:
346
347 use Data::Dumper;
348 print Dumper($data);
349
350 The output would look something like this:
351
352 $VAR1 = {
353 'bar' => -271,
354 'baz' => 5000,
355 'ary' => [
356 42,
357 48,
358 100
359 ]
360 };
361
362 Preprocessor configuration
363 Convert::Binary::C uses Thomas Pornin's "ucpp" as an internal C
364 preprocessor. It is compliant to ISO-C99, so you don't have to worry
365 about using even weird preprocessor constructs in your code.
366
367 If your C source contains includes or depends upon preprocessor
368 defines, you may need to configure the internal preprocessor. Use the
369 "Include" and "Define" configuration options for that:
370
371 $c->configure(Include => ['/usr/include',
372 '/home/mhx/include'],
373 Define => [qw( NDEBUG FOO=42 )]);
374
375 If your code uses system includes, it is most likely that you will need
376 to define the symbols that are usually defined by the compiler.
377
378 On some operating systems, the system includes require the preprocessor
379 to predefine a certain set of assertions. Assertions are supported by
380 "ucpp", and you can define them either in the source code using
381 "#assert" or as a property of the Convert::Binary::C object using
382 "Assert":
383
384 $c->configure(Assert => ['predicate(answer)']);
385
386 Information about defined macros can be retrieved from the preprocessor
387 as long as its configuration isn't changed. The preprocessor is
388 implicitly reset if you change one of the following configuration
389 options:
390
391 Include
392 Define
393 Assert
394 HasCPPComments
395 HasMacroVAARGS
396
397 Supported pragma directives
398 Convert::Binary::C supports the "pack" pragma to locally override
399 struct member alignment. The supported syntax is as follows:
400
401 #pragma pack( ALIGN )
402 Sets the new alignment to ALIGN. If ALIGN is 0, resets the
403 alignment to its original value.
404
405 #pragma pack
406 Resets the alignment to its original value.
407
408 #pragma pack( push, ALIGN )
409 Saves the current alignment on a stack and sets the new alignment
410 to ALIGN. If ALIGN is 0, sets the alignment to the default
411 alignment.
412
413 #pragma pack( pop )
414 Restores the alignment to the last value saved on the stack.
415
416 /* Example assumes sizeof( short ) == 2, sizeof( long ) == 4. */
417
418 #pragma pack(1)
419
420 struct nopad {
421 char a; /* no padding bytes between 'a' and 'b' */
422 long b;
423 };
424
425 #pragma pack /* reset to "native" alignment */
426
427 #pragma pack( push, 2 )
428
429 struct pad {
430 char a; /* one padding byte between 'a' and 'b' */
431 long b;
432
433 #pragma pack( push, 1 )
434
435 struct {
436 char c; /* no padding between 'c' and 'd' */
437 short d;
438 } e; /* sizeof( e ) == 3 */
439
440 #pragma pack( pop ); /* back to pack( 2 ) */
441
442 long f; /* one padding byte between 'e' and 'f' */
443 };
444
445 #pragma pack( pop ); /* back to "native" */
446
447 The "pack" pragma as it is currently implemented only affects the
448 maximum struct member alignment. There are compilers that also allow to
449 specify the minimum struct member alignment. This is not supported by
450 Convert::Binary::C.
451
452 Automatic configuration using "ccconfig"
453 As there are over 20 different configuration options, setting all of
454 them correctly can be a lengthy and tedious task.
455
456 The "ccconfig" script, which is bundled with this module, aims at
457 automatically determining the correct compiler configuration by testing
458 the compiler executable. It works for both, native and cross compilers.
459
461 This section covers one of the fundamental features of
462 Convert::Binary::C. It's how type expressions, referred to as TYPEs in
463 the method reference, are handled by the module.
464
465 Many of the methods, namely "pack", "unpack", "sizeof", "typeof",
466 "member", "offsetof", "def", "initializer" and "tag", are passed a TYPE
467 to operate on as their first argument.
468
469 Standard Types
470 These are trivial. Standard types are simply enum names, struct names,
471 union names, or typedefs. Almost every method that wants a TYPE will
472 accept a standard type.
473
474 For enums, structs and unions, the prefixes "enum", "struct" and
475 "union" are optional. However, if a typedef with the same name exists,
476 like in
477
478 struct foo {
479 int bar;
480 };
481
482 typedef int foo;
483
484 you will have to use the prefix to distinguish between the struct and
485 the typedef. Otherwise, a typedef is always given preference.
486
487 Basic Types
488 Basic types, or atomic types, are "int" or "char", for example. It's
489 possible to use these basic types without having parsed any code. You
490 can simply do
491
492 $c = new Convert::Binary::C;
493 $size = $c->sizeof('unsigned long');
494 $data = $c->pack('short int', 42);
495
496 Even though the above works fine, it is not possible to define more
497 complex types on the fly, so
498
499 $size = $c->sizeof('struct { int a, b; }');
500
501 will result in an error.
502
503 Basic types are not supported by all methods. For example, it makes no
504 sense to use "member" or "offsetof" on a basic type. Using "typeof"
505 isn't very useful, but supported.
506
507 Member Expressions
508 This is by far the most complex part, depending on the complexity of
509 your data structures. Any standard type that defines a compound or an
510 array may be followed by a member expression to select only a certain
511 part of the data type. Say you have parsed the following C code:
512
513 struct foo {
514 long type;
515 struct {
516 short x, y;
517 } array[20];
518 };
519
520 typedef struct foo matrix[8][8];
521
522 You may want to know the size of the "array" member of "struct foo".
523 This is quite easy:
524
525 print $c->sizeof('foo.array'), " bytes";
526
527 will print
528
529 80 bytes
530
531 depending of course on the "ShortSize" you configured.
532
533 If you wanted to unpack only a single column of "matrix", that's easy
534 as well (and of course it doesn't matter which index you use):
535
536 $column = $c->unpack('matrix[2]', $data);
537
538 Just like in C, it is possible to use out-of-bounds array indices.
539 This means that, for example, despite "array" is declared to have 20
540 elements, the following code
541
542 $size = $c->sizeof('foo.array[4711]');
543 $offset = $c->offsetof('foo', 'array[-13]');
544
545 is perfectly valid and will result in:
546
547 $size = 4
548 $offset = -48
549
550 Member expressions can be arbitrarily complex:
551
552 $type = $c->typeof('matrix[2][3].array[7].y');
553 print "the type is $type";
554
555 will, for example, print
556
557 the type is short
558
559 Member expressions are also used as the second argument to "offsetof".
560
561 Offsets
562 Members returned by the "member" method have an optional offset suffix
563 to indicate that the given offset doesn't point to the start of that
564 member. For example,
565
566 $member = $c->member('matrix', 1431);
567 print $member;
568
569 will print
570
571 [2][1].type+3
572
573 If you would use this as a member expression, like in
574
575 $size = $c->sizeof("matrix $member");
576
577 the offset suffix will simply be ignored. Actually, it will be ignored
578 for all methods if it's used in the first argument.
579
580 When used in the second argument to "offsetof", it will usually do what
581 you mean, i. e. the offset suffix, if present, will be considered when
582 determining the offset. This behaviour ensures that
583
584 $member = $c->member('foo', 43);
585 $offset = $c->offsetof('foo', $member);
586 print "'$member' is located at offset $offset of struct foo";
587
588 will always correctly set $offset:
589
590 '.array[9].y+1' is located at offset 43 of struct foo
591
592 If this is not what you mean, e.g. because you want to know the offset
593 where the member returned by "member" starts, you just have to remove
594 the suffix:
595
596 $member =~ s/\+\d+$//;
597 $offset = $c->offsetof('foo', $member);
598 print "'$member' starts at offset $offset of struct foo";
599
600 This would then print:
601
602 '.array[9].y' starts at offset 42 of struct foo
603
605 In a nutshell, tags are properties that you can attach to types.
606
607 You can add tags to types using the "tag" method, and remove them using
608 "tag" or "untag", for example:
609
610 # Attach 'Format' and 'Hooks' tags
611 $c->tag('type', Format => 'String', Hooks => { pack => \&rout });
612
613 $c->untag('type', 'Format'); # Remove only 'Format' tag
614 $c->untag('type'); # Remove all tags
615
616 You can also use "tag" to see which tags are attached to a type, for
617 example:
618
619 $tags = $c->tag('type');
620
621 This would give you:
622
623 $tags = {
624 'Hooks' => {
625 'pack' => \&rout
626 },
627 'Format' => 'String'
628 };
629
630 Currently, there are only a couple of different tags that influence the
631 way data is packed and unpacked. There are probably more tags to come
632 in the future.
633
634 The Format Tag
635 One of the tags currently available is the "Format" tag. Using this
636 tag, you can tell a Convert::Binary::C object to pack and unpack a
637 certain data type in a special way.
638
639 For example, if you have a (fixed length) string type
640
641 typedef char str_type[40];
642
643 this type would, by default, be unpacked as an array of "char"s. That's
644 because it is only an array of "char"s, and Convert::Binary::C doesn't
645 know it is actually used as a string.
646
647 But you can tell Convert::Binary::C that "str_type" is a C string using
648 the "Format" tag:
649
650 $c->tag('str_type', Format => 'String');
651
652 This will make "unpack" (and of course also "pack") treat the binary
653 data like a null-terminated C string:
654
655 $binary = "Hello World!\n\0 this is just some dummy data";
656 $hello = $c->unpack('str_type', $binary);
657 print $hello;
658
659 would thusly print:
660
661 Hello World!
662
663 Of course, this also works the other way round:
664
665 use Data::Hexdumper;
666
667 $binary = $c->pack('str_type', "Just another C::B::C hacker");
668 print hexdump(data => $binary);
669
670 would print:
671
672 0x0000 : 4A 75 73 74 20 61 6E 6F 74 68 65 72 20 43 3A 3A : Just.another.C::
673 0x0010 : 42 3A 3A 43 20 68 61 63 6B 65 72 00 00 00 00 00 : B::C.hacker.....
674 0x0020 : 00 00 00 00 00 00 00 00 : ........
675
676 If you want Convert::Binary::C to not interpret the binary data at all,
677 you can set the "Format" tag to "Binary". This might not be seem very
678 useful, as "pack" and "unpack" would just pass through the unmodified
679 binary data. But you can tag not only whole types, but also compound
680 members. For example
681
682 $c->parse(<<ENDC);
683 struct packet {
684 unsigned short header;
685 unsigned short flags;
686 unsigned char payload[28];
687 };
688 ENDC
689
690 $c->tag('packet.payload', Format => 'Binary');
691
692 would allow you to write:
693
694 read FILE, $payload, $c->sizeof('packet.payload');
695
696 $packet = {
697 header => 4711,
698 flags => 0xf00f,
699 payload => $payload,
700 };
701
702 $binary = $c->pack('packet', $packet);
703
704 print hexdump(data => $binary);
705
706 This would print something like:
707
708 0x0000 : 12 67 F0 0F 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A : .g..no.no.no.no.
709 0x0010 : 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E 6F 0A 6E : no.no.no.no.no.n
710
711 For obvious reasons, it is not allowed to attach a "Format" tag to
712 bitfield members. Trying to do so will result in an exception being
713 thrown by the "tag" method.
714
715 The ByteOrder Tag
716 The "ByteOrder" tag allows you to override the byte order of certain
717 types or members. The implementation of this tag is considered
718 experimental and may be subject to changes in the future.
719
720 Usually it doesn't make much sense to override the byte order, but
721 there may be applications where a sub-structure is packed in a
722 different byte order than the surrounding structure.
723
724 Take, for example, the following code:
725
726 $c = Convert::Binary::C->new(ByteOrder => 'BigEndian',
727 OrderMembers => 1);
728 $c->parse(<<'ENDC');
729
730 typedef unsigned short u_16;
731
732 struct coords_3d {
733 long x, y, z;
734 };
735
736 struct coords_msg {
737 u_16 header;
738 u_16 length;
739 struct coords_3d coords;
740 };
741
742 ENDC
743
744 Assume that while "coords_msg" is big endian, the embedded coordinates
745 "coords_3d" are stored in little endian format for some reason. In C,
746 you'll have to handle this manually.
747
748 But using Convert::Binary::C, you can simply attach a "ByteOrder" tag
749 to either the "coords_3d" structure or to the "coords" member of the
750 "coords_msg" structure. Both will work in this case. The only
751 difference is that if you tag the "coords" member, "coords_3d" will
752 only be treated as little endian if you "pack" or "unpack" the
753 "coords_msg" structure. (BTW, you could also tag all members of
754 "coords_3d" individually, but that would be inefficient.)
755
756 So, let's attach the "ByteOrder" tag to the "coords" member:
757
758 $c->tag('coords_msg.coords', ByteOrder => 'LittleEndian');
759
760 Assume the following binary message:
761
762 0x0000 : 00 2A 00 0C FF FF FF FF 02 00 00 00 2A 00 00 00 : .*..........*...
763
764 If you unpack this message...
765
766 $msg = $c->unpack('coords_msg', $binary);
767
768 ...you will get the following data structure:
769
770 $msg = {
771 'header' => 42,
772 'length' => 12,
773 'coords' => {
774 'x' => -1,
775 'y' => 2,
776 'z' => 42
777 }
778 };
779
780 Without the "ByteOrder" tag, you would get:
781
782 $msg = {
783 'header' => 42,
784 'length' => 12,
785 'coords' => {
786 'x' => -1,
787 'y' => 33554432,
788 'z' => 704643072
789 }
790 };
791
792 The "ByteOrder" tag is a recursive tag, i.e. it applies to all children
793 of the tagged object recursively. Of course, it is also possible to
794 override a "ByteOrder" tag by attaching another "ByteOrder" tag to a
795 child type. Confused? Here's an example. In addition to tagging the
796 "coords" member as little endian, we now tag "coords_3d.y" as big
797 endian:
798
799 $c->tag('coords_3d.y', ByteOrder => 'BigEndian');
800 $msg = $c->unpack('coords_msg', $binary);
801
802 This will return the following data structure:
803
804 $msg = {
805 'header' => 42,
806 'length' => 12,
807 'coords' => {
808 'x' => -1,
809 'y' => 33554432,
810 'z' => 42
811 }
812 };
813
814 Note that if you tag both a type and a member of that type within a
815 compound, the tag attached to the type itself has higher precedence.
816 Using the example above, if you would attach a "ByteOrder" tag to both
817 "coords_msg.coords" and "coords_3d", the tag attached to "coords_3d"
818 would always win.
819
820 Also note that the "ByteOrder" tag might not work as expected along
821 with bitfields, which is why the implementation is considered
822 experimental. Bitfields are currently not affected by the "ByteOrder"
823 tag at all. This is because the byte order would affect the bitfield
824 layout, and a consistent implementation supporting multiple layouts of
825 the same struct would be quite bulky and probably slow down the whole
826 module.
827
828 If you really need the correct behaviour, you can use the following
829 trick:
830
831 $le = Convert::Binary::C->new(ByteOrder => 'LittleEndian');
832
833 $le->parse(<<'ENDC');
834
835 typedef unsigned short u_16;
836 typedef unsigned long u_32;
837
838 struct message {
839 u_16 header;
840 u_16 length;
841 struct {
842 u_32 a;
843 u_32 b;
844 u_32 c : 7;
845 u_32 d : 5;
846 u_32 e : 20;
847 } data;
848 };
849
850 ENDC
851
852 $be = $le->clone->ByteOrder('BigEndian');
853
854 $le->tag('message.data', Format => 'Binary', Hooks => {
855 unpack => sub { $be->unpack('message.data', @_) },
856 pack => sub { $be->pack('message.data', @_) },
857 });
858
859
860 $msg = $le->unpack('message', $binary);
861
862 This uses the "Format" and "Hooks" tags along with a big endian "clone"
863 of the original little endian object. It attaches hooks to the little
864 endian object and in the hooks it uses the big endian object to "pack"
865 and "unpack" the binary data.
866
867 The Dimension Tag
868 The "Dimension" tag allows you to override the declared dimension of an
869 array for packing or unpacking data. The implementation of this tag is
870 considered very experimental and will definitely change in a future
871 release.
872
873 That being said, the "Dimension" tag is primarily useful to support
874 variable length arrays. Usually, you have to write the following code
875 for such a variable length array in C:
876
877 struct c_message
878 {
879 unsigned count;
880 char data[1];
881 };
882
883 So, because you cannot declare an empty array, you declare an array
884 with a single element. If you have a ISO-C99 compliant compiler, you
885 can write this code instead:
886
887 struct c99_message
888 {
889 unsigned count;
890 char data[];
891 };
892
893 This explicitly tells the compiler that "data" is a flexible array
894 member. Convert::Binary::C already uses this information to handle
895 flexible array members in a special way.
896
897 As you can see in the following example, the two types are treated
898 differently:
899
900 $data = pack 'NC*', 3, 1..8;
901 $uc = $c->unpack('c_message', $data);
902 $uc99 = $c->unpack('c99_message', $data);
903
904 This will result in:
905
906 $uc = {'count' => 3,'data' => [1]};
907 $uc99 = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
908
909 However, only few compilers support ISO-C99, and you probably don't
910 want to change your existing code only to get some extra features when
911 using Convert::Binary::C.
912
913 So it is possible to attach a tag to the "data" member of the
914 "c_message" struct that tells Convert::Binary::C to treat the array as
915 if it were flexible:
916
917 $c->tag('c_message.data', Dimension => '*');
918
919 Now both "c_message" and "c99_message" will behave exactly the same
920 when using "pack" or "unpack". Repeating the above code:
921
922 $uc = $c->unpack('c_message', $data);
923
924 This will result in:
925
926 $uc = {'count' => 3,'data' => [1,2,3,4,5,6,7,8]};
927
928 But there's more you can do. Even though it probably doesn't make much
929 sense, you can tag a fixed dimension to an array:
930
931 $c->tag('c_message.data', Dimension => '5');
932
933 This will obviously result in:
934
935 $uc = {'count' => 3,'data' => [1,2,3,4,5]};
936
937 A more useful way to use the "Dimension" tag is to set it to the name
938 of a member in the same compound:
939
940 $c->tag('c_message.data', Dimension => 'count');
941
942 Convert::Binary::C will now use the value of that member to determine
943 the size of the array, so unpacking will result in:
944
945 $uc = {'count' => 3,'data' => [1,2,3]};
946
947 Of course, you can also tag flexible array members. And yes, it's also
948 possible to use more complex member expressions:
949
950 $c->parse(<<ENDC);
951 struct msg_header
952 {
953 unsigned len[2];
954 };
955
956 struct more_complex
957 {
958 struct msg_header hdr;
959 char data[];
960 };
961 ENDC
962
963 $data = pack 'NNC*', 42, 7, 1 .. 10;
964
965 $c->tag('more_complex.data', Dimension => 'hdr.len[1]');
966
967 $u = $c->unpack('more_complex', $data);
968
969 The result will be:
970
971 $u = {
972 'hdr' => {
973 'len' => [
974 42,
975 7
976 ]
977 },
978 'data' => [
979 1,
980 2,
981 3,
982 4,
983 5,
984 6,
985 7
986 ]
987 };
988
989 By the way, it's also possible to tag arrays that are not embedded
990 inside a compound:
991
992 $c->parse(<<ENDC);
993 typedef unsigned short short_array[];
994 ENDC
995
996 $c->tag('short_array', Dimension => '5');
997
998 $u = $c->unpack('short_array', $data);
999
1000 Resulting in:
1001
1002 $u = [0,42,0,7,258];
1003
1004 The final and most powerful way to define a "Dimension" tag is to pass
1005 it a subroutine reference. The referenced subroutine can execute
1006 whatever code is neccessary to determine the size of the tagged array:
1007
1008 sub get_size
1009 {
1010 my $m = shift;
1011 return $m->{hdr}{len}[0] / $m->{hdr}{len}[1];
1012 }
1013
1014 $c->tag('more_complex.data', Dimension => \&get_size);
1015
1016 $u = $c->unpack('more_complex', $data);
1017
1018 As you can guess from the above code, the subroutine is being passed a
1019 reference to hash that stores the already unpacked part of the compound
1020 embedding the tagged array. This is the result:
1021
1022 $u = {
1023 'hdr' => {
1024 'len' => [
1025 42,
1026 7
1027 ]
1028 },
1029 'data' => [
1030 1,
1031 2,
1032 3,
1033 4,
1034 5,
1035 6
1036 ]
1037 };
1038
1039 You can also pass custom arguments to the subroutines by using the
1040 "arg" method. This is similar to the functionality offered by the
1041 "Hooks" tag.
1042
1043 Of course, all that also works for the "pack" method as well.
1044
1045 However, the current implementation has at least one shortcomings,
1046 which is why it's experimental: The "Dimension" tag doesn't impact
1047 compound layout. This means that while you can alter the size of an
1048 array in the middle of a compound, the offset of the members after that
1049 array won't be impacted. I'd rather like to see the layout adapt
1050 dynamically, so this is what I'm hoping to implement in the future.
1051
1052 The Hooks Tag
1053 Hooks are a special kind of tag that can be extremely useful.
1054
1055 Using hooks, you can easily override the way "pack" and "unpack" handle
1056 data using your own subroutines. If you define hooks for a certain
1057 data type, each time this data type is processed the corresponding hook
1058 will be called to allow you to modify that data.
1059
1060 Basic Hooks
1061
1062 Here's an example. Let's assume the following C code has been parsed:
1063
1064 typedef unsigned long u_32;
1065 typedef u_32 ProtoId;
1066 typedef ProtoId MyProtoId;
1067
1068 struct MsgHeader {
1069 MyProtoId id;
1070 u_32 len;
1071 };
1072
1073 struct String {
1074 u_32 len;
1075 char buf[];
1076 };
1077
1078 You could now use the types above and, for example, unpack binary data
1079 representing a "MsgHeader" like this:
1080
1081 $msg_header = $c->unpack('MsgHeader', $data);
1082
1083 This would give you:
1084
1085 $msg_header = {
1086 'len' => 13,
1087 'id' => 42
1088 };
1089
1090 Instead of dealing with "ProtoId"'s as integers, you would rather like
1091 to have them as clear text. You could provide subroutines to convert
1092 between clear text and integers:
1093
1094 %proto = (
1095 CATS => 1,
1096 DOGS => 42,
1097 HEDGEHOGS => 4711,
1098 );
1099
1100 %rproto = reverse %proto;
1101
1102 sub ProtoId_unpack {
1103 $rproto{$_[0]} || 'unknown protocol'
1104 }
1105
1106 sub ProtoId_pack {
1107 $proto{$_[0]} or die 'unknown protocol'
1108 }
1109
1110 You can now register these subroutines by attaching a "Hooks" tag to
1111 "ProtoId" using the "tag" method:
1112
1113 $c->tag('ProtoId', Hooks => { pack => \&ProtoId_pack,
1114 unpack => \&ProtoId_unpack });
1115
1116 Doing exactly the same unpack on "MsgHeader" again would now return:
1117
1118 $msg_header = {
1119 'len' => 13,
1120 'id' => 'DOGS'
1121 };
1122
1123 Actually, if you don't need the reverse operation, you don't even have
1124 to register a "pack" hook. Or, even better, you can have a more
1125 intelligent "unpack" hook that creates a dual-typed variable:
1126
1127 use Scalar::Util qw(dualvar);
1128
1129 sub ProtoId_unpack2 {
1130 dualvar $_[0], $rproto{$_[0]} || 'unknown protocol'
1131 }
1132
1133 $c->tag('ProtoId', Hooks => { unpack => \&ProtoId_unpack2 });
1134
1135 $msg_header = $c->unpack('MsgHeader', $data);
1136
1137 Just as before, this would print
1138
1139 $msg_header = {
1140 'len' => 13,
1141 'id' => 'DOGS'
1142 };
1143
1144 but without requiring a "pack" hook for packing, at least as long as
1145 you keep the variable dual-typed.
1146
1147 Hooks are usually called with exactly one argument, which is the data
1148 that should be processed (see "Advanced Hooks" for details on how to
1149 customize hook arguments). They are called in scalar context and
1150 expected to return the processed data.
1151
1152 To get rid of registered hooks, you can either undefine only certain
1153 hooks
1154
1155 $c->tag('ProtoId', Hooks => { pack => undef });
1156
1157 or all hooks:
1158
1159 $c->tag('ProtoId', Hooks => undef);
1160
1161 Of course, hooks are not restricted to handling integer values. You
1162 could just as well attach hooks for the "String" struct from the code
1163 above. A useful example would be to have these hooks:
1164
1165 sub string_unpack {
1166 my $s = shift;
1167 pack "c$s->{len}", @{$s->{buf}};
1168 }
1169
1170 sub string_pack {
1171 my $s = shift;
1172 return {
1173 len => length $s,
1174 buf => [ unpack 'c*', $s ],
1175 }
1176 }
1177
1178 (Don't be confused by the fact that the "unpack" hook uses "pack" and
1179 the "pack" hook uses "unpack". And also see "Advanced Hooks" for a
1180 more clever approach.)
1181
1182 While you would normally get the following output when unpacking a
1183 "String"
1184
1185 $string = {
1186 'len' => 12,
1187 'buf' => [
1188 72,
1189 101,
1190 108,
1191 108,
1192 111,
1193 32,
1194 87,
1195 111,
1196 114,
1197 108,
1198 100,
1199 33
1200 ]
1201 };
1202
1203 you could just register the hooks using
1204
1205 $c->tag('String', Hooks => { pack => \&string_pack,
1206 unpack => \&string_unpack });
1207
1208 and you would get a nice human-readable Perl string:
1209
1210 $string = 'Hello World!';
1211
1212 Packing a string turns out to be just as easy:
1213
1214 use Data::Hexdumper;
1215
1216 $data = $c->pack('String', 'Just another Perl hacker,');
1217
1218 print hexdump(data => $data);
1219
1220 This would print:
1221
1222 0x0000 : 00 00 00 19 4A 75 73 74 20 61 6E 6F 74 68 65 72 : ....Just.another
1223 0x0010 : 20 50 65 72 6C 20 68 61 63 6B 65 72 2C : .Perl.hacker,
1224
1225 If you want to find out if or which hooks are registered for a certain
1226 type, you can also use the "tag" method:
1227
1228 $hooks = $c->tag('String', 'Hooks');
1229
1230 This would return:
1231
1232 $hooks = {
1233 'unpack' => \&string_unpack,
1234 'pack' => \&string_pack
1235 };
1236
1237 Advanced Hooks
1238
1239 It is also possible to combine hooks with using the "Format" tag. This
1240 can be useful if you know better than Convert::Binary::C how to
1241 interpret the binary data. In the previous section, we've handled this
1242 type
1243
1244 struct String {
1245 u_32 len;
1246 char buf[];
1247 };
1248
1249 with the following hooks:
1250
1251 sub string_unpack {
1252 my $s = shift;
1253 pack "c$s->{len}", @{$s->{buf}};
1254 }
1255
1256 sub string_pack {
1257 my $s = shift;
1258 return {
1259 len => length $s,
1260 buf => [ unpack 'c*', $s ],
1261 }
1262 }
1263
1264 $c->tag('String', Hooks => { pack => \&string_pack,
1265 unpack => \&string_unpack });
1266
1267 As you can see in the hook code, "buf" is expected to be an array of
1268 characters. For the "unpack" case Convert::Binary::C first turns the
1269 binary data into a Perl array, and then the hook packs it back into a
1270 string. The intermediate array creation and destruction is completely
1271 useless. Same thing, of course, for the "pack" case.
1272
1273 Here's a clever way to handle this. Just tag "buf" as binary
1274
1275 $c->tag('String.buf', Format => 'Binary');
1276
1277 and use the following hooks instead:
1278
1279 sub string_unpack2 {
1280 my $s = shift;
1281 substr $s->{buf}, 0, $s->{len};
1282 }
1283
1284 sub string_pack2 {
1285 my $s = shift;
1286 return {
1287 len => length $s,
1288 buf => $s,
1289 }
1290 }
1291
1292 $c->tag('String', Hooks => { pack => \&string_pack2,
1293 unpack => \&string_unpack2 });
1294
1295 This will be exactly equivalent to the old code, but faster and
1296 probably even much easier to understand.
1297
1298 But hooks are even more powerful. You can customize the arguments that
1299 are passed to your hooks and you can use "arg" to pass certain special
1300 arguments, such as the name of the type that is currently being
1301 processed by the hook.
1302
1303 The following example shows how it is easily possible to peek into the
1304 perl internals using hooks.
1305
1306 use Config;
1307
1308 $c = new Convert::Binary::C %CC, OrderMembers => 1;
1309 $c->Include(["$Config{archlib}/CORE", @{$c->Include}]);
1310 $c->parse(<<ENDC);
1311 #include "EXTERN.h"
1312 #include "perl.h"
1313 ENDC
1314
1315 $c->tag($_, Hooks => { unpack_ptr => [\&unpack_ptr,
1316 $c->arg(qw(SELF TYPE DATA))] })
1317 for qw( XPVAV XPVHV );
1318
1319 First, we add the perl core include path and parse perl.h. Then, we add
1320 an "unpack_ptr" hook for a couple of the internal data types.
1321
1322 The "unpack_ptr" and "pack_ptr" hooks are called whenever a pointer to
1323 a certain data structure is processed. This is by far the most
1324 experimental part of the hooks feature, as this includes any kind of
1325 pointer. There's no way for the hook to know the difference between a
1326 plain pointer, or a pointer to a pointer, or a pointer to an array
1327 (this is because the difference doesn't matter anywhere else in
1328 Convert::Binary::C).
1329
1330 But the hook above makes use of another very interesting feature: It
1331 uses "arg" to pass special arguments to the hook subroutine. Usually,
1332 the hook subroutine is simply passed a single data argument. But using
1333 the above definition, it'll get a reference to the calling object
1334 ("SELF"), the name of the type being processed ("TYPE") and the data
1335 ("DATA").
1336
1337 But how does our hook look like?
1338
1339 sub unpack_ptr {
1340 my($self, $type, $ptr) = @_;
1341 $ptr or return '<NULL>';
1342 my $size = $self->sizeof($type);
1343 $self->unpack($type, unpack("P$size", pack('I', $ptr)));
1344 }
1345
1346 As you can see, the hook is rather simple. First, it receives the
1347 arguments mentioned above. It performs a quick check if the pointer is
1348 "NULL" and shouldn't be processed any further. Next, it determines the
1349 size of the type being processed. And finally, it'll just use the "P"n
1350 unpack template to read from that memory location and recursively call
1351 "unpack" to unpack the type. (And yes, this may of course again call
1352 other hooks.)
1353
1354 Now, let's test that:
1355
1356 my $ref = { foo => 42, bar => 4711 };
1357 my $ptr = hex(("$ref" =~ /\(0x([[:xdigit:]]+)\)$/)[0]);
1358
1359 print Dumper(unpack_ptr($c, 'AV', $ptr));
1360
1361 Just for the fun of it, we create a blessed array reference. But how do
1362 we get a pointer to the corresponding "AV"? This is rather easy, as the
1363 address of the "AV" is just the hex value that appears when using the
1364 array reference in string context. So we just grab that and turn it
1365 into decimal. All that's left to do is just call our hook, as it can
1366 already handle "AV" pointers. And this is what we get:
1367
1368 $VAR1 = {
1369 'sv_any' => {
1370 'xnv_u' => {
1371 'xnv_nv' => '2.18376848395956105e-4933',
1372 'xgv_stash' => 0,
1373 'xpad_cop_seq' => {
1374 'xlow' => 0,
1375 'xhigh' => 139484332
1376 },
1377 'xbm_s' => {
1378 'xbm_previous' => 0,
1379 'xbm_flags' => 172,
1380 'xbm_rare' => 92
1381 }
1382 },
1383 'xav_fill' => 2,
1384 'xav_max' => 7,
1385 'xiv_u' => {
1386 'xivu_iv' => 2,
1387 'xivu_uv' => 2,
1388 'xivu_p1' => 2,
1389 'xivu_i32' => 2,
1390 'xivu_namehek' => 2,
1391 'xivu_hv' => 2
1392 },
1393 'xmg_u' => {
1394 'xmg_magic' => 0,
1395 'xmg_ourstash' => 0
1396 },
1397 'xmg_stash' => 0
1398 },
1399 'sv_refcnt' => 1,
1400 'sv_flags' => 536870924,
1401 'sv_u' => {
1402 'svu_iv' => 139483844,
1403 'svu_uv' => 139483844,
1404 'svu_rv' => 139483844,
1405 'svu_pv' => 139483844,
1406 'svu_array' => 139483844,
1407 'svu_hash' => 139483844,
1408 'svu_gp' => 139483844
1409 }
1410 };
1411
1412 Even though it is rather easy to do such stuff using "unpack_ptr"
1413 hooks, you should really know what you're doing and do it with extreme
1414 care because of the limitations mentioned above. It's really easy to
1415 run into segmentation faults when you're dereferencing pointers that
1416 point to memory which you don't own.
1417
1418 Performance
1419
1420 Using hooks isn't for free. In performance-critical applications you
1421 have to keep in mind that hooks are actually perl subroutines and that
1422 they are called once for every value of a registered type that is being
1423 packed or unpacked. If only about 10% of the values require hooks to be
1424 called, you'll hardly notice the difference (if your hooks are
1425 implemented efficiently, that is). But if all values would require
1426 hooks to be called, that alone could easily make packing and unpacking
1427 very slow.
1428
1429 Tag Order
1430 Since it is possible to attach multiple tags to a single type, the
1431 order in which the tags are processed is important. Here's a small
1432 table that shows the processing order.
1433
1434 pack unpack
1435 ---------------------
1436 Hooks Format
1437 Format ByteOrder
1438 ByteOrder Hooks
1439
1440 As a general rule, the "Hooks" tag is always the first thing processed
1441 when packing data, and the last thing processed when unpacking data.
1442
1443 The "Format" and "ByteOrder" tags are exclusive, but when both are
1444 given the "Format" tag wins.
1445
1447 new
1448 "new"
1449 "new" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1450 The constructor is used to create a new Convert::Binary::C
1451 object. You can simply use
1452
1453 $c = new Convert::Binary::C;
1454
1455 without additional arguments to create an object, or you can
1456 optionally pass any arguments to the constructor that are
1457 described for the "configure" method.
1458
1459 configure
1460 "configure"
1461 "configure" OPTION
1462 "configure" OPTION1 => VALUE1, OPTION2 => VALUE2, ...
1463 This method can be used to configure an existing
1464 Convert::Binary::C object or to retrieve its current
1465 configuration.
1466
1467 To configure the object, the list of options consists of key
1468 and value pairs and must therefore contain an even number of
1469 elements. "configure" (and also "new" if used with
1470 configuration options) will throw an exception if you pass an
1471 odd number of elements. Configuration will normally look like
1472 this:
1473
1474 $c->configure(ByteOrder => 'BigEndian', IntSize => 2);
1475
1476 To retrieve the current value of a configuration option, you
1477 must pass a single argument to "configure" that holds the name
1478 of the option, just like
1479
1480 $order = $c->configure('ByteOrder');
1481
1482 If you want to get the values of all configuration options at
1483 once, you can call "configure" without any arguments and it
1484 will return a reference to a hash table that holds the whole
1485 object configuration. This can be conveniently used with the
1486 Data::Dumper module, for example:
1487
1488 use Convert::Binary::C;
1489 use Data::Dumper;
1490
1491 $c = new Convert::Binary::C Define => ['DEBUGGING', 'FOO=123'],
1492 Include => ['/usr/include'];
1493
1494 print Dumper($c->configure);
1495
1496 Which will print something like this:
1497
1498 $VAR1 = {
1499 'Define' => [
1500 'DEBUGGING',
1501 'FOO=123'
1502 ],
1503 'StdCVersion' => 199901,
1504 'ByteOrder' => 'LittleEndian',
1505 'LongSize' => 4,
1506 'IntSize' => 4,
1507 'HostedC' => 1,
1508 'ShortSize' => 2,
1509 'HasMacroVAARGS' => 1,
1510 'Assert' => [],
1511 'UnsignedChars' => 0,
1512 'DoubleSize' => 8,
1513 'CharSize' => 1,
1514 'EnumType' => 'Integer',
1515 'PointerSize' => 4,
1516 'EnumSize' => 4,
1517 'DisabledKeywords' => [],
1518 'FloatSize' => 4,
1519 'Alignment' => 1,
1520 'LongLongSize' => 8,
1521 'LongDoubleSize' => 12,
1522 'KeywordMap' => {},
1523 'Include' => [
1524 '/usr/include'
1525 ],
1526 'HasCPPComments' => 1,
1527 'Bitfields' => {
1528 'Engine' => 'Generic'
1529 },
1530 'UnsignedBitfields' => 0,
1531 'Warnings' => 0,
1532 'CompoundAlignment' => 1,
1533 'OrderMembers' => 0
1534 };
1535
1536 Since you may not always want to write a "configure" call when
1537 you only want to change a single configuration item, you can
1538 use any configuration option name as a method name, like:
1539
1540 $c->ByteOrder('LittleEndian') if $c->IntSize < 4;
1541
1542 (Yes, the example doesn't make very much sense... ;-)
1543
1544 However, you should keep in mind that configuration methods
1545 that can take lists (namely "Include", "Define" and "Assert",
1546 but not "DisabledKeywords") may behave slightly different than
1547 their "configure" equivalent. If you pass these methods a
1548 single argument that is an array reference, the current list
1549 will be replaced by the new one, which is just the behaviour of
1550 the corresponding "configure" call. So the following are
1551 equivalent:
1552
1553 $c->configure(Define => ['foo', 'bar=123']);
1554 $c->Define(['foo', 'bar=123']);
1555
1556 But if you pass a list of strings instead of an array reference
1557 (which cannot be done when using "configure"), the new list
1558 items are appended to the current list, so
1559
1560 $c = new Convert::Binary::C Include => ['/include'];
1561 $c->Include('/usr/include', '/usr/local/include');
1562 print Dumper($c->Include);
1563
1564 $c->Include(['/usr/local/include']);
1565 print Dumper($c->Include);
1566
1567 will first print all three include paths, but finally only
1568 "/usr/local/include" will be configured:
1569
1570 $VAR1 = [
1571 '/include',
1572 '/usr/include',
1573 '/usr/local/include'
1574 ];
1575 $VAR1 = [
1576 '/usr/local/include'
1577 ];
1578
1579 Furthermore, configuration methods can be chained together, as
1580 they return a reference to their object if called as a set
1581 method. So, if you like, you can configure your object like
1582 this:
1583
1584 $c = Convert::Binary::C->new(IntSize => 4)
1585 ->Define(qw( __DEBUG__ DB_LEVEL=3 ))
1586 ->ByteOrder('BigEndian');
1587
1588 $c->configure(EnumType => 'Both', Alignment => 4)
1589 ->Include('/usr/include', '/usr/local/include');
1590
1591 In the example above, "qw( ... )" is the word list quoting
1592 operator. It returns a list of all non-whitespace sequences,
1593 and is especially useful for configuring preprocessor defines
1594 or assertions. The following assignments are equivalent:
1595
1596 @array = ('one', 'two', 'three');
1597 @array = qw(one two three);
1598
1599 You can configure the following options. Unknown options, as
1600 well as invalid values for an option, will cause the object to
1601 throw exceptions.
1602
1603 "IntSize" => 0 | 1 | 2 | 4 | 8
1604 Set the number of bytes that are occupied by an integer.
1605 This is in most cases 2 or 4. If you set it to zero, the
1606 size of an integer on the host system will be used. This is
1607 also the default unless overridden by
1608 "CBC_DEFAULT_INT_SIZE" at compile time.
1609
1610 "CharSize" => 0 | 1 | 2 | 4 | 8
1611 Set the number of bytes that are occupied by a "char".
1612 This rarely needs to be changed, except for some platforms
1613 that don't care about bytes, for example DSPs. If you set
1614 this to zero, the size of a "char" on the host system will
1615 be used. This is also the default unless overridden by
1616 "CBC_DEFAULT_CHAR_SIZE" at compile time.
1617
1618 "ShortSize" => 0 | 1 | 2 | 4 | 8
1619 Set the number of bytes that are occupied by a short
1620 integer. Although integers explicitly declared as "short"
1621 should be always 16 bit, there are compilers that make a
1622 short 8 bit wide. If you set it to zero, the size of a
1623 short integer on the host system will be used. This is also
1624 the default unless overridden by "CBC_DEFAULT_SHORT_SIZE"
1625 at compile time.
1626
1627 "LongSize" => 0 | 1 | 2 | 4 | 8
1628 Set the number of bytes that are occupied by a long
1629 integer. If set to zero, the size of a long integer on the
1630 host system will be used. This is also the default unless
1631 overridden by "CBC_DEFAULT_LONG_SIZE" at compile time.
1632
1633 "LongLongSize" => 0 | 1 | 2 | 4 | 8
1634 Set the number of bytes that are occupied by a long long
1635 integer. If set to zero, the size of a long long integer on
1636 the host system, or 8, will be used. This is also the
1637 default unless overridden by "CBC_DEFAULT_LONG_LONG_SIZE"
1638 at compile time.
1639
1640 "FloatSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1641 Set the number of bytes that are occupied by a single
1642 precision floating point value. If you set it to zero, the
1643 size of a "float" on the host system will be used. This is
1644 also the default unless overridden by
1645 "CBC_DEFAULT_FLOAT_SIZE" at compile time. For details on
1646 floating point support, see "FLOATING POINT VALUES".
1647
1648 "DoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1649 Set the number of bytes that are occupied by a double
1650 precision floating point value. If you set it to zero, the
1651 size of a "double" on the host system will be used. This is
1652 also the default unless overridden by
1653 "CBC_DEFAULT_DOUBLE_SIZE" at compile time. For details on
1654 floating point support, see "FLOATING POINT VALUES".
1655
1656 "LongDoubleSize" => 0 | 1 | 2 | 4 | 8 | 12 | 16
1657 Set the number of bytes that are occupied by a double
1658 precision floating point value. If you set it to zero, the
1659 size of a "long double" on the host system, or 12 will be
1660 used. This is also the default unless overridden by
1661 "CBC_DEFAULT_LONG_DOUBLE_SIZE" at compile time. For details
1662 on floating point support, see "FLOATING POINT VALUES".
1663
1664 "PointerSize" => 0 | 1 | 2 | 4 | 8
1665 Set the number of bytes that are occupied by a pointer.
1666 This is in most cases 2 or 4. If you set it to zero, the
1667 size of a pointer on the host system will be used. This is
1668 also the default unless overridden by
1669 "CBC_DEFAULT_PTR_SIZE" at compile time.
1670
1671 "EnumSize" => -1 | 0 | 1 | 2 | 4 | 8
1672 Set the number of bytes that are occupied by an enumeration
1673 type. On most systems, this is equal to the size of an
1674 integer, which is also the default. However, for some
1675 compilers, the size of an enumeration type depends on the
1676 size occupied by the largest enumerator. So the size may
1677 vary between 1 and 8. If you have
1678
1679 enum foo {
1680 ONE = 100, TWO = 200
1681 };
1682
1683 this will occupy one byte because the enum can be
1684 represented as an unsigned one-byte value. However,
1685
1686 enum foo {
1687 ONE = -100, TWO = 200
1688 };
1689
1690 will occupy two bytes, because the -100 forces the type to
1691 be signed, and 200 doesn't fit into a signed one-byte
1692 value. Therefore, the type used is a signed two-byte
1693 value. If this is the behaviour you need, set the EnumSize
1694 to 0.
1695
1696 Some compilers try to follow this strategy, but don't care
1697 whether the enumeration has signed values or not. They
1698 always declare an enum as signed. On such a compiler, given
1699
1700 enum one { ONE = -100, TWO = 100 };
1701 enum two { ONE = 100, TWO = 200 };
1702
1703 enum "one" will occupy only one byte, while enum "two" will
1704 occupy two bytes, even though it could be represented by a
1705 unsigned one-byte value. If this is the behaviour of your
1706 compiler, set EnumSize to "-1".
1707
1708 "Alignment" => 0 | 1 | 2 | 4 | 8 | 16
1709 Set the struct member alignment. This option controls where
1710 padding bytes are inserted between struct members. It
1711 globally sets the alignment for all structs/unions.
1712 However, this can be overridden from within the source code
1713 with the common "pack" pragma as explained in "Supported
1714 pragma directives". The default alignment is 1, which
1715 means no padding bytes are inserted. A setting of 0 means
1716 native alignment, i.e. the alignment of the system that
1717 Convert::Binary::C has been compiled on. You can determine
1718 the native properties using the "native" function.
1719
1720 The "Alignment" option is similar to the "-Zp[n]" option of
1721 the Intel compiler. It globally specifies the maximum
1722 boundary to which struct members are aligned. Consider the
1723 following structure and the sizes of "char", "short",
1724 "long" and "double" being 1, 2, 4 and 8, respectively.
1725
1726 struct align {
1727 char a;
1728 short b, c;
1729 long d;
1730 double e;
1731 };
1732
1733 With an alignment of 1 (the default), the struct members
1734 would be packed tightly:
1735
1736 0 1 2 3 4 5 6 7 8 9 10 11 12
1737 +---+---+---+---+---+---+---+---+---+---+---+---+
1738 | a | b | c | d | ...
1739 +---+---+---+---+---+---+---+---+---+---+---+---+
1740
1741 12 13 14 15 16 17
1742 +---+---+---+---+---+
1743 ... e |
1744 +---+---+---+---+---+
1745
1746 With an alignment of 2, the struct members larger than one
1747 byte would be aligned to 2-byte boundaries, which results
1748 in a single padding byte between "a" and "b".
1749
1750 0 1 2 3 4 5 6 7 8 9 10 11 12
1751 +---+---+---+---+---+---+---+---+---+---+---+---+
1752 | a | * | b | c | d | ...
1753 +---+---+---+---+---+---+---+---+---+---+---+---+
1754
1755 12 13 14 15 16 17 18
1756 +---+---+---+---+---+---+
1757 ... e |
1758 +---+---+---+---+---+---+
1759
1760 With an alignment of 4, the struct members of size 2 would
1761 be aligned to 2-byte boundaries and larger struct members
1762 would be aligned to 4-byte boundaries:
1763
1764 0 1 2 3 4 5 6 7 8 9 10 11 12
1765 +---+---+---+---+---+---+---+---+---+---+---+---+
1766 | a | * | b | c | * | * | d | ...
1767 +---+---+---+---+---+---+---+---+---+---+---+---+
1768
1769 12 13 14 15 16 17 18 19 20
1770 +---+---+---+---+---+---+---+---+
1771 ... | e |
1772 +---+---+---+---+---+---+---+---+
1773
1774 This layout of the struct members allows the compiler to
1775 generate optimized code because aligned members can be
1776 accessed more easily by the underlying architecture.
1777
1778 Finally, setting the alignment to 8 will align "double"s to
1779 8-byte boundaries:
1780
1781 0 1 2 3 4 5 6 7 8 9 10 11 12
1782 +---+---+---+---+---+---+---+---+---+---+---+---+
1783 | a | * | b | c | * | * | d | ...
1784 +---+---+---+---+---+---+---+---+---+---+---+---+
1785
1786 12 13 14 15 16 17 18 19 20 21 22 23 24
1787 +---+---+---+---+---+---+---+---+---+---+---+---+
1788 ... | * | * | * | * | e |
1789 +---+---+---+---+---+---+---+---+---+---+---+---+
1790
1791 Further increasing the alignment does not alter the layout
1792 of our structure, as only members larger that 8 bytes would
1793 be affected.
1794
1795 The alignment of a structure depends on its largest member
1796 and on the setting of the "Alignment" option. With
1797 "Alignment" set to 2, a structure holding a "long" would be
1798 aligned to a 2-byte boundary, while a structure containing
1799 only "char"s would have no alignment restrictions.
1800 (Unfortunately, that's not the whole story. See the
1801 "CompoundAlignment" option for details.)
1802
1803 Here's another example. Assuming 8-byte alignment, the
1804 following two structs will both have a size of 16 bytes:
1805
1806 struct one {
1807 char c;
1808 double d;
1809 };
1810
1811 struct two {
1812 double d;
1813 char c;
1814 };
1815
1816 This is clear for "struct one", because the member "d" has
1817 to be aligned to an 8-byte boundary, and thus 7 padding
1818 bytes are inserted after "c". But for "struct two", the
1819 padding bytes are inserted at the end of the structure,
1820 which doesn't make much sense immediately. However, it
1821 makes perfect sense if you think about an array of "struct
1822 two". Each "double" has to be aligned to an 8-byte
1823 boundary, an thus each array element would have to occupy
1824 16 bytes. With that in mind, it would be strange if a
1825 "struct two" variable would have a different size. And it
1826 would make the widely used construct
1827
1828 struct two array[] = { {1.0, 0}, {2.0, 1} };
1829 int elements = sizeof(array) / sizeof(struct two);
1830
1831 impossible.
1832
1833 The alignment behaviour described here seems to be common
1834 for all compilers. However, not all compilers have an
1835 option to configure their default alignment.
1836
1837 "CompoundAlignment" => 0 | 1 | 2 | 4 | 8 | 16
1838 Usually, the alignment of a compound (i.e. a "struct" or a
1839 "union") depends only on its largest member and on the
1840 setting of the "Alignment" option. There are, however,
1841 architectures and compilers where compounds can have
1842 different alignment constraints.
1843
1844 For most platforms and compilers, the alignment constraint
1845 for compounds is 1 byte. That is, on most platforms
1846
1847 struct onebyte {
1848 char byte;
1849 };
1850
1851 will have an alignment of 1 and also a size of 1. But if
1852 you take an ARM architecture, the above "struct onebyte"
1853 will have an alignment of 4, and thus also a size of 4.
1854
1855 You can configure this by setting "CompoundAlignment" to 4.
1856 This will ensure that the alignment of compounds is always
1857 4.
1858
1859 Setting "CompoundAlignment" to 0 means native compound
1860 alignment, i.e. the compound alignment of the system that
1861 Convert::Binary::C has been compiled on. You can determine
1862 the native properties using the "native" function.
1863
1864 There are also compilers for certain platforms that allow
1865 you to adjust the compound alignment. If you're not aware
1866 of the fact that your compiler/architecture has a compound
1867 alignment other than 1, strange things can happen. If, for
1868 example, the compound alignment is 2 and you have something
1869 like
1870
1871 typedef unsigned char U8;
1872
1873 struct msg_head {
1874 U8 cmd;
1875 struct {
1876 U8 hi;
1877 U8 low;
1878 } crc16;
1879 U8 len;
1880 };
1881
1882 there will be one padding byte inserted before the embedded
1883 "crc16" struct and after the "len" member, which is most
1884 probably not what was intended:
1885
1886 0 1 2 3 4 5 6
1887 +-----+-----+-----+-----+-----+-----+
1888 | cmd | * | hi | low | len | * |
1889 +-----+-----+-----+-----+-----+-----+
1890
1891 Note that both "#pragma pack" and the "Alignment" option
1892 can override "CompoundAlignment". If you set
1893 "CompoundAlignment" to 4, but "Alignment" to 2, compounds
1894 will actually be aligned on 2-byte boundaries.
1895
1896 "ByteOrder" => 'BigEndian' | 'LittleEndian'
1897 Set the byte order for integers larger than a single byte.
1898 Little endian (Intel, least significant byte first) and big
1899 endian (Motorola, most significant byte first) byte order
1900 are supported. The default byte order is the same as the
1901 byte order of the host system unless overridden by
1902 "CBC_DEFAULT_BYTEORDER" at compile time.
1903
1904 "EnumType" => 'Integer' | 'String' | 'Both'
1905 This option controls the type that enumeration constants
1906 will have in data structures returned by the "unpack"
1907 method. If you have the following definitions:
1908
1909 typedef enum {
1910 SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
1911 THURSDAY, FRIDAY, SATURDAY
1912 } Weekday;
1913
1914 typedef enum {
1915 JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY,
1916 AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER
1917 } Month;
1918
1919 typedef struct {
1920 int year;
1921 Month month;
1922 int day;
1923 Weekday weekday;
1924 } Date;
1925
1926 and a byte string that holds a packed Date struct, then
1927 you'll get the following results from a call to the
1928 "unpack" method.
1929
1930 "Integer"
1931 Enumeration constants are returned as plain integers.
1932 This is fast, but may be not very useful. It is also
1933 the default.
1934
1935 $date = {
1936 'weekday' => 1,
1937 'month' => 0,
1938 'day' => 7,
1939 'year' => 2002
1940 };
1941
1942 "String"
1943 Enumeration constants are returned as strings. This
1944 will create a string constant for every unpacked
1945 enumeration constant and thus consumes more time and
1946 memory. However, the result may be more useful.
1947
1948 $date = {
1949 'weekday' => 'MONDAY',
1950 'month' => 'JANUARY',
1951 'day' => 7,
1952 'year' => 2002
1953 };
1954
1955 "Both"
1956 Enumeration constants are returned as double typed
1957 scalars. If evaluated in string context, the
1958 enumeration constant will be a string, if evaluated in
1959 numeric context, the enumeration constant will be an
1960 integer.
1961
1962 $date = $c->EnumType('Both')->unpack('Date', $binary);
1963
1964 printf "Weekday = %s (%d)\n\n", $date->{weekday},
1965 $date->{weekday};
1966
1967 if ($date->{month} == 0) {
1968 print "It's $date->{month}, happy new year!\n\n";
1969 }
1970
1971 print Dumper($date);
1972
1973 This will print:
1974
1975 Weekday = MONDAY (1)
1976
1977 It's JANUARY, happy new year!
1978
1979 $VAR1 = {
1980 'weekday' => 'MONDAY',
1981 'month' => 'JANUARY',
1982 'day' => 7,
1983 'year' => 2002
1984 };
1985
1986 "DisabledKeywords" => [ KEYWORDS ]
1987 This option allows you to selectively deactivate certain
1988 keywords in the C parser. Some C compilers don't have the
1989 complete ANSI keyword set, i.e. they don't recognize the
1990 keywords "const" or "void", for example. If you do
1991
1992 typedef int void;
1993
1994 on such a compiler, this will usually be ok. But if you
1995 parse this with an ANSI compiler, it will be a syntax
1996 error. To parse the above code correctly, you have to
1997 disable the "void" keyword in the Convert::Binary::C
1998 parser:
1999
2000 $c->DisabledKeywords([qw( void )]);
2001
2002 By default, the Convert::Binary::C parser will recognize
2003 the keywords "inline" and "restrict". If your compiler
2004 doesn't have these new keywords, it usually doesn't matter.
2005 Only if you're using the keywords as identifiers, like in
2006
2007 typedef struct inline {
2008 int a, b;
2009 } restrict;
2010
2011 you'll have to disable these ISO-C99 keywords:
2012
2013 $c->DisabledKeywords([qw( inline restrict )]);
2014
2015 The parser allows you to disable the following keywords:
2016
2017 asm
2018 auto
2019 const
2020 double
2021 enum
2022 extern
2023 float
2024 inline
2025 long
2026 register
2027 restrict
2028 short
2029 signed
2030 static
2031 unsigned
2032 void
2033 volatile
2034
2035 "KeywordMap" => { KEYWORD => TOKEN, ... }
2036 This option allows you to add new keywords to the parser.
2037 These new keywords can either be mapped to existing tokens
2038 or simply ignored. For example, recent versions of the GNU
2039 compiler recognize the keywords "__signed__" and
2040 "__extension__". The first one obviously is a synonym for
2041 "signed", while the second one is only a marker for a
2042 language extension.
2043
2044 Using the preprocessor, you could of course do the
2045 following:
2046
2047 $c->Define(qw( __signed__=signed __extension__= ));
2048
2049 However, the preprocessor symbols could be undefined or
2050 redefined in the code, and
2051
2052 #ifdef __signed__
2053 # undef __signed__
2054 #endif
2055
2056 typedef __extension__ __signed__ long long s_quad;
2057
2058 would generate a parse error, because "__signed__" is an
2059 unexpected identifier.
2060
2061 Instead of utilizing the preprocessor, you'll have to
2062 create mappings for the new keywords directly in the parser
2063 using "KeywordMap". In the above example, you want to map
2064 "__signed__" to the built-in C keyword "signed" and ignore
2065 "__extension__". This could be done with the following
2066 code:
2067
2068 $c->KeywordMap({ __signed__ => 'signed',
2069 __extension__ => undef });
2070
2071 You can specify any valid identifier as hash key, and
2072 either a valid C keyword or "undef" as hash value. Having
2073 configured the object that way, you could parse even
2074
2075 #ifdef __signed__
2076 # undef __signed__
2077 #endif
2078
2079 typedef __extension__ __signed__ long long s_quad;
2080
2081 without problems.
2082
2083 Note that "KeywordMap" and "DisabledKeywords" perfectly
2084 work together. You could, for example, disable the "signed"
2085 keyword, but still have "__signed__" mapped to the original
2086 "signed" token:
2087
2088 $c->configure(DisabledKeywords => [ 'signed' ],
2089 KeywordMap => { __signed__ => 'signed' });
2090
2091 This would allow you to define
2092
2093 typedef __signed__ long signed;
2094
2095 which would normally be a syntax error because "signed"
2096 cannot be used as an identifier.
2097
2098 "UnsignedChars" => 0 | 1
2099 Use this boolean option if you want characters to be
2100 unsigned if specified without an explicit "signed" or
2101 "unsigned" type specifier. By default, characters are
2102 signed.
2103
2104 "UnsignedBitfields" => 0 | 1
2105 Use this boolean option if you want bitfields to be
2106 unsigned if specified without an explicit "signed" or
2107 "unsigned" type specifier. By default, bitfields are
2108 signed.
2109
2110 "Warnings" => 0 | 1
2111 Use this boolean option if you want warnings to be issued
2112 during the parsing of source code. Currently, warnings are
2113 only reported by the preprocessor, so don't expect the
2114 output to cover everything.
2115
2116 By default, warnings are turned off and only errors will be
2117 reported. However, even these errors are turned off if you
2118 run without the "-w" flag.
2119
2120 "HasCPPComments" => 0 | 1
2121 Use this option to turn C++ comments on or off. By default,
2122 C++ comments are enabled. Disabling C++ comments may be
2123 necessary if your code includes strange things like:
2124
2125 one = 4 //* <- divide */ 4;
2126 two = 2;
2127
2128 With C++ comments, the above will be interpreted as
2129
2130 one = 4
2131 two = 2;
2132
2133 which will obviously be a syntax error, but without C++
2134 comments, it will be interpreted as
2135
2136 one = 4 / 4;
2137 two = 2;
2138
2139 which is correct.
2140
2141 "HasMacroVAARGS" => 0 | 1
2142 Use this option to turn the "__VA_ARGS__" macro expansion
2143 on or off. If this is enabled (which is the default), you
2144 can use variable length argument lists in your preprocessor
2145 macros.
2146
2147 #define DEBUG( ... ) fprintf( stderr, __VA_ARGS__ )
2148
2149 There's normally no reason to turn that feature off.
2150
2151 "StdCVersion" => undef | INTEGER
2152 Use this option to change the value of the preprocessor's
2153 predefined "__STDC_VERSION__" macro. When set to "undef",
2154 the macro will not be defined.
2155
2156 "HostedC" => undef | 0 | 1
2157 Use this option to change the value of the preprocessor's
2158 predefined "__STDC_HOSTED__" macro. When set to "undef",
2159 the macro will not be defined.
2160
2161 "Include" => [ INCLUDES ]
2162 Use this option to set the include path for the internal
2163 preprocessor. The option value is a reference to an array
2164 of strings, each string holding a directory that should be
2165 searched for includes.
2166
2167 "Define" => [ DEFINES ]
2168 Use this option to define symbols in the preprocessor. The
2169 option value is, again, a reference to an array of strings.
2170 Each string can be either just a symbol or an assignment to
2171 a symbol. This is completely equivalent to what the "-D"
2172 option does for most preprocessors.
2173
2174 The following will define the symbol "FOO" and define "BAR"
2175 to be 12345:
2176
2177 $c->configure(Define => [qw( FOO BAR=12345 )]);
2178
2179 "Assert" => [ ASSERTIONS ]
2180 Use this option to make assertions in the preprocessor. If
2181 you don't know what assertions are, don't be concerned,
2182 since they're deprecated anyway. They are, however, used in
2183 some system's include files. The value is an array
2184 reference, just like for the macro definitions. Only the
2185 way the assertions are defined is a bit different and
2186 mimics the way they are defined with the "#assert"
2187 directive:
2188
2189 $c->configure(Assert => ['foo(bar)']);
2190
2191 "OrderMembers" => 0 | 1
2192 When using "unpack" on compounds and iterating over the
2193 returned hash, the order of the compound members is
2194 generally not preserved due to the nature of hash tables.
2195 It is not even guaranteed that the order is the same
2196 between different runs of the same program. This can be
2197 very annoying if you simply use to dump your data
2198 structures and the compound members always show up in a
2199 different order.
2200
2201 By setting "OrderMembers" to a non-zero value, all hashes
2202 returned by "unpack" are tied to a class that preserves the
2203 order of the hash keys. This way, all compound members
2204 will be returned in the correct order just as they are
2205 defined in your C code.
2206
2207 use Convert::Binary::C;
2208 use Data::Dumper;
2209
2210 $c = Convert::Binary::C->new->parse(<<'ENDC');
2211 struct test {
2212 char one;
2213 char two;
2214 struct {
2215 char never;
2216 char change;
2217 char this;
2218 char order;
2219 } three;
2220 char four;
2221 };
2222 ENDC
2223
2224 $data = "Convert";
2225
2226 $u1 = $c->unpack('test', $data);
2227 $c->OrderMembers(1);
2228 $u2 = $c->unpack('test', $data);
2229
2230 print Data::Dumper->Dump([$u1, $u2], [qw(u1 u2)]);
2231
2232 This will print something like:
2233
2234 $u1 = {
2235 'three' => {
2236 'change' => 118,
2237 'order' => 114,
2238 'this' => 101,
2239 'never' => 110
2240 },
2241 'one' => 67,
2242 'two' => 111,
2243 'four' => 116
2244 };
2245 $u2 = {
2246 'one' => 67,
2247 'two' => 111,
2248 'three' => {
2249 'never' => 110,
2250 'change' => 118,
2251 'this' => 101,
2252 'order' => 114
2253 },
2254 'four' => 116
2255 };
2256
2257 To be able to use this option, you have to install either
2258 the Tie::Hash::Indexed or the Tie::IxHash module. If both
2259 are installed, Convert::Binary::C will give preference to
2260 Tie::Hash::Indexed because it's faster.
2261
2262 When using this option, you should keep in mind that tied
2263 hashes are significantly slower and consume more memory
2264 than ordinary hashes, even when the class they're tied to
2265 is implemented efficiently. So don't turn this option on if
2266 you don't have to.
2267
2268 You can also influence hash member ordering by using the
2269 "CBC_ORDER_MEMBERS" environment variable.
2270
2271 "Bitfields" => { OPTION => VALUE, ... }
2272 Use this option to specify and configure a bitfield
2273 layouting engine. You can choose an engine by passing its
2274 name to the "Engine" option, like:
2275
2276 $c->configure(Bitfields => { Engine => 'Generic' });
2277
2278 Each engine can have its own set of options, although
2279 currently none of them does.
2280
2281 You can choose between the following bitfield engines:
2282
2283 "Generic"
2284 This engine implements the behaviour of most UNIX C
2285 compilers, including GCC. It does not handle packed
2286 bitfields yet.
2287
2288 "Microsoft"
2289 This engine implements the behaviour of Microsoft's
2290 "cl" compiler. It should be fairly complete and can
2291 handle packed bitfields.
2292
2293 "Simple"
2294 This engine is only used for testing the bitfield
2295 infrastructure in Convert::Binary::C. There's usually
2296 no reason to use it.
2297
2298 You can reconfigure all options even after you have parsed some
2299 code. The changes will be applied to the already parsed
2300 definitions. This works as long as array lengths are not
2301 affected by the changes. If you have Alignment and IntSize set
2302 to 4 and parse code like this
2303
2304 typedef struct {
2305 char abc;
2306 int day;
2307 } foo;
2308
2309 struct bar {
2310 foo zap[2*sizeof(foo)];
2311 };
2312
2313 the array "zap" in "struct bar" will obviously have 16
2314 elements. If you reconfigure the alignment to 1 now, the size
2315 of "foo" is now 5 instead of 8. While the alignment is adjusted
2316 correctly, the number of elements in array "zap" will still be
2317 16 and will not be changed to 10.
2318
2319 parse
2320 "parse" CODE
2321 Parses a string of valid C code. All enumeration, compound and
2322 type definitions are extracted. You can call the "parse" and
2323 "parse_file" methods as often as you like to add further
2324 definitions to the Convert::Binary::C object.
2325
2326 "parse" will throw an exception if an error occurs. On
2327 success, the method returns a reference to its object.
2328
2329 See "Parsing C code" for an example.
2330
2331 parse_file
2332 "parse_file" FILE
2333 Parses a C source file. All enumeration, compound and type
2334 definitions are extracted. You can call the "parse" and
2335 "parse_file" methods as often as you like to add further
2336 definitions to the Convert::Binary::C object.
2337
2338 "parse_file" will search the include path given via the
2339 "Include" option for the file if it cannot find it in the
2340 current directory.
2341
2342 "parse_file" will throw an exception if an error occurs. On
2343 success, the method returns a reference to its object.
2344
2345 See "Parsing C code" for an example.
2346
2347 When calling "parse" or "parse_file" multiple times, you may
2348 use types previously defined, but you are not allowed to
2349 redefine types. The state of the preprocessor is also saved, so
2350 you may also use defines from a previous parse. This works only
2351 as long as the preprocessor is not reset. See "Preprocessor
2352 configuration" for details.
2353
2354 When you're parsing C source files instead of C header files,
2355 note that local definitions are ignored. This means that type
2356 definitions hidden within functions will not be recognized by
2357 Convert::Binary::C. This is necessary because different
2358 functions (even different blocks within the same function) can
2359 define types with the same name:
2360
2361 void my_func(int i)
2362 {
2363 if (i < 10)
2364 {
2365 enum digit { ONE, TWO, THREE } x = ONE;
2366 printf("%d, %d\n", i, x);
2367 }
2368 else
2369 {
2370 enum digit { THREE, TWO, ONE } x = ONE;
2371 printf("%d, %d\n", i, x);
2372 }
2373 }
2374
2375 The above is a valid piece of C code, but it's not possible for
2376 Convert::Binary::C to distinguish between the different
2377 definitions of "enum digit", as they're only defined locally
2378 within the corresponding block.
2379
2380 clean
2381 "clean" Clears all information that has been collected during previous
2382 calls to "parse" or "parse_file". You can use this method if
2383 you want to parse some entirely different code, but with the
2384 same configuration.
2385
2386 The "clean" method returns a reference to its object.
2387
2388 clone
2389 "clone" Makes the object return an exact independent copy of itself.
2390
2391 $c = new Convert::Binary::C Include => ['/usr/include'];
2392 $c->parse_file('definitions.c');
2393 $clone = $c->clone;
2394
2395 The above code is technically equivalent (Mostly. Actually,
2396 using "sourcify" and "parse" might alter the order of the
2397 parsed data, which would make methods such as "compound" return
2398 the definitions in a different order.) to:
2399
2400 $c = new Convert::Binary::C Include => ['/usr/include'];
2401 $c->parse_file('definitions.c');
2402 $clone = new Convert::Binary::C %{$c->configure};
2403 $clone->parse($c->sourcify);
2404
2405 Using "clone" is just a lot faster.
2406
2407 def
2408 "def" NAME
2409 "def" TYPE
2410 If you need to know if a definition for a certain type name
2411 exists, use this method. You pass it the name of an enum,
2412 struct, union or typedef, and it will return a non-empty string
2413 being either "enum", "struct", "union", or "typedef" if there's
2414 a definition for the type in question, an empty string if
2415 there's no such definition, or "undef" if the name is
2416 completely unknown. If the type can be interpreted as a basic
2417 type, "basic" will be returned.
2418
2419 If you pass in a TYPE, the output will be slightly different.
2420 If the specified member exists, the "def" method will return
2421 "member". If the member doesn't exist, or if the type cannot
2422 have members, the empty string will be returned. Again, if the
2423 name of the type is completely unknown, "undef" will be
2424 returned. This may be useful if you want to check if a certain
2425 member exists within a compound, for example.
2426
2427 use Convert::Binary::C;
2428
2429 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2430
2431 typedef struct __not not;
2432 typedef struct __not *ptr;
2433
2434 struct foo {
2435 enum bar *xxx;
2436 };
2437
2438 typedef int quad[4];
2439
2440 ENDC
2441
2442 for my $type (qw( not ptr foo bar xxx foo.xxx foo.abc xxx.yyy
2443 quad quad[3] quad[5] quad[-3] short[1] ),
2444 'unsigned long')
2445 {
2446 my $def = $c->def($type);
2447 printf "%-14s => %s\n",
2448 $type, defined $def ? "'$def'" : 'undef';
2449 }
2450
2451 The following would be returned by the "def" method:
2452
2453 not => ''
2454 ptr => 'typedef'
2455 foo => 'struct'
2456 bar => ''
2457 xxx => undef
2458 foo.xxx => 'member'
2459 foo.abc => ''
2460 xxx.yyy => undef
2461 quad => 'typedef'
2462 quad[3] => 'member'
2463 quad[5] => 'member'
2464 quad[-3] => 'member'
2465 short[1] => undef
2466 unsigned long => 'basic'
2467
2468 So, if "def" returns a non-empty string, you can safely use any
2469 other method with that type's name or with that member
2470 expression.
2471
2472 Concerning arrays, note that the index into an array doesn't
2473 need to be within the bounds of the array's definition, just
2474 like in C. In the above example, "quad[5]" and "quad[-3]" are
2475 valid members of the "quad" array, even though it is declared
2476 to have only four elements.
2477
2478 In cases where the typedef namespace overlaps with the
2479 namespace of enums/structs/unions, the "def" method will give
2480 preference to the typedef and will thus return the string
2481 "typedef". You could however force interpretation as an enum,
2482 struct or union by putting "enum", "struct" or "union" in front
2483 of the type's name.
2484
2485 defined
2486 "defined" MACRO
2487 You can use the "defined" method to find out if a certain macro
2488 is defined, just like you would use the "defined" operator of
2489 the preprocessor. For example, the following code
2490
2491 use Convert::Binary::C;
2492
2493 my $c = Convert::Binary::C->new->parse(<<'ENDC');
2494
2495 #define ADD(a, b) ((a) + (b))
2496
2497 #if 1
2498 # define DEFINED
2499 #else
2500 # define UNDEFINED
2501 #endif
2502
2503 ENDC
2504
2505 for my $macro (qw( ADD DEFINED UNDEFINED )) {
2506 my $not = $c->defined($macro) ? '' : ' not';
2507 print "Macro '$macro' is$not defined.\n";
2508 }
2509
2510 would print:
2511
2512 Macro 'ADD' is defined.
2513 Macro 'DEFINED' is defined.
2514 Macro 'UNDEFINED' is not defined.
2515
2516 You have to keep in mind that this works only as long as the
2517 preprocessor is not reset. See "Preprocessor configuration" for
2518 details.
2519
2520 pack
2521 "pack" TYPE
2522 "pack" TYPE, DATA
2523 "pack" TYPE, DATA, STRING
2524 Use this method to pack a complex data structure into a binary
2525 string according to a type definition that has been previously
2526 parsed. DATA must be a scalar matching the type definition. C
2527 structures and unions are represented by references to Perl
2528 hashes, C arrays by references to Perl arrays.
2529
2530 use Convert::Binary::C;
2531 use Data::Dumper;
2532 use Data::Hexdumper;
2533
2534 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2535 , LongSize => 4
2536 , ShortSize => 2
2537 )
2538 ->parse(<<'ENDC');
2539 struct test {
2540 char ary[3];
2541 union {
2542 short word[2];
2543 long quad;
2544 } uni;
2545 };
2546 ENDC
2547
2548 Hashes don't have to contain a key for each compound member and
2549 arrays may be truncated:
2550
2551 $binary = $c->pack('test', { ary => [1, 2], uni => { quad => 42 } });
2552
2553 Elements not defined in the Perl data structure will be set to
2554 zero in the packed byte string. If you pass "undef" as or
2555 simply omit the second parameter, the whole string will be
2556 initialized with zero bytes. On success, the packed byte string
2557 is returned.
2558
2559 print hexdump(data => $binary);
2560
2561 The above code would print:
2562
2563 0x0000 : 01 02 00 00 00 00 2A : ......*
2564
2565 You could also use "unpack" and dump the data structure.
2566
2567 $unpacked = $c->unpack('test', $binary);
2568 print Data::Dumper->Dump([$unpacked], ['unpacked']);
2569
2570 This would print:
2571
2572 $unpacked = {
2573 'uni' => {
2574 'word' => [
2575 0,
2576 42
2577 ],
2578 'quad' => 42
2579 },
2580 'ary' => [
2581 1,
2582 2,
2583 0
2584 ]
2585 };
2586
2587 If TYPE refers to a compound object, you may pack any member of
2588 that compound object. Simply add a member expression to the
2589 type name, just as you would access the member in C:
2590
2591 $array = $c->pack('test.ary', [1, 2, 3]);
2592 print hexdump(data => $array);
2593
2594 $value = $c->pack('test.uni.word[1]', 2);
2595 print hexdump(data => $value);
2596
2597 This would give you:
2598
2599 0x0000 : 01 02 03 : ...
2600 0x0000 : 00 02 : ..
2601
2602 Call "pack" with the optional STRING argument if you want to
2603 use an existing binary string to insert the data. If called in
2604 a void context, "pack" will directly modify the string you
2605 passed as the third argument. Otherwise, a copy of the string
2606 is created, and "pack" will modify and return the copy, so the
2607 original string will remain unchanged.
2608
2609 The 3-argument version may be useful if you want to change only
2610 a few members of a complex data structure without having to
2611 "unpack" everything, change the members, and then "pack" again
2612 (which could waste lots of memory and CPU cycles). So, instead
2613 of doing something like
2614
2615 $test = $c->unpack('test', $binary);
2616 $test->{uni}{quad} = 4711;
2617 $new = $c->pack('test', $test);
2618
2619 to change the "uni.quad" member of $packed, you could simply do
2620 either
2621
2622 $new = $c->pack('test', { uni => { quad => 4711 } }, $binary);
2623
2624 or
2625
2626 $c->pack('test', { uni => { quad => 4711 } }, $binary);
2627
2628 while the latter would directly modify $packed. Besides this
2629 code being a lot shorter (and perhaps even more readable), it
2630 can be significantly faster if you're dealing with really big
2631 data blocks.
2632
2633 If the length of the input string is less than the size
2634 required by the type, the string (or its copy) is extended and
2635 the extended part is initialized to zero. If the length is
2636 more than the size required by the type, the string is kept at
2637 that length, and also a copy would be an exact copy of that
2638 string.
2639
2640 $too_short = pack "C*", (1 .. 4);
2641 $too_long = pack "C*", (1 .. 20);
2642
2643 $c->pack('test', { uni => { quad => 0x4711 } }, $too_short);
2644 print "too_short:\n", hexdump(data => $too_short);
2645
2646 $copy = $c->pack('test', { uni => { quad => 0x4711 } }, $too_long);
2647 print "\ncopy:\n", hexdump(data => $copy);
2648
2649 This would print:
2650
2651 too_short:
2652 0x0000 : 01 02 03 00 00 47 11 : .....G.
2653
2654 copy:
2655 0x0000 : 01 02 03 00 00 47 11 08 09 0A 0B 0C 0D 0E 0F 10 : .....G..........
2656 0x0010 : 11 12 13 14 : ....
2657
2658 unpack
2659 "unpack" TYPE, STRING
2660 Use this method to unpack a binary string and create an
2661 arbitrarily complex Perl data structure based on a previously
2662 parsed type definition.
2663
2664 use Convert::Binary::C;
2665 use Data::Dumper;
2666
2667 $c = Convert::Binary::C->new( ByteOrder => 'BigEndian'
2668 , LongSize => 4
2669 , ShortSize => 2
2670 )
2671 ->parse( <<'ENDC' );
2672 struct test {
2673 char ary[3];
2674 union {
2675 short word[2];
2676 long *quad;
2677 } uni;
2678 };
2679 ENDC
2680
2681 # Generate some binary dummy data
2682 $binary = pack "C*", 1 .. $c->sizeof('test');
2683
2684 On failure, e.g. if the specified type cannot be found, the
2685 method will throw an exception. On success, a reference to a
2686 complex Perl data structure is returned, which can directly be
2687 dumped using the Data::Dumper module:
2688
2689 $unpacked = $c->unpack('test', $binary);
2690 print Dumper($unpacked);
2691
2692 This would print:
2693
2694 $VAR1 = {
2695 'uni' => {
2696 'word' => [
2697 1029,
2698 1543
2699 ],
2700 'quad' => 67438087
2701 },
2702 'ary' => [
2703 1,
2704 2,
2705 3
2706 ]
2707 };
2708
2709 If TYPE refers to a compound object, you may unpack any member
2710 of that compound object. Simply add a member expression to the
2711 type name, just as you would access the member in C:
2712
2713 $binary2 = substr $binary, $c->offsetof('test', 'uni.word');
2714
2715 $unpack1 = $unpacked->{uni}{word};
2716 $unpack2 = $c->unpack('test.uni.word', $binary2);
2717
2718 print Data::Dumper->Dump([$unpack1, $unpack2], [qw(unpack1 unpack2)]);
2719
2720 You will find that the output is exactly the same for both
2721 $unpack1 and $unpack2:
2722
2723 $unpack1 = [
2724 1029,
2725 1543
2726 ];
2727 $unpack2 = [
2728 1029,
2729 1543
2730 ];
2731
2732 When "unpack" is called in list context, it will unpack as many
2733 elements as possible from STRING, including zero if STRING is
2734 not long enough.
2735
2736 initializer
2737 "initializer" TYPE
2738 "initializer" TYPE, DATA
2739 The "initializer" method can be used retrieve an initializer
2740 string for a certain TYPE. This can be useful if you have to
2741 initialize only a couple of members in a huge compound type or
2742 if you simply want to generate initializers automatically.
2743
2744 struct date {
2745 unsigned year : 12;
2746 unsigned month: 4;
2747 unsigned day : 5;
2748 unsigned hour : 5;
2749 unsigned min : 6;
2750 };
2751
2752 typedef struct {
2753 enum { DATE, QWORD } type;
2754 short number;
2755 union {
2756 struct date date;
2757 unsigned long qword;
2758 } choice;
2759 } data;
2760
2761 Given the above code has been parsed
2762
2763 $init = $c->initializer('data');
2764 print "data x = $init;\n";
2765
2766 would print the following:
2767
2768 data x = {
2769 0,
2770 0,
2771 {
2772 {
2773 0,
2774 0,
2775 0,
2776 0,
2777 0
2778 }
2779 }
2780 };
2781
2782 You could directly put that into a C program, although it
2783 probably isn't very useful yet. It becomes more useful if you
2784 actually specify how you want to initialize the type:
2785
2786 $data = {
2787 type => 'QWORD',
2788 choice => {
2789 date => { month => 12, day => 24 },
2790 qword => 4711,
2791 },
2792 stuff => 'yes?',
2793 };
2794
2795 $init = $c->initializer('data', $data);
2796 print "data x = $init;\n";
2797
2798 This would print the following:
2799
2800 data x = {
2801 QWORD,
2802 0,
2803 {
2804 {
2805 0,
2806 12,
2807 24,
2808 0,
2809 0
2810 }
2811 }
2812 };
2813
2814 As only the first member of a "union" can be initialized,
2815 "choice.qword" is ignored. You will not be warned about the
2816 fact that you probably tried to initialize a member other than
2817 the first. This is considered a feature, because it allows you
2818 to use "unpack" to generate the initializer data:
2819
2820 $data = $c->unpack('data', $binary);
2821 $init = $c->initializer('data', $data);
2822
2823 Since "unpack" unpacks all union members, you would otherwise
2824 have to delete all but the first one previous to feeding it
2825 into "initializer".
2826
2827 Also, "stuff" is ignored, because it actually isn't a member of
2828 "data". You won't be warned about that either.
2829
2830 sizeof
2831 "sizeof" TYPE
2832 This method will return the size of a C type in bytes. If it
2833 cannot find the type, it will throw an exception.
2834
2835 If the type defines some kind of compound object, you may ask
2836 for the size of a member of that compound object:
2837
2838 $size = $c->sizeof('test.uni.word[1]');
2839
2840 This would set $size to 2.
2841
2842 typeof
2843 "typeof" TYPE
2844 This method will return the type of a C member. While this
2845 only makes sense for compound types, it's legal to also use it
2846 for non-compound types. If it cannot find the type, it will
2847 throw an exception.
2848
2849 The "typeof" method can be used on any valid member, even on
2850 arrays or unnamed types. It will always return a string that
2851 holds the name (or in case of unnamed types only the class) of
2852 the type, optionally followed by a '*' character to indicate
2853 it's a pointer type, and optionally followed by one or more
2854 array dimensions if it's an array type. If the type is a
2855 bitfield, the type name is followed by a colon and the number
2856 of bits.
2857
2858 struct test {
2859 char ary[3];
2860 union {
2861 short word[2];
2862 long *quad;
2863 } uni;
2864 struct {
2865 unsigned short six:6;
2866 unsigned short ten:10;
2867 } bits;
2868 };
2869
2870 Given the above C code has been parsed, calls to "typeof" would
2871 return the following values:
2872
2873 $c->typeof('test') => 'struct test'
2874 $c->typeof('test.ary') => 'char [3]'
2875 $c->typeof('test.uni') => 'union'
2876 $c->typeof('test.uni.quad') => 'long *'
2877 $c->typeof('test.uni.word') => 'short [2]'
2878 $c->typeof('test.uni.word[1]') => 'short'
2879 $c->typeof('test.bits') => 'struct'
2880 $c->typeof('test.bits.six') => 'unsigned short :6'
2881 $c->typeof('test.bits.ten') => 'unsigned short :10'
2882
2883 offsetof
2884 "offsetof" TYPE, MEMBER
2885 You can use "offsetof" just like the C macro of same
2886 denominator. It will simply return the offset (in bytes) of
2887 MEMBER relative to TYPE.
2888
2889 use Convert::Binary::C;
2890
2891 $c = Convert::Binary::C->new( Alignment => 4
2892 , LongSize => 4
2893 , PointerSize => 4
2894 )
2895 ->parse(<<'ENDC');
2896 typedef struct {
2897 char abc;
2898 long day;
2899 int *ptr;
2900 } week;
2901
2902 struct test {
2903 week zap[8];
2904 };
2905 ENDC
2906
2907 @args = (
2908 ['test', 'zap[5].day' ],
2909 ['test.zap[2]', 'day' ],
2910 ['test', 'zap[5].day+1'],
2911 ['test', 'zap[-3].ptr' ],
2912 );
2913
2914 for (@args) {
2915 my $offset = eval { $c->offsetof(@$_) };
2916 printf "\$c->offsetof('%s', '%s') => $offset\n", @$_;
2917 }
2918
2919 The final loop will print:
2920
2921 $c->offsetof('test', 'zap[5].day') => 64
2922 $c->offsetof('test.zap[2]', 'day') => 4
2923 $c->offsetof('test', 'zap[5].day+1') => 65
2924 $c->offsetof('test', 'zap[-3].ptr') => -28
2925
2926 · The first iteration simply shows that the offset of
2927 "zap[5].day" is 64 relative to the beginning of "struct
2928 test".
2929
2930 · You may additionally specify a member for the type passed as
2931 the first argument, as shown in the second iteration.
2932
2933 · The offset suffix is also supported by "offsetof", so the
2934 third iteration will correctly print 65.
2935
2936 · The last iteration demonstrates that even out-of-bounds array
2937 indices are handled correctly, just as they are handled in C.
2938
2939 Unlike the C macro, "offsetof" also works on array types.
2940
2941 $offset = $c->offsetof('test.zap', '[3].ptr+2');
2942 print "offset = $offset";
2943
2944 This will print:
2945
2946 offset = 46
2947
2948 If TYPE is a compound, MEMBER may optionally be prefixed with a
2949 dot, so
2950
2951 printf "offset = %d\n", $c->offsetof('week', 'day');
2952 printf "offset = %d\n", $c->offsetof('week', '.day');
2953
2954 are both equivalent and will print
2955
2956 offset = 4
2957 offset = 4
2958
2959 This allows to
2960
2961 · use the C macro style, without a leading dot, and
2962
2963 · directly use the output of the "member" method, which
2964 includes a leading dot for compound types, as input for the
2965 MEMBER argument.
2966
2967 member
2968 "member" TYPE
2969 "member" TYPE, OFFSET
2970 You can think of "member" as being the reverse of the
2971 "offsetof" method. However, as this is more complex, there's no
2972 equivalent to "member" in the C language.
2973
2974 Usually this method is used if you want to retrieve the name of
2975 the member that is located at a specific offset of a previously
2976 parsed type.
2977
2978 use Convert::Binary::C;
2979
2980 $c = Convert::Binary::C->new( Alignment => 4
2981 , LongSize => 4
2982 , PointerSize => 4
2983 )
2984 ->parse(<<'ENDC');
2985 typedef struct {
2986 char abc;
2987 long day;
2988 int *ptr;
2989 } week;
2990
2991 struct test {
2992 week zap[8];
2993 };
2994 ENDC
2995
2996 for my $offset (24, 39, 69, 99) {
2997 print "\$c->member('test', $offset)";
2998 my $member = eval { $c->member('test', $offset) };
2999 print $@ ? "\n exception: $@" : " => '$member'\n";
3000 }
3001
3002 This will print:
3003
3004 $c->member('test', 24) => '.zap[2].abc'
3005 $c->member('test', 39) => '.zap[3]+3'
3006 $c->member('test', 69) => '.zap[5].ptr+1'
3007 $c->member('test', 99)
3008 exception: Offset 99 out of range (0 <= offset < 96)
3009
3010 · The output of the first iteration is obvious. The member
3011 "zap[2].abc" is located at offset 24 of "struct test".
3012
3013 · In the second iteration, the offset points into a region of
3014 padding bytes and thus no member of "week" can be named.
3015 Instead of a member name the offset relative to "zap[3]" is
3016 appended.
3017
3018 · In the third iteration, the offset points to "zap[5].ptr".
3019 However, "zap[5].ptr" is located at 68, not at 69, and thus
3020 the remaining offset of 1 is also appended.
3021
3022 · The last iteration causes an exception because the offset of
3023 99 is not valid for "struct test" since the size of "struct
3024 test" is only 96. You might argue that this is inconsistent,
3025 since "offsetof" can also handle out-of-bounds array members.
3026 But as soon as you have more than one level of array nesting,
3027 there's an infinite number of out-of-bounds members for a
3028 single given offset, so it would be impossible to return a
3029 list of all members.
3030
3031 You can additionally specify a member for the type passed as
3032 the first argument:
3033
3034 $member = $c->member('test.zap[2]', 6);
3035 print $member;
3036
3037 This will print:
3038
3039 .day+2
3040
3041 Like "offsetof", "member" also works on array types:
3042
3043 $member = $c->member('test.zap', 42);
3044 print $member;
3045
3046 This will print:
3047
3048 [3].day+2
3049
3050 While the behaviour for "struct"s is quite obvious, the
3051 behaviour for "union"s is rather tricky. As a single offset
3052 usually references more than one member of a union, there are
3053 certain rules that the algorithm uses for determining the best
3054 member.
3055
3056 · The first non-compound member that is referenced without an
3057 offset has the highest priority.
3058
3059 · If no member is referenced without an offset, the first non-
3060 compound member that is referenced with an offset will be
3061 returned.
3062
3063 · Otherwise the first padding region that is encountered will
3064 be taken.
3065
3066 As an example, given 4-byte-alignment and the union
3067
3068 union choice {
3069 struct {
3070 char color[2];
3071 long size;
3072 char taste;
3073 } apple;
3074 char grape[3];
3075 struct {
3076 long weight;
3077 short price[3];
3078 } melon;
3079 };
3080
3081 the "member" method would return what is shown in the Member
3082 column of the following table. The Type column shows the result
3083 of the "typeof" method when passing the corresponding member.
3084
3085 Offset Member Type
3086 --------------------------------------
3087 0 .apple.color[0] 'char'
3088 1 .apple.color[1] 'char'
3089 2 .grape[2] 'char'
3090 3 .melon.weight+3 'long'
3091 4 .apple.size 'long'
3092 5 .apple.size+1 'long'
3093 6 .melon.price[1] 'short'
3094 7 .apple.size+3 'long'
3095 8 .apple.taste 'char'
3096 9 .melon.price[2]+1 'short'
3097 10 .apple+10 'struct'
3098 11 .apple+11 'struct'
3099
3100 It's like having a stack of all the union members and looking
3101 through the stack for the shiniest piece you can see. The
3102 beginning of a member (denoted by uppercase letters) is always
3103 shinier than the rest of a member, while padding regions
3104 (denoted by dashes) aren't shiny at all.
3105
3106 Offset 0 1 2 3 4 5 6 7 8 9 10 11
3107 -------------------------------------------------------
3108 apple (C) (C) - - (S) (s) s (s) (T) - (-) (-)
3109 grape G G (G)
3110 melon W w w (w) P p (P) p P (p) - -
3111
3112 If you look through that stack from top to bottom, you'll end
3113 up at the parenthesized members.
3114
3115 Alternatively, if you're not only interested in the best
3116 member, you can call "member" in list context, which makes it
3117 return all members referenced by the given offset.
3118
3119 Offset Member Type
3120 --------------------------------------
3121 0 .apple.color[0] 'char'
3122 .grape[0] 'char'
3123 .melon.weight 'long'
3124 1 .apple.color[1] 'char'
3125 .grape[1] 'char'
3126 .melon.weight+1 'long'
3127 2 .grape[2] 'char'
3128 .melon.weight+2 'long'
3129 .apple+2 'struct'
3130 3 .melon.weight+3 'long'
3131 .apple+3 'struct'
3132 4 .apple.size 'long'
3133 .melon.price[0] 'short'
3134 5 .apple.size+1 'long'
3135 .melon.price[0]+1 'short'
3136 6 .melon.price[1] 'short'
3137 .apple.size+2 'long'
3138 7 .apple.size+3 'long'
3139 .melon.price[1]+1 'short'
3140 8 .apple.taste 'char'
3141 .melon.price[2] 'short'
3142 9 .melon.price[2]+1 'short'
3143 .apple+9 'struct'
3144 10 .apple+10 'struct'
3145 .melon+10 'struct'
3146 11 .apple+11 'struct'
3147 .melon+11 'struct'
3148
3149 The first member returned is always the best member. The other
3150 members are sorted according to the rules given above. This
3151 means that members referenced without an offset are followed by
3152 members referenced with an offset. Padding regions will be at
3153 the end.
3154
3155 If OFFSET is not given in the method call, "member" will return
3156 a list of all possible members of TYPE.
3157
3158 print "$_\n" for $c->member('choice');
3159
3160 This will print:
3161
3162 .apple.color[0]
3163 .apple.color[1]
3164 .apple.size
3165 .apple.taste
3166 .grape[0]
3167 .grape[1]
3168 .grape[2]
3169 .melon.weight
3170 .melon.price[0]
3171 .melon.price[1]
3172 .melon.price[2]
3173
3174 In scalar context, the number of possible members is returned.
3175
3176 tag
3177 "tag" TYPE
3178 "tag" TYPE, TAG
3179 "tag" TYPE, TAG1 => VALUE1, TAG2 => VALUE2, ...
3180 The "tag" method can be used to tag properties to a TYPE. It's
3181 a bit like having "configure" for individual types.
3182
3183 See "USING TAGS" for an example.
3184
3185 Note that while you can tag whole types as well as compound
3186 members, it is not possible to tag array members, i.e. you
3187 cannot treat, for example, "a[1]" and "a[2]" differently.
3188
3189 Also note that in code like this
3190
3191 struct test {
3192 int a;
3193 struct {
3194 int x;
3195 } b, c;
3196 };
3197
3198 if you tag "test.b.x", this will also tag "test.c.x"
3199 implicitly.
3200
3201 It is also possible to tag basic types if you really want to do
3202 that, for example:
3203
3204 $c->tag('int', Format => 'Binary');
3205
3206 To remove a tag from a type, you can either set that tag to
3207 "undef", for example
3208
3209 $c->tag('test', Hooks => undef);
3210
3211 or use "untag".
3212
3213 To see if a tag is attached to a type or to get the value of a
3214 tag, pass only the type and tag name to "tag":
3215
3216 $c->tag('test.a', Format => 'Binary');
3217
3218 $hooks = $c->tag('test.a', 'Hooks');
3219 $format = $c->tag('test.a', 'Format');
3220
3221 This will give you:
3222
3223 $hooks = undef;
3224 $format = 'Binary';
3225
3226 To see which tags are attached to a type, pass only the type.
3227 The "tag" method will now return a hash reference containing
3228 all tags attached to the type:
3229
3230 $tags = $c->tag('test.a');
3231
3232 This will give you:
3233
3234 $tags = {
3235 'Format' => 'Binary'
3236 };
3237
3238 "tag" will throw an exception if an error occurs. If called as
3239 a 'set' method, it will return a reference to its object,
3240 allowing you to chain together consecutive method calls.
3241
3242 Note that when a compound is inlined, tags attached to the
3243 inlined compound are ignored, for example:
3244
3245 $c->parse(<<ENDC);
3246 struct header {
3247 int id;
3248 int len;
3249 unsigned flags;
3250 };
3251
3252 struct message {
3253 struct header;
3254 short samples[32];
3255 };
3256 ENDC
3257
3258 for my $type (qw( header message header.len )) {
3259 $c->tag($type, Hooks => { unpack => sub { print "unpack: $type\n"; @_ } });
3260 }
3261
3262 for my $type (qw( header message )) {
3263 print "[unpacking $type]\n";
3264 $u = $c->unpack($type, $data);
3265 }
3266
3267 This will print:
3268
3269 [unpacking header]
3270 unpack: header.len
3271 unpack: header
3272 [unpacking message]
3273 unpack: header.len
3274 unpack: message
3275
3276 As you can see from the above output, tags attached to members
3277 of inlined compounds ("header.len" are still handled.
3278
3279 The following tags can be configured:
3280
3281 "Format" => 'Binary' | 'String'
3282 The "Format" tag allows you to control the way binary data
3283 is converted by "pack" and "unpack".
3284
3285 If you tag a "TYPE" as "Binary", it will not be converted
3286 at all, i.e. it will be passed through as a binary string.
3287
3288 If you tag it as "String", it will be treated like a null-
3289 terminated C string, i.e. "unpack" will convert the C
3290 string to a Perl string and vice versa.
3291
3292 See "The Format Tag" for an example.
3293
3294 "ByteOrder" => 'BigEndian' | 'LittleEndian'
3295 The "ByteOrder" tag allows you to explicitly set the byte
3296 order of a TYPE.
3297
3298 See "The ByteOrder Tag" for an example.
3299
3300 "Dimension" => '*'
3301 "Dimension" => VALUE
3302 "Dimension" => MEMBER
3303 "Dimension" => SUB
3304 "Dimension" => [ SUB, ARGS ]
3305 The "Dimension" tag allows you to alter the size of an
3306 array dynamically.
3307
3308 You can tag fixed size arrays as being flexible using '*'.
3309 This is useful if you cannot use flexible array members in
3310 your source code.
3311
3312 $c->tag('type.array', Dimension => '*');
3313
3314 You can also tag an array to have a fixed size different
3315 from the one it was originally declared with.
3316
3317 $c->tag('type.array', Dimension => 42);
3318
3319 If the array is a member of a compound, you can also tag it
3320 with to have a size corresponding to the value of another
3321 member in that compound.
3322
3323 $c->tag('type.array', Dimension => 'count');
3324
3325 Finally, you can specify a subroutine that is called when
3326 the size of the array needs to be determined.
3327
3328 $c->tag('type.array', Dimension => \&get_count);
3329
3330 By default, and if the array is a compound member, that
3331 subroutine will be passed a reference to the hash storing
3332 the data for the compound.
3333
3334 You can also instruct Convert::Binary::C to pass additional
3335 arguments to the subroutine by passing an array reference
3336 instead of the subroutine reference. This array contains
3337 the subroutine reference as well as a list of arguments.
3338 It is possible to define certain special arguments using
3339 the "arg" method.
3340
3341 $c->tag('type.array', Dimension => [\&get_count, $c->arg('SELF'), 42]);
3342
3343 See "The Dimension Tag" for various examples.
3344
3345 "Hooks" => { HOOK => SUB, HOOK => [ SUB, ARGS ], ... }, ...
3346 The "Hooks" tag allows you to register subroutines as
3347 hooks.
3348
3349 Hooks are called whenever a certain "TYPE" is packed or
3350 unpacked. Hooks are currently considered an experimental
3351 feature.
3352
3353 "HOOK" can be one of the following:
3354
3355 pack
3356 unpack
3357 pack_ptr
3358 unpack_ptr
3359
3360 "pack" and "unpack" hooks are called when processing their
3361 "TYPE", while "pack_ptr" and "unpack_ptr" hooks are called
3362 when processing pointers to their "TYPE".
3363
3364 "SUB" is a reference to a subroutine that usually takes one
3365 input argument, processes it and returns one output
3366 argument.
3367
3368 Alternatively, you can pass a custom list of arguments to
3369 the hook by using an array reference instead of "SUB" that
3370 holds the subroutine reference in the first element and the
3371 arguments to be passed to the subroutine as the other
3372 elements. This way, you can even pass special arguments to
3373 the hook using the "arg" method.
3374
3375 Here are a few examples for registering hooks:
3376
3377 $c->tag('ObjectType', Hooks => {
3378 pack => \&obj_pack,
3379 unpack => \&obj_unpack
3380 });
3381
3382 $c->tag('ProtocolId', Hooks => {
3383 unpack => sub { $protos[$_[0]] }
3384 });
3385
3386 $c->tag('ProtocolId', Hooks => {
3387 unpack_ptr => [sub {
3388 sprintf "$_[0]:{0x%X}", $_[1]
3389 },
3390 $c->arg('TYPE', 'DATA')
3391 ],
3392 });
3393
3394 Note that the above example registers both an "unpack" hook
3395 and an "unpack_ptr" hook for "ProtocolId" with two separate
3396 calls to "tag". As long as you don't explicitly overwrite a
3397 previously registered hook, it won't be modified or removed
3398 by registering other hooks for the same "TYPE".
3399
3400 To remove all registered hooks for a type, simply remove
3401 the "Hooks" tag:
3402
3403 $c->untag('ProtocolId', 'Hooks');
3404
3405 To remove only a single hook, pass "undef" as "SUB" instead
3406 of a subroutine reference:
3407
3408 $c->tag('ObjectType', Hooks => { pack => undef });
3409
3410 If all hooks are removed, the whole "Hooks" tag is removed.
3411
3412 See "The Hooks Tag" for examples on how to use hooks.
3413
3414 untag
3415 "untag" TYPE
3416 "untag" TYPE, TAG1, TAG2, ...
3417 Use the "untag" method to remove one, more, or all tags from a
3418 type. If you don't pass any tag names, all tags attached to the
3419 type will be removed. Otherwise only the listed tags will be
3420 removed.
3421
3422 See "USING TAGS" for an example.
3423
3424 arg
3425 "arg" 'ARG', ...
3426 Creates placeholders for special arguments to be passed to
3427 hooks or other subroutines. These arguments are currently:
3428
3429 "SELF"
3430 A reference to the calling Convert::Binary::C object. This
3431 may be useful if you need to work with the object inside
3432 the subroutine.
3433
3434 "TYPE"
3435 The name of the type that is currently being processed by
3436 the hook.
3437
3438 "DATA"
3439 The data argument that is passed to the subroutine.
3440
3441 "HOOK"
3442 The type of the hook as which the subroutine has been
3443 called, for example "pack" or "unpack_ptr".
3444
3445 "arg" will return a placeholder for each argument it is being
3446 passed. Note that not all arguments may be supported depending
3447 on the context of the subroutine.
3448
3449 dependencies
3450 "dependencies"
3451 After some code has been parsed using either the "parse" or
3452 "parse_file" methods, the "dependencies" method can be used to
3453 retrieve information about all files that the object depends
3454 on, i.e. all files that have been parsed.
3455
3456 In scalar context, the method returns a hash reference. Each
3457 key is the name of a file. The values are again hash
3458 references, each of which holds the size, modification time
3459 (mtime), and change time (ctime) of the file at the moment it
3460 was parsed.
3461
3462 use Convert::Binary::C;
3463 use Data::Dumper;
3464
3465 #----------------------------------------------------------
3466 # Create object, set include path, parse 'string.h' header
3467 #----------------------------------------------------------
3468 my $c = Convert::Binary::C->new
3469 ->Include('/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include',
3470 '/usr/include')
3471 ->parse_file('string.h');
3472
3473 #----------------------------------------------------------
3474 # Get dependencies of the object, extract dependency files
3475 #----------------------------------------------------------
3476 my $depend = $c->dependencies;
3477 my @files = keys %$depend;
3478
3479 #-----------------------------
3480 # Dump dependencies and files
3481 #-----------------------------
3482 print Data::Dumper->Dump([$depend, \@files],
3483 [qw( depend *files )]);
3484
3485 The above code would print something like this:
3486
3487 $depend = {
3488 '/usr/include/features.h' => {
3489 'ctime' => 1196609327,
3490 'mtime' => 1196609232,
3491 'size' => 11688
3492 },
3493 '/usr/include/gnu/stubs-32.h' => {
3494 'ctime' => 1196609327,
3495 'mtime' => 1196609305,
3496 'size' => 624
3497 },
3498 '/usr/include/sys/cdefs.h' => {
3499 'ctime' => 1196609327,
3500 'mtime' => 1196609269,
3501 'size' => 11773
3502 },
3503 '/usr/include/gnu/stubs.h' => {
3504 'ctime' => 1196609327,
3505 'mtime' => 1196609232,
3506 'size' => 315
3507 },
3508 '/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include/stddef.h' => {
3509 'ctime' => 1203359674,
3510 'mtime' => 1203357922,
3511 'size' => 12695
3512 },
3513 '/usr/include/string.h' => {
3514 'ctime' => 1196609327,
3515 'mtime' => 1196609262,
3516 'size' => 16438
3517 },
3518 '/usr/include/bits/wordsize.h' => {
3519 'ctime' => 1196609327,
3520 'mtime' => 1196609257,
3521 'size' => 873
3522 }
3523 };
3524 @files = (
3525 '/usr/include/features.h',
3526 '/usr/include/gnu/stubs-32.h',
3527 '/usr/include/sys/cdefs.h',
3528 '/usr/include/gnu/stubs.h',
3529 '/usr/lib/gcc/i686-pc-linux-gnu/4.1.2/include/stddef.h',
3530 '/usr/include/string.h',
3531 '/usr/include/bits/wordsize.h'
3532 );
3533
3534 In list context, the method returns the names of all files that
3535 have been parsed, i.e. the following lines are equivalent:
3536
3537 @files = keys %{$c->dependencies};
3538 @files = $c->dependencies;
3539
3540 sourcify
3541 "sourcify"
3542 "sourcify" CONFIG
3543 Returns a string that holds the C source code necessary to
3544 represent all parsed C data structures.
3545
3546 use Convert::Binary::C;
3547
3548 $c = new Convert::Binary::C;
3549 $c->parse(<<'END');
3550
3551 #define ADD(a, b) ((a) + (b))
3552 #define NUMBER 42
3553
3554 typedef struct _mytype mytype;
3555
3556 struct _mytype {
3557 union {
3558 int iCount;
3559 enum count *pCount;
3560 } counter;
3561 #pragma pack( push, 1 )
3562 struct {
3563 char string[NUMBER];
3564 int array[NUMBER/sizeof(int)];
3565 } storage;
3566 #pragma pack( pop )
3567 mytype *next;
3568 };
3569
3570 enum count { ZERO, ONE, TWO, THREE };
3571
3572 END
3573
3574 print $c->sourcify;
3575
3576 The above code would print something like this:
3577
3578 /* typedef predeclarations */
3579
3580 typedef struct _mytype mytype;
3581
3582 /* defined enums */
3583
3584 enum count
3585 {
3586 ZERO,
3587 ONE,
3588 TWO,
3589 THREE
3590 };
3591
3592
3593 /* defined structs and unions */
3594
3595 struct _mytype
3596 {
3597 union
3598 {
3599 int iCount;
3600 enum count *pCount;
3601 } counter;
3602 #pragma pack(push, 1)
3603 struct
3604 {
3605 char string[42];
3606 int array[10];
3607 } storage;
3608 #pragma pack(pop)
3609 mytype *next;
3610 };
3611
3612 The purpose of the "sourcify" method is to enable some kind of
3613 platform-independent caching. The C code generated by
3614 "sourcify" can be parsed by any standard C compiler, as well as
3615 of course by the Convert::Binary::C parser. However, the code
3616 may be significantly shorter than the code that has originally
3617 been parsed.
3618
3619 When parsing a typical header file, it's easily possible that
3620 you need to open dozens of other files that are included from
3621 that file, and end up parsing several hundred kilobytes of C
3622 code. Since most of it is usually preprocessor directives,
3623 function prototypes and comments, the "sourcify" function
3624 strips this down to a few kilobytes. Saving the "sourcify"
3625 string and parsing it next time instead of the original code
3626 may be a lot faster.
3627
3628 The "sourcify" method takes a hash reference as an optional
3629 argument. It can be used to tweak the method's output. The
3630 following options can be configured.
3631
3632 "Context" => 0 | 1
3633 Turns preprocessor context information on or off. If this
3634 is turned on, "sourcify" will insert "#line" preprocessor
3635 directives in its output. So in the above example
3636
3637 print $c->sourcify({ Context => 1 });
3638
3639 would print:
3640
3641 /* typedef predeclarations */
3642
3643 typedef struct _mytype mytype;
3644
3645 /* defined enums */
3646
3647
3648 #line 21 "[buffer]"
3649 enum count
3650 {
3651 ZERO,
3652 ONE,
3653 TWO,
3654 THREE
3655 };
3656
3657
3658 /* defined structs and unions */
3659
3660
3661 #line 7 "[buffer]"
3662 struct _mytype
3663 {
3664 #line 8 "[buffer]"
3665 union
3666 {
3667 int iCount;
3668 enum count *pCount;
3669 } counter;
3670 #pragma pack(push, 1)
3671 #line 13 "[buffer]"
3672 struct
3673 {
3674 char string[42];
3675 int array[10];
3676 } storage;
3677 #pragma pack(pop)
3678 mytype *next;
3679 };
3680
3681 Note that "[buffer]" refers to the here-doc buffer when
3682 using "parse".
3683
3684 "Defines" => 0 | 1
3685 Turn this on if you want all the defined macros to be part
3686 of the source code output. Given the example code above
3687
3688 print $c->sourcify({ Defines => 1 });
3689
3690 would print:
3691
3692 /* typedef predeclarations */
3693
3694 typedef struct _mytype mytype;
3695
3696 /* defined enums */
3697
3698 enum count
3699 {
3700 ZERO,
3701 ONE,
3702 TWO,
3703 THREE
3704 };
3705
3706
3707 /* defined structs and unions */
3708
3709 struct _mytype
3710 {
3711 union
3712 {
3713 int iCount;
3714 enum count *pCount;
3715 } counter;
3716 #pragma pack(push, 1)
3717 struct
3718 {
3719 char string[42];
3720 int array[10];
3721 } storage;
3722 #pragma pack(pop)
3723 mytype *next;
3724 };
3725
3726 /* preprocessor defines */
3727
3728 #define ADD(a, b) ((a) + (b))
3729 #define NUMBER 42
3730
3731 The macro definitions always appear at the end of the
3732 source code. The order of the macro definitions is
3733 undefined.
3734
3735 The following methods can be used to retrieve information about the
3736 definitions that have been parsed. The examples given in the
3737 description for "enum", "compound" and "typedef" all assume this piece
3738 of C code has been parsed:
3739
3740 #define ABC_SIZE 2
3741 #define MULTIPLY(x, y) ((x)*(y))
3742
3743 #ifdef ABC_SIZE
3744 # define DEFINED
3745 #else
3746 # define NOT_DEFINED
3747 #endif
3748
3749 typedef unsigned long U32;
3750 typedef void *any;
3751
3752 enum __socket_type
3753 {
3754 SOCK_STREAM = 1,
3755 SOCK_DGRAM = 2,
3756 SOCK_RAW = 3,
3757 SOCK_RDM = 4,
3758 SOCK_SEQPACKET = 5,
3759 SOCK_PACKET = 10
3760 };
3761
3762 struct STRUCT_SV {
3763 void *sv_any;
3764 U32 sv_refcnt;
3765 U32 sv_flags;
3766 };
3767
3768 typedef union {
3769 int abc[ABC_SIZE];
3770 struct xxx {
3771 int a;
3772 int b;
3773 } ab[3][4];
3774 any ptr;
3775 } test;
3776
3777 enum_names
3778 "enum_names"
3779 Returns a list of identifiers of all defined enumeration
3780 objects. Enumeration objects don't necessarily have an
3781 identifier, so something like
3782
3783 enum { A, B, C };
3784
3785 will obviously not appear in the list returned by the
3786 "enum_names" method. Also, enumerations that are not defined
3787 within the source code - like in
3788
3789 struct foo {
3790 enum weekday *pWeekday;
3791 unsigned long year;
3792 };
3793
3794 where only a pointer to the "weekday" enumeration object is
3795 used - will not be returned, even though they have an
3796 identifier. So for the above two enumerations, "enum_names"
3797 will return an empty list:
3798
3799 @names = $c->enum_names;
3800
3801 The only way to retrieve a list of all enumeration identifiers
3802 is to use the "enum" method without additional arguments. You
3803 can get a list of all enumeration objects that have an
3804 identifier by using
3805
3806 @enums = map { $_->{identifier} || () } $c->enum;
3807
3808 but these may not have a definition. Thus, the two arrays would
3809 look like this:
3810
3811 @names = ();
3812 @enums = ('weekday');
3813
3814 The "def" method returns a true value for all identifiers
3815 returned by "enum_names".
3816
3817 enum
3818 enum
3819 "enum" LIST
3820 Returns a list of references to hashes containing detailed
3821 information about all enumerations that have been parsed.
3822
3823 If a list of enumeration identifiers is passed to the method,
3824 the returned list will only contain hash references for those
3825 enumerations. The enumeration identifiers may optionally be
3826 prefixed by "enum".
3827
3828 If an enumeration identifier cannot be found, the returned list
3829 will contain an undefined value at that position.
3830
3831 In scalar context, the number of enumerations will be returned
3832 as long as the number of arguments to the method call is not 1.
3833 In the latter case, a hash reference holding information for
3834 the enumeration will be returned.
3835
3836 The list returned by the "enum" method looks similar to this:
3837
3838 @enum = (
3839 {
3840 'enumerators' => {
3841 'SOCK_STREAM' => 1,
3842 'SOCK_RAW' => 3,
3843 'SOCK_SEQPACKET' => 5,
3844 'SOCK_RDM' => 4,
3845 'SOCK_PACKET' => 10,
3846 'SOCK_DGRAM' => 2
3847 },
3848 'identifier' => '__socket_type',
3849 'context' => 'definitions.c(13)',
3850 'size' => 4,
3851 'sign' => 0
3852 }
3853 );
3854
3855 "identifier"
3856 holds the enumeration identifier. This key is not present
3857 if the enumeration has no identifier.
3858
3859 "context"
3860 is the context in which the enumeration is defined. This is
3861 the filename followed by the line number in parentheses.
3862
3863 "enumerators"
3864 is a reference to a hash table that holds all enumerators
3865 of the enumeration.
3866
3867 "sign"
3868 is a boolean indicating if the enumeration is signed (i.e.
3869 has negative values).
3870
3871 One useful application may be to create a hash table that holds
3872 all enumerators of all defined enumerations:
3873
3874 %enum = map %{ $_->{enumerators} || {} }, $c->enum;
3875
3876 The %enum hash table would then be:
3877
3878 %enum = (
3879 'SOCK_STREAM' => 1,
3880 'SOCK_RAW' => 3,
3881 'SOCK_SEQPACKET' => 5,
3882 'SOCK_RDM' => 4,
3883 'SOCK_DGRAM' => 2,
3884 'SOCK_PACKET' => 10
3885 );
3886
3887 compound_names
3888 "compound_names"
3889 Returns a list of identifiers of all structs and unions
3890 (compound data structures) that are defined in the parsed
3891 source code. Like enumerations, compounds don't need to have an
3892 identifier, nor do they need to be defined.
3893
3894 Again, the only way to retrieve information about all struct
3895 and union objects is to use the "compound" method and don't
3896 pass it any arguments. If you should need a list of all struct
3897 and union identifiers, you can use:
3898
3899 @compound = map { $_->{identifier} || () } $c->compound;
3900
3901 The "def" method returns a true value for all identifiers
3902 returned by "compound_names".
3903
3904 If you need the names of only the structs or only the unions,
3905 use the "struct_names" and "union_names" methods respectively.
3906
3907 compound
3908 "compound"
3909 "compound" LIST
3910 Returns a list of references to hashes containing detailed
3911 information about all compounds (structs and unions) that have
3912 been parsed.
3913
3914 If a list of struct/union identifiers is passed to the method,
3915 the returned list will only contain hash references for those
3916 compounds. The identifiers may optionally be prefixed by
3917 "struct" or "union", which limits the search to the specified
3918 kind of compound.
3919
3920 If an identifier cannot be found, the returned list will
3921 contain an undefined value at that position.
3922
3923 In scalar context, the number of compounds will be returned as
3924 long as the number of arguments to the method call is not 1. In
3925 the latter case, a hash reference holding information for the
3926 compound will be returned.
3927
3928 The list returned by the "compound" method looks similar to
3929 this:
3930
3931 @compound = (
3932 {
3933 'identifier' => 'STRUCT_SV',
3934 'align' => 1,
3935 'context' => 'definitions.c(23)',
3936 'pack' => 0,
3937 'type' => 'struct',
3938 'declarations' => [
3939 {
3940 'declarators' => [
3941 {
3942 'declarator' => '*sv_any',
3943 'size' => 4,
3944 'offset' => 0
3945 }
3946 ],
3947 'type' => 'void'
3948 },
3949 {
3950 'declarators' => [
3951 {
3952 'declarator' => 'sv_refcnt',
3953 'size' => 4,
3954 'offset' => 4
3955 }
3956 ],
3957 'type' => 'U32'
3958 },
3959 {
3960 'declarators' => [
3961 {
3962 'declarator' => 'sv_flags',
3963 'size' => 4,
3964 'offset' => 8
3965 }
3966 ],
3967 'type' => 'U32'
3968 }
3969 ],
3970 'size' => 12
3971 },
3972 {
3973 'identifier' => 'xxx',
3974 'align' => 1,
3975 'context' => 'definitions.c(31)',
3976 'pack' => 0,
3977 'type' => 'struct',
3978 'declarations' => [
3979 {
3980 'declarators' => [
3981 {
3982 'declarator' => 'a',
3983 'size' => 4,
3984 'offset' => 0
3985 }
3986 ],
3987 'type' => 'int'
3988 },
3989 {
3990 'declarators' => [
3991 {
3992 'declarator' => 'b',
3993 'size' => 4,
3994 'offset' => 4
3995 }
3996 ],
3997 'type' => 'int'
3998 }
3999 ],
4000 'size' => 8
4001 },
4002 {
4003 'align' => 1,
4004 'context' => 'definitions.c(29)',
4005 'pack' => 0,
4006 'type' => 'union',
4007 'declarations' => [
4008 {
4009 'declarators' => [
4010 {
4011 'declarator' => 'abc[2]',
4012 'size' => 8,
4013 'offset' => 0
4014 }
4015 ],
4016 'type' => 'int'
4017 },
4018 {
4019 'declarators' => [
4020 {
4021 'declarator' => 'ab[3][4]',
4022 'size' => 96,
4023 'offset' => 0
4024 }
4025 ],
4026 'type' => 'struct xxx'
4027 },
4028 {
4029 'declarators' => [
4030 {
4031 'declarator' => 'ptr',
4032 'size' => 4,
4033 'offset' => 0
4034 }
4035 ],
4036 'type' => 'any'
4037 }
4038 ],
4039 'size' => 96
4040 }
4041 );
4042
4043 "identifier"
4044 holds the struct or union identifier. This key is not
4045 present if the compound has no identifier.
4046
4047 "context"
4048 is the context in which the struct or union is defined.
4049 This is the filename followed by the line number in
4050 parentheses.
4051
4052 "type"
4053 is either 'struct' or 'union'.
4054
4055 "size"
4056 is the size of the struct or union.
4057
4058 "align"
4059 is the alignment of the struct or union.
4060
4061 "pack"
4062 is the struct member alignment if the compound is packed,
4063 or zero otherwise.
4064
4065 "declarations"
4066 is an array of hash references describing each struct
4067 declaration:
4068
4069 "type"
4070 is the type of the struct declaration. This may be a
4071 string or a reference to a hash describing the type.
4072
4073 "declarators"
4074 is an array of hashes describing each declarator:
4075
4076 "declarator"
4077 is a string representation of the declarator.
4078
4079 "offset"
4080 is the offset of the struct member represented by
4081 the current declarator relative to the beginning of
4082 the struct or union.
4083
4084 "size"
4085 is the size occupied by the struct member
4086 represented by the current declarator.
4087
4088 It may be useful to have separate lists for structs and unions.
4089 One way to retrieve such lists would be to use
4090
4091 push @{$_->{type} eq 'union' ? \@unions : \@structs}, $_
4092 for $c->compound;
4093
4094 However, you should use the "struct" and "union" methods, which
4095 is a lot simpler:
4096
4097 @structs = $c->struct;
4098 @unions = $c->union;
4099
4100 struct_names
4101 "struct_names"
4102 Returns a list of all defined struct identifiers. This is
4103 equivalent to calling "compound_names", just that it only
4104 returns the names of the struct identifiers and doesn't return
4105 the names of the union identifiers.
4106
4107 struct
4108 "struct"
4109 "struct" LIST
4110 Like the "compound" method, but only allows for structs.
4111
4112 union_names
4113 "union_names"
4114 Returns a list of all defined union identifiers. This is
4115 equivalent to calling "compound_names", just that it only
4116 returns the names of the union identifiers and doesn't return
4117 the names of the struct identifiers.
4118
4119 union
4120 "union"
4121 "union" LIST
4122 Like the "compound" method, but only allows for unions.
4123
4124 typedef_names
4125 "typedef_names"
4126 Returns a list of all defined typedef identifiers. Typedefs
4127 that do not specify a type that you could actually work with
4128 will not be returned.
4129
4130 The "def" method returns a true value for all identifiers
4131 returned by "typedef_names".
4132
4133 typedef
4134 "typedef"
4135 "typedef" LIST
4136 Returns a list of references to hashes containing detailed
4137 information about all typedefs that have been parsed.
4138
4139 If a list of typedef identifiers is passed to the method, the
4140 returned list will only contain hash references for those
4141 typedefs.
4142
4143 If an identifier cannot be found, the returned list will
4144 contain an undefined value at that position.
4145
4146 In scalar context, the number of typedefs will be returned as
4147 long as the number of arguments to the method call is not 1. In
4148 the latter case, a hash reference holding information for the
4149 typedef will be returned.
4150
4151 The list returned by the "typedef" method looks similar to
4152 this:
4153
4154 @typedef = (
4155 {
4156 'declarator' => 'U32',
4157 'type' => 'unsigned long'
4158 },
4159 {
4160 'declarator' => '*any',
4161 'type' => 'void'
4162 },
4163 {
4164 'declarator' => 'test',
4165 'type' => {
4166 'align' => 1,
4167 'context' => 'definitions.c(29)',
4168 'pack' => 0,
4169 'type' => 'union',
4170 'declarations' => [
4171 {
4172 'declarators' => [
4173 {
4174 'declarator' => 'abc[2]',
4175 'size' => 8,
4176 'offset' => 0
4177 }
4178 ],
4179 'type' => 'int'
4180 },
4181 {
4182 'declarators' => [
4183 {
4184 'declarator' => 'ab[3][4]',
4185 'size' => 96,
4186 'offset' => 0
4187 }
4188 ],
4189 'type' => 'struct xxx'
4190 },
4191 {
4192 'declarators' => [
4193 {
4194 'declarator' => 'ptr',
4195 'size' => 4,
4196 'offset' => 0
4197 }
4198 ],
4199 'type' => 'any'
4200 }
4201 ],
4202 'size' => 96
4203 }
4204 }
4205 );
4206
4207 "declarator"
4208 is the type declarator.
4209
4210 "type"
4211 is the type specification. This may be a string or a
4212 reference to a hash describing the type. See "enum" and
4213 "compound" for a description on how to interpret this hash.
4214
4215 macro_names
4216 "macro_names"
4217 Returns a list of all defined macro names.
4218
4219 The list returned by the "macro_names" method looks similar to
4220 this:
4221
4222 @macro_names = (
4223 '__STDC_VERSION__',
4224 '__STDC_HOSTED__',
4225 'DEFINED',
4226 'MULTIPLY',
4227 'ABC_SIZE'
4228 );
4229
4230 This works only as long as the preprocessor is not reset. See
4231 "Preprocessor configuration" for details.
4232
4233 macro
4234 "macro"
4235 "macro" LIST
4236 Returns the definitions for all defined macros.
4237
4238 If a list of macro names is passed to the method, the returned
4239 list will only contain the definitions for those macros. For
4240 undefined macros, "undef" will be returned.
4241
4242 The list returned by the "macro" method looks similar to this:
4243
4244 @macro = (
4245 '__STDC_VERSION__ 199901L',
4246 '__STDC_HOSTED__ 1',
4247 'DEFINED',
4248 'MULTIPLY(x, y) ((x)*(y))',
4249 'ABC_SIZE 2'
4250 );
4251
4252 This works only as long as the preprocessor is not reset. See
4253 "Preprocessor configuration" for details.
4254
4256 You can alternatively call the following functions as methods on
4257 Convert::Binary::C objects.
4258
4259 feature
4260 "feature" STRING
4261 Checks if Convert::Binary::C was built with certain features.
4262 For example,
4263
4264 print "debugging version"
4265 if Convert::Binary::C::feature('debug');
4266
4267 will check if Convert::Binary::C was built with debugging
4268 support enabled. The "feature" function returns 1 if the
4269 feature is enabled, 0 if the feature is disabled, and "undef"
4270 if the feature is unknown. Currently the only features that can
4271 be checked are "ieeefp" and "debug".
4272
4273 You can enable or disable certain features at compile time of
4274 the module by using the
4275
4276 perl Makefile.PL enable-feature disable-feature
4277
4278 syntax.
4279
4280 native
4281 "native"
4282 "native" STRING
4283 Returns the value of a property of the native system that
4284 Convert::Binary::C was built on. For example,
4285
4286 $size = Convert::Binary::C::native('IntSize');
4287
4288 will fetch the size of an "int" on the native system. The
4289 following properties can be queried:
4290
4291 Alignment
4292 ByteOrder
4293 CharSize
4294 CompoundAlignment
4295 DoubleSize
4296 EnumSize
4297 FloatSize
4298 HostedC
4299 IntSize
4300 LongDoubleSize
4301 LongLongSize
4302 LongSize
4303 PointerSize
4304 ShortSize
4305 StdCVersion
4306 UnsignedBitfields
4307 UnsignedChars
4308
4309 You can also call "native" without arguments, in which case it
4310 will return a reference to a hash with all properties, like:
4311
4312 $native = {
4313 'StdCVersion' => undef,
4314 'ByteOrder' => 'LittleEndian',
4315 'LongSize' => 4,
4316 'IntSize' => 4,
4317 'HostedC' => 1,
4318 'ShortSize' => 2,
4319 'UnsignedChars' => 0,
4320 'DoubleSize' => 8,
4321 'CharSize' => 1,
4322 'EnumSize' => 4,
4323 'PointerSize' => 4,
4324 'FloatSize' => 4,
4325 'LongLongSize' => 8,
4326 'Alignment' => 4,
4327 'LongDoubleSize' => 12,
4328 'UnsignedBitfields' => 0,
4329 'CompoundAlignment' => 1
4330 };
4331
4332 The contents of that hash are suitable for passing them to the
4333 "configure" method.
4334
4336 Like perl itself, Convert::Binary::C can be compiled with debugging
4337 support that can then be selectively enabled at runtime. You can
4338 specify whether you like to build Convert::Binary::C with debugging
4339 support or not by explicitly giving an argument to Makefile.PL. Use
4340
4341 perl Makefile.PL enable-debug
4342
4343 to enable debugging, or
4344
4345 perl Makefile.PL disable-debug
4346
4347 to disable debugging. The default will depend on how your perl binary
4348 was built. If it was built with "-DDEBUGGING", Convert::Binary::C will
4349 be built with debugging support, too.
4350
4351 Once you have built Convert::Binary::C with debugging support, you can
4352 use the following syntax to enable debug output. Instead of
4353
4354 use Convert::Binary::C;
4355
4356 you simply say
4357
4358 use Convert::Binary::C debug => 'all';
4359
4360 which will enable all debug output. However, I don't recommend to
4361 enable all debug output, because that can be a fairly large amount.
4362
4363 Debugging options
4364 Instead of saying "all", you can pass a string that consists of one or
4365 more of the following characters:
4366
4367 m enable memory allocation tracing
4368 M enable memory allocation & assertion tracing
4369
4370 h enable hash table debugging
4371 H enable hash table dumps
4372
4373 d enable debug output from the XS module
4374 c enable debug output from the ctlib
4375 t enable debug output about type objects
4376
4377 l enable debug output from the C lexer
4378 p enable debug output from the C parser
4379 P enable debug output from the C preprocessor
4380 r enable debug output from the #pragma parser
4381
4382 y enable debug output from yacc (bison)
4383
4384 So the following might give you a brief overview of what's going on
4385 inside Convert::Binary::C:
4386
4387 use Convert::Binary::C debug => 'dct';
4388
4389 When you want to debug memory allocation using
4390
4391 use Convert::Binary::C debug => 'm';
4392
4393 you can use the Perl script check_alloc.pl that resides in the
4394 ctlib/util/tool directory to extract statistics about memory usage and
4395 information about memory leaks from the resulting debug output.
4396
4397 Redirecting debug output
4398 By default, all debug output is written to "stderr". You can, however,
4399 redirect the debug output to a file with the "debugfile" option:
4400
4401 use Convert::Binary::C debug => 'dcthHm',
4402 debugfile => './debug.out';
4403
4404 If the file cannot be opened, you'll receive a warning and the output
4405 will go the "stderr" way again.
4406
4407 Alternatively, you can use the environment variables "CBC_DEBUG_OPT"
4408 and "CBC_DEBUG_FILE" to turn on debug output.
4409
4410 If Convert::Binary::C is built without debugging support, passing the
4411 "debug" or "debugfile" options will cause a warning to be issued. The
4412 corresponding environment variables will simply be ignored.
4413
4415 "CBC_ORDER_MEMBERS"
4416 Setting this variable to a non-zero value will globally turn on hash
4417 key ordering for compound members. Have a look at the "OrderMembers"
4418 option for details.
4419
4420 Setting the variable to the name of a perl module will additionally use
4421 this module instead of the predefined modules for member ordering to
4422 tie the hashes to.
4423
4424 "CBC_DEBUG_OPT"
4425 If Convert::Binary::C is built with debugging support, you can use this
4426 variable to specify the debugging options.
4427
4428 "CBC_DEBUG_FILE"
4429 If Convert::Binary::C is built with debugging support, you can use this
4430 variable to redirect the debug output to a file.
4431
4432 "CBC_DISABLE_PARSER"
4433 This variable is intended purely for development. Setting it to a non-
4434 zero value disables the Convert::Binary::C parser, which means that no
4435 information is collected from the file or code that is parsed. However,
4436 the preprocessor will run, which is useful for benchmarking the
4437 preprocessor.
4438
4440 Flexible array members are a feature introduced with ISO-C99. It's a
4441 common problem that you have a variable length data field at the end of
4442 a structure, for example an array of characters at the end of a message
4443 struct. ISO-C99 allows you to write this as:
4444
4445 struct message {
4446 long header;
4447 char data[];
4448 };
4449
4450 The advantage is that you clearly indicate that the size of the
4451 appended data is variable, and that the "data" member doesn't
4452 contribute to the size of the "message" structure.
4453
4454 When packing or unpacking data, Convert::Binary::C deals with flexible
4455 array members as if their length was adjustable. For example, "unpack"
4456 will adapt the length of the array depending on the input string:
4457
4458 $msg1 = $c->unpack('message', 'abcdefg');
4459 $msg2 = $c->unpack('message', 'abcdefghijkl');
4460
4461 The following data is unpacked:
4462
4463 $msg1 = {
4464 'data' => [
4465 101,
4466 102,
4467 103
4468 ],
4469 'header' => 1633837924
4470 };
4471 $msg2 = {
4472 'data' => [
4473 101,
4474 102,
4475 103,
4476 104,
4477 105,
4478 106,
4479 107,
4480 108
4481 ],
4482 'header' => 1633837924
4483 };
4484
4485 Similarly, pack will adjust the length of the output string according
4486 to the data you feed in:
4487
4488 use Data::Hexdumper;
4489
4490 $msg = {
4491 header => 4711,
4492 data => [0x10, 0x20, 0x30, 0x40, 0x77..0x88],
4493 };
4494
4495 $data = $c->pack('message', $msg);
4496
4497 print hexdump(data => $data);
4498
4499 This would print:
4500
4501 0x0000 : 00 00 12 67 10 20 30 40 77 78 79 7A 7B 7C 7D 7E : ...g..0@wxyz{|}~
4502 0x0010 : 7F 80 81 82 83 84 85 86 87 88 : ..........
4503
4504 Incomplete types such as
4505
4506 typedef unsigned long array[];
4507
4508 are handled in exactly the same way. Thus, you can easily
4509
4510 $array = $c->unpack('array', '?'x20);
4511
4512 which will unpack the following array:
4513
4514 $array = [
4515 1061109567,
4516 1061109567,
4517 1061109567,
4518 1061109567,
4519 1061109567
4520 ];
4521
4522 You can also alter the length of an array using the "Dimension" tag.
4523
4525 When using Convert::Binary::C to handle floating point values, you have
4526 to be aware of some limitations.
4527
4528 You're usually safe if all your platforms are using the IEEE floating
4529 point format. During the Convert::Binary::C build process, the "ieeefp"
4530 feature will automatically be enabled if the host is using IEEE
4531 floating point. You can check for this feature at runtime using the
4532 "feature" function:
4533
4534 if (Convert::Binary::C::feature('ieeefp')) {
4535 # do something
4536 }
4537
4538 When IEEE floating point support is enabled, the module can also handle
4539 floating point values of a different byteorder.
4540
4541 If your host platform is not using IEEE floating point, the "ieeefp"
4542 feature will be disabled. Convert::Binary::C then will be more
4543 restrictive, refusing to handle any non-native floating point values.
4544
4545 However, Convert::Binary::C cannot detect the floating point format
4546 used by your target platform. It can only try to prevent problems in
4547 obvious cases. If you know your target platform has a completely
4548 different floating point format, don't use floating point conversion at
4549 all.
4550
4551 Whenever Convert::Binary::C detects that it cannot properly do floating
4552 point value conversion, it will issue a warning and will not attempt to
4553 convert the floating point value.
4554
4556 Bitfield support in Convert::Binary::C is currently in an experimental
4557 state. You are encouraged to test it, but you should not blindly rely
4558 on its results.
4559
4560 You are also encouraged to supply layouting algorithms for compilers
4561 whose bitfield implementation is not handled correctly at the moment.
4562 Even better that the plain algorithm is of course a patch that adds a
4563 new bitfield layouting engine.
4564
4565 While bitfields may not be handled correctly by the conversion routines
4566 yet, they are always parsed correctly. This means that you can reliably
4567 use the declarator fields as returned by the "struct" or "typedef"
4568 methods. Given the following source
4569
4570 struct bitfield {
4571 int seven:7;
4572 int :1;
4573 int four:4, :0;
4574 int integer;
4575 };
4576
4577 a call to "struct" will return
4578
4579 @struct = (
4580 {
4581 'identifier' => 'bitfield',
4582 'align' => 1,
4583 'context' => 'bitfields.c(1)',
4584 'pack' => 0,
4585 'type' => 'struct',
4586 'declarations' => [
4587 {
4588 'declarators' => [
4589 {
4590 'declarator' => 'seven:7'
4591 }
4592 ],
4593 'type' => 'int'
4594 },
4595 {
4596 'declarators' => [
4597 {
4598 'declarator' => ':1'
4599 }
4600 ],
4601 'type' => 'int'
4602 },
4603 {
4604 'declarators' => [
4605 {
4606 'declarator' => 'four:4'
4607 },
4608 {
4609 'declarator' => ':0'
4610 }
4611 ],
4612 'type' => 'int'
4613 },
4614 {
4615 'declarators' => [
4616 {
4617 'declarator' => 'integer',
4618 'size' => 4,
4619 'offset' => 4
4620 }
4621 ],
4622 'type' => 'int'
4623 }
4624 ],
4625 'size' => 8
4626 }
4627 );
4628
4629 No size/offset keys will currently be returned for bitfield entries.
4630
4632 Convert::Binary::C was designed to be thread-safe.
4633
4635 If you wish to derive a new class from Convert::Binary::C, this is
4636 relatively easy. Despite their XS implementation, Convert::Binary::C
4637 objects are actually blessed hash references.
4638
4639 The XS data is stored in a read-only hash value for the key that is the
4640 empty string. So it is safe to use any non-empty hash key when deriving
4641 your own class. In addition, Convert::Binary::C does quite a lot of
4642 checks to detect corruption in the object hash.
4643
4644 If you store private data in the hash, you should override the "clone"
4645 method and provide the necessary code to clone your private data.
4646 You'll have to call "SUPER::clone", but this will only clone the
4647 Convert::Binary::C part of the object.
4648
4649 For an example of a derived class, you can have a look at
4650 Convert::Binary::C::Cached.
4651
4653 Convert::Binary::C should build and run on most of the platforms that
4654 Perl runs on:
4655
4656 · Various Linux systems
4657
4658 · Various BSD systems
4659
4660 · HP-UX
4661
4662 · Compaq/HP Tru64 Unix
4663
4664 · Mac-OS X
4665
4666 · Cygwin
4667
4668 · Windows 98/NT/2000/XP
4669
4670 Also, many architectures are supported:
4671
4672 · Various Intel Pentium and Itanium systems
4673
4674 · Various Alpha systems
4675
4676 · HP PA-RISC
4677
4678 · Power-PC
4679
4680 · StrongARM
4681
4682 The module should build with any perl binary from 5.004 up to the
4683 latest development version.
4684
4686 Most of the time when you're really looking for Convert::Binary::C
4687 you'll actually end up finding one of the following modules. Some of
4688 them have different goals, so it's probably worth pointing out the
4689 differences.
4690
4691 C::Include
4692 Like Convert::Binary::C, this module aims at doing conversion from and
4693 to binary data based on C types. However, its configurability is very
4694 limited compared to Convert::Binary::C. Also, it does not parse all C
4695 code correctly. It's slower than Convert::Binary::C, doesn't have a
4696 preprocessor. On the plus side, it's written in pure Perl.
4697
4698 C::DynaLib::Struct
4699 This module doesn't allow you to reuse your C source code. One main
4700 goal of Convert::Binary::C was to avoid code duplication or, even
4701 worse, having to maintain different representations of your data
4702 structures. Like C::Include, C::DynaLib::Struct is rather limited in
4703 its configurability.
4704
4705 Win32::API::Struct
4706 This module has a special purpose. It aims at building structs for
4707 interfacing Perl code with Windows API code.
4708
4710 · My love Jennifer for always being there, for filling my life with joy
4711 and last but not least for proofreading the documentation.
4712
4713 · Alain Barbet <alian@cpan.org> for testing and debugging support.
4714
4715 · Mitchell N. Charity for giving me pointers into various interesting
4716 directions.
4717
4718 · Alexis Denis for making me improve (externally) and simplify
4719 (internally) floating point support. He can also be blamed
4720 (indirectly) for the "initializer" method, as I need it in my effort
4721 to support bitfields some day.
4722
4723 · Michael J. Hohmann <mjh@scientist.de> for endless discussions on our
4724 way to and back home from work, and for making me think about
4725 supporting "pack" and "unpack" for compound members.
4726
4727 · Thorsten Jens <thojens@gmx.de> for testing the package on various
4728 platforms.
4729
4730 · Mark Overmeer <mark@overmeer.net> for suggesting the module name and
4731 giving invaluable feedback.
4732
4733 · Thomas Pornin <pornin@bolet.org> for his excellent "ucpp"
4734 preprocessor library.
4735
4736 · Marc Rosenthal for his suggestions and support.
4737
4738 · James Roskind, as his C parser was a great starting point to fix all
4739 the problems I had with my original parser based only on the ANSI
4740 ruleset.
4741
4742 · Gisbert W. Selke for spotting some interesting bugs and providing
4743 extensive reports.
4744
4745 · Steffen Zimmermann for a prolific discussion on the cloning
4746 algorithm.
4747
4749 There's also a mailing list that you can join:
4750
4751 convert-binary-c@yahoogroups.com
4752
4753 To subscribe, simply send mail to:
4754
4755 convert-binary-c-subscribe@yahoogroups.com
4756
4757 You can use this mailing list for non-bug problems, questions or
4758 discussions.
4759
4761 I'm sure there are still lots of bugs in the code for this module. If
4762 you find any bugs, Convert::Binary::C doesn't seem to build on your
4763 system or any of its tests fail, please use the CPAN Request Tracker at
4764 <http://rt.cpan.org/> to create a ticket for the module. Alternatively,
4765 just send a mail to <mhx@cpan.org>.
4766
4768 Some features in Convert::Binary::C are marked as experimental. This
4769 has most probably one of the following reasons:
4770
4771 · The feature does not behave in exactly the way that I wish it did,
4772 possibly due to some limitations in the current design of the module.
4773
4774 · The feature hasn't been tested enough and may completely fail to
4775 produce the expected results.
4776
4777 I hope to fix most issues with these experimental features someday, but
4778 this may mean that I have to change the way they currently work in a
4779 way that's not backwards compatible. So if any of these features is
4780 useful to you, you can use it, but you should be aware that the
4781 behaviour or the interface may change in future releases of this
4782 module.
4783
4785 If you're interested in what I currently plan to improve (or fix), have
4786 a look at the TODO file.
4787
4789 If you're using my module and like it, you can show your appreciation
4790 by sending me a postcard from where you live. I won't urge you to do
4791 it, it's completely up to you. To me, this is just a very nice way of
4792 receiving feedback about my work. Please send your postcard to:
4793
4794 Marcus Holland-Moritz
4795 Kuppinger Weg 28
4796 71116 Gaertringen
4797 GERMANY
4798
4799 If you feel that sending a postcard is too much effort, you maybe want
4800 to rate the module at <http://cpanratings.perl.org/>.
4801
4803 Copyright (c) 2002-2009 Marcus Holland-Moritz. All rights reserved.
4804 This program is free software; you can redistribute it and/or modify it
4805 under the same terms as Perl itself.
4806
4807 The "ucpp" library is (c) 1998-2002 Thomas Pornin. For license and
4808 redistribution details refer to ctlib/ucpp/README.
4809
4810 Portions copyright (c) 1989, 1990 James A. Roskind.
4811
4812 The include files located in tests/include/include, which are used in
4813 some of the test scripts are (c) 1991-1999, 2000, 2001 Free Software
4814 Foundation, Inc. They are neither required to create the binary nor
4815 linked to the source code of this module in any other way.
4816
4818 See ccconfig, perl, perldata, perlop, perlvar, Data::Dumper and
4819 Scalar::Util.
4820
4821
4822
4823perl v5.12.0 2009-04-18 Convert::Binary::C(3)