1Convert::BinHex(3)    User Contributed Perl Documentation   Convert::BinHex(3)
2
3
4

NAME

6       Convert::BinHex - extract data from Macintosh BinHex files
7
8       ALPHA WARNING: this code is currently in its Alpha release.  Things may
9       change drastically until the interface is hammered out: if you have
10       suggestions or objections, please speak up now!
11

SYNOPSIS

13       Simple functions:
14
15           use Convert::BinHex qw(binhex_crc macbinary_crc);
16
17           # Compute HQX7-style CRC for data, pumping in old CRC if desired:
18           $crc = binhex_crc($data, $crc);
19
20           # Compute the MacBinary-II-style CRC for the data:
21           $crc = macbinary_crc($data, $crc);
22
23       Hex to bin, low-level interface.  Conversion is actually done via an
24       object ("Convert::BinHex::Hex2Bin") which keeps internal conversion
25       state:
26
27           # Create and use a "translator" object:
28           my $H2B = Convert::BinHex->hex2bin;    # get a converter object
29           while (<STDIN>) {
30               print $STDOUT $H2B->next($_);        # convert some more input
31           }
32           print $STDOUT $H2B->done;              # no more input: finish up
33
34       Hex to bin, OO interface.  The following operations must be done in the
35       order shown!
36
37           # Read data in piecemeal:
38           $HQX = Convert::BinHex->open(FH=>\*STDIN) ⎪⎪ die "open: $!";
39           $HQX->read_header;                  # read header info
40           @data = $HQX->read_data;            # read in all the data
41           @rsrc = $HQX->read_resource;        # read in all the resource
42
43       Bin to hex, low-level interface.  Conversion is actually done via an
44       object ("Convert::BinHex::Bin2Hex") which keeps internal conversion
45       state:
46
47           # Create and use a "translator" object:
48           my $B2H = Convert::BinHex->bin2hex;    # get a converter object
49           while (<STDIN>) {
50               print $STDOUT $B2H->next($_);        # convert some more input
51           }
52           print $STDOUT $B2H->done;              # no more input: finish up
53
54       Bin to hex, file interface.  Yes, you can convert to BinHex as well as
55       from it!
56
57           # Create new, empty object:
58           my $HQX = Convert::BinHex->new;
59
60           # Set header attributes:
61           $HQX->filename("logo.gif");
62           $HQX->type("GIFA");
63           $HQX->creator("CNVS");
64
65           # Give it the data and resource forks (either can be absent):
66           $HQX->data(Path => "/path/to/data");       # here, data is on disk
67           $HQX->resource(Data => $resourcefork);     # here, resource is in core
68
69           # Output as a BinHex stream, complete with leading comment:
70           $HQX->encode(\*STDOUT);
71
72       PLANNED!!!! Bin to hex, "CAP" interface.  Thanks to Ken Lunde for sug‐
73       gesting this.
74
75           # Create new, empty object from CAP tree:
76           my $HQX = Convert::BinHex->from_cap("/path/to/root/file");
77           $HQX->encode(\*STDOUT);
78

DESCRIPTION

80       BinHex is a format used by Macintosh for transporting Mac files safely
81       through electronic mail, as short-lined, 7-bit, semi-compressed data
82       streams.  Ths module provides a means of converting those data streams
83       back into into binary data.
84

FORMAT

86       (Some text taken from RFC-1741.)  Files on the Macintosh consist of two
87       parts, called forks:
88
89       Data fork
90           The actual data included in the file.  The Data fork is typically
91           the only meaningful part of a Macintosh file on a non-Macintosh
92           computer system.  For example, if a Macintosh user wants to send a
93           file of data to a user on an IBM-PC, she would only send the Data
94           fork.
95
96       Resource fork
97           Contains a collection of arbitrary attribute/value pairs, including
98           program segments, icon bitmaps, and parametric values.
99
100       Additional information regarding Macintosh files is stored by the
101       Finder in a hidden file, called the "Desktop Database".
102
103       Because of the complications in storing different parts of a Macintosh
104       file in a non-Macintosh filesystem that only handles consecutive data
105       in one part, it is common to convert the Macintosh file into some other
106       format before transferring it over the network.  The BinHex format
107       squashes that data into transmittable ASCII as follows:
108
109       1.  The file is output as a byte stream consisting of some basic header
110           information (filename, type, creator), then the data fork, then the
111           resource fork.
112
113       2.  The byte stream is compressed by looking for series of duplicated
114           bytes and representing them using a special binary escape sequence
115           (of course, any occurences of the escape character must also be
116           escaped).
117
118       3.  The compressed stream is encoded via the "6/8 hemiola" common to
119           base64 and uuencode: each group of three 8-bit bytes (24 bits) is
120           chopped into four 6-bit numbers, which are used as indexes into an
121           ASCII "alphabet".  (I assume that leftover bytes are zero-padded;
122           documentation is thin).
123

FUNCTIONS

125       CRC computation
126
127       macbinary_crc DATA, SEED
128           Compute the MacBinary-II-style CRC for the given DATA, with the CRC
129           seeded to SEED.  Normally, you start with a SEED of 0, and you pump
130           in the previous CRC as the SEED if you're handling a lot of data
131           one chunk at a time.  That is:
132
133               $crc = 0;
134               while (<STDIN>) {
135                   $crc = macbinary_crc($_, $crc);
136               }
137
138           Note: Extracted from the mcvert utility (Doug Moore, April '87),
139           using a "magic array" algorithm by Jim Van Verth for efficiency.
140           Converted to Perl5 by Eryq.  Untested.
141
142       binhex_crc DATA, SEED
143           Compute the HQX-style CRC for the given DATA, with the CRC seeded
144           to SEED.  Normally, you start with a SEED of 0, and you pump in the
145           previous CRC as the SEED if you're handling a lot of data one chunk
146           at a time.  That is:
147
148               $crc = 0;
149               while (<STDIN>) {
150                   $crc = binhex_crc($_, $crc);
151               }
152
153           Note: Extracted from the mcvert utility (Doug Moore, April '87),
154           using a "magic array" algorithm by Jim Van Verth for efficiency.
155           Converted to Perl5 by Eryq.
156

OO INTERFACE

158       Conversion
159
160       bin2hex
161           Class method, constructor.  Return a converter object.  Just cre‐
162           ates a new instance of "Convert::BinHex::Bin2Hex"; see that class
163           for details.
164
165       hex2bin
166           Class method, constructor.  Return a converter object.  Just cre‐
167           ates a new instance of "Convert::BinHex::Hex2Bin"; see that class
168           for details.
169
170       Construction
171
172       new PARAMHASH
173           Class method, constructor.  Return a handle on a BinHex'able
174           entity.  In general, the data and resource forks for such an entity
175           are stored in native format (binary) format.
176
177           Parameters in the PARAMHASH are the same as header-oriented method
178           names, and may be used to set attributes:
179
180               $HQX = new Convert::BinHex filename => "icon.gif",
181                                          type    => "GIFB",
182                                          creator => "CNVS";
183
184       open PARAMHASH
185           Class method, constructor.  Return a handle on a new BinHex'ed
186           stream, for parsing.  Params are:
187
188           Data
189               Input a HEX stream from the given data.  This can be a scalar,
190               or a reference to an array of scalars.
191
192           Expr
193               Input a HEX stream from any open()able expression.  It will be
194               opened and binmode'd, and the filehandle will be closed either
195               on a "close()" or when the object is destructed.
196
197           FH  Input a HEX stream from the given filehandle.
198
199           NoComment
200               If true, the parser should not attempt to skip a leading "(This
201               file...)"  comment.  That means that the first nonwhite charac‐
202               ters encountered must be the binhex'ed data.
203
204       Get/set header information
205
206       creator [VALUE]
207           Instance method.  Get/set the creator of the file.  This is a four-
208           character string (though I don't know if it's guaranteed to be
209           printable ASCII!)  that serves as part of the Macintosh's version
210           of a MIME "content-type".
211
212           For example, a document created by "Canvas" might have creator
213           "CNVS".
214
215       data [PARAMHASH]
216           Instance method.  Get/set the data fork.  Any arguments are passed
217           into the new() method of "Convert::BinHex::Fork".
218
219       filename [VALUE]
220           Instance method.  Get/set the name of the file.
221
222       flags [VALUE]
223           Instance method.  Return the flags, as an integer.  Use bitmasking
224           to get as the values you need.
225
226       header_as_string
227           Return a stringified version of the header that you might use for
228           logging/debugging purposes.  It looks like this:
229
230               X-HQX-Software: BinHex 4.0 (Convert::BinHex 1.102)
231               X-HQX-Filename: Something_new.eps
232               X-HQX-Version: 0
233               X-HQX-Type: EPSF
234               X-HQX-Creator: ART5
235               X-HQX-Data-Length: 49731
236               X-HQX-Rsrc-Length: 23096
237
238           As some of you might have guessed, this is RFC-822-style, and may
239           be easily plunked down into the middle of a mail header, or split
240           into lines, etc.
241
242       requires [VALUE]
243           Instance method.  Get/set the software version required to convert
244           this file, as extracted from the comment that preceded the actual
245           binhex'ed data; e.g.:
246
247               (This file must be converted with BinHex 4.0)
248
249           In this case, after parsing in the comment, the code:
250
251               $HQX->requires;
252
253           would get back "4.0".
254
255       resource [PARAMHASH]
256           Instance method.  Get/set the resource fork.  Any arguments are
257           passed into the new() method of "Convert::BinHex::Fork".
258
259       type [VALUE]
260           Instance method.  Get/set the type of the file.  This is a four-
261           character string (though I don't know if it's guaranteed to be
262           printable ASCII!)  that serves as part of the Macintosh's version
263           of a MIME "content-type".
264
265           For example, a GIF89a file might have type "GF89".
266
267       version [VALUE]
268           Instance method.  Get/set the version, as an integer.
269
270       Decode, high-level
271
272       read_comment
273           Instance method.  Skip past the opening comment in the file, which
274           is of the form:
275
276              (This file must be converted with BinHex 4.0)
277
278           As per RFC-1741, this comment must immediately precede the BinHex
279           data, and any text before it will be ignored.
280
281           You don't need to invoke this method yourself; "read_header()" will
282           do it for you.  After the call, the version number in the comment
283           is accessible via the "requires()" method.
284
285       read_header
286           Instance method.  Read in the BinHex file header.  You must do this
287           first!
288
289       read_data [NBYTES]
290           Instance method.  Read information from the data fork.  Use it in
291           an array context to slurp all the data into an array of scalars:
292
293               @data = $HQX->read_data;
294
295           Or use it in a scalar context to get the data piecemeal:
296
297               while (defined($data = $HQX->read_data)) {
298                  # do stuff with $data
299               }
300
301           The NBYTES to read defaults to 2048.
302
303       read_resource [NBYTES]
304           Instance method.  Read in all/some of the resource fork.  See
305           "read_data()" for usage.
306
307       Encode, high-level
308
309       encode OUT
310           Encode the object as a BinHex stream to the given output handle
311           OUT.  OUT can be a filehandle, or any blessed object that responds
312           to a "print()" message.
313
314           The leading comment is output, using the "requires()" attribute.
315

SUBMODULES

317       Convert::BinHex::Bin2Hex
318
319       A BINary-to-HEX converter.  This kind of conversion requires a certain
320       amount of state information; it cannot be done by just calling a simple
321       function repeatedly.  Use it like this:
322
323           # Create and use a "translator" object:
324           my $B2H = Convert::BinHex->bin2hex;    # get a converter object
325           while (<STDIN>) {
326               print STDOUT $B2H->next($_);          # convert some more input
327           }
328           print STDOUT $B2H->done;               # no more input: finish up
329
330           # Re-use the object:
331           $B2H->rewind;                 # ready for more action!
332           while (<MOREIN>) { ...
333
334       On each iteration, "next()" (and "done()") may return either a decent-
335       sized non-empty string (indicating that more converted data is ready
336       for you) or an empty string (indicating that the converter is waiting
337       to amass more input in its private buffers before handing you more
338       stuff to output.
339
340       Note that "done()" always converts and hands you whatever is left.
341
342       This may have been a good approach.  It may not.  Someday, the con‐
343       verter may also allow you give it an object that responds to read(), or
344       a FileHandle, and it will do all the nasty buffer-filling on its own,
345       serving you stuff line by line:
346
347           # Someday, maybe...
348           my $B2H = Convert::BinHex->bin2hex(\*STDIN);
349           while (defined($_ = $B2H->getline)) {
350               print STDOUT $_;
351           }
352
353       Someday, maybe.  Feel free to voice your opinions.
354
355       Convert::BinHex::Hex2Bin
356
357       A HEX-to-BINary converter. This kind of conversion requires a certain
358       amount of state information; it cannot be done by just calling a simple
359       function repeatedly.  Use it like this:
360
361           # Create and use a "translator" object:
362           my $H2B = Convert::BinHex->hex2bin;    # get a converter object
363           while (<STDIN>) {
364               print STDOUT $H2B->next($_);          # convert some more input
365           }
366           print STDOUT $H2B->done;               # no more input: finish up
367
368           # Re-use the object:
369           $H2B->rewind;                 # ready for more action!
370           while (<MOREIN>) { ...
371
372       On each iteration, "next()" (and "done()") may return either a decent-
373       sized non-empty string (indicating that more converted data is ready
374       for you) or an empty string (indicating that the converter is waiting
375       to amass more input in its private buffers before handing you more
376       stuff to output.
377
378       Note that "done()" always converts and hands you whatever is left.
379
380       Note that this converter does not find the initial "BinHex version"
381       comment.  You have to skip that yourself.  It only handles data between
382       the opening and closing ":".
383
384       Convert::BinHex::Fork
385
386       A fork in a Macintosh file.
387
388           # How to get them...
389           $data_fork = $HQX->data;      # get the data fork
390           $rsrc_fork = $HQX->resource;  # get the resource fork
391
392           # Make a new fork:
393           $FORK = Convert::BinHex::Fork->new(Path => "/tmp/file.data");
394           $FORK = Convert::BinHex::Fork->new(Data => $scalar);
395           $FORK = Convert::BinHex::Fork->new(Data => \@array_of_scalars);
396
397           # Get/set the length of the data fork:
398           $len = $FORK->length;
399           $FORK->length(170);        # this overrides the REAL value: be careful!
400
401           # Get/set the path to the underlying data (if in a disk file):
402           $path = $FORK->path;
403           $FORK->path("/tmp/file.data");
404
405           # Get/set the in-core data itself, which may be a scalar or an arrayref:
406           $data = $FORK->data;
407           $FORK->data($scalar);
408           $FORK->data(\@array_of_scalars);
409
410           # Get/set the CRC:
411           $crc = $FORK->crc;
412           $FORK->crc($crc);
413

UNDER THE HOOD

415       Design issues
416
417       BinHex needs a stateful parser
418           Unlike its cousins base64 and uuencode, BinHex format is not
419           amenable to being parsed line-by-line.  There appears to be no
420           guarantee that lines contain 4n encoded characters... and even if
421           there is one, the BinHex compression algorithm interferes: even
422           when you can decode one line at a time, you can't necessarily
423           decompress a line at a time.
424
425           For example: a decoded line ending with the byte "\x90" (the escape
426           or "mark" character) is ambiguous: depending on the next decoded
427           byte, it could mean a literal "\x90" (if the next byte is a
428           "\x00"), or it could mean n-1 more repetitions of the previous
429           character (if the next byte is some nonzero "n").
430
431           For this reason, a BinHex parser has to be somewhat stateful: you
432           cannot have code like this:
433
434               #### NO! #### NO! #### NO! #### NO! #### NO! ####
435               while (<STDIN>) {            # read HEX
436                   print hexbin($_);          # convert and write BIN
437               }
438
439           unless something is happening "behind the scenes" to keep track of
440           what was last done.  The dangerous thing, however, is that this
441           approach will seem to work, if you only test it on BinHex files
442           which do not use compression and which have 4n HEX characters on
443           each line.
444
445           Since we have to be stateful anyway, we use the parser object to
446           keep our state.
447
448       We need to be handle large input files
449           Solutions that demand reading everything into core don't cut it in
450           my book.  The first MPEG file that comes along can louse up your
451           whole day.  So, there are no size limitations in this module: the
452           data is read on-demand, and filehandles are always an option.
453
454       Boy, is this slow!
455           A lot of the byte-level manipulation that has to go on, particu‐
456           larly the CRC computing (which involves intensive bit-shifting and
457           masking) slows this module down significantly.  What is needed per‐
458           haps is an optional extension library where the slow pieces can be
459           done more quickly... a Convert::BinHex::CRC, if you will.  Volun‐
460           teers, anyone?
461
462           Even considering that, however, it's slower than I'd like.  I'm
463           sure many improvements can be made in the HEX-to-BIN end of things.
464           No doubt I'll attempt some as time goes on...
465
466       How it works
467
468       Since BinHex is a layered format, consisting of...
469
470             A Macintosh file [the "BIN"]...
471                Encoded as a structured 8-bit bytestream, then...
472                   Compressed to reduce duplicate bytes, then...
473                      Encoded as 7-bit ASCII [the "HEX"]
474
475       ...there is a layered parsing algorithm to reverse the process.  Basi‐
476       cally, it works in a similar fashion to stdio's fread():
477
478              0. There is an internal buffer of decompressed (BIN) data,
479                 initially empty.
480              1. Application asks to read() n bytes of data from object
481              2. If the buffer is not full enough to accomodate the request:
482                   2a. The read() method grabs the next available chunk of input
483                       data (the HEX).
484                   2b. HEX data is converted and decompressed into as many BIN
485                       bytes as possible.
486                   2c. BIN bytes are added to the read() buffer.
487                   2d. Go back to step 2a. until the buffer is full enough
488                       or we hit end-of-input.
489
490       The conversion-and-decompression algorithms need their own internal
491       buffers and state (since the next input chunk may not contain all the
492       data needed for a complete conversion/decompression operation).  These
493       are maintained in the object, so parsing two different input streams
494       simultaneously is possible.
495

WARNINGS

497       Only handles "Hqx7" files, as per RFC-1741.
498
499       Remember that Macintosh text files use "\r" as end-of-line: this means
500       that if you want a textual file to look normal on a non-Mac system, you
501       probably want to do this to the data:
502
503           # Get the data, and output it according to normal conventions:
504           foreach ($HQX->read_data) { s/\r/\n/g; print }
505

CHANGE LOG

507       Current version: $Id: BinHex.pm,v 1.119 1997/06/28 05:12:42 eryq Exp $
508
509       Version 1.118
510           Ready to go public (with Paul's version, patched for native Mac
511           support)!  Warnings have been suppressed in a few places where
512           undefined values appear.
513
514       Version 1.115
515           Fixed another bug in comp2bin, related to the MARK falling on a
516           boundary between inputs.  Added testing code.
517
518       Version 1.114
519           Added BIN-to-HEX conversion.  Eh.  It's a start.  Also, a lot of
520           documentation additions and cleanups.  Some methods were also
521           renamed.
522
523       Version 1.103
524           Fixed bug in decompression (wasn't saving last character).  Fixed
525           "NoComment" bug.
526
527       Version 1.102
528           Initial release.
529

AUTHOR AND CREDITS

531       Written by Eryq, http://www.enteract.com/~eryq / eryq@enteract.com
532
533       Support for native-Mac conversion, plus invaluable contributions in
534       Alpha Testing, plus a few patches, plus the baseline binhex/debinhex
535       programs, were provided by Paul J. Schinder (NASA/GSFC).
536
537       Ken Lunde (Adobe) suggested incorporating the CAP file representation.
538

TERMS AND CONDITIONS

540       Copyright (c) 1997 by Eryq.  All rights reserved.  This program is free
541       software; you can redistribute it and/or modify it under the same terms
542       as Perl itself.
543
544       This software comes with NO WARRANTY of any kind.  See the COPYING file
545       in the distribution for details.
546
547
548
549perl v5.8.8                       1997-06-28                Convert::BinHex(3)
Impressum