1IO::Uncompress::Gunzip(3Ppemr)l Programmers Reference GuIiOd:e:Uncompress::Gunzip(3pm)
2
3
4
6 IO::Uncompress::Gunzip - Read RFC 1952 files/buffers
7
9 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
10
11 my $status = gunzip $input => $output [,OPTS]
12 or die "gunzip failed: $GunzipError\n";
13
14 my $z = new IO::Uncompress::Gunzip $input [OPTS]
15 or die "gunzip failed: $GunzipError\n";
16
17 $status = $z->read($buffer)
18 $status = $z->read($buffer, $length)
19 $status = $z->read($buffer, $length, $offset)
20 $line = $z->getline()
21 $char = $z->getc()
22 $char = $z->ungetc()
23 $char = $z->opened()
24
25 $status = $z->inflateSync()
26
27 $data = $z->trailingData()
28 $status = $z->nextStream()
29 $data = $z->getHeaderInfo()
30 $z->tell()
31 $z->seek($position, $whence)
32 $z->binmode()
33 $z->fileno()
34 $z->eof()
35 $z->close()
36
37 $GunzipError ;
38
39 # IO::File mode
40
41 <$z>
42 read($z, $buffer);
43 read($z, $buffer, $length);
44 read($z, $buffer, $length, $offset);
45 tell($z)
46 seek($z, $position, $whence)
47 binmode($z)
48 fileno($z)
49 eof($z)
50 close($z)
51
53 This module provides a Perl interface that allows the reading of
54 files/buffers that conform to RFC 1952.
55
56 For writing RFC 1952 files/buffers, see the companion module
57 IO::Compress::Gzip.
58
60 A top-level function, "gunzip", is provided to carry out "one-shot"
61 uncompression between buffers and/or files. For finer control over the
62 uncompression process, see the "OO Interface" section.
63
64 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
65
66 gunzip $input => $output [,OPTS]
67 or die "gunzip failed: $GunzipError\n";
68
69 The functional interface needs Perl5.005 or better.
70
71 gunzip $input => $output [, OPTS]
72 "gunzip" expects at least two parameters, $input and $output.
73
74 The $input parameter
75
76 The parameter, $input, is used to define the source of the compressed
77 data.
78
79 It can take one of the following forms:
80
81 A filename
82 If the $input parameter is a simple scalar, it is assumed to be a
83 filename. This file will be opened for reading and the input data
84 will be read from it.
85
86 A filehandle
87 If the $input parameter is a filehandle, the input data will be
88 read from it. The string '-' can be used as an alias for standard
89 input.
90
91 A scalar reference
92 If $input is a scalar reference, the input data will be read from
93 $$input.
94
95 An array reference
96 If $input is an array reference, each element in the array must be
97 a filename.
98
99 The input data will be read from each file in turn.
100
101 The complete array will be walked to ensure that it only contains
102 valid filenames before any data is uncompressed.
103
104 An Input FileGlob string
105 If $input is a string that is delimited by the characters "<" and
106 ">" "gunzip" will assume that it is an input fileglob string. The
107 input is the list of files that match the fileglob.
108
109 If the fileglob does not match any files ...
110
111 See File::GlobMapper for more details.
112
113 If the $input parameter is any other type, "undef" will be returned.
114
115 The $output parameter
116
117 The parameter $output is used to control the destination of the
118 uncompressed data. This parameter can take one of these forms.
119
120 A filename
121 If the $output parameter is a simple scalar, it is assumed to be a
122 filename. This file will be opened for writing and the
123 uncompressed data will be written to it.
124
125 A filehandle
126 If the $output parameter is a filehandle, the uncompressed data
127 will be written to it. The string '-' can be used as an alias for
128 standard output.
129
130 A scalar reference
131 If $output is a scalar reference, the uncompressed data will be
132 stored in $$output.
133
134 An Array Reference
135 If $output is an array reference, the uncompressed data will be
136 pushed onto the array.
137
138 An Output FileGlob
139 If $output is a string that is delimited by the characters "<" and
140 ">" "gunzip" will assume that it is an output fileglob string. The
141 output is the list of files that match the fileglob.
142
143 When $output is an fileglob string, $input must also be a fileglob
144 string. Anything else is an error.
145
146 If the $output parameter is any other type, "undef" will be returned.
147
148 Notes
149 When $input maps to multiple compressed files/buffers and $output is a
150 single file/buffer, after uncompression $output will contain a
151 concatenation of all the uncompressed data from each of the input
152 files/buffers.
153
154 Optional Parameters
155 Unless specified below, the optional parameters for "gunzip", "OPTS",
156 are the same as those used with the OO interface defined in the
157 "Constructor Options" section below.
158
159 "AutoClose => 0|1"
160 This option applies to any input or output data streams to
161 "gunzip" that are filehandles.
162
163 If "AutoClose" is specified, and the value is true, it will result
164 in all input and/or output filehandles being closed once "gunzip"
165 has completed.
166
167 This parameter defaults to 0.
168
169 "BinModeOut => 0|1"
170 When writing to a file or filehandle, set "binmode" before writing
171 to the file.
172
173 Defaults to 0.
174
175 "Append => 0|1"
176 The behaviour of this option is dependent on the type of output
177 data stream.
178
179 · A Buffer
180
181 If "Append" is enabled, all uncompressed data will be append
182 to the end of the output buffer. Otherwise the output buffer
183 will be cleared before any uncompressed data is written to
184 it.
185
186 · A Filename
187
188 If "Append" is enabled, the file will be opened in append
189 mode. Otherwise the contents of the file, if any, will be
190 truncated before any uncompressed data is written to it.
191
192 · A Filehandle
193
194 If "Append" is enabled, the filehandle will be positioned to
195 the end of the file via a call to "seek" before any
196 uncompressed data is written to it. Otherwise the file
197 pointer will not be moved.
198
199 When "Append" is specified, and set to true, it will append all
200 uncompressed data to the output data stream.
201
202 So when the output is a filehandle it will carry out a seek to the
203 eof before writing any uncompressed data. If the output is a
204 filename, it will be opened for appending. If the output is a
205 buffer, all uncompressed data will be appened to the existing
206 buffer.
207
208 Conversely when "Append" is not specified, or it is present and is
209 set to false, it will operate as follows.
210
211 When the output is a filename, it will truncate the contents of
212 the file before writing any uncompressed data. If the output is a
213 filehandle its position will not be changed. If the output is a
214 buffer, it will be wiped before any uncompressed data is output.
215
216 Defaults to 0.
217
218 "MultiStream => 0|1"
219 If the input file/buffer contains multiple compressed data
220 streams, this option will uncompress the whole lot as a single
221 data stream.
222
223 Defaults to 0.
224
225 "TrailingData => $scalar"
226 Returns the data, if any, that is present immediately after the
227 compressed data stream once uncompression is complete.
228
229 This option can be used when there is useful information
230 immediately following the compressed data stream, and you don't
231 know the length of the compressed data stream.
232
233 If the input is a buffer, "trailingData" will return everything
234 from the end of the compressed data stream to the end of the
235 buffer.
236
237 If the input is a filehandle, "trailingData" will return the data
238 that is left in the filehandle input buffer once the end of the
239 compressed data stream has been reached. You can then use the
240 filehandle to read the rest of the input file.
241
242 Don't bother using "trailingData" if the input is a filename.
243
244 If you know the length of the compressed data stream before you
245 start uncompressing, you can avoid having to use "trailingData" by
246 setting the "InputLength" option.
247
248 Examples
249 To read the contents of the file "file1.txt.gz" and write the
250 uncompressed data to the file "file1.txt".
251
252 use strict ;
253 use warnings ;
254 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
255
256 my $input = "file1.txt.gz";
257 my $output = "file1.txt";
258 gunzip $input => $output
259 or die "gunzip failed: $GunzipError\n";
260
261 To read from an existing Perl filehandle, $input, and write the
262 uncompressed data to a buffer, $buffer.
263
264 use strict ;
265 use warnings ;
266 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
267 use IO::File ;
268
269 my $input = new IO::File "<file1.txt.gz"
270 or die "Cannot open 'file1.txt.gz': $!\n" ;
271 my $buffer ;
272 gunzip $input => \$buffer
273 or die "gunzip failed: $GunzipError\n";
274
275 To uncompress all files in the directory "/my/home" that match
276 "*.txt.gz" and store the compressed data in the same directory
277
278 use strict ;
279 use warnings ;
280 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
281
282 gunzip '</my/home/*.txt.gz>' => '</my/home/#1.txt>'
283 or die "gunzip failed: $GunzipError\n";
284
285 and if you want to compress each file one at a time, this will do the
286 trick
287
288 use strict ;
289 use warnings ;
290 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
291
292 for my $input ( glob "/my/home/*.txt.gz" )
293 {
294 my $output = $input;
295 $output =~ s/.gz// ;
296 gunzip $input => $output
297 or die "Error compressing '$input': $GunzipError\n";
298 }
299
301 Constructor
302 The format of the constructor for IO::Uncompress::Gunzip is shown below
303
304 my $z = new IO::Uncompress::Gunzip $input [OPTS]
305 or die "IO::Uncompress::Gunzip failed: $GunzipError\n";
306
307 Returns an "IO::Uncompress::Gunzip" object on success and undef on
308 failure. The variable $GunzipError will contain an error message on
309 failure.
310
311 If you are running Perl 5.005 or better the object, $z, returned from
312 IO::Uncompress::Gunzip can be used exactly like an IO::File filehandle.
313 This means that all normal input file operations can be carried out
314 with $z. For example, to read a line from a compressed file/buffer you
315 can use either of these forms
316
317 $line = $z->getline();
318 $line = <$z>;
319
320 The mandatory parameter $input is used to determine the source of the
321 compressed data. This parameter can take one of three forms.
322
323 A filename
324 If the $input parameter is a scalar, it is assumed to be a
325 filename. This file will be opened for reading and the compressed
326 data will be read from it.
327
328 A filehandle
329 If the $input parameter is a filehandle, the compressed data will
330 be read from it. The string '-' can be used as an alias for
331 standard input.
332
333 A scalar reference
334 If $input is a scalar reference, the compressed data will be read
335 from $$output.
336
337 Constructor Options
338 The option names defined below are case insensitive and can be
339 optionally prefixed by a '-'. So all of the following are valid
340
341 -AutoClose
342 -autoclose
343 AUTOCLOSE
344 autoclose
345
346 OPTS is a combination of the following options:
347
348 "AutoClose => 0|1"
349 This option is only valid when the $input parameter is a
350 filehandle. If specified, and the value is true, it will result in
351 the file being closed once either the "close" method is called or
352 the IO::Uncompress::Gunzip object is destroyed.
353
354 This parameter defaults to 0.
355
356 "MultiStream => 0|1"
357 Allows multiple concatenated compressed streams to be treated as a
358 single compressed stream. Decompression will stop once either the
359 end of the file/buffer is reached, an error is encountered
360 (premature eof, corrupt compressed data) or the end of a stream is
361 not immediately followed by the start of another stream.
362
363 This parameter defaults to 0.
364
365 "Prime => $string"
366 This option will uncompress the contents of $string before
367 processing the input file/buffer.
368
369 This option can be useful when the compressed data is embedded in
370 another file/data structure and it is not possible to work out
371 where the compressed data begins without having to read the first
372 few bytes. If this is the case, the uncompression can be primed
373 with these bytes using this option.
374
375 "Transparent => 0|1"
376 If this option is set and the input file/buffer is not compressed
377 data, the module will allow reading of it anyway.
378
379 In addition, if the input file/buffer does contain compressed data
380 and there is non-compressed data immediately following it, setting
381 this option will make this module treat the whole file/bufffer as
382 a single data stream.
383
384 This option defaults to 1.
385
386 "BlockSize => $num"
387 When reading the compressed input data, IO::Uncompress::Gunzip
388 will read it in blocks of $num bytes.
389
390 This option defaults to 4096.
391
392 "InputLength => $size"
393 When present this option will limit the number of compressed bytes
394 read from the input file/buffer to $size. This option can be used
395 in the situation where there is useful data directly after the
396 compressed data stream and you know beforehand the exact length of
397 the compressed data stream.
398
399 This option is mostly used when reading from a filehandle, in
400 which case the file pointer will be left pointing to the first
401 byte directly after the compressed data stream.
402
403 This option defaults to off.
404
405 "Append => 0|1"
406 This option controls what the "read" method does with uncompressed
407 data.
408
409 If set to 1, all uncompressed data will be appended to the output
410 parameter of the "read" method.
411
412 If set to 0, the contents of the output parameter of the "read"
413 method will be overwritten by the uncompressed data.
414
415 Defaults to 0.
416
417 "Strict => 0|1"
418 This option controls whether the extra checks defined below are
419 used when carrying out the decompression. When Strict is on, the
420 extra tests are carried out, when Strict is off they are not.
421
422 The default for this option is off.
423
424 1. If the FHCRC bit is set in the gzip FLG header byte, the
425 CRC16 bytes in the header must match the crc16 value of the
426 gzip header actually read.
427
428 2. If the gzip header contains a name field (FNAME) it consists
429 solely of ISO 8859-1 characters.
430
431 3. If the gzip header contains a comment field (FCOMMENT) it
432 consists solely of ISO 8859-1 characters plus line-feed.
433
434 4. If the gzip FEXTRA header field is present it must conform to
435 the sub-field structure as defined in RFC 1952.
436
437 5. The CRC32 and ISIZE trailer fields must be present.
438
439 6. The value of the CRC32 field read must match the crc32 value
440 of the uncompressed data actually contained in the gzip file.
441
442 7. The value of the ISIZE fields read must match the length of
443 the uncompressed data actually read from the file.
444
445 "ParseExtra => 0|1" If the gzip FEXTRA header field is present and this
446 option is set, it will force the module to check that it conforms to
447 the sub-field structure as defined in RFC 1952.
448 If the "Strict" is on it will automatically enable this option.
449
450 Defaults to 0.
451
452 Examples
453 TODO
454
456 read
457 Usage is
458
459 $status = $z->read($buffer)
460
461 Reads a block of compressed data (the size the the compressed block is
462 determined by the "Buffer" option in the constructor), uncompresses it
463 and writes any uncompressed data into $buffer. If the "Append"
464 parameter is set in the constructor, the uncompressed data will be
465 appended to the $buffer parameter. Otherwise $buffer will be
466 overwritten.
467
468 Returns the number of uncompressed bytes written to $buffer, zero if
469 eof or a negative number on error.
470
471 read
472 Usage is
473
474 $status = $z->read($buffer, $length)
475 $status = $z->read($buffer, $length, $offset)
476
477 $status = read($z, $buffer, $length)
478 $status = read($z, $buffer, $length, $offset)
479
480 Attempt to read $length bytes of uncompressed data into $buffer.
481
482 The main difference between this form of the "read" method and the
483 previous one, is that this one will attempt to return exactly $length
484 bytes. The only circumstances that this function will not is if end-of-
485 file or an IO error is encountered.
486
487 Returns the number of uncompressed bytes written to $buffer, zero if
488 eof or a negative number on error.
489
490 getline
491 Usage is
492
493 $line = $z->getline()
494 $line = <$z>
495
496 Reads a single line.
497
498 This method fully supports the use of of the variable $/ (or
499 $INPUT_RECORD_SEPARATOR or $RS when "English" is in use) to determine
500 what constitutes an end of line. Paragraph mode, record mode and file
501 slurp mode are all supported.
502
503 getc
504 Usage is
505
506 $char = $z->getc()
507
508 Read a single character.
509
510 ungetc
511 Usage is
512
513 $char = $z->ungetc($string)
514
515 inflateSync
516 Usage is
517
518 $status = $z->inflateSync()
519
520 TODO
521
522 getHeaderInfo
523 Usage is
524
525 $hdr = $z->getHeaderInfo();
526 @hdrs = $z->getHeaderInfo();
527
528 This method returns either a hash reference (in scalar context) or a
529 list or hash references (in array context) that contains information
530 about each of the header fields in the compressed data stream(s).
531
532 Name The contents of the Name header field, if present. If no name is
533 present, the value will be undef. Note this is different from a
534 zero length name, which will return an empty string.
535
536 Comment
537 The contents of the Comment header field, if present. If no
538 comment is present, the value will be undef. Note this is
539 different from a zero length comment, which will return an empty
540 string.
541
542 tell
543 Usage is
544
545 $z->tell()
546 tell $z
547
548 Returns the uncompressed file offset.
549
550 eof
551 Usage is
552
553 $z->eof();
554 eof($z);
555
556 Returns true if the end of the compressed input stream has been
557 reached.
558
559 seek
560 $z->seek($position, $whence);
561 seek($z, $position, $whence);
562
563 Provides a sub-set of the "seek" functionality, with the restriction
564 that it is only legal to seek forward in the input file/buffer. It is
565 a fatal error to attempt to seek backward.
566
567 The $whence parameter takes one the usual values, namely SEEK_SET,
568 SEEK_CUR or SEEK_END.
569
570 Returns 1 on success, 0 on failure.
571
572 binmode
573 Usage is
574
575 $z->binmode
576 binmode $z ;
577
578 This is a noop provided for completeness.
579
580 opened
581 $z->opened()
582
583 Returns true if the object currently refers to a opened file/buffer.
584
585 autoflush
586 my $prev = $z->autoflush()
587 my $prev = $z->autoflush(EXPR)
588
589 If the $z object is associated with a file or a filehandle, this method
590 returns the current autoflush setting for the underlying filehandle. If
591 "EXPR" is present, and is non-zero, it will enable flushing after every
592 write/print operation.
593
594 If $z is associated with a buffer, this method has no effect and always
595 returns "undef".
596
597 Note that the special variable $| cannot be used to set or retrieve the
598 autoflush setting.
599
600 input_line_number
601 $z->input_line_number()
602 $z->input_line_number(EXPR)
603
604 Returns the current uncompressed line number. If "EXPR" is present it
605 has the effect of setting the line number. Note that setting the line
606 number does not change the current position within the file/buffer
607 being read.
608
609 The contents of $/ are used to to determine what constitutes a line
610 terminator.
611
612 fileno
613 $z->fileno()
614 fileno($z)
615
616 If the $z object is associated with a file or a filehandle, "fileno"
617 will return the underlying file descriptor. Once the "close" method is
618 called "fileno" will return "undef".
619
620 If the $z object is is associated with a buffer, this method will
621 return "undef".
622
623 close
624 $z->close() ;
625 close $z ;
626
627 Closes the output file/buffer.
628
629 For most versions of Perl this method will be automatically invoked if
630 the IO::Uncompress::Gunzip object is destroyed (either explicitly or by
631 the variable with the reference to the object going out of scope). The
632 exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these
633 cases, the "close" method will be called automatically, but not until
634 global destruction of all live objects when the program is terminating.
635
636 Therefore, if you want your scripts to be able to run on all versions
637 of Perl, you should call "close" explicitly and not rely on automatic
638 closing.
639
640 Returns true on success, otherwise 0.
641
642 If the "AutoClose" option has been enabled when the
643 IO::Uncompress::Gunzip object was created, and the object is associated
644 with a file, the underlying file will also be closed.
645
646 nextStream
647 Usage is
648
649 my $status = $z->nextStream();
650
651 Skips to the next compressed data stream in the input file/buffer. If a
652 new compressed data stream is found, the eof marker will be cleared and
653 $. will be reset to 0.
654
655 Returns 1 if a new stream was found, 0 if none was found, and -1 if an
656 error was encountered.
657
658 trailingData
659 Usage is
660
661 my $data = $z->trailingData();
662
663 Returns the data, if any, that is present immediately after the
664 compressed data stream once uncompression is complete. It only makes
665 sense to call this method once the end of the compressed data stream
666 has been encountered.
667
668 This option can be used when there is useful information immediately
669 following the compressed data stream, and you don't know the length of
670 the compressed data stream.
671
672 If the input is a buffer, "trailingData" will return everything from
673 the end of the compressed data stream to the end of the buffer.
674
675 If the input is a filehandle, "trailingData" will return the data that
676 is left in the filehandle input buffer once the end of the compressed
677 data stream has been reached. You can then use the filehandle to read
678 the rest of the input file.
679
680 Don't bother using "trailingData" if the input is a filename.
681
682 If you know the length of the compressed data stream before you start
683 uncompressing, you can avoid having to use "trailingData" by setting
684 the "InputLength" option in the constructor.
685
687 No symbolic constants are required by this IO::Uncompress::Gunzip at
688 present.
689
690 :all Imports "gunzip" and $GunzipError. Same as doing this
691
692 use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
693
695 Working with Net::FTP
696 See IO::Uncompress::Gunzip::FAQ
697
699 Compress::Zlib, IO::Compress::Gzip, IO::Compress::Deflate,
700 IO::Uncompress::Inflate, IO::Compress::RawDeflate,
701 IO::Uncompress::RawInflate, IO::Compress::Bzip2,
702 IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma,
703 IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop,
704 IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf,
705 IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress
706
707 Compress::Zlib::FAQ
708
709 File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib
710
711 For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html,
712 http://www.faqs.org/rfcs/rfc1951.html and
713 http://www.faqs.org/rfcs/rfc1952.html
714
715 The zlib compression library was written by Jean-loup Gailly
716 gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.
717
718 The primary site for the zlib compression library is
719 http://www.zlib.org.
720
721 The primary site for gzip is http://www.gzip.org.
722
724 This module was written by Paul Marquess, pmqs@cpan.org.
725
727 See the Changes file.
728
730 Copyright (c) 2005-2010 Paul Marquess. All rights reserved.
731
732 This program is free software; you can redistribute it and/or modify it
733 under the same terms as Perl itself.
734
735
736
737perl v5.12.4 2011-06-07 IO::Uncompress::Gunzip(3pm)