1IO::Uncompress::AnyUncoUmsperresCso(n3t)ributed Perl DocIuOm:e:nUtnactoimopnress::AnyUncompress(3)
2
3
4
6 IO::Uncompress::AnyUncompress - Uncompress gzip, zip, bzip2 or lzop
7 file/buffer
8
10 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
11
12 my $status = anyuncompress $input => $output [,OPTS]
13 or die "anyuncompress failed: $AnyUncompressError\n";
14
15 my $z = new IO::Uncompress::AnyUncompress $input [OPTS]
16 or die "anyuncompress failed: $AnyUncompressError\n";
17
18 $status = $z->read($buffer)
19 $status = $z->read($buffer, $length)
20 $status = $z->read($buffer, $length, $offset)
21 $line = $z->getline()
22 $char = $z->getc()
23 $char = $z->ungetc()
24 $char = $z->opened()
25
26 $data = $z->trailingData()
27 $status = $z->nextStream()
28 $data = $z->getHeaderInfo()
29 $z->tell()
30 $z->seek($position, $whence)
31 $z->binmode()
32 $z->fileno()
33 $z->eof()
34 $z->close()
35
36 $AnyUncompressError ;
37
38 # IO::File mode
39
40 <$z>
41 read($z, $buffer);
42 read($z, $buffer, $length);
43 read($z, $buffer, $length, $offset);
44 tell($z)
45 seek($z, $position, $whence)
46 binmode($z)
47 fileno($z)
48 eof($z)
49 close($z)
50
52 This module provides a Perl interface that allows the reading of
53 files/buffers that have been compressed with a variety of compression
54 libraries.
55
56 The formats supported are:
57
58 RFC 1950
59 RFC 1951 (optionally)
60 gzip (RFC 1952)
61 zip
62 bzip2
63 lzop
64 lzf
65 lzma
66 lzip
67 xz
68
69 The module will auto-detect which, if any, of the supported compression
70 formats is being used.
71
73 A top-level function, "anyuncompress", is provided to carry out "one-
74 shot" uncompression between buffers and/or files. For finer control
75 over the uncompression process, see the "OO Interface" section.
76
77 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
78
79 anyuncompress $input_filename_or_reference => $output_filename_or_reference [,OPTS]
80 or die "anyuncompress failed: $AnyUncompressError\n";
81
82 The functional interface needs Perl5.005 or better.
83
84 anyuncompress $input_filename_or_reference => $output_filename_or_reference
85 [, OPTS]
86 "anyuncompress" expects at least two parameters,
87 $input_filename_or_reference and $output_filename_or_reference.
88
89 The $input_filename_or_reference parameter
90
91 The parameter, $input_filename_or_reference, is used to define the
92 source of the compressed data.
93
94 It can take one of the following forms:
95
96 A filename
97 If the <$input_filename_or_reference> parameter is a simple
98 scalar, it is assumed to be a filename. This file will be opened
99 for reading and the input data will be read from it.
100
101 A filehandle
102 If the $input_filename_or_reference parameter is a filehandle, the
103 input data will be read from it. The string '-' can be used as an
104 alias for standard input.
105
106 A scalar reference
107 If $input_filename_or_reference is a scalar reference, the input
108 data will be read from $$input_filename_or_reference.
109
110 An array reference
111 If $input_filename_or_reference is an array reference, each
112 element in the array must be a filename.
113
114 The input data will be read from each file in turn.
115
116 The complete array will be walked to ensure that it only contains
117 valid filenames before any data is uncompressed.
118
119 An Input FileGlob string
120 If $input_filename_or_reference is a string that is delimited by
121 the characters "<" and ">" "anyuncompress" will assume that it is
122 an input fileglob string. The input is the list of files that
123 match the fileglob.
124
125 See File::GlobMapper for more details.
126
127 If the $input_filename_or_reference parameter is any other type,
128 "undef" will be returned.
129
130 The $output_filename_or_reference parameter
131
132 The parameter $output_filename_or_reference is used to control the
133 destination of the uncompressed data. This parameter can take one of
134 these forms.
135
136 A filename
137 If the $output_filename_or_reference parameter is a simple scalar,
138 it is assumed to be a filename. This file will be opened for
139 writing and the uncompressed data will be written to it.
140
141 A filehandle
142 If the $output_filename_or_reference parameter is a filehandle,
143 the uncompressed data will be written to it. The string '-' can
144 be used as an alias for standard output.
145
146 A scalar reference
147 If $output_filename_or_reference is a scalar reference, the
148 uncompressed data will be stored in
149 $$output_filename_or_reference.
150
151 An Array Reference
152 If $output_filename_or_reference is an array reference, the
153 uncompressed data will be pushed onto the array.
154
155 An Output FileGlob
156 If $output_filename_or_reference is a string that is delimited by
157 the characters "<" and ">" "anyuncompress" will assume that it is
158 an output fileglob string. The output is the list of files that
159 match the fileglob.
160
161 When $output_filename_or_reference is an fileglob string,
162 $input_filename_or_reference must also be a fileglob string.
163 Anything else is an error.
164
165 See File::GlobMapper for more details.
166
167 If the $output_filename_or_reference parameter is any other type,
168 "undef" will be returned.
169
170 Notes
171 When $input_filename_or_reference maps to multiple compressed
172 files/buffers and $output_filename_or_reference is a single
173 file/buffer, after uncompression $output_filename_or_reference will
174 contain a concatenation of all the uncompressed data from each of the
175 input files/buffers.
176
177 Optional Parameters
178 Unless specified below, the optional parameters for "anyuncompress",
179 "OPTS", are the same as those used with the OO interface defined in the
180 "Constructor Options" section below.
181
182 "AutoClose => 0|1"
183 This option applies to any input or output data streams to
184 "anyuncompress" that are filehandles.
185
186 If "AutoClose" is specified, and the value is true, it will result
187 in all input and/or output filehandles being closed once
188 "anyuncompress" has completed.
189
190 This parameter defaults to 0.
191
192 "BinModeOut => 0|1"
193 This option is now a no-op. All files will be written in binmode.
194
195 "Append => 0|1"
196 The behaviour of this option is dependent on the type of output
197 data stream.
198
199 · A Buffer
200
201 If "Append" is enabled, all uncompressed data will be append
202 to the end of the output buffer. Otherwise the output buffer
203 will be cleared before any uncompressed data is written to
204 it.
205
206 · A Filename
207
208 If "Append" is enabled, the file will be opened in append
209 mode. Otherwise the contents of the file, if any, will be
210 truncated before any uncompressed data is written to it.
211
212 · A Filehandle
213
214 If "Append" is enabled, the filehandle will be positioned to
215 the end of the file via a call to "seek" before any
216 uncompressed data is written to it. Otherwise the file
217 pointer will not be moved.
218
219 When "Append" is specified, and set to true, it will append all
220 uncompressed data to the output data stream.
221
222 So when the output is a filehandle it will carry out a seek to the
223 eof before writing any uncompressed data. If the output is a
224 filename, it will be opened for appending. If the output is a
225 buffer, all uncompressed data will be appended to the existing
226 buffer.
227
228 Conversely when "Append" is not specified, or it is present and is
229 set to false, it will operate as follows.
230
231 When the output is a filename, it will truncate the contents of
232 the file before writing any uncompressed data. If the output is a
233 filehandle its position will not be changed. If the output is a
234 buffer, it will be wiped before any uncompressed data is output.
235
236 Defaults to 0.
237
238 "MultiStream => 0|1"
239 If the input file/buffer contains multiple compressed data
240 streams, this option will uncompress the whole lot as a single
241 data stream.
242
243 Defaults to 0.
244
245 "TrailingData => $scalar"
246 Returns the data, if any, that is present immediately after the
247 compressed data stream once uncompression is complete.
248
249 This option can be used when there is useful information
250 immediately following the compressed data stream, and you don't
251 know the length of the compressed data stream.
252
253 If the input is a buffer, "trailingData" will return everything
254 from the end of the compressed data stream to the end of the
255 buffer.
256
257 If the input is a filehandle, "trailingData" will return the data
258 that is left in the filehandle input buffer once the end of the
259 compressed data stream has been reached. You can then use the
260 filehandle to read the rest of the input file.
261
262 Don't bother using "trailingData" if the input is a filename.
263
264 If you know the length of the compressed data stream before you
265 start uncompressing, you can avoid having to use "trailingData" by
266 setting the "InputLength" option.
267
268 Examples
269 To read the contents of the file "file1.txt.Compressed" and write the
270 uncompressed data to the file "file1.txt".
271
272 use strict ;
273 use warnings ;
274 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
275
276 my $input = "file1.txt.Compressed";
277 my $output = "file1.txt";
278 anyuncompress $input => $output
279 or die "anyuncompress failed: $AnyUncompressError\n";
280
281 To read from an existing Perl filehandle, $input, and write the
282 uncompressed data to a buffer, $buffer.
283
284 use strict ;
285 use warnings ;
286 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
287 use IO::File ;
288
289 my $input = new IO::File "<file1.txt.Compressed"
290 or die "Cannot open 'file1.txt.Compressed': $!\n" ;
291 my $buffer ;
292 anyuncompress $input => \$buffer
293 or die "anyuncompress failed: $AnyUncompressError\n";
294
295 To uncompress all files in the directory "/my/home" that match
296 "*.txt.Compressed" and store the compressed data in the same directory
297
298 use strict ;
299 use warnings ;
300 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
301
302 anyuncompress '</my/home/*.txt.Compressed>' => '</my/home/#1.txt>'
303 or die "anyuncompress failed: $AnyUncompressError\n";
304
305 and if you want to compress each file one at a time, this will do the
306 trick
307
308 use strict ;
309 use warnings ;
310 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
311
312 for my $input ( glob "/my/home/*.txt.Compressed" )
313 {
314 my $output = $input;
315 $output =~ s/.Compressed// ;
316 anyuncompress $input => $output
317 or die "Error compressing '$input': $AnyUncompressError\n";
318 }
319
321 Constructor
322 The format of the constructor for IO::Uncompress::AnyUncompress is
323 shown below
324
325 my $z = new IO::Uncompress::AnyUncompress $input [OPTS]
326 or die "IO::Uncompress::AnyUncompress failed: $AnyUncompressError\n";
327
328 Returns an "IO::Uncompress::AnyUncompress" object on success and undef
329 on failure. The variable $AnyUncompressError will contain an error
330 message on failure.
331
332 If you are running Perl 5.005 or better the object, $z, returned from
333 IO::Uncompress::AnyUncompress can be used exactly like an IO::File
334 filehandle. This means that all normal input file operations can be
335 carried out with $z. For example, to read a line from a compressed
336 file/buffer you can use either of these forms
337
338 $line = $z->getline();
339 $line = <$z>;
340
341 The mandatory parameter $input is used to determine the source of the
342 compressed data. This parameter can take one of three forms.
343
344 A filename
345 If the $input parameter is a scalar, it is assumed to be a
346 filename. This file will be opened for reading and the compressed
347 data will be read from it.
348
349 A filehandle
350 If the $input parameter is a filehandle, the compressed data will
351 be read from it. The string '-' can be used as an alias for
352 standard input.
353
354 A scalar reference
355 If $input is a scalar reference, the compressed data will be read
356 from $$input.
357
358 Constructor Options
359 The option names defined below are case insensitive and can be
360 optionally prefixed by a '-'. So all of the following are valid
361
362 -AutoClose
363 -autoclose
364 AUTOCLOSE
365 autoclose
366
367 OPTS is a combination of the following options:
368
369 "AutoClose => 0|1"
370 This option is only valid when the $input parameter is a
371 filehandle. If specified, and the value is true, it will result in
372 the file being closed once either the "close" method is called or
373 the IO::Uncompress::AnyUncompress object is destroyed.
374
375 This parameter defaults to 0.
376
377 "MultiStream => 0|1"
378 Allows multiple concatenated compressed streams to be treated as a
379 single compressed stream. Decompression will stop once either the
380 end of the file/buffer is reached, an error is encountered
381 (premature eof, corrupt compressed data) or the end of a stream is
382 not immediately followed by the start of another stream.
383
384 This parameter defaults to 0.
385
386 "Prime => $string"
387 This option will uncompress the contents of $string before
388 processing the input file/buffer.
389
390 This option can be useful when the compressed data is embedded in
391 another file/data structure and it is not possible to work out
392 where the compressed data begins without having to read the first
393 few bytes. If this is the case, the uncompression can be primed
394 with these bytes using this option.
395
396 "Transparent => 0|1"
397 If this option is set and the input file/buffer is not compressed
398 data, the module will allow reading of it anyway.
399
400 In addition, if the input file/buffer does contain compressed data
401 and there is non-compressed data immediately following it, setting
402 this option will make this module treat the whole file/buffer as a
403 single data stream.
404
405 This option defaults to 1.
406
407 "BlockSize => $num"
408 When reading the compressed input data,
409 IO::Uncompress::AnyUncompress will read it in blocks of $num
410 bytes.
411
412 This option defaults to 4096.
413
414 "InputLength => $size"
415 When present this option will limit the number of compressed bytes
416 read from the input file/buffer to $size. This option can be used
417 in the situation where there is useful data directly after the
418 compressed data stream and you know beforehand the exact length of
419 the compressed data stream.
420
421 This option is mostly used when reading from a filehandle, in
422 which case the file pointer will be left pointing to the first
423 byte directly after the compressed data stream.
424
425 This option defaults to off.
426
427 "Append => 0|1"
428 This option controls what the "read" method does with uncompressed
429 data.
430
431 If set to 1, all uncompressed data will be appended to the output
432 parameter of the "read" method.
433
434 If set to 0, the contents of the output parameter of the "read"
435 method will be overwritten by the uncompressed data.
436
437 Defaults to 0.
438
439 "Strict => 0|1"
440 This option controls whether the extra checks defined below are
441 used when carrying out the decompression. When Strict is on, the
442 extra tests are carried out, when Strict is off they are not.
443
444 The default for this option is off.
445
446 "RawInflate => 0|1"
447 When auto-detecting the compressed format, try to test for raw-
448 deflate (RFC 1951) content using the "IO::Uncompress::RawInflate"
449 module.
450
451 The reason this is not default behaviour is because RFC 1951
452 content can only be detected by attempting to uncompress it. This
453 process is error prone and can result is false positives.
454
455 Defaults to 0.
456
457 "UnLzma => 0|1"
458 When auto-detecting the compressed format, try to test for
459 lzma_alone content using the "IO::Uncompress::UnLzma" module.
460
461 The reason this is not default behaviour is because lzma_alone
462 content can only be detected by attempting to uncompress it. This
463 process is error prone and can result is false positives.
464
465 Defaults to 0.
466
467 Examples
468 TODO
469
471 read
472 Usage is
473
474 $status = $z->read($buffer)
475
476 Reads a block of compressed data (the size of the compressed block is
477 determined by the "Buffer" option in the constructor), uncompresses it
478 and writes any uncompressed data into $buffer. If the "Append"
479 parameter is set in the constructor, the uncompressed data will be
480 appended to the $buffer parameter. Otherwise $buffer will be
481 overwritten.
482
483 Returns the number of uncompressed bytes written to $buffer, zero if
484 eof or a negative number on error.
485
486 read
487 Usage is
488
489 $status = $z->read($buffer, $length)
490 $status = $z->read($buffer, $length, $offset)
491
492 $status = read($z, $buffer, $length)
493 $status = read($z, $buffer, $length, $offset)
494
495 Attempt to read $length bytes of uncompressed data into $buffer.
496
497 The main difference between this form of the "read" method and the
498 previous one, is that this one will attempt to return exactly $length
499 bytes. The only circumstances that this function will not is if end-of-
500 file or an IO error is encountered.
501
502 Returns the number of uncompressed bytes written to $buffer, zero if
503 eof or a negative number on error.
504
505 getline
506 Usage is
507
508 $line = $z->getline()
509 $line = <$z>
510
511 Reads a single line.
512
513 This method fully supports the use of the variable $/ (or
514 $INPUT_RECORD_SEPARATOR or $RS when "English" is in use) to determine
515 what constitutes an end of line. Paragraph mode, record mode and file
516 slurp mode are all supported.
517
518 getc
519 Usage is
520
521 $char = $z->getc()
522
523 Read a single character.
524
525 ungetc
526 Usage is
527
528 $char = $z->ungetc($string)
529
530 getHeaderInfo
531 Usage is
532
533 $hdr = $z->getHeaderInfo();
534 @hdrs = $z->getHeaderInfo();
535
536 This method returns either a hash reference (in scalar context) or a
537 list or hash references (in array context) that contains information
538 about each of the header fields in the compressed data stream(s).
539
540 tell
541 Usage is
542
543 $z->tell()
544 tell $z
545
546 Returns the uncompressed file offset.
547
548 eof
549 Usage is
550
551 $z->eof();
552 eof($z);
553
554 Returns true if the end of the compressed input stream has been
555 reached.
556
557 seek
558 $z->seek($position, $whence);
559 seek($z, $position, $whence);
560
561 Provides a sub-set of the "seek" functionality, with the restriction
562 that it is only legal to seek forward in the input file/buffer. It is
563 a fatal error to attempt to seek backward.
564
565 Note that the implementation of "seek" in this module does not provide
566 true random access to a compressed file/buffer. It works by
567 uncompressing data from the current offset in the file/buffer until it
568 reaches the uncompressed offset specified in the parameters to "seek".
569 For very small files this may be acceptable behaviour. For large files
570 it may cause an unacceptable delay.
571
572 The $whence parameter takes one the usual values, namely SEEK_SET,
573 SEEK_CUR or SEEK_END.
574
575 Returns 1 on success, 0 on failure.
576
577 binmode
578 Usage is
579
580 $z->binmode
581 binmode $z ;
582
583 This is a noop provided for completeness.
584
585 opened
586 $z->opened()
587
588 Returns true if the object currently refers to a opened file/buffer.
589
590 autoflush
591 my $prev = $z->autoflush()
592 my $prev = $z->autoflush(EXPR)
593
594 If the $z object is associated with a file or a filehandle, this method
595 returns the current autoflush setting for the underlying filehandle. If
596 "EXPR" is present, and is non-zero, it will enable flushing after every
597 write/print operation.
598
599 If $z is associated with a buffer, this method has no effect and always
600 returns "undef".
601
602 Note that the special variable $| cannot be used to set or retrieve the
603 autoflush setting.
604
605 input_line_number
606 $z->input_line_number()
607 $z->input_line_number(EXPR)
608
609 Returns the current uncompressed line number. If "EXPR" is present it
610 has the effect of setting the line number. Note that setting the line
611 number does not change the current position within the file/buffer
612 being read.
613
614 The contents of $/ are used to determine what constitutes a line
615 terminator.
616
617 fileno
618 $z->fileno()
619 fileno($z)
620
621 If the $z object is associated with a file or a filehandle, "fileno"
622 will return the underlying file descriptor. Once the "close" method is
623 called "fileno" will return "undef".
624
625 If the $z object is associated with a buffer, this method will return
626 "undef".
627
628 close
629 $z->close() ;
630 close $z ;
631
632 Closes the output file/buffer.
633
634 For most versions of Perl this method will be automatically invoked if
635 the IO::Uncompress::AnyUncompress object is destroyed (either
636 explicitly or by the variable with the reference to the object going
637 out of scope). The exceptions are Perl versions 5.005 through 5.00504
638 and 5.8.0. In these cases, the "close" method will be called
639 automatically, but not until global destruction of all live objects
640 when the program is terminating.
641
642 Therefore, if you want your scripts to be able to run on all versions
643 of Perl, you should call "close" explicitly and not rely on automatic
644 closing.
645
646 Returns true on success, otherwise 0.
647
648 If the "AutoClose" option has been enabled when the
649 IO::Uncompress::AnyUncompress object was created, and the object is
650 associated with a file, the underlying file will also be closed.
651
652 nextStream
653 Usage is
654
655 my $status = $z->nextStream();
656
657 Skips to the next compressed data stream in the input file/buffer. If a
658 new compressed data stream is found, the eof marker will be cleared and
659 $. will be reset to 0.
660
661 Returns 1 if a new stream was found, 0 if none was found, and -1 if an
662 error was encountered.
663
664 trailingData
665 Usage is
666
667 my $data = $z->trailingData();
668
669 Returns the data, if any, that is present immediately after the
670 compressed data stream once uncompression is complete. It only makes
671 sense to call this method once the end of the compressed data stream
672 has been encountered.
673
674 This option can be used when there is useful information immediately
675 following the compressed data stream, and you don't know the length of
676 the compressed data stream.
677
678 If the input is a buffer, "trailingData" will return everything from
679 the end of the compressed data stream to the end of the buffer.
680
681 If the input is a filehandle, "trailingData" will return the data that
682 is left in the filehandle input buffer once the end of the compressed
683 data stream has been reached. You can then use the filehandle to read
684 the rest of the input file.
685
686 Don't bother using "trailingData" if the input is a filename.
687
688 If you know the length of the compressed data stream before you start
689 uncompressing, you can avoid having to use "trailingData" by setting
690 the "InputLength" option in the constructor.
691
693 No symbolic constants are required by this
694 IO::Uncompress::AnyUncompress at present.
695
696 :all Imports "anyuncompress" and $AnyUncompressError. Same as doing
697 this
698
699 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
700
703 Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip,
704 IO::Compress::Deflate, IO::Uncompress::Inflate,
705 IO::Compress::RawDeflate, IO::Uncompress::RawInflate,
706 IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma,
707 IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz,
708 IO::Compress::Lzip, IO::Uncompress::UnLzip, IO::Compress::Lzop,
709 IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf,
710 IO::Compress::Zstd, IO::Uncompress::UnZstd, IO::Uncompress::AnyInflate
711
712 IO::Compress::FAQ
713
714 File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib
715
717 This module was written by Paul Marquess, "pmqs@cpan.org".
718
720 See the Changes file.
721
723 Copyright (c) 2005-2019 Paul Marquess. All rights reserved.
724
725 This program is free software; you can redistribute it and/or modify it
726 under the same terms as Perl itself.
727
728
729
730perl v5.28.1 2019-01-05 IO::Uncompress::AnyUncompress(3)