1IO::Uncompress::AnyUncoUmsperresCso(n3t)ributed Perl DocIuOm:e:nUtnactoimopnress::AnyUncompress(3)
2
3
4
6 IO::Uncompress::AnyUncompress - Uncompress gzip, zip, bzip2, zstd, xz,
7 lzma, lzip, lzf or lzop file/buffer
8
10 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
11
12 my $status = anyuncompress $input => $output [,OPTS]
13 or die "anyuncompress failed: $AnyUncompressError\n";
14
15 my $z = IO::Uncompress::AnyUncompress->new( $input [OPTS] )
16 or die "anyuncompress failed: $AnyUncompressError\n";
17
18 $status = $z->read($buffer)
19 $status = $z->read($buffer, $length)
20 $status = $z->read($buffer, $length, $offset)
21 $line = $z->getline()
22 $char = $z->getc()
23 $char = $z->ungetc()
24 $char = $z->opened()
25
26 $data = $z->trailingData()
27 $status = $z->nextStream()
28 $data = $z->getHeaderInfo()
29 $z->tell()
30 $z->seek($position, $whence)
31 $z->binmode()
32 $z->fileno()
33 $z->eof()
34 $z->close()
35
36 $AnyUncompressError ;
37
38 # IO::File mode
39
40 <$z>
41 read($z, $buffer);
42 read($z, $buffer, $length);
43 read($z, $buffer, $length, $offset);
44 tell($z)
45 seek($z, $position, $whence)
46 binmode($z)
47 fileno($z)
48 eof($z)
49 close($z)
50
52 This module provides a Perl interface that allows the reading of
53 files/buffers that have been compressed with a variety of compression
54 libraries.
55
56 The formats supported are:
57
58 RFC 1950
59 RFC 1951 (optionally)
60 gzip (RFC 1952)
61 zip
62 zstd (Zstandard)
63 bzip2
64 lzop
65 lzf
66 lzma
67 lzip
68 xz
69
70 The module will auto-detect which, if any, of the supported compression
71 formats is being used.
72
74 A top-level function, "anyuncompress", is provided to carry out "one-
75 shot" uncompression between buffers and/or files. For finer control
76 over the uncompression process, see the "OO Interface" section.
77
78 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
79
80 anyuncompress $input_filename_or_reference => $output_filename_or_reference [,OPTS]
81 or die "anyuncompress failed: $AnyUncompressError\n";
82
83 The functional interface needs Perl5.005 or better.
84
85 anyuncompress $input_filename_or_reference => $output_filename_or_reference
86 [, OPTS]
87 "anyuncompress" expects at least two parameters,
88 $input_filename_or_reference and $output_filename_or_reference and zero
89 or more optional parameters (see "Optional Parameters")
90
91 The $input_filename_or_reference parameter
92
93 The parameter, $input_filename_or_reference, is used to define the
94 source of the compressed data.
95
96 It can take one of the following forms:
97
98 A filename
99 If the $input_filename_or_reference parameter is a simple scalar,
100 it is assumed to be a filename. This file will be opened for
101 reading and the input data will be read from it.
102
103 A filehandle
104 If the $input_filename_or_reference parameter is a filehandle, the
105 input data will be read from it. The string '-' can be used as an
106 alias for standard input.
107
108 A scalar reference
109 If $input_filename_or_reference is a scalar reference, the input
110 data will be read from $$input_filename_or_reference.
111
112 An array reference
113 If $input_filename_or_reference is an array reference, each
114 element in the array must be a filename.
115
116 The input data will be read from each file in turn.
117
118 The complete array will be walked to ensure that it only contains
119 valid filenames before any data is uncompressed.
120
121 An Input FileGlob string
122 If $input_filename_or_reference is a string that is delimited by
123 the characters "<" and ">" "anyuncompress" will assume that it is
124 an input fileglob string. The input is the list of files that
125 match the fileglob.
126
127 See File::GlobMapper for more details.
128
129 If the $input_filename_or_reference parameter is any other type,
130 "undef" will be returned.
131
132 The $output_filename_or_reference parameter
133
134 The parameter $output_filename_or_reference is used to control the
135 destination of the uncompressed data. This parameter can take one of
136 these forms.
137
138 A filename
139 If the $output_filename_or_reference parameter is a simple scalar,
140 it is assumed to be a filename. This file will be opened for
141 writing and the uncompressed data will be written to it.
142
143 A filehandle
144 If the $output_filename_or_reference parameter is a filehandle,
145 the uncompressed data will be written to it. The string '-' can
146 be used as an alias for standard output.
147
148 A scalar reference
149 If $output_filename_or_reference is a scalar reference, the
150 uncompressed data will be stored in
151 $$output_filename_or_reference.
152
153 An Array Reference
154 If $output_filename_or_reference is an array reference, the
155 uncompressed data will be pushed onto the array.
156
157 An Output FileGlob
158 If $output_filename_or_reference is a string that is delimited by
159 the characters "<" and ">" "anyuncompress" will assume that it is
160 an output fileglob string. The output is the list of files that
161 match the fileglob.
162
163 When $output_filename_or_reference is an fileglob string,
164 $input_filename_or_reference must also be a fileglob string.
165 Anything else is an error.
166
167 See File::GlobMapper for more details.
168
169 If the $output_filename_or_reference parameter is any other type,
170 "undef" will be returned.
171
172 Notes
173 When $input_filename_or_reference maps to multiple compressed
174 files/buffers and $output_filename_or_reference is a single
175 file/buffer, after uncompression $output_filename_or_reference will
176 contain a concatenation of all the uncompressed data from each of the
177 input files/buffers.
178
179 Optional Parameters
180 The optional parameters for the one-shot function "anyuncompress" are
181 (for the most part) identical to those used with the OO interface
182 defined in the "Constructor Options" section. The exceptions are listed
183 below
184
185 "AutoClose => 0|1"
186 This option applies to any input or output data streams to
187 "anyuncompress" that are filehandles.
188
189 If "AutoClose" is specified, and the value is true, it will result
190 in all input and/or output filehandles being closed once
191 "anyuncompress" has completed.
192
193 This parameter defaults to 0.
194
195 "BinModeOut => 0|1"
196 This option is now a no-op. All files will be written in binmode.
197
198 "Append => 0|1"
199 The behaviour of this option is dependent on the type of output
200 data stream.
201
202 • A Buffer
203
204 If "Append" is enabled, all uncompressed data will be append
205 to the end of the output buffer. Otherwise the output buffer
206 will be cleared before any uncompressed data is written to
207 it.
208
209 • A Filename
210
211 If "Append" is enabled, the file will be opened in append
212 mode. Otherwise the contents of the file, if any, will be
213 truncated before any uncompressed data is written to it.
214
215 • A Filehandle
216
217 If "Append" is enabled, the filehandle will be positioned to
218 the end of the file via a call to "seek" before any
219 uncompressed data is written to it. Otherwise the file
220 pointer will not be moved.
221
222 When "Append" is specified, and set to true, it will append all
223 uncompressed data to the output data stream.
224
225 So when the output is a filehandle it will carry out a seek to the
226 eof before writing any uncompressed data. If the output is a
227 filename, it will be opened for appending. If the output is a
228 buffer, all uncompressed data will be appended to the existing
229 buffer.
230
231 Conversely when "Append" is not specified, or it is present and is
232 set to false, it will operate as follows.
233
234 When the output is a filename, it will truncate the contents of
235 the file before writing any uncompressed data. If the output is a
236 filehandle its position will not be changed. If the output is a
237 buffer, it will be wiped before any uncompressed data is output.
238
239 Defaults to 0.
240
241 "MultiStream => 0|1"
242 If the input file/buffer contains multiple compressed data
243 streams, this option will uncompress the whole lot as a single
244 data stream.
245
246 Defaults to 0.
247
248 "TrailingData => $scalar"
249 Returns the data, if any, that is present immediately after the
250 compressed data stream once uncompression is complete.
251
252 This option can be used when there is useful information
253 immediately following the compressed data stream, and you don't
254 know the length of the compressed data stream.
255
256 If the input is a buffer, "trailingData" will return everything
257 from the end of the compressed data stream to the end of the
258 buffer.
259
260 If the input is a filehandle, "trailingData" will return the data
261 that is left in the filehandle input buffer once the end of the
262 compressed data stream has been reached. You can then use the
263 filehandle to read the rest of the input file.
264
265 Don't bother using "trailingData" if the input is a filename.
266
267 If you know the length of the compressed data stream before you
268 start uncompressing, you can avoid having to use "trailingData" by
269 setting the "InputLength" option.
270
271 OneShot Examples
272 To read the contents of the file "file1.txt.Compressed" and write the
273 uncompressed data to the file "file1.txt".
274
275 use strict ;
276 use warnings ;
277 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
278
279 my $input = "file1.txt.Compressed";
280 my $output = "file1.txt";
281 anyuncompress $input => $output
282 or die "anyuncompress failed: $AnyUncompressError\n";
283
284 To read from an existing Perl filehandle, $input, and write the
285 uncompressed data to a buffer, $buffer.
286
287 use strict ;
288 use warnings ;
289 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
290 use IO::File ;
291
292 my $input = IO::File->new( "<file1.txt.Compressed" )
293 or die "Cannot open 'file1.txt.Compressed': $!\n" ;
294 my $buffer ;
295 anyuncompress $input => \$buffer
296 or die "anyuncompress failed: $AnyUncompressError\n";
297
298 To uncompress all files in the directory "/my/home" that match
299 "*.txt.Compressed" and store the compressed data in the same directory
300
301 use strict ;
302 use warnings ;
303 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
304
305 anyuncompress '</my/home/*.txt.Compressed>' => '</my/home/#1.txt>'
306 or die "anyuncompress failed: $AnyUncompressError\n";
307
308 and if you want to compress each file one at a time, this will do the
309 trick
310
311 use strict ;
312 use warnings ;
313 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
314
315 for my $input ( glob "/my/home/*.txt.Compressed" )
316 {
317 my $output = $input;
318 $output =~ s/.Compressed// ;
319 anyuncompress $input => $output
320 or die "Error compressing '$input': $AnyUncompressError\n";
321 }
322
324 Constructor
325 The format of the constructor for IO::Uncompress::AnyUncompress is
326 shown below
327
328 my $z = IO::Uncompress::AnyUncompress->new( $input [OPTS] )
329 or die "IO::Uncompress::AnyUncompress failed: $AnyUncompressError\n";
330
331 The constructor takes one mandatory parameter, $input, defined below,
332 and zero or more "OPTS", defined in "Constructor Options".
333
334 Returns an "IO::Uncompress::AnyUncompress" object on success and undef
335 on failure. The variable $AnyUncompressError will contain an error
336 message on failure.
337
338 If you are running Perl 5.005 or better the object, $z, returned from
339 IO::Uncompress::AnyUncompress can be used exactly like an IO::File
340 filehandle. This means that all normal input file operations can be
341 carried out with $z. For example, to read a line from a compressed
342 file/buffer you can use either of these forms
343
344 $line = $z->getline();
345 $line = <$z>;
346
347 Below is a simple exaple of using the OO interface to read the
348 compressed file "myfile.Compressed" and write its contents to stdout.
349
350 my $filename = "myfile.Compressed";
351 my $z = IO::Uncompress::AnyUncompress->new($filename)
352 or die "IO::Uncompress::AnyUncompress failed: $AnyUncompressError\n";
353
354 while (<$z>) {
355 print $_;
356 }
357 $z->close();
358
359 See "EXAMPLES" for further examples
360
361 The mandatory parameter $input is used to determine the source of the
362 compressed data. This parameter can take one of three forms.
363
364 A filename
365 If the $input parameter is a scalar, it is assumed to be a
366 filename. This file will be opened for reading and the compressed
367 data will be read from it.
368
369 A filehandle
370 If the $input parameter is a filehandle, the compressed data will
371 be read from it. The string '-' can be used as an alias for
372 standard input.
373
374 A scalar reference
375 If $input is a scalar reference, the compressed data will be read
376 from $$input.
377
378 Constructor Options
379 The option names defined below are case insensitive and can be
380 optionally prefixed by a '-'. So all of the following are valid
381
382 -AutoClose
383 -autoclose
384 AUTOCLOSE
385 autoclose
386
387 OPTS is a combination of the following options:
388
389 "AutoClose => 0|1"
390 This option is only valid when the $input parameter is a
391 filehandle. If specified, and the value is true, it will result in
392 the file being closed once either the "close" method is called or
393 the IO::Uncompress::AnyUncompress object is destroyed.
394
395 This parameter defaults to 0.
396
397 "MultiStream => 0|1"
398 Allows multiple concatenated compressed streams to be treated as a
399 single compressed stream. Decompression will stop once either the
400 end of the file/buffer is reached, an error is encountered
401 (premature eof, corrupt compressed data) or the end of a stream is
402 not immediately followed by the start of another stream.
403
404 This parameter defaults to 0.
405
406 "Prime => $string"
407 This option will uncompress the contents of $string before
408 processing the input file/buffer.
409
410 This option can be useful when the compressed data is embedded in
411 another file/data structure and it is not possible to work out
412 where the compressed data begins without having to read the first
413 few bytes. If this is the case, the uncompression can be primed
414 with these bytes using this option.
415
416 "Transparent => 0|1"
417 If this option is set and the input file/buffer is not compressed
418 data, the module will allow reading of it anyway.
419
420 In addition, if the input file/buffer does contain compressed data
421 and there is non-compressed data immediately following it, setting
422 this option will make this module treat the whole file/buffer as a
423 single data stream.
424
425 This option defaults to 1.
426
427 "BlockSize => $num"
428 When reading the compressed input data,
429 IO::Uncompress::AnyUncompress will read it in blocks of $num
430 bytes.
431
432 This option defaults to 4096.
433
434 "InputLength => $size"
435 When present this option will limit the number of compressed bytes
436 read from the input file/buffer to $size. This option can be used
437 in the situation where there is useful data directly after the
438 compressed data stream and you know beforehand the exact length of
439 the compressed data stream.
440
441 This option is mostly used when reading from a filehandle, in
442 which case the file pointer will be left pointing to the first
443 byte directly after the compressed data stream.
444
445 This option defaults to off.
446
447 "Append => 0|1"
448 This option controls what the "read" method does with uncompressed
449 data.
450
451 If set to 1, all uncompressed data will be appended to the output
452 parameter of the "read" method.
453
454 If set to 0, the contents of the output parameter of the "read"
455 method will be overwritten by the uncompressed data.
456
457 Defaults to 0.
458
459 "Strict => 0|1"
460 This option controls whether the extra checks defined below are
461 used when carrying out the decompression. When Strict is on, the
462 extra tests are carried out, when Strict is off they are not.
463
464 The default for this option is off.
465
466 "RawInflate => 0|1"
467 When auto-detecting the compressed format, try to test for raw-
468 deflate (RFC 1951) content using the "IO::Uncompress::RawInflate"
469 module.
470
471 The reason this is not default behaviour is because RFC 1951
472 content can only be detected by attempting to uncompress it. This
473 process is error prone and can result is false positives.
474
475 Defaults to 0.
476
477 "UnLzma => 0|1"
478 When auto-detecting the compressed format, try to test for
479 lzma_alone content using the "IO::Uncompress::UnLzma" module.
480
481 The reason this is not default behaviour is because lzma_alone
482 content can only be detected by attempting to uncompress it. This
483 process is error prone and can result is false positives.
484
485 Defaults to 0.
486
488 read
489 Usage is
490
491 $status = $z->read($buffer)
492
493 Reads a block of compressed data (the size of the compressed block is
494 determined by the "Buffer" option in the constructor), uncompresses it
495 and writes any uncompressed data into $buffer. If the "Append"
496 parameter is set in the constructor, the uncompressed data will be
497 appended to the $buffer parameter. Otherwise $buffer will be
498 overwritten.
499
500 Returns the number of uncompressed bytes written to $buffer, zero if
501 eof or a negative number on error.
502
503 read
504 Usage is
505
506 $status = $z->read($buffer, $length)
507 $status = $z->read($buffer, $length, $offset)
508
509 $status = read($z, $buffer, $length)
510 $status = read($z, $buffer, $length, $offset)
511
512 Attempt to read $length bytes of uncompressed data into $buffer.
513
514 The main difference between this form of the "read" method and the
515 previous one, is that this one will attempt to return exactly $length
516 bytes. The only circumstances that this function will not is if end-of-
517 file or an IO error is encountered.
518
519 Returns the number of uncompressed bytes written to $buffer, zero if
520 eof or a negative number on error.
521
522 getline
523 Usage is
524
525 $line = $z->getline()
526 $line = <$z>
527
528 Reads a single line.
529
530 This method fully supports the use of the variable $/ (or
531 $INPUT_RECORD_SEPARATOR or $RS when "English" is in use) to determine
532 what constitutes an end of line. Paragraph mode, record mode and file
533 slurp mode are all supported.
534
535 getc
536 Usage is
537
538 $char = $z->getc()
539
540 Read a single character.
541
542 ungetc
543 Usage is
544
545 $char = $z->ungetc($string)
546
547 getHeaderInfo
548 Usage is
549
550 $hdr = $z->getHeaderInfo();
551 @hdrs = $z->getHeaderInfo();
552
553 This method returns either a hash reference (in scalar context) or a
554 list or hash references (in array context) that contains information
555 about each of the header fields in the compressed data stream(s).
556
557 tell
558 Usage is
559
560 $z->tell()
561 tell $z
562
563 Returns the uncompressed file offset.
564
565 eof
566 Usage is
567
568 $z->eof();
569 eof($z);
570
571 Returns true if the end of the compressed input stream has been
572 reached.
573
574 seek
575 $z->seek($position, $whence);
576 seek($z, $position, $whence);
577
578 Provides a sub-set of the "seek" functionality, with the restriction
579 that it is only legal to seek forward in the input file/buffer. It is
580 a fatal error to attempt to seek backward.
581
582 Note that the implementation of "seek" in this module does not provide
583 true random access to a compressed file/buffer. It works by
584 uncompressing data from the current offset in the file/buffer until it
585 reaches the uncompressed offset specified in the parameters to "seek".
586 For very small files this may be acceptable behaviour. For large files
587 it may cause an unacceptable delay.
588
589 The $whence parameter takes one the usual values, namely SEEK_SET,
590 SEEK_CUR or SEEK_END.
591
592 Returns 1 on success, 0 on failure.
593
594 binmode
595 Usage is
596
597 $z->binmode
598 binmode $z ;
599
600 This is a noop provided for completeness.
601
602 opened
603 $z->opened()
604
605 Returns true if the object currently refers to a opened file/buffer.
606
607 autoflush
608 my $prev = $z->autoflush()
609 my $prev = $z->autoflush(EXPR)
610
611 If the $z object is associated with a file or a filehandle, this method
612 returns the current autoflush setting for the underlying filehandle. If
613 "EXPR" is present, and is non-zero, it will enable flushing after every
614 write/print operation.
615
616 If $z is associated with a buffer, this method has no effect and always
617 returns "undef".
618
619 Note that the special variable $| cannot be used to set or retrieve the
620 autoflush setting.
621
622 input_line_number
623 $z->input_line_number()
624 $z->input_line_number(EXPR)
625
626 Returns the current uncompressed line number. If "EXPR" is present it
627 has the effect of setting the line number. Note that setting the line
628 number does not change the current position within the file/buffer
629 being read.
630
631 The contents of $/ are used to determine what constitutes a line
632 terminator.
633
634 fileno
635 $z->fileno()
636 fileno($z)
637
638 If the $z object is associated with a file or a filehandle, "fileno"
639 will return the underlying file descriptor. Once the "close" method is
640 called "fileno" will return "undef".
641
642 If the $z object is associated with a buffer, this method will return
643 "undef".
644
645 close
646 $z->close() ;
647 close $z ;
648
649 Closes the output file/buffer.
650
651 For most versions of Perl this method will be automatically invoked if
652 the IO::Uncompress::AnyUncompress object is destroyed (either
653 explicitly or by the variable with the reference to the object going
654 out of scope). The exceptions are Perl versions 5.005 through 5.00504
655 and 5.8.0. In these cases, the "close" method will be called
656 automatically, but not until global destruction of all live objects
657 when the program is terminating.
658
659 Therefore, if you want your scripts to be able to run on all versions
660 of Perl, you should call "close" explicitly and not rely on automatic
661 closing.
662
663 Returns true on success, otherwise 0.
664
665 If the "AutoClose" option has been enabled when the
666 IO::Uncompress::AnyUncompress object was created, and the object is
667 associated with a file, the underlying file will also be closed.
668
669 nextStream
670 Usage is
671
672 my $status = $z->nextStream();
673
674 Skips to the next compressed data stream in the input file/buffer. If a
675 new compressed data stream is found, the eof marker will be cleared and
676 $. will be reset to 0.
677
678 Returns 1 if a new stream was found, 0 if none was found, and -1 if an
679 error was encountered.
680
681 trailingData
682 Usage is
683
684 my $data = $z->trailingData();
685
686 Returns the data, if any, that is present immediately after the
687 compressed data stream once uncompression is complete. It only makes
688 sense to call this method once the end of the compressed data stream
689 has been encountered.
690
691 This option can be used when there is useful information immediately
692 following the compressed data stream, and you don't know the length of
693 the compressed data stream.
694
695 If the input is a buffer, "trailingData" will return everything from
696 the end of the compressed data stream to the end of the buffer.
697
698 If the input is a filehandle, "trailingData" will return the data that
699 is left in the filehandle input buffer once the end of the compressed
700 data stream has been reached. You can then use the filehandle to read
701 the rest of the input file.
702
703 Don't bother using "trailingData" if the input is a filename.
704
705 If you know the length of the compressed data stream before you start
706 uncompressing, you can avoid having to use "trailingData" by setting
707 the "InputLength" option in the constructor.
708
710 No symbolic constants are required by IO::Uncompress::AnyUncompress at
711 present.
712
713 :all Imports "anyuncompress" and $AnyUncompressError. Same as doing
714 this
715
716 use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
717
720 General feedback/questions/bug reports should be sent to
721 <https://github.com/pmqs/IO-Compress/issues> (preferred) or
722 <https://rt.cpan.org/Public/Dist/Display.html?Name=IO-Compress>.
723
725 Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip,
726 IO::Compress::Deflate, IO::Uncompress::Inflate,
727 IO::Compress::RawDeflate, IO::Uncompress::RawInflate,
728 IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma,
729 IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz,
730 IO::Compress::Lzip, IO::Uncompress::UnLzip, IO::Compress::Lzop,
731 IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf,
732 IO::Compress::Zstd, IO::Uncompress::UnZstd, IO::Uncompress::AnyInflate
733
734 IO::Compress::FAQ
735
736 File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib
737
739 This module was written by Paul Marquess, "pmqs@cpan.org".
740
742 See the Changes file.
743
745 Copyright (c) 2005-2023 Paul Marquess. All rights reserved.
746
747 This program is free software; you can redistribute it and/or modify it
748 under the same terms as Perl itself.
749
750
751
752perl v5.38.0 2023-07-26 IO::Uncompress::AnyUncompress(3)