1IO::Uncompress::Unzip(3U)ser Contributed Perl DocumentatiIoOn::Uncompress::Unzip(3)
2
3
4
6 IO::Uncompress::Unzip - Read zip files/buffers
7
9 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
10
11 my $status = unzip $input => $output [,OPTS]
12 or die "unzip failed: $UnzipError\n";
13
14 my $z = new IO::Uncompress::Unzip $input [OPTS]
15 or die "unzip failed: $UnzipError\n";
16
17 $status = $z->read($buffer)
18 $status = $z->read($buffer, $length)
19 $status = $z->read($buffer, $length, $offset)
20 $line = $z->getline()
21 $char = $z->getc()
22 $char = $z->ungetc()
23 $char = $z->opened()
24
25 $status = $z->inflateSync()
26
27 $data = $z->trailingData()
28 $status = $z->nextStream()
29 $data = $z->getHeaderInfo()
30 $z->tell()
31 $z->seek($position, $whence)
32 $z->binmode()
33 $z->fileno()
34 $z->eof()
35 $z->close()
36
37 $UnzipError ;
38
39 # IO::File mode
40
41 <$z>
42 read($z, $buffer);
43 read($z, $buffer, $length);
44 read($z, $buffer, $length, $offset);
45 tell($z)
46 seek($z, $position, $whence)
47 binmode($z)
48 fileno($z)
49 eof($z)
50 close($z)
51
53 This module provides a Perl interface that allows the reading of zlib
54 files/buffers.
55
56 For writing zip files/buffers, see the companion module
57 IO::Compress::Zip.
58
60 A top-level function, "unzip", is provided to carry out "one-shot"
61 uncompression between buffers and/or files. For finer control over the
62 uncompression process, see the "OO Interface" section.
63
64 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
65
66 unzip $input_filename_or_reference => $output_filename_or_reference [,OPTS]
67 or die "unzip failed: $UnzipError\n";
68
69 The functional interface needs Perl5.005 or better.
70
71 unzip $input_filename_or_reference => $output_filename_or_reference [,
72 OPTS]
73 "unzip" expects at least two parameters, $input_filename_or_reference
74 and $output_filename_or_reference and zero or more optional parameters
75 (see "Optional Parameters")
76
77 The $input_filename_or_reference parameter
78
79 The parameter, $input_filename_or_reference, is used to define the
80 source of the compressed data.
81
82 It can take one of the following forms:
83
84 A filename
85 If the $input_filename_or_reference parameter is a simple scalar,
86 it is assumed to be a filename. This file will be opened for
87 reading and the input data will be read from it.
88
89 A filehandle
90 If the $input_filename_or_reference parameter is a filehandle, the
91 input data will be read from it. The string '-' can be used as an
92 alias for standard input.
93
94 A scalar reference
95 If $input_filename_or_reference is a scalar reference, the input
96 data will be read from $$input_filename_or_reference.
97
98 An array reference
99 If $input_filename_or_reference is an array reference, each
100 element in the array must be a filename.
101
102 The input data will be read from each file in turn.
103
104 The complete array will be walked to ensure that it only contains
105 valid filenames before any data is uncompressed.
106
107 An Input FileGlob string
108 If $input_filename_or_reference is a string that is delimited by
109 the characters "<" and ">" "unzip" will assume that it is an input
110 fileglob string. The input is the list of files that match the
111 fileglob.
112
113 See File::GlobMapper for more details.
114
115 If the $input_filename_or_reference parameter is any other type,
116 "undef" will be returned.
117
118 The $output_filename_or_reference parameter
119
120 The parameter $output_filename_or_reference is used to control the
121 destination of the uncompressed data. This parameter can take one of
122 these forms.
123
124 A filename
125 If the $output_filename_or_reference parameter is a simple scalar,
126 it is assumed to be a filename. This file will be opened for
127 writing and the uncompressed data will be written to it.
128
129 A filehandle
130 If the $output_filename_or_reference parameter is a filehandle,
131 the uncompressed data will be written to it. The string '-' can
132 be used as an alias for standard output.
133
134 A scalar reference
135 If $output_filename_or_reference is a scalar reference, the
136 uncompressed data will be stored in
137 $$output_filename_or_reference.
138
139 An Array Reference
140 If $output_filename_or_reference is an array reference, the
141 uncompressed data will be pushed onto the array.
142
143 An Output FileGlob
144 If $output_filename_or_reference is a string that is delimited by
145 the characters "<" and ">" "unzip" will assume that it is an
146 output fileglob string. The output is the list of files that match
147 the fileglob.
148
149 When $output_filename_or_reference is an fileglob string,
150 $input_filename_or_reference must also be a fileglob string.
151 Anything else is an error.
152
153 See File::GlobMapper for more details.
154
155 If the $output_filename_or_reference parameter is any other type,
156 "undef" will be returned.
157
158 Notes
159 When $input_filename_or_reference maps to multiple compressed
160 files/buffers and $output_filename_or_reference is a single
161 file/buffer, after uncompression $output_filename_or_reference will
162 contain a concatenation of all the uncompressed data from each of the
163 input files/buffers.
164
165 Optional Parameters
166 The optional parameters for the one-shot function "unzip" are (for the
167 most part) identical to those used with the OO interface defined in the
168 "Constructor Options" section. The exceptions are listed below
169
170 "AutoClose => 0|1"
171 This option applies to any input or output data streams to "unzip"
172 that are filehandles.
173
174 If "AutoClose" is specified, and the value is true, it will result
175 in all input and/or output filehandles being closed once "unzip"
176 has completed.
177
178 This parameter defaults to 0.
179
180 "BinModeOut => 0|1"
181 This option is now a no-op. All files will be written in binmode.
182
183 "Append => 0|1"
184 The behaviour of this option is dependent on the type of output
185 data stream.
186
187 · A Buffer
188
189 If "Append" is enabled, all uncompressed data will be append
190 to the end of the output buffer. Otherwise the output buffer
191 will be cleared before any uncompressed data is written to
192 it.
193
194 · A Filename
195
196 If "Append" is enabled, the file will be opened in append
197 mode. Otherwise the contents of the file, if any, will be
198 truncated before any uncompressed data is written to it.
199
200 · A Filehandle
201
202 If "Append" is enabled, the filehandle will be positioned to
203 the end of the file via a call to "seek" before any
204 uncompressed data is written to it. Otherwise the file
205 pointer will not be moved.
206
207 When "Append" is specified, and set to true, it will append all
208 uncompressed data to the output data stream.
209
210 So when the output is a filehandle it will carry out a seek to the
211 eof before writing any uncompressed data. If the output is a
212 filename, it will be opened for appending. If the output is a
213 buffer, all uncompressed data will be appended to the existing
214 buffer.
215
216 Conversely when "Append" is not specified, or it is present and is
217 set to false, it will operate as follows.
218
219 When the output is a filename, it will truncate the contents of
220 the file before writing any uncompressed data. If the output is a
221 filehandle its position will not be changed. If the output is a
222 buffer, it will be wiped before any uncompressed data is output.
223
224 Defaults to 0.
225
226 "MultiStream => 0|1"
227 If the input file/buffer contains multiple compressed data
228 streams, this option will uncompress the whole lot as a single
229 data stream.
230
231 Defaults to 0.
232
233 "TrailingData => $scalar"
234 Returns the data, if any, that is present immediately after the
235 compressed data stream once uncompression is complete.
236
237 This option can be used when there is useful information
238 immediately following the compressed data stream, and you don't
239 know the length of the compressed data stream.
240
241 If the input is a buffer, "trailingData" will return everything
242 from the end of the compressed data stream to the end of the
243 buffer.
244
245 If the input is a filehandle, "trailingData" will return the data
246 that is left in the filehandle input buffer once the end of the
247 compressed data stream has been reached. You can then use the
248 filehandle to read the rest of the input file.
249
250 Don't bother using "trailingData" if the input is a filename.
251
252 If you know the length of the compressed data stream before you
253 start uncompressing, you can avoid having to use "trailingData" by
254 setting the "InputLength" option.
255
256 Examples
257 Say you have a zip file, "file1.zip", that only contains a single
258 member, you can read it and write the uncompressed data to the file
259 "file1.txt" like this.
260
261 use strict ;
262 use warnings ;
263 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
264
265 my $input = "file1.zip";
266 my $output = "file1.txt";
267 unzip $input => $output
268 or die "unzip failed: $UnzipError\n";
269
270 If you have a zip file that contains multiple members and want to read
271 a specific member from the file, say "data1", use the "Name" option
272
273 use strict ;
274 use warnings ;
275 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
276
277 my $input = "file1.zip";
278 my $output = "file1.txt";
279 unzip $input => $output, Name => "data1"
280 or die "unzip failed: $UnzipError\n";
281
282 Alternatively, if you want to read the "data1" member into memory, use
283 a scalar reference for the "output" parameter.
284
285 use strict ;
286 use warnings ;
287 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
288
289 my $input = "file1.zip";
290 my $output ;
291 unzip $input => \$output, Name => "data1"
292 or die "unzip failed: $UnzipError\n";
293 # $output now contains the uncompressed data
294
295 To read from an existing Perl filehandle, $input, and write the
296 uncompressed data to a buffer, $buffer.
297
298 use strict ;
299 use warnings ;
300 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
301 use IO::File ;
302
303 my $input = new IO::File "<file1.zip"
304 or die "Cannot open 'file1.zip': $!\n" ;
305 my $buffer ;
306 unzip $input => \$buffer
307 or die "unzip failed: $UnzipError\n";
308
310 Constructor
311 The format of the constructor for IO::Uncompress::Unzip is shown below
312
313 my $z = new IO::Uncompress::Unzip $input [OPTS]
314 or die "IO::Uncompress::Unzip failed: $UnzipError\n";
315
316 Returns an "IO::Uncompress::Unzip" object on success and undef on
317 failure. The variable $UnzipError will contain an error message on
318 failure.
319
320 If you are running Perl 5.005 or better the object, $z, returned from
321 IO::Uncompress::Unzip can be used exactly like an IO::File filehandle.
322 This means that all normal input file operations can be carried out
323 with $z. For example, to read a line from a compressed file/buffer you
324 can use either of these forms
325
326 $line = $z->getline();
327 $line = <$z>;
328
329 The mandatory parameter $input is used to determine the source of the
330 compressed data. This parameter can take one of three forms.
331
332 A filename
333 If the $input parameter is a scalar, it is assumed to be a
334 filename. This file will be opened for reading and the compressed
335 data will be read from it.
336
337 A filehandle
338 If the $input parameter is a filehandle, the compressed data will
339 be read from it. The string '-' can be used as an alias for
340 standard input.
341
342 A scalar reference
343 If $input is a scalar reference, the compressed data will be read
344 from $$input.
345
346 Constructor Options
347 The option names defined below are case insensitive and can be
348 optionally prefixed by a '-'. So all of the following are valid
349
350 -AutoClose
351 -autoclose
352 AUTOCLOSE
353 autoclose
354
355 OPTS is a combination of the following options:
356
357 "Name => "membername""
358 Open "membername" from the zip file for reading.
359
360 "Efs => 0| 1"
361 When this option is set to true AND the zip archive being read has
362 the "Language Encoding Flag" (EFS) set, the member name is assumed
363 to be encoded in UTF-8.
364
365 If the member name in the zip archive is not valid UTF-8 when this
366 optionn is true, the script will die with an error message.
367
368 Note that this option only works with Perl 5.8.4 or better.
369
370 This option defaults to false.
371
372 "AutoClose => 0|1"
373 This option is only valid when the $input parameter is a
374 filehandle. If specified, and the value is true, it will result in
375 the file being closed once either the "close" method is called or
376 the IO::Uncompress::Unzip object is destroyed.
377
378 This parameter defaults to 0.
379
380 "MultiStream => 0|1"
381 Treats the complete zip file/buffer as a single compressed data
382 stream. When reading in multi-stream mode each member of the zip
383 file/buffer will be uncompressed in turn until the end of the
384 file/buffer is encountered.
385
386 This parameter defaults to 0.
387
388 "Prime => $string"
389 This option will uncompress the contents of $string before
390 processing the input file/buffer.
391
392 This option can be useful when the compressed data is embedded in
393 another file/data structure and it is not possible to work out
394 where the compressed data begins without having to read the first
395 few bytes. If this is the case, the uncompression can be primed
396 with these bytes using this option.
397
398 "Transparent => 0|1"
399 If this option is set and the input file/buffer is not compressed
400 data, the module will allow reading of it anyway.
401
402 In addition, if the input file/buffer does contain compressed data
403 and there is non-compressed data immediately following it, setting
404 this option will make this module treat the whole file/buffer as a
405 single data stream.
406
407 This option defaults to 1.
408
409 "BlockSize => $num"
410 When reading the compressed input data, IO::Uncompress::Unzip will
411 read it in blocks of $num bytes.
412
413 This option defaults to 4096.
414
415 "InputLength => $size"
416 When present this option will limit the number of compressed bytes
417 read from the input file/buffer to $size. This option can be used
418 in the situation where there is useful data directly after the
419 compressed data stream and you know beforehand the exact length of
420 the compressed data stream.
421
422 This option is mostly used when reading from a filehandle, in
423 which case the file pointer will be left pointing to the first
424 byte directly after the compressed data stream.
425
426 This option defaults to off.
427
428 "Append => 0|1"
429 This option controls what the "read" method does with uncompressed
430 data.
431
432 If set to 1, all uncompressed data will be appended to the output
433 parameter of the "read" method.
434
435 If set to 0, the contents of the output parameter of the "read"
436 method will be overwritten by the uncompressed data.
437
438 Defaults to 0.
439
440 "Strict => 0|1"
441 This option controls whether the extra checks defined below are
442 used when carrying out the decompression. When Strict is on, the
443 extra tests are carried out, when Strict is off they are not.
444
445 The default for this option is off.
446
447 Examples
448 TODO
449
451 read
452 Usage is
453
454 $status = $z->read($buffer)
455
456 Reads a block of compressed data (the size of the compressed block is
457 determined by the "Buffer" option in the constructor), uncompresses it
458 and writes any uncompressed data into $buffer. If the "Append"
459 parameter is set in the constructor, the uncompressed data will be
460 appended to the $buffer parameter. Otherwise $buffer will be
461 overwritten.
462
463 Returns the number of uncompressed bytes written to $buffer, zero if
464 eof or a negative number on error.
465
466 read
467 Usage is
468
469 $status = $z->read($buffer, $length)
470 $status = $z->read($buffer, $length, $offset)
471
472 $status = read($z, $buffer, $length)
473 $status = read($z, $buffer, $length, $offset)
474
475 Attempt to read $length bytes of uncompressed data into $buffer.
476
477 The main difference between this form of the "read" method and the
478 previous one, is that this one will attempt to return exactly $length
479 bytes. The only circumstances that this function will not is if end-of-
480 file or an IO error is encountered.
481
482 Returns the number of uncompressed bytes written to $buffer, zero if
483 eof or a negative number on error.
484
485 getline
486 Usage is
487
488 $line = $z->getline()
489 $line = <$z>
490
491 Reads a single line.
492
493 This method fully supports the use of the variable $/ (or
494 $INPUT_RECORD_SEPARATOR or $RS when "English" is in use) to determine
495 what constitutes an end of line. Paragraph mode, record mode and file
496 slurp mode are all supported.
497
498 getc
499 Usage is
500
501 $char = $z->getc()
502
503 Read a single character.
504
505 ungetc
506 Usage is
507
508 $char = $z->ungetc($string)
509
510 inflateSync
511 Usage is
512
513 $status = $z->inflateSync()
514
515 TODO
516
517 getHeaderInfo
518 Usage is
519
520 $hdr = $z->getHeaderInfo();
521 @hdrs = $z->getHeaderInfo();
522
523 This method returns either a hash reference (in scalar context) or a
524 list or hash references (in array context) that contains information
525 about each of the header fields in the compressed data stream(s).
526
527 tell
528 Usage is
529
530 $z->tell()
531 tell $z
532
533 Returns the uncompressed file offset.
534
535 eof
536 Usage is
537
538 $z->eof();
539 eof($z);
540
541 Returns true if the end of the compressed input stream has been
542 reached.
543
544 seek
545 $z->seek($position, $whence);
546 seek($z, $position, $whence);
547
548 Provides a sub-set of the "seek" functionality, with the restriction
549 that it is only legal to seek forward in the input file/buffer. It is
550 a fatal error to attempt to seek backward.
551
552 Note that the implementation of "seek" in this module does not provide
553 true random access to a compressed file/buffer. It works by
554 uncompressing data from the current offset in the file/buffer until it
555 reaches the uncompressed offset specified in the parameters to "seek".
556 For very small files this may be acceptable behaviour. For large files
557 it may cause an unacceptable delay.
558
559 The $whence parameter takes one the usual values, namely SEEK_SET,
560 SEEK_CUR or SEEK_END.
561
562 Returns 1 on success, 0 on failure.
563
564 binmode
565 Usage is
566
567 $z->binmode
568 binmode $z ;
569
570 This is a noop provided for completeness.
571
572 opened
573 $z->opened()
574
575 Returns true if the object currently refers to a opened file/buffer.
576
577 autoflush
578 my $prev = $z->autoflush()
579 my $prev = $z->autoflush(EXPR)
580
581 If the $z object is associated with a file or a filehandle, this method
582 returns the current autoflush setting for the underlying filehandle. If
583 "EXPR" is present, and is non-zero, it will enable flushing after every
584 write/print operation.
585
586 If $z is associated with a buffer, this method has no effect and always
587 returns "undef".
588
589 Note that the special variable $| cannot be used to set or retrieve the
590 autoflush setting.
591
592 input_line_number
593 $z->input_line_number()
594 $z->input_line_number(EXPR)
595
596 Returns the current uncompressed line number. If "EXPR" is present it
597 has the effect of setting the line number. Note that setting the line
598 number does not change the current position within the file/buffer
599 being read.
600
601 The contents of $/ are used to determine what constitutes a line
602 terminator.
603
604 fileno
605 $z->fileno()
606 fileno($z)
607
608 If the $z object is associated with a file or a filehandle, "fileno"
609 will return the underlying file descriptor. Once the "close" method is
610 called "fileno" will return "undef".
611
612 If the $z object is associated with a buffer, this method will return
613 "undef".
614
615 close
616 $z->close() ;
617 close $z ;
618
619 Closes the output file/buffer.
620
621 For most versions of Perl this method will be automatically invoked if
622 the IO::Uncompress::Unzip object is destroyed (either explicitly or by
623 the variable with the reference to the object going out of scope). The
624 exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these
625 cases, the "close" method will be called automatically, but not until
626 global destruction of all live objects when the program is terminating.
627
628 Therefore, if you want your scripts to be able to run on all versions
629 of Perl, you should call "close" explicitly and not rely on automatic
630 closing.
631
632 Returns true on success, otherwise 0.
633
634 If the "AutoClose" option has been enabled when the
635 IO::Uncompress::Unzip object was created, and the object is associated
636 with a file, the underlying file will also be closed.
637
638 nextStream
639 Usage is
640
641 my $status = $z->nextStream();
642
643 Skips to the next compressed data stream in the input file/buffer. If a
644 new compressed data stream is found, the eof marker will be cleared and
645 $. will be reset to 0.
646
647 If trailing data is present immediately after the zip archive and the
648 "Transparent" option is enabled, this method will consider that
649 trailing data to be another member of the zip archive.
650
651 Returns 1 if a new stream was found, 0 if none was found, and -1 if an
652 error was encountered.
653
654 trailingData
655 Usage is
656
657 my $data = $z->trailingData();
658
659 Returns the data, if any, that is present immediately after the
660 compressed data stream once uncompression is complete. It only makes
661 sense to call this method once the end of the compressed data stream
662 has been encountered.
663
664 This option can be used when there is useful information immediately
665 following the compressed data stream, and you don't know the length of
666 the compressed data stream.
667
668 If the input is a buffer, "trailingData" will return everything from
669 the end of the compressed data stream to the end of the buffer.
670
671 If the input is a filehandle, "trailingData" will return the data that
672 is left in the filehandle input buffer once the end of the compressed
673 data stream has been reached. You can then use the filehandle to read
674 the rest of the input file.
675
676 Don't bother using "trailingData" if the input is a filename.
677
678 If you know the length of the compressed data stream before you start
679 uncompressing, you can avoid having to use "trailingData" by setting
680 the "InputLength" option in the constructor.
681
683 No symbolic constants are required by this IO::Uncompress::Unzip at
684 present.
685
686 :all Imports "unzip" and $UnzipError. Same as doing this
687
688 use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
689
691 Working with Net::FTP
692 See IO::Compress::FAQ
693
694 Walking through a zip file
695 The code below can be used to traverse a zip file, one compressed data
696 stream at a time.
697
698 use IO::Uncompress::Unzip qw($UnzipError);
699
700 my $zipfile = "somefile.zip";
701 my $u = new IO::Uncompress::Unzip $zipfile
702 or die "Cannot open $zipfile: $UnzipError";
703
704 my $status;
705 for ($status = 1; $status > 0; $status = $u->nextStream())
706 {
707
708 my $name = $u->getHeaderInfo()->{Name};
709 warn "Processing member $name\n" ;
710
711 my $buff;
712 while (($status = $u->read($buff)) > 0) {
713 # Do something here
714 }
715
716 last if $status < 0;
717 }
718
719 die "Error processing $zipfile: $!\n"
720 if $status < 0 ;
721
722 Each individual compressed data stream is read until the logical end-
723 of-file is reached. Then "nextStream" is called. This will skip to the
724 start of the next compressed data stream and clear the end-of-file
725 flag.
726
727 It is also worth noting that "nextStream" can be called at any time --
728 you don't have to wait until you have exhausted a compressed data
729 stream before skipping to the next one.
730
731 Unzipping a complete zip file to disk
732 Daniel S. Sterling has written a script that uses
733 "IO::Uncompress::UnZip" to read a zip file and unzip its contents to
734 disk.
735
736 The script is available from <https://gist.github.com/eqhmcow/5389877>
737
739 General feedback/questions/bug reports should be sent to
740 <https://github.com/pmqs/IO-Compress/issues> (preferred) or
741 <https://rt.cpan.org/Public/Dist/Display.html?Name=IO-Compress>.
742
744 Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip,
745 IO::Compress::Deflate, IO::Uncompress::Inflate,
746 IO::Compress::RawDeflate, IO::Uncompress::RawInflate,
747 IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma,
748 IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz,
749 IO::Compress::Lzip, IO::Uncompress::UnLzip, IO::Compress::Lzop,
750 IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf,
751 IO::Compress::Zstd, IO::Uncompress::UnZstd, IO::Uncompress::AnyInflate,
752 IO::Uncompress::AnyUncompress
753
754 IO::Compress::FAQ
755
756 File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib
757
758 For RFC 1950, 1951 and 1952 see
759 <http://www.faqs.org/rfcs/rfc1950.html>,
760 <http://www.faqs.org/rfcs/rfc1951.html> and
761 <http://www.faqs.org/rfcs/rfc1952.html>
762
763 The zlib compression library was written by Jean-loup Gailly
764 "gzip@prep.ai.mit.edu" and Mark Adler "madler@alumni.caltech.edu".
765
766 The primary site for the zlib compression library is
767 <http://www.zlib.org>.
768
769 The primary site for gzip is <http://www.gzip.org>.
770
772 This module was written by Paul Marquess, "pmqs@cpan.org".
773
775 See the Changes file.
776
778 Copyright (c) 2005-2019 Paul Marquess. All rights reserved.
779
780 This program is free software; you can redistribute it and/or modify it
781 under the same terms as Perl itself.
782
783
784
785perl v5.30.1 2020-01-30 IO::Uncompress::Unzip(3)