1Spreadsheet::ParseExcelU(s3e)r Contributed Perl DocumentaStpiroenadsheet::ParseExcel(3)
2
3
4
6 Spreadsheet::ParseExcel - Extract information from an Excel file.
7
9 #!/usr/bin/perl -w
10
11 use strict;
12 use Spreadsheet::ParseExcel;
13
14 my $parser = Spreadsheet::ParseExcel->new();
15 my $workbook = $parser->Parse('Book1.xls');
16
17 for my $worksheet ( $workbook->worksheets() ) {
18
19 my ( $row_min, $row_max ) = $worksheet->row_range();
20 my ( $col_min, $col_max ) = $worksheet->col_range();
21
22 for my $row ( $row_min .. $row_max ) {
23 for my $col ( $col_min .. $col_max ) {
24
25 my $cell = $worksheet->get_cell( $row, $col );
26 next unless $cell;
27
28 print "Row, Col = ($row, $col)\n";
29 print "Value = ", $cell->value(), "\n";
30 print "Unformatted = ", $cell->unformatted(), "\n";
31 print "\n";
32 }
33 }
34 }
35
37 The Spreadsheet::ParseExcel module can be used to read information from
38 an Excel 95-2003 file.
39
41 new()
42 The "new()" method is used to create a new "Spreadsheet::ParseExcel"
43 parser object.
44
45 my $parser = Spreadsheet::ParseExcel->new();
46
47 As an advanced feature it is also possible to pass a call-back handler
48 to the parser to control the parsing of the spreadsheet.
49
50 $parser = Spreadsheet::ParseExcel->new(
51 [ CellHandler => \&cell_handler,
52 NotSetCell => 1,
53 ]);
54
55 The call-back can be used to ignore certain cells or to reduce memory
56 usage. See the section "Reducing the memory usage of
57 Spreadsheet::ParseExcel" for more information.
58
59 Parse($filename, [$formatter])
60 The Parser "Parse()" method return a "Workbook" object.
61
62 my $parser = Spreadsheet::ParseExcel->new();
63 my $workbook = $parser->Parse('Book1.xls');
64
65 If an error occurs "Parse()" returns "undef".
66
67 The $filename parameter is generally the file to be parsed. However, it
68 can also be a filehandle or a scalar reference.
69
70 The optional $formatter array ref can be an reference to a "Formatter
71 Class" to format the value of cells.
72
73 ColorIdxToRGB()
74 The "ColorIdxToRGB()" method returns a RGB string corresponding to a
75 specified color index. The RGB string has 6 characters, representing
76 the RGB hex value, for example 'FF0000'. The color index is generally
77 obtained from a FONT object.
78
79 $RGB = $parser->ColorIdxToRGB($color_index);
80
82 A "Spreadsheet::ParseExcel::Workbook" is created via the
83 "Spreadsheet::ParseExcel" "Parse()" method:
84
85 my $parser = Spreadsheet::ParseExcel->new();
86 my $workbook = $parser->Parse('Book1.xls');
87
88 The Workbook class has methods and properties that are outlined in the
89 following sections.
90
92 Parse()
93 As a syntactic shorthand you can create a Parser and Workbook object in
94 one go using the Workbook "Parse()" method. The following examples are
95 equivalent:
96
97 # Method 1
98 my $parser = Spreadsheet::ParseExcel->new();
99 my $workbook = $parser->Parse('Book1.xls');
100
101 # Method 2
102 my $workbook = Spreadsheet::ParseExcel::Workbook->Parse('Book1.xls');
103
104 worksheets()
105 Returns an array of "Worksheet" objects. This was most commonly used to
106 iterate over the worksheets in a workbook:
107
108 for my $worksheet ( $workbook->worksheets() ) {
109 ...
110 }
111
112 Worksheet()
113 The "Worksheet()" method returns a single "Worksheet" object using
114 either its name or index:
115
116 $worksheet = $workbook->Worksheet('Sheet1');
117 $worksheet = $workbook->Worksheet(0);
118
119 Returns "undef" if the sheet name or index doesn't exist.
120
122 A workbook object exposes a number of properties as shown below:
123
124 $workbook->{Worksheet }->[$index]
125 $workbook->{File}
126 $workbook->{Author}
127 $workbook->{Flg1904}
128 $workbook->{Version}
129 $workbook->{SheetCount}
130 $workbook->{PrintArea }->[$index]
131 $workbook->{PrintTitle}->[$index]
132
133 These properties are generally only of interest to advanced users.
134 Casual users can skip this section.
135
136 $workbook->{Worksheet}->[$index]
137 Returns an array of "Worksheet" objects. This was most commonly used to
138 iterate over the worksheets in a workbook:
139
140 for my $worksheet (@{$workbook->{Worksheet}}) {
141 ...
142 }
143
144 It is now deprecated, use worksheets()) instead.
145
146 $workbook->{File}
147 Returns the name of the Excel file.
148
149 $workbook->{Author}
150 Returns the author of the Excel file.
151
152 $workbook->{Flg1904}
153 Returns true if the Excel file is using the 1904 date epoch instead of
154 the 1900 epoch. The Windows version of Excel generally uses the 1900
155 epoch while the Mac version of Excel generally uses the 1904 epoch.
156
157 $workbook->{Version}
158 Returns the version of the Excel file.
159
160 $workbook->{SheetCount}
161 Returns the numbers of "Worksheet" objects in the Workbook.
162
163 $workbook->{PrintArea}->[$index]
164 Returns an array ref of print areas. Each print area is as follows:
165
166 [ $start_row, $start_col, $end_row, $end_col]
167
168 $workbook->{PrintTitle}->[$index]
169 Returns an array ref of print title hash refs. Each print title is as
170 follows:
171
172 {
173 Row => [$start_row, $end_row],
174 Column => [$start_col, $end_col]
175 }
176
178 The "Spreadsheet::ParseExcel::Worksheet" class has the following
179 methods and properties.
180
182 get_cell($row, $col)
183 Return the "Cell" object at row $row and column $col if it is defined.
184 Otherwise returns undef.
185
186 my $cell = $worksheet->get_cell($row, $col);
187
188 row_range()
189 Return a two-element list "($min, $max)" containing the minimum and
190 maximum defined rows in the worksheet. If there is no row defined $max
191 is smaller than $min.
192
193 my ( $row_min, $row_max ) = $worksheet->row_range();
194
195 col_range()
196 Return a two-element list "($min, $max)" containing the minimum and
197 maximum of defined columns in the worksheet. If there is no column
198 defined $max is smaller than $min.
199
200 my ( $col_min, $col_max ) = $worksheet->col_range();
201
203 A worksheet object exposes a number of properties as shown below:
204
205 $worksheet->{Name}
206 $worksheet->{DefRowHeight}
207 $worksheet->{DefColWidth}
208 $worksheet->{RowHeight}->[$row]
209 $worksheet->{ColWidth}->[$col]
210 $worksheet->{Cells}->[$row]->[$col]
211 $worksheet->{Landscape}
212 $worksheet->{Scale}
213 $worksheet->{PageFit}
214 $worksheet->{FitWidth}
215 $worksheet->{FitHeight}
216 $worksheet->{PaperSize}
217 $worksheet->{PageStart}
218 $worksheet->{UsePage}
219 $worksheet->{$margin}
220 $worksheet->{HCenter}
221 $worksheet->{VCenter}
222 $worksheet->{Header}
223 $worksheet->{Footer}
224 $worksheet->{PrintGrid}
225 $worksheet->{PrintHeaders}
226 $worksheet->{NoColor}
227 $worksheet->{Draft}
228 $worksheet->{Notes}
229 $worksheet->{LeftToRight}
230 $worksheet->{HPageBreak}
231 $worksheet->{VPageBreak}
232 $worksheet->{MergedArea}
233
234 These properties are generally only of interest to advanced users.
235 Casual users can skip this section.
236
237 $worksheet->{Name}
238 Returns the name of the worksheet such as 'Sheet1'.
239
240 $worksheet->{DefRowHeight}
241 Returns default height of the rows in the worksheet.
242
243 $worksheet->{DefColWidth}
244 Returns default width of columns in the worksheet.
245
246 $worksheet->{RowHeight}->[$row]
247 Returns an array of row heights.
248
249 $worksheet->{ColWidth}->[$col]
250 Returns array of column widths. A value of "undef" means the column has
251 the "DefColWidth".
252
253 $worksheet->{Cells}->[$row]->[$col]
254 Returns array of "Cell" objects in the worksheet.
255
256 my $cell = $worksheet->{Cells}->[$row]->[$col];
257
258 $worksheet->{Landscape}
259 Returns 0 for horizontal or 1 for vertical.
260
261 $worksheet->{Scale}
262 Returns the worksheet print scale.
263
264 $worksheet->{PageFit}
265 Returns true if the "fit to" print option is set.
266
267 $worksheet->{FitWidth}
268 Return the number of pages in the "fit to width" option.
269
270 $worksheet->{FitHeight}
271 Return the number of pages in the "fit to height" option.
272
273 $worksheet->{PaperSize}
274 Returns the printer paper size. The value corresponds to the formats
275 shown below:
276
277 Index Paper format Paper size
278 ===== ============ ==========
279 0 Printer default -
280 1 Letter 8 1/2 x 11 in
281 2 Letter Small 8 1/2 x 11 in
282 3 Tabloid 11 x 17 in
283 4 Ledger 17 x 11 in
284 5 Legal 8 1/2 x 14 in
285 6 Statement 5 1/2 x 8 1/2 in
286 7 Executive 7 1/4 x 10 1/2 in
287 8 A3 297 x 420 mm
288 9 A4 210 x 297 mm
289 10 A4 Small 210 x 297 mm
290 11 A5 148 x 210 mm
291 12 B4 250 x 354 mm
292 13 B5 182 x 257 mm
293 14 Folio 8 1/2 x 13 in
294 15 Quarto 215 x 275 mm
295 16 - 10x14 in
296 17 - 11x17 in
297 18 Note 8 1/2 x 11 in
298 19 Envelope 9 3 7/8 x 8 7/8
299 20 Envelope 10 4 1/8 x 9 1/2
300 21 Envelope 11 4 1/2 x 10 3/8
301 22 Envelope 12 4 3/4 x 11
302 23 Envelope 14 5 x 11 1/2
303 24 C size sheet -
304 25 D size sheet -
305 26 E size sheet -
306 27 Envelope DL 110 x 220 mm
307 28 Envelope C3 324 x 458 mm
308 29 Envelope C4 229 x 324 mm
309 30 Envelope C5 162 x 229 mm
310 31 Envelope C6 114 x 162 mm
311 32 Envelope C65 114 x 229 mm
312 33 Envelope B4 250 x 353 mm
313 34 Envelope B5 176 x 250 mm
314 35 Envelope B6 176 x 125 mm
315 36 Envelope 110 x 230 mm
316 37 Monarch 3.875 x 7.5 in
317 38 Envelope 3 5/8 x 6 1/2 in
318 39 Fanfold 14 7/8 x 11 in
319 40 German Std Fanfold 8 1/2 x 12 in
320 41 German Legal Fanfold 8 1/2 x 13 in
321 256 User defined
322
323 The two most common paper sizes are "1 = "US Letter"" and "9 = A4".
324
325 $worksheet->{PageStart}
326 Returns the page number where printing starts.
327
328 $worksheet->{UsePage}
329 Returns whether a user defined start page is in use.
330
331 $worksheet->{$margin}
332 Returns the worksheet margin for left, right, top, bottom, header and
333 footer where $margin has one of the following values:
334
335 LeftMargin
336 RightMargin
337 TopMargin
338 BottomMargin
339 HeaderMargin
340 FooterMargin
341
342 $worksheet->{HCenter}
343 Returns true if the "Center horizontally when Printing" option is set.
344
345 $worksheet->{VCenter}
346 Returns true if the "Center vertically when Printing" option is set.
347
348 $worksheet->{Header}
349 Returns the print header string. This can contain control codes for
350 alignment and font properties. Refer to the Excel on-line help on
351 headers and footers or to the Spreadsheet::WriteExcel documentation for
352 "set_header()".
353
354 $worksheet->{Footer}
355 Returns the print footer string. This can contain control codes for
356 alignment and font properties. Refer to the Excel on-line help on
357 headers and footers or to the Spreadsheet::WriteExcel documentation for
358 "set_header()".
359
360 $worksheet->{PrintGrid}
361 Returns true if Print with gridlines is set.
362
363 $worksheet->{PrintHeaders}
364 Returns true if Print with headings is set.
365
366 $worksheet->{NoColor}
367 Returns true if Print in black and white is set.
368
369 $worksheet->{Draft}
370 Returns true if the "draft mode" print option is set.
371
372 $worksheet->{Notes}
373 Returns true if print with notes option is set.
374
375 $worksheet->{LeftToRight}
376 Returns the print order for the worksheet. Returns 0 for "left to
377 right" printing and 1 for "top down" printing.
378
379 $worksheet->{HPageBreak}
380 Return an array ref of horizontal page breaks.
381
382 $worksheet->{VPageBreak}
383 Return an array ref of vertical page breaks.
384
385 $worksheet->{MergedArea}
386 Return an array ref of merged areas. Each merged area is:
387
388 [ $start_row, $start_col, $end_row, $end_col]
389
391 The "Spreadsheet::ParseExcel::Cell" class has the following methods and
392 properties.
393
395 value()
396 Formatted value of the cell.
397
398 unformatted()
399 Unformatted value of the cell.
400
402 $cell->{Val}
403 $cell->{Type}
404 $cell->{Code}
405 $cell->{Format}
406 $cell->{Merged}
407 $cell->{Rich}
408
409 $cell->{Val}
410 Returns the unformatted value of the cell. This is Deprecated, use
411 "$cell->unformatted()" instead.
412
413 $cell->{Type}
414 Returns the type of cell such as "Text", "Numeric" or "Date".
415
416 If the type was detected as "Numeric", and the Cell Format matches
417 "m{^[dmy][-\\/dmy]*$}", it will be treated as a "Date" type.
418
419 $cell->{Code}
420 Returns the character encoding of the cell. It is either "undef",
421 "ucs2" or "_native_".
422
423 If "undef" then the character encoding seems to be "ascii".
424
425 If "_native_" it means that cell seems to be 'sjis' or something
426 similar.
427
428 $cell->{Format}
429 Returns the "Format" object for the cell.
430
431 $cell->{Merged}
432 Returns true if the cell is merged.
433
434 $cell->{Rich}
435 Returns an array ref of font information about each string block in a
436 "rich", i.e. multi-format, string. Each entry has the form:
437
438 [ $start_position>, $font_object ]
439
440 For more information refer to the example program "sample/dmpExR.pl".
441
443 The "Spreadsheet::ParseExcel::Format" class has the following
444 properties:
445
446 Format properties
447 $format->{Font}
448 $format->{AlignH}
449 $format->{AlignV}
450 $format->{Indent}
451 $format->{Wrap}
452 $format->{Shrink}
453 $format->{Rotate}
454 $format->{JustLast}
455 $format->{ReadDir}
456 $format->{BdrStyle}
457 $format->{BdrColor}
458 $format->{BdrDiag}
459 $format->{Fill}
460 $format->{Lock}
461 $format->{Hidden}
462 $format->{Style}
463
464 These properties are generally only of interest to advanced users.
465 Casual users can skip this section.
466
467 $format->{Font}
468 Returns the "Font" object for the Format.
469
470 $format->{AlignH}
471 Returns the horizontal alignment of the format where the value has the
472 following meaning:
473
474 0 => No alignment
475 1 => Left
476 2 => Center
477 3 => Right
478 4 => Fill
479 5 => Justify
480 6 => Center across
481 7 => Distributed/Equal spaced
482
483 $format->{AlignV}
484 Returns the vertical alignment of the format where the value has the
485 following meaning:
486
487 0 => Top
488 1 => Center
489 2 => Bottom
490 3 => Justify
491 4 => Distributed/Equal spaced
492
493 $format->{Indent}
494 Returns the indent level of the "Left" horizontal alignment.
495
496 $format->{Wrap}
497 Returns true if textwrap is on.
498
499 $format->{Shrink}
500 Returns true if "Shrink to fit" is set for the format.
501
502 $format->{Rotate}
503 Returns the text rotation. In Excel97+, it returns the angle in degrees
504 of the text rotation.
505
506 In Excel95 or earlier it returns a value as follows:
507
508 0 => No rotation
509 1 => Top down
510 2 => 90 degrees anti-clockwise,
511 3 => 90 clockwise
512
513 $format->{JustLast}
514 Return true if the "justify last" property is set for the format.
515
516 $format->{ReadDir}
517 Returns the direction that the text is read from.
518
519 $format->{BdrStyle}
520 Returns an array ref of border styles as follows:
521
522 [ $left, $right, $top, $bottom ]
523
524 $format->{BdrColor}
525 Returns an array ref of border color indexes as follows:
526
527 [ $left, $right, $top, $bottom ]
528
529 $format->{BdrDiag}
530 Returns an array ref of diagonal border kind, style and color index as
531 follows:
532
533 [$kind, $style, $color ]
534
535 Where kind is:
536
537 0 => None
538 1 => Right-Down
539 2 => Right-Up
540 3 => Both
541
542 $format->{Fill}
543 Returns an array ref of fill pattern and color indexes as follows:
544
545 [ $pattern, $front_color, $back_color ]
546
547 $format->{Lock}
548 Returns true if the cell is locked.
549
550 $format->{Hidden}
551 Returns true if the cell is Hidden.
552
553 $format->{Style}
554 Returns true if the format is a Style format.
555
557 Spreadsheet::ParseExcel::Font
558
559 Format class has these properties:
560
562 $font->{Name}
563 $font->{Bold}
564 $font->{Italic}
565 $font->{Height}
566 $font->{Underline}
567 $font->{UnderlineStyle}
568 $font->{Color}
569 $font->{Strikeout}
570 $font->{Super}
571
572 $font->{Name}
573 Returns the name of the font, for example 'Arial'.
574
575 $font->{Bold}
576 Returns true if the font is bold.
577
578 $font->{Italic}
579 Returns true if the font is italic.
580
581 $font->{Height}
582 Returns the size (height) of the font.
583
584 $font->{Underline}
585 Returns true if the font in underlined.
586
587 $font->{UnderlineStyle}
588 Returns the style of an underlined font where the value has the
589 following meaning:
590
591 0 => None
592 1 => Single
593 2 => Double
594 33 => Single accounting
595 34 => Double accounting
596
597 $font->{Color}
598 Returns the color index for the font. The index can be converted to a
599 RGB string using the "ColorIdxToRGB()" Parser method.
600
601 $font->{Strikeout}
602 Returns true if the font has the strikeout property set.
603
604 $font->{Super}
605 Returns one of the following values if the superscript or subscript
606 property of the font is set:
607
608 0 => None
609 1 => Superscript
610 2 => Subscript
611
613 Spreadsheet::ParseExcel::Fmt*
614
615 Formatter class will convert cell data.
616
617 Spreadsheet::ParseExcel includes 2 formatter classes. "FmtDefault" and
618 "FmtJapanese". It is also possible to create a user defined formatting
619 class.
620
621 The formatter class "Spreadsheet::ParseExcel::Fmt*" should provide the
622 following functions:
623
624 ChkType($self, $is_numeric, $format_index)
625 Method to check the the type of data in the cell. Should return "Date",
626 "Numeric" or "Text". It is passed the following parameters:
627
628 $self
629 A scalar reference to the Formatter object.
630
631 $is_numeric
632 If true, the value seems to be number.
633
634 $format_index
635 The index number for the cell Format object.
636
637 TextFmt($self, $string_data, $string_encoding)
638 Converts the string data in the cell into the correct encoding. It is
639 passed the following parameters:
640
641 $self
642 A scalar reference to the Formatter object.
643
644 $string_data
645 The original string/text data.
646
647 $string_encoding
648 The character encoding of original string/text.
649
650 ValFmt($self, $cell, $workbook)
651 Convert the original unformatted cell value into the appropriate
652 formatted value. For instance turn a number into a formatted date. It
653 is passed the following parameters:
654
655 $self
656 A scalar reference to the Formatter object.
657
658 $cell
659 A scalar reference to the Cell object.
660
661 $workbook
662 A scalar reference to the Workbook object.
663
664 FmtString($self, $cell, $workbook)
665 Get the format string for the Cell. It is passed the following
666 parameters:
667
668 $self
669 A scalar reference to the Formatter object.
670
671 $cell
672 A scalar reference to the Cell object.
673
674 $workbook
675 A scalar reference to the Workbook object.
676
678 In some cases a "Spreadsheet::ParseExcel" application may consume a lot
679 of memory when processing a large Excel file and, as a result, may fail
680 to complete. The following explains why this can occur and how to
681 resolve it.
682
683 "Spreadsheet::ParseExcel" processes an Excel file in two stages. In the
684 first stage it extracts the Excel binary stream from the OLE container
685 file using "OLE::Storage_Lite". In the second stage it parses the
686 binary stream to read workbook, worksheet and cell data which it then
687 stores in memory. The majority of the memory usage is required for
688 storing cell data.
689
690 The reason for this is that as the Excel file is parsed and each cell
691 is encountered a cell handling function creates a relatively large
692 nested cell object that contains the cell value and all of the data
693 that relates to the cell formatting. For large files (a 10MB Excel file
694 on a 256MB system) this overhead can cause the system to grind to a
695 halt.
696
697 However, in a lot of cases when an Excel file is being processed the
698 only information that is required are the cell values. In these cases
699 it is possible to avoid most of the memory overhead by specifying your
700 own cell handling function and by telling Spreadsheet::ParseExcel not
701 to store the parsed cell data. This is achieved by passing a cell
702 handler function to "new()" when creating the parse object. Here is an
703 example.
704
705 #!/usr/bin/perl -w
706
707 use strict;
708 use Spreadsheet::ParseExcel;
709
710 my $parser = Spreadsheet::ParseExcel->new(
711 CellHandler => \&cell_handler,
712 NotSetCell => 1
713 );
714
715 my $workbook = $parser->Parse('file.xls');
716
717 sub cell_handler {
718
719 my $workbook = $_[0];
720 my $sheet_index = $_[1];
721 my $row = $_[2];
722 my $col = $_[3];
723 my $cell = $_[4];
724
725 # Do something useful with the formatted cell value
726 print $cell->value(), "\n";
727
728 }
729
730 The user specified cell handler is passed as a code reference to
731 "new()" along with the parameter "NotSetCell" which tells
732 Spreadsheet::ParseExcel not to store the parsed cell. Note, you don't
733 have to iterate over the rows and columns, this happens automatically
734 as part of the parsing.
735
736 The cell handler is passed 5 arguments. The first, $workbook, is a
737 reference to the "Spreadsheet::ParseExcel::Workbook" object that
738 represent the parsed workbook. This can be used to access any of the
739 "Spreadsheet::ParseExcel::Workbook" methods, see "Workbook". The second
740 $sheet_index is the zero-based index of the worksheet being parsed. The
741 third and fourth, $row and $col, are the zero-based row and column
742 number of the cell. The fifth, $cell, is a reference to the
743 "Spreadsheet::ParseExcel::Cell" object. This is used to extract the
744 data from the cell. See "Cell" for more information.
745
746 This technique can be useful if you are writing an Excel to database
747 filter since you can put your DB calls in the cell handler.
748
749 If you don't want all of the data in the spreadsheet you can add some
750 control logic to the cell handler. For example we can extend the
751 previous example so that it only prints the first 10 rows of the first
752 two worksheets in the parsed workbook by adding some "if()" statements
753 to the cell handler:
754
755 #!/usr/bin/perl -w
756
757 use strict;
758 use Spreadsheet::ParseExcel;
759
760 my $parser = Spreadsheet::ParseExcel->new(
761 CellHandler => \&cell_handler,
762 NotSetCell => 1
763 );
764
765 my $workbook = $parser->Parse('file.xls');
766
767 sub cell_handler {
768
769 my $workbook = $_[0];
770 my $sheet_index = $_[1];
771 my $row = $_[2];
772 my $col = $_[3];
773 my $cell = $_[4];
774
775 # Skip some worksheets and rows (inefficiently).
776 return if $sheet_index >= 3;
777 return if $row >= 10;
778
779 # Do something with the formatted cell value
780 print $cell->value(), "\n";
781
782 }
783
784 However, this still processes the entire workbook. If you wish to save
785 some additional processing time you can abort the parsing after you
786 have read the data that you want, using the workbook "ParseAbort"
787 method:
788
789 #!/usr/bin/perl -w
790
791 use strict;
792 use Spreadsheet::ParseExcel;
793
794 my $parser = Spreadsheet::ParseExcel->new(
795 CellHandler => \&cell_handler,
796 NotSetCell => 1
797 );
798
799 my $workbook = $parser->Parse('file.xls');
800
801 sub cell_handler {
802
803 my $workbook = $_[0];
804 my $sheet_index = $_[1];
805 my $row = $_[2];
806 my $col = $_[3];
807 my $cell = $_[4];
808
809 # Skip some worksheets and rows (more efficiently).
810 if ( $sheet_index >= 1 and $row >= 10 ) {
811 $workbook->ParseAbort(1);
812 return;
813 }
814
815 # Do something with the formatted cell value
816 print $cell->value(), "\n";
817
818 }
819
821 · Issues reported by users:
822 http://rt.cpan.org/Public/Dist/Display.html?Name=Spreadsheet-ParseExcel
823
824 · This module cannot read the values of formulas from files created
825 with Spreadsheet::WriteExcel unless the user specified the values
826 when creating the file (which is generally not the case). The
827 reason for this is that Spreadsheet::WriteExcel writes the formula
828 but not the formula result since it isn't in a position to
829 calculate arbitrary Excel formulas without access to Excel's
830 formula engine.
831
832 · If Excel has date fields where the specified format is equal to the
833 system-default for the short-date locale, Excel does not store the
834 format, but defaults to an internal format which is system
835 dependent. In these cases ParseExcel uses the date format
836 'yyyy-mm-dd'.
837
839 Bugs can be reported via rt.cpan.org. See the following for
840 instructions on bug reporting for Spreadsheet::ParseExcel
841
842 http://rt.cpan.org/Public/Dist/Display.html?Name=Spreadsheet-ParseExcel
843
845 · xls2csv by Ken Prows
846 (http://search.cpan.org/~ken/xls2csv-1.06/script/xls2csv).
847
848 · xls2csv and xlscat by H.Merijn Brand (these utilities are part of
849 Spreadsheet::Read, see below).
850
851 · excel2txt by Ken Youens-Clark,
852 (http://search.cpan.org/~kclark/excel2txt/excel2txt). This is an
853 excellent example of an Excel filter using Spreadsheet::ParseExcel.
854 It can produce CSV, Tab delimited, Html, XML and Yaml.
855
856 · XLSperl by Jon Allen
857 (http://search.cpan.org/~jonallen/XLSperl/bin/XLSperl). This
858 application allows you to use Perl "one-liners" with Microsoft
859 Excel files.
860
861 · Spreadsheet::XLSX
862 (http://search.cpan.org/~dmow/Spreadsheet-XLSX/lib/Spreadsheet/XLSX.pm)
863 by Dmitry Ovsyanko. A module with a similar interface to
864 Spreadsheet::ParseExcel for parsing Excel 2007 XLSX OpenXML files.
865
866 · Spreadsheet::Read
867 (http://search.cpan.org/~hmbrand/Spreadsheet-Read/Read.pm) by
868 H.Merijn Brand. A single interface for reading several different
869 spreadsheet formats.
870
871 · Spreadsheet::WriteExcel
872 (http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel/lib/Spreadsheet/WriteExcel.pm).
873 A perl module for creating new Excel files.
874
875 · Spreadsheet::ParseExcel::SaveParser
876 (http://search.cpan.org/~jmcnamara/Spreadsheet-ParseExcel/lib/Spreadsheet/ParseExcel/SaveParser.pm).
877 This is a combination of Spreadsheet::ParseExcel and
878 Spreadsheet::WriteExcel and it allows you to "rewrite" an Excel
879 file. See the following example
880 (http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel/lib/Spreadsheet/WriteExcel.pm#MODIFYING_AND_REWRITING_EXCEL_FILES).
881 It is part of the Spreadsheet::ParseExcel distro.
882
883 · Text::CSV_XS
884 (http://search.cpan.org/~hmbrand/Text-CSV_XS/CSV_XS.pm) by H.Merijn
885 Brand. A fast and rigorous module for reading and writing CSV data.
886 Don't consider rolling your own CSV handling, use this module
887 instead.
888
890 There is a Google group for discussing and asking questions about
891 Spreadsheet::ParseExcel. This is a good place to search to see if your
892 question has been asked before:
893 http://groups-beta.google.com/group/spreadsheet-parseexcel/
894
896 If you'd care to donate to the Spreadsheet::ParseExcel project, you can
897 do so via PayPal: http://tinyurl.com/7ayes
898
900 · The current maintenance work is directed towards making the
901 documentation more useful, improving and simplifying the API, and
902 improving the maintainability of the code base. After that new
903 features will be added.
904
905 · Fix open bugs and documentation for SaveParser.
906
907 · Add Formula support, Hyperlink support, Named Range support.
908
909 · Improve Spreadsheet::ParseExcel::SaveParser compatibility with
910 Spreadsheet::WriteExcel.
911
912 · Improve Unicode and other encoding support. This will probably
913 require dropping support for perls prior to 5.8+.
914
916 From Kawai Takanori:
917
918 First of all, I would like to acknowledge the following valuable
919 programs and modules: XHTML, OLE::Storage and Spreadsheet::WriteExcel.
920
921 In no particular order: Yamaji Haruna, Simamoto Takesi, Noguchi Harumi,
922 Ikezawa Kazuhiro, Suwazono Shugo, Hirofumi Morisada, Michael Edwards,
923 Kim Namusk, Slaven Rezic, Grant Stevens, H.Merijn Brand and many many
924 people + Kawai Mikako.
925
927 Because this software is licensed free of charge, there is no warranty
928 for the software, to the extent permitted by applicable law. Except
929 when otherwise stated in writing the copyright holders and/or other
930 parties provide the software "as is" without warranty of any kind,
931 either expressed or implied, including, but not limited to, the implied
932 warranties of merchantability and fitness for a particular purpose. The
933 entire risk as to the quality and performance of the software is with
934 you. Should the software prove defective, you assume the cost of all
935 necessary servicing, repair, or correction.
936
937 In no event unless required by applicable law or agreed to in writing
938 will any copyright holder, or any other party who may modify and/or
939 redistribute the software as permitted by the above licence, be liable
940 to you for damages, including any general, special, incidental, or
941 consequential damages arising out of the use or inability to use the
942 software (including but not limited to loss of data or data being
943 rendered inaccurate or losses sustained by you or third parties or a
944 failure of the software to operate with any other software), even if
945 such holder or other party has been advised of the possibility of such
946 damages.
947
949 Either the Perl Artistic Licence
950 http://dev.perl.org/licenses/artistic.html or the GPL
951 http://www.opensource.org/licenses/gpl-license.php
952
954 Current maintainer 0.40+: John McNamara jmcnamara@cpan.org
955
956 Maintainer 0.27-0.33: Gabor Szabo szabgab@cpan.org
957
958 Original author: Kawai Takanori (Hippo2000) kwitknr@cpan.org
959
961 Copyright (c) 2009 John McNamara
962
963 Copyright (c) 2006-2008 Gabor Szabo
964
965 Copyright (c) 2000-2006 Kawai Takanori
966
967 All rights reserved. This is free software. You may distribute under
968 the terms of either the GNU General Public License or the Artistic
969 License.
970
971
972
973perl v5.12.0 2010-05-06 Spreadsheet::ParseExcel(3)