1Text::CSV_PP(3) User Contributed Perl Documentation Text::CSV_PP(3)
2
3
4
6 Text::CSV_PP - Text::CSV_XS compatible pure-Perl module
7
9 use Text::CSV_PP;
10
11 $csv = Text::CSV_PP->new(); # create a new object
12 # If you want to handle non-ascii char.
13 $csv = Text::CSV_PP->new({binary => 1});
14
15 $status = $csv->combine(@columns); # combine columns into a string
16 $line = $csv->string(); # get the combined string
17
18 $status = $csv->parse($line); # parse a CSV string into fields
19 @columns = $csv->fields(); # get the parsed fields
20
21 $status = $csv->status (); # get the most recent status
22 $bad_argument = $csv->error_input (); # get the most recent bad argument
23 $diag = $csv->error_diag (); # if an error occurred, explains WHY
24
25 $status = $csv->print ($io, $colref); # Write an array of fields
26 # immediately to a file $io
27 $colref = $csv->getline ($io); # Read a line from file $io,
28 # parse it and return an array
29 # ref of fields
30 $csv->column_names (@names); # Set column names for getline_hr ()
31 $ref = $csv->getline_hr ($io); # getline (), but returns a hashref
32 $eof = $csv->eof (); # Indicate if last parse or
33 # getline () hit End Of File
34
35 $csv->types(\@t_array); # Set column types
36
38 Text::CSV_PP is a pure-perl module that provides facilities for the
39 composition and decomposition of comma-separated values. This is
40 (almost) compatible with much faster Text::CSV_XS, and mainly used as
41 its fallback module when you use Text::CSV module without having
42 installed Text::CSV_XS. If you don't have any reason to use this module
43 directly, use Text::CSV for speed boost and portability (or maybe
44 Text::CSV_XS when you write an one-off script and don't need to care
45 about portability).
46
47 The following caveats are taken from the doc of Text::CSV_XS.
48
49 Embedded newlines
50 Important Note: The default behavior is to accept only ASCII
51 characters in the range from 0x20 (space) to 0x7E (tilde). This means
52 that the fields can not contain newlines. If your data contains
53 newlines embedded in fields, or characters above 0x7E (tilde), or
54 binary data, you must set "binary => 1" in the call to "new". To cover
55 the widest range of parsing options, you will always want to set
56 binary.
57
58 But you still have the problem that you have to pass a correct line to
59 the "parse" method, which is more complicated from the usual point of
60 usage:
61
62 my $csv = Text::CSV_PP->new ({ binary => 1, eol => $/ });
63 while (<>) { # WRONG!
64 $csv->parse ($_);
65 my @fields = $csv->fields ();
66 }
67
68 this will break, as the "while" might read broken lines: it does not
69 care about the quoting. If you need to support embedded newlines, the
70 way to go is to not pass "eol" in the parser (it accepts "\n", "\r",
71 and "\r\n" by default) and then
72
73 my $csv = Text::CSV_PP->new ({ binary => 1 });
74 open my $fh, "<", $file or die "$file: $!";
75 while (my $row = $csv->getline ($fh)) {
76 my @fields = @$row;
77 }
78
79 The old(er) way of using global file handles is still supported
80
81 while (my $row = $csv->getline (*ARGV)) { ... }
82
83 Unicode
84 Unicode is only tested to work with perl-5.8.2 and up.
85
86 See also "BOM".
87
88 The simplest way to ensure the correct encoding is used for in- and
89 output is by either setting layers on the filehandles, or setting the
90 "encoding" argument for "csv".
91
92 open my $fh, "<:encoding(UTF-8)", "in.csv" or die "in.csv: $!";
93 or
94 my $aoa = csv (in => "in.csv", encoding => "UTF-8");
95
96 open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
97 or
98 csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
99
100 On parsing (both for "getline" and "parse"), if the source is marked
101 being UTF8, then all fields that are marked binary will also be marked
102 UTF8.
103
104 On combining ("print" and "combine"): if any of the combining fields
105 was marked UTF8, the resulting string will be marked as UTF8. Note
106 however that all fields before the first field marked UTF8 and
107 contained 8-bit characters that were not upgraded to UTF8, these will
108 be "bytes" in the resulting string too, possibly causing unexpected
109 errors. If you pass data of different encoding, or you don't know if
110 there is different encoding, force it to be upgraded before you pass
111 them on:
112
113 $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
114
115 For complete control over encoding, please use Text::CSV::Encoded:
116
117 use Text::CSV::Encoded;
118 my $csv = Text::CSV::Encoded->new ({
119 encoding_in => "iso-8859-1", # the encoding comes into Perl
120 encoding_out => "cp1252", # the encoding comes out of Perl
121 });
122
123 $csv = Text::CSV::Encoded->new ({ encoding => "utf8" });
124 # combine () and print () accept *literally* utf8 encoded data
125 # parse () and getline () return *literally* utf8 encoded data
126
127 $csv = Text::CSV::Encoded->new ({ encoding => undef }); # default
128 # combine () and print () accept UTF8 marked data
129 # parse () and getline () return UTF8 marked data
130
131 BOM
132 BOM (or Byte Order Mark) handling is available only inside the
133 "header" method. This method supports the following encodings:
134 "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
135 "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
136 <https://en.wikipedia.org/wiki/Byte_order_mark>.
137
138 If a file has a BOM, the easiest way to deal with that is
139
140 my $aoh = csv (in => $file, detect_bom => 1);
141
142 All records will be encoded based on the detected BOM.
143
144 This implies a call to the "header" method, which defaults to also
145 set the "column_names". So this is not the same as
146
147 my $aoh = csv (in => $file, headers => "auto");
148
149 which only reads the first record to set "column_names" but ignores
150 any meaning of possible present BOM.
151
153 This section is taken from Text::CSV_XS.
154
155 version
156 (Class method) Returns the current module version.
157
158 new
159 (Class method) Returns a new instance of class Text::CSV_PP. The
160 attributes are described by the (optional) hash ref "\%attr".
161
162 my $csv = Text::CSV_PP->new ({ attributes ... });
163
164 The following attributes are available:
165
166 eol
167
168 my $csv = Text::CSV_PP->new ({ eol => $/ });
169 $csv->eol (undef);
170 my $eol = $csv->eol;
171
172 The end-of-line string to add to rows for "print" or the record
173 separator for "getline".
174
175 When not passed in a parser instance, the default behavior is to
176 accept "\n", "\r", and "\r\n", so it is probably safer to not specify
177 "eol" at all. Passing "undef" or the empty string behave the same.
178
179 When not passed in a generating instance, records are not terminated
180 at all, so it is probably wise to pass something you expect. A safe
181 choice for "eol" on output is either $/ or "\r\n".
182
183 Common values for "eol" are "\012" ("\n" or Line Feed), "\015\012"
184 ("\r\n" or Carriage Return, Line Feed), and "\015" ("\r" or Carriage
185 Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
186
187 If both $/ and "eol" equal "\015", parsing lines that end on only a
188 Carriage Return without Line Feed, will be "parse"d correct.
189
190 sep_char
191
192 my $csv = Text::CSV_PP->new ({ sep_char => ";" });
193 $csv->sep_char (";");
194 my $c = $csv->sep_char;
195
196 The char used to separate fields, by default a comma. (","). Limited
197 to a single-byte character, usually in the range from 0x20 (space) to
198 0x7E (tilde). When longer sequences are required, use "sep".
199
200 The separation character can not be equal to the quote character or to
201 the escape character.
202
203 sep
204
205 my $csv = Text::CSV_PP->new ({ sep => "\N{FULLWIDTH COMMA}" });
206 $csv->sep (";");
207 my $sep = $csv->sep;
208
209 The chars used to separate fields, by default undefined. Limited to 8
210 bytes.
211
212 When set, overrules "sep_char". If its length is one byte it acts as
213 an alias to "sep_char".
214
215 quote_char
216
217 my $csv = Text::CSV_PP->new ({ quote_char => "'" });
218 $csv->quote_char (undef);
219 my $c = $csv->quote_char;
220
221 The character to quote fields containing blanks or binary data, by
222 default the double quote character ("""). A value of undef suppresses
223 quote chars (for simple cases only). Limited to a single-byte
224 character, usually in the range from 0x20 (space) to 0x7E (tilde).
225 When longer sequences are required, use "quote".
226
227 "quote_char" can not be equal to "sep_char".
228
229 quote
230
231 my $csv = Text::CSV_PP->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
232 $csv->quote ("'");
233 my $quote = $csv->quote;
234
235 The chars used to quote fields, by default undefined. Limited to 8
236 bytes.
237
238 When set, overrules "quote_char". If its length is one byte it acts as
239 an alias to "quote_char".
240
241 escape_char
242
243 my $csv = Text::CSV_PP->new ({ escape_char => "\\" });
244 $csv->escape_char (":");
245 my $c = $csv->escape_char;
246
247 The character to escape certain characters inside quoted fields.
248 This is limited to a single-byte character, usually in the range
249 from 0x20 (space) to 0x7E (tilde).
250
251 The "escape_char" defaults to being the double-quote mark ("""). In
252 other words the same as the default "quote_char". This means that
253 doubling the quote mark in a field escapes it:
254
255 "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
256
257 If you change the "quote_char" without changing the
258 "escape_char", the "escape_char" will still be the double-quote
259 ("""). If instead you want to escape the "quote_char" by doubling it
260 you will need to also change the "escape_char" to be the same as what
261 you have changed the "quote_char" to.
262
263 Setting "escape_char" to <undef> or "" will disable escaping completely
264 and is greatly discouraged. This will also disable "escape_null".
265
266 The escape character can not be equal to the separation character.
267
268 binary
269
270 my $csv = Text::CSV_PP->new ({ binary => 1 });
271 $csv->binary (0);
272 my $f = $csv->binary;
273
274 If this attribute is 1, you may use binary characters in quoted
275 fields, including line feeds, carriage returns and "NULL" bytes. (The
276 latter could be escaped as ""0".) By default this feature is off.
277
278 If a string is marked UTF8, "binary" will be turned on automatically
279 when binary characters other than "CR" and "NL" are encountered. Note
280 that a simple string like "\x{00a0}" might still be binary, but not
281 marked UTF8, so setting "{ binary => 1 }" is still a wise option.
282
283 strict
284
285 my $csv = Text::CSV_PP->new ({ strict => 1 });
286 $csv->strict (0);
287 my $f = $csv->strict;
288
289 If this attribute is set to 1, any row that parses to a different
290 number of fields than the previous row will cause the parser to throw
291 error 2014.
292
293 formula_handling
294
295 formula
296
297 my $csv = Text::CSV_PP->new ({ formula => "none" });
298 $csv->formula ("none");
299 my $f = $csv->formula;
300
301 This defines the behavior of fields containing formulas. As formulas
302 are considered dangerous in spreadsheets, this attribute can define an
303 optional action to be taken if a field starts with an equal sign ("=").
304
305 For purpose of code-readability, this can also be written as
306
307 my $csv = Text::CSV_PP->new ({ formula_handling => "none" });
308 $csv->formula_handling ("none");
309 my $f = $csv->formula_handling;
310
311 Possible values for this attribute are
312
313 none
314 Take no specific action. This is the default.
315
316 $csv->formula ("none");
317
318 die
319 Cause the process to "die" whenever a leading "=" is encountered.
320
321 $csv->formula ("die");
322
323 croak
324 Cause the process to "croak" whenever a leading "=" is encountered.
325 (See Carp)
326
327 $csv->formula ("croak");
328
329 diag
330 Report position and content of the field whenever a leading "=" is
331 found. The value of the field is unchanged.
332
333 $csv->formula ("diag");
334
335 empty
336 Replace the content of fields that start with a "=" with the empty
337 string.
338
339 $csv->formula ("empty");
340 $csv->formula ("");
341
342 undef
343 Replace the content of fields that start with a "=" with "undef".
344
345 $csv->formula ("undef");
346 $csv->formula (undef);
347
348 All other values will give a warning and then fallback to "diag".
349
350 decode_utf8
351
352 my $csv = Text::CSV_PP->new ({ decode_utf8 => 1 });
353 $csv->decode_utf8 (0);
354 my $f = $csv->decode_utf8;
355
356 This attributes defaults to TRUE.
357
358 While parsing, fields that are valid UTF-8, are automatically set to
359 be UTF-8, so that
360
361 $csv->parse ("\xC4\xA8\n");
362
363 results in
364
365 PV("\304\250"\0) [UTF8 "\x{128}"]
366
367 Sometimes it might not be a desired action. To prevent those upgrades,
368 set this attribute to false, and the result will be
369
370 PV("\304\250"\0)
371
372 auto_diag
373
374 my $csv = Text::CSV_PP->new ({ auto_diag => 1 });
375 $csv->auto_diag (2);
376 my $l = $csv->auto_diag;
377
378 Set this attribute to a number between 1 and 9 causes "error_diag" to
379 be automatically called in void context upon errors.
380
381 In case of error "2012 - EOF", this call will be void.
382
383 If "auto_diag" is set to a numeric value greater than 1, it will "die"
384 on errors instead of "warn". If set to anything unrecognized, it will
385 be silently ignored.
386
387 Future extensions to this feature will include more reliable auto-
388 detection of "autodie" being active in the scope of which the error
389 occurred which will increment the value of "auto_diag" with 1 the
390 moment the error is detected.
391
392 diag_verbose
393
394 my $csv = Text::CSV_PP->new ({ diag_verbose => 1 });
395 $csv->diag_verbose (2);
396 my $l = $csv->diag_verbose;
397
398 Set the verbosity of the output triggered by "auto_diag". Currently
399 only adds the current input-record-number (if known) to the
400 diagnostic output with an indication of the position of the error.
401
402 blank_is_undef
403
404 my $csv = Text::CSV_PP->new ({ blank_is_undef => 1 });
405 $csv->blank_is_undef (0);
406 my $f = $csv->blank_is_undef;
407
408 Under normal circumstances, "CSV" data makes no distinction between
409 quoted- and unquoted empty fields. These both end up in an empty
410 string field once read, thus
411
412 1,"",," ",2
413
414 is read as
415
416 ("1", "", "", " ", "2")
417
418 When writing "CSV" files with either "always_quote" or "quote_empty"
419 set, the unquoted empty field is the result of an undefined value.
420 To enable this distinction when reading "CSV" data, the
421 "blank_is_undef" attribute will cause unquoted empty fields to be set
422 to "undef", causing the above to be parsed as
423
424 ("1", "", undef, " ", "2")
425
426 note that this is specifically important when loading "CSV" fields
427 into a database that allows "NULL" values, as the perl equivalent for
428 "NULL" is "undef" in DBI land.
429
430 empty_is_undef
431
432 my $csv = Text::CSV_PP->new ({ empty_is_undef => 1 });
433 $csv->empty_is_undef (0);
434 my $f = $csv->empty_is_undef;
435
436 Going one step further than "blank_is_undef", this attribute
437 converts all empty fields to "undef", so
438
439 1,"",," ",2
440
441 is read as
442
443 (1, undef, undef, " ", 2)
444
445 Note that this effects only fields that are originally empty, not
446 fields that are empty after stripping allowed whitespace. YMMV.
447
448 allow_whitespace
449
450 my $csv = Text::CSV_PP->new ({ allow_whitespace => 1 });
451 $csv->allow_whitespace (0);
452 my $f = $csv->allow_whitespace;
453
454 When this option is set to true, the whitespace ("TAB"'s and
455 "SPACE"'s) surrounding the separation character is removed when
456 parsing. If either "TAB" or "SPACE" is one of the three characters
457 "sep_char", "quote_char", or "escape_char" it will not be considered
458 whitespace.
459
460 Now lines like:
461
462 1 , "foo" , bar , 3 , zapp
463
464 are parsed as valid "CSV", even though it violates the "CSV" specs.
465
466 Note that all whitespace is stripped from both start and end of
467 each field. That would make it more than a feature to enable parsing
468 bad "CSV" lines, as
469
470 1, 2.0, 3, ape , monkey
471
472 will now be parsed as
473
474 ("1", "2.0", "3", "ape", "monkey")
475
476 even if the original line was perfectly acceptable "CSV".
477
478 allow_loose_quotes
479
480 my $csv = Text::CSV_PP->new ({ allow_loose_quotes => 1 });
481 $csv->allow_loose_quotes (0);
482 my $f = $csv->allow_loose_quotes;
483
484 By default, parsing unquoted fields containing "quote_char" characters
485 like
486
487 1,foo "bar" baz,42
488
489 would result in parse error 2034. Though it is still bad practice to
490 allow this format, we cannot help the fact that some vendors
491 make their applications spit out lines styled this way.
492
493 If there is really bad "CSV" data, like
494
495 1,"foo "bar" baz",42
496
497 or
498
499 1,""foo bar baz"",42
500
501 there is a way to get this data-line parsed and leave the quotes inside
502 the quoted field as-is. This can be achieved by setting
503 "allow_loose_quotes" AND making sure that the "escape_char" is not
504 equal to "quote_char".
505
506 allow_loose_escapes
507
508 my $csv = Text::CSV_PP->new ({ allow_loose_escapes => 1 });
509 $csv->allow_loose_escapes (0);
510 my $f = $csv->allow_loose_escapes;
511
512 Parsing fields that have "escape_char" characters that escape
513 characters that do not need to be escaped, like:
514
515 my $csv = Text::CSV_PP->new ({ escape_char => "\\" });
516 $csv->parse (qq{1,"my bar\'s",baz,42});
517
518 would result in parse error 2025. Though it is bad practice to allow
519 this format, this attribute enables you to treat all escape character
520 sequences equal.
521
522 allow_unquoted_escape
523
524 my $csv = Text::CSV_PP->new ({ allow_unquoted_escape => 1 });
525 $csv->allow_unquoted_escape (0);
526 my $f = $csv->allow_unquoted_escape;
527
528 A backward compatibility issue where "escape_char" differs from
529 "quote_char" prevents "escape_char" to be in the first position of a
530 field. If "quote_char" is equal to the default """ and "escape_char"
531 is set to "\", this would be illegal:
532
533 1,\0,2
534
535 Setting this attribute to 1 might help to overcome issues with
536 backward compatibility and allow this style.
537
538 always_quote
539
540 my $csv = Text::CSV_PP->new ({ always_quote => 1 });
541 $csv->always_quote (0);
542 my $f = $csv->always_quote;
543
544 By default the generated fields are quoted only if they need to be.
545 For example, if they contain the separator character. If you set this
546 attribute to 1 then all defined fields will be quoted. ("undef" fields
547 are not quoted, see "blank_is_undef"). This makes it quite often easier
548 to handle exported data in external applications.
549
550 quote_space
551
552 my $csv = Text::CSV_PP->new ({ quote_space => 1 });
553 $csv->quote_space (0);
554 my $f = $csv->quote_space;
555
556 By default, a space in a field would trigger quotation. As no rule
557 exists this to be forced in "CSV", nor any for the opposite, the
558 default is true for safety. You can exclude the space from this
559 trigger by setting this attribute to 0.
560
561 quote_empty
562
563 my $csv = Text::CSV_PP->new ({ quote_empty => 1 });
564 $csv->quote_empty (0);
565 my $f = $csv->quote_empty;
566
567 By default the generated fields are quoted only if they need to be.
568 An empty (defined) field does not need quotation. If you set this
569 attribute to 1 then empty defined fields will be quoted. ("undef"
570 fields are not quoted, see "blank_is_undef"). See also "always_quote".
571
572 quote_binary
573
574 my $csv = Text::CSV_PP->new ({ quote_binary => 1 });
575 $csv->quote_binary (0);
576 my $f = $csv->quote_binary;
577
578 By default, all "unsafe" bytes inside a string cause the combined
579 field to be quoted. By setting this attribute to 0, you can disable
580 that trigger for bytes >= 0x7F.
581
582 escape_null
583
584 my $csv = Text::CSV_PP->new ({ escape_null => 1 });
585 $csv->escape_null (0);
586 my $f = $csv->escape_null;
587
588 By default, a "NULL" byte in a field would be escaped. This option
589 enables you to treat the "NULL" byte as a simple binary character in
590 binary mode (the "{ binary => 1 }" is set). The default is true. You
591 can prevent "NULL" escapes by setting this attribute to 0.
592
593 When the "escape_char" attribute is set to undefined, this attribute
594 will be set to false.
595
596 The default setting will encode "=\x00=" as
597
598 "="0="
599
600 With "escape_null" set, this will result in
601
602 "=\x00="
603
604 The default when using the "csv" function is "false".
605
606 For backward compatibility reasons, the deprecated old name
607 "quote_null" is still recognized.
608
609 keep_meta_info
610
611 my $csv = Text::CSV_PP->new ({ keep_meta_info => 1 });
612 $csv->keep_meta_info (0);
613 my $f = $csv->keep_meta_info;
614
615 By default, the parsing of input records is as simple and fast as
616 possible. However, some parsing information - like quotation of the
617 original field - is lost in that process. Setting this flag to true
618 enables retrieving that information after parsing with the methods
619 "meta_info", "is_quoted", and "is_binary" described below. Default is
620 false for performance.
621
622 If you set this attribute to a value greater than 9, than you can
623 control output quotation style like it was used in the input of the the
624 last parsed record (unless quotation was added because of other
625 reasons).
626
627 my $csv = Text::CSV_PP->new ({
628 binary => 1,
629 keep_meta_info => 1,
630 quote_space => 0,
631 });
632
633 my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
634
635 $csv->print (*STDOUT, \@row);
636 # 1,,, , ,f,g,"h""h",help,help
637 $csv->keep_meta_info (11);
638 $csv->print (*STDOUT, \@row);
639 # 1,,"", ," ",f,"g","h""h",help,"help"
640
641 undef_str
642
643 my $csv = Text::CSV_PP->new ({ undef_str => "\\N" });
644 $csv->undef_str (undef);
645 my $s = $csv->undef_str;
646
647 This attribute optionally defines the output of undefined fields. The
648 value passed is not changed at all, so if it needs quotation, the
649 quotation needs to be included in the value of the attribute. Use with
650 caution, as passing a value like ",",,,,""" will for sure mess up
651 your output. The default for this attribute is "undef", meaning no
652 special treatment.
653
654 This attribute is useful when exporting CSV data to be imported in
655 custom loaders, like for MySQL, that recognize special sequences for
656 "NULL" data.
657
658 verbatim
659
660 my $csv = Text::CSV_PP->new ({ verbatim => 1 });
661 $csv->verbatim (0);
662 my $f = $csv->verbatim;
663
664 This is a quite controversial attribute to set, but makes some hard
665 things possible.
666
667 The rationale behind this attribute is to tell the parser that the
668 normally special characters newline ("NL") and Carriage Return ("CR")
669 will not be special when this flag is set, and be dealt with as being
670 ordinary binary characters. This will ease working with data with
671 embedded newlines.
672
673 When "verbatim" is used with "getline", "getline" auto-"chomp"'s
674 every line.
675
676 Imagine a file format like
677
678 M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
679
680 where, the line ending is a very specific "#\r\n", and the sep_char is
681 a "^" (caret). None of the fields is quoted, but embedded binary
682 data is likely to be present. With the specific line ending, this
683 should not be too hard to detect.
684
685 By default, Text::CSV_PP' parse function is instructed to only know
686 about "\n" and "\r" to be legal line endings, and so has to deal with
687 the embedded newline as a real "end-of-line", so it can scan the next
688 line if binary is true, and the newline is inside a quoted field. With
689 this option, we tell "parse" to parse the line as if "\n" is just
690 nothing more than a binary character.
691
692 For "parse" this means that the parser has no more idea about line
693 ending and "getline" "chomp"s line endings on reading.
694
695 types
696
697 A set of column types; the attribute is immediately passed to the
698 "types" method.
699
700 callbacks
701
702 See the "Callbacks" section below.
703
704 accessors
705
706 To sum it up,
707
708 $csv = Text::CSV_PP->new ();
709
710 is equivalent to
711
712 $csv = Text::CSV_PP->new ({
713 eol => undef, # \r, \n, or \r\n
714 sep_char => ',',
715 sep => undef,
716 quote_char => '"',
717 quote => undef,
718 escape_char => '"',
719 binary => 0,
720 decode_utf8 => 1,
721 auto_diag => 0,
722 diag_verbose => 0,
723 blank_is_undef => 0,
724 empty_is_undef => 0,
725 allow_whitespace => 0,
726 allow_loose_quotes => 0,
727 allow_loose_escapes => 0,
728 allow_unquoted_escape => 0,
729 always_quote => 0,
730 quote_empty => 0,
731 quote_space => 1,
732 escape_null => 1,
733 quote_binary => 1,
734 keep_meta_info => 0,
735 verbatim => 0,
736 undef_str => undef,
737 types => undef,
738 callbacks => undef,
739 });
740
741 For all of the above mentioned flags, an accessor method is available
742 where you can inquire the current value, or change the value
743
744 my $quote = $csv->quote_char;
745 $csv->binary (1);
746
747 It is not wise to change these settings halfway through writing "CSV"
748 data to a stream. If however you want to create a new stream using the
749 available "CSV" object, there is no harm in changing them.
750
751 If the "new" constructor call fails, it returns "undef", and makes
752 the fail reason available through the "error_diag" method.
753
754 $csv = Text::CSV_PP->new ({ ecs_char => 1 }) or
755 die "".Text::CSV_PP->error_diag ();
756
757 "error_diag" will return a string like
758
759 "INI - Unknown attribute 'ecs_char'"
760
761 known_attributes
762 @attr = Text::CSV_PP->known_attributes;
763 @attr = Text::CSV_PP::known_attributes;
764 @attr = $csv->known_attributes;
765
766 This method will return an ordered list of all the supported
767 attributes as described above. This can be useful for knowing what
768 attributes are valid in classes that use or extend Text::CSV_PP.
769
770 print
771 $status = $csv->print ($fh, $colref);
772
773 Similar to "combine" + "string" + "print", but much more efficient.
774 It expects an array ref as input (not an array!) and the resulting
775 string is not really created, but immediately written to the $fh
776 object, typically an IO handle or any other object that offers a
777 "print" method.
778
779 For performance reasons "print" does not create a result string, so
780 all "string", "status", "fields", and "error_input" methods will return
781 undefined information after executing this method.
782
783 If $colref is "undef" (explicit, not through a variable argument) and
784 "bind_columns" was used to specify fields to be printed, it is
785 possible to make performance improvements, as otherwise data would have
786 to be copied as arguments to the method call:
787
788 $csv->bind_columns (\($foo, $bar));
789 $status = $csv->print ($fh, undef);
790
791 A short benchmark
792
793 my @data = ("aa" .. "zz");
794 $csv->bind_columns (\(@data));
795
796 $csv->print ($fh, [ @data ]); # 11800 recs/sec
797 $csv->print ($fh, \@data ); # 57600 recs/sec
798 $csv->print ($fh, undef ); # 48500 recs/sec
799
800 say
801 $status = $csv->say ($fh, $colref);
802
803 Like "print", but "eol" defaults to "$\".
804
805 print_hr
806 $csv->print_hr ($fh, $ref);
807
808 Provides an easy way to print a $ref (as fetched with "getline_hr")
809 provided the column names are set with "column_names".
810
811 It is just a wrapper method with basic parameter checks over
812
813 $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
814
815 combine
816 $status = $csv->combine (@fields);
817
818 This method constructs a "CSV" record from @fields, returning success
819 or failure. Failure can result from lack of arguments or an argument
820 that contains an invalid character. Upon success, "string" can be
821 called to retrieve the resultant "CSV" string. Upon failure, the
822 value returned by "string" is undefined and "error_input" could be
823 called to retrieve the invalid argument.
824
825 string
826 $line = $csv->string ();
827
828 This method returns the input to "parse" or the resultant "CSV"
829 string of "combine", whichever was called more recently.
830
831 getline
832 $colref = $csv->getline ($fh);
833
834 This is the counterpart to "print", as "parse" is the counterpart to
835 "combine": it parses a row from the $fh handle using the "getline"
836 method associated with $fh and parses this row into an array ref.
837 This array ref is returned by the function or "undef" for failure.
838 When $fh does not support "getline", you are likely to hit errors.
839
840 When fields are bound with "bind_columns" the return value is a
841 reference to an empty list.
842
843 The "string", "fields", and "status" methods are meaningless again.
844
845 getline_all
846 $arrayref = $csv->getline_all ($fh);
847 $arrayref = $csv->getline_all ($fh, $offset);
848 $arrayref = $csv->getline_all ($fh, $offset, $length);
849
850 This will return a reference to a list of getline ($fh) results. In
851 this call, "keep_meta_info" is disabled. If $offset is negative, as
852 with "splice", only the last "abs ($offset)" records of $fh are taken
853 into consideration.
854
855 Given a CSV file with 10 lines:
856
857 lines call
858 ----- ---------------------------------------------------------
859 0..9 $csv->getline_all ($fh) # all
860 0..9 $csv->getline_all ($fh, 0) # all
861 8..9 $csv->getline_all ($fh, 8) # start at 8
862 - $csv->getline_all ($fh, 0, 0) # start at 0 first 0 rows
863 0..4 $csv->getline_all ($fh, 0, 5) # start at 0 first 5 rows
864 4..5 $csv->getline_all ($fh, 4, 2) # start at 4 first 2 rows
865 8..9 $csv->getline_all ($fh, -2) # last 2 rows
866 6..7 $csv->getline_all ($fh, -4, 2) # first 2 of last 4 rows
867
868 getline_hr
869 The "getline_hr" and "column_names" methods work together to allow you
870 to have rows returned as hashrefs. You must call "column_names" first
871 to declare your column names.
872
873 $csv->column_names (qw( code name price description ));
874 $hr = $csv->getline_hr ($fh);
875 print "Price for $hr->{name} is $hr->{price} EUR\n";
876
877 "getline_hr" will croak if called before "column_names".
878
879 Note that "getline_hr" creates a hashref for every row and will be
880 much slower than the combined use of "bind_columns" and "getline" but
881 still offering the same ease of use hashref inside the loop:
882
883 my @cols = @{$csv->getline ($fh)};
884 $csv->column_names (@cols);
885 while (my $row = $csv->getline_hr ($fh)) {
886 print $row->{price};
887 }
888
889 Could easily be rewritten to the much faster:
890
891 my @cols = @{$csv->getline ($fh)};
892 my $row = {};
893 $csv->bind_columns (\@{$row}{@cols});
894 while ($csv->getline ($fh)) {
895 print $row->{price};
896 }
897
898 Your mileage may vary for the size of the data and the number of rows.
899 With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
900
901 Rate hashrefs getlines
902 hashrefs 1.00/s -- -76%
903 getlines 4.15/s 313% --
904
905 getline_hr_all
906 $arrayref = $csv->getline_hr_all ($fh);
907 $arrayref = $csv->getline_hr_all ($fh, $offset);
908 $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
909
910 This will return a reference to a list of getline_hr ($fh) results.
911 In this call, "keep_meta_info" is disabled.
912
913 parse
914 $status = $csv->parse ($line);
915
916 This method decomposes a "CSV" string into fields, returning success
917 or failure. Failure can result from a lack of argument or the given
918 "CSV" string is improperly formatted. Upon success, "fields" can be
919 called to retrieve the decomposed fields. Upon failure calling "fields"
920 will return undefined data and "error_input" can be called to
921 retrieve the invalid argument.
922
923 You may use the "types" method for setting column types. See "types"'
924 description below.
925
926 The $line argument is supposed to be a simple scalar. Everything else
927 is supposed to croak and set error 1500.
928
929 fragment
930 This function tries to implement RFC7111 (URI Fragment Identifiers for
931 the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
932
933 my $AoA = $csv->fragment ($fh, $spec);
934
935 In specifications, "*" is used to specify the last item, a dash ("-")
936 to indicate a range. All indices are 1-based: the first row or
937 column has index 1. Selections can be combined with the semi-colon
938 (";").
939
940 When using this method in combination with "column_names", the
941 returned reference will point to a list of hashes instead of a list
942 of lists. A disjointed cell-based combined selection might return
943 rows with different number of columns making the use of hashes
944 unpredictable.
945
946 $csv->column_names ("Name", "Age");
947 my $AoH = $csv->fragment ($fh, "col=3;8");
948
949 If the "after_parse" callback is active, it is also called on every
950 line parsed and skipped before the fragment.
951
952 row
953 row=4
954 row=5-7
955 row=6-*
956 row=1-2;4;6-*
957
958 col
959 col=2
960 col=1-3
961 col=4-*
962 col=1-2;4;7-*
963
964 cell
965 In cell-based selection, the comma (",") is used to pair row and
966 column
967
968 cell=4,1
969
970 The range operator ("-") using "cell"s can be used to define top-left
971 and bottom-right "cell" location
972
973 cell=3,1-4,6
974
975 The "*" is only allowed in the second part of a pair
976
977 cell=3,2-*,2 # row 3 till end, only column 2
978 cell=3,2-3,* # column 2 till end, only row 3
979 cell=3,2-*,* # strip row 1 and 2, and column 1
980
981 Cells and cell ranges may be combined with ";", possibly resulting in
982 rows with different number of columns
983
984 cell=1,1-2,2;3,3-4,4;1,4;4,1
985
986 Disjointed selections will only return selected cells. The cells
987 that are not specified will not be included in the returned
988 set, not even as "undef". As an example given a "CSV" like
989
990 11,12,13,...19
991 21,22,...28,29
992 : :
993 91,...97,98,99
994
995 with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
996
997 11,12,14
998 21,22
999 33,34
1000 41,43,44
1001
1002 Overlapping cell-specs will return those cells only once, So
1003 "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1004
1005 11,12,13
1006 21,22,23,24
1007 31,32,33,34
1008 42,43,44
1009
1010 RFC7111 <http://tools.ietf.org/html/rfc7111> does not allow different
1011 types of specs to be combined (either "row" or "col" or "cell").
1012 Passing an invalid fragment specification will croak and set error
1013 2013.
1014
1015 column_names
1016 Set the "keys" that will be used in the "getline_hr" calls. If no
1017 keys (column names) are passed, it will return the current setting as a
1018 list.
1019
1020 "column_names" accepts a list of scalars (the column names) or a
1021 single array_ref, so you can pass the return value from "getline" too:
1022
1023 $csv->column_names ($csv->getline ($fh));
1024
1025 "column_names" does no checking on duplicates at all, which might lead
1026 to unexpected results. Undefined entries will be replaced with the
1027 string "\cAUNDEF\cA", so
1028
1029 $csv->column_names (undef, "", "name", "name");
1030 $hr = $csv->getline_hr ($fh);
1031
1032 Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field, "$hr->{""}" to the
1033 2nd field, and "$hr->{name}" to the 4th field, discarding the 3rd
1034 field.
1035
1036 "column_names" croaks on invalid arguments.
1037
1038 header
1039 This method does NOT work in perl-5.6.x
1040
1041 Parse the CSV header and set "sep", column_names and encoding.
1042
1043 my @hdr = $csv->header ($fh);
1044 $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1045 $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1046
1047 The first argument should be a file handle.
1048
1049 This method resets some object properties, as it is supposed to be
1050 invoked only once per file or stream. It will leave attributes
1051 "column_names" and "bound_columns" alone of setting column names is
1052 disabled. Reading headers on previously process objects might fail on
1053 perl-5.8.0 and older.
1054
1055 Assuming that the file opened for parsing has a header, and the header
1056 does not contain problematic characters like embedded newlines, read
1057 the first line from the open handle then auto-detect whether the header
1058 separates the column names with a character from the allowed separator
1059 list.
1060
1061 If any of the allowed separators matches, and none of the other
1062 allowed separators match, set "sep" to that separator for the
1063 current CSV_PP instance and use it to parse the first line, map those
1064 to lowercase, and use that to set the instance "column_names":
1065
1066 my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
1067 open my $fh, "<", "file.csv";
1068 binmode $fh; # for Windows
1069 $csv->header ($fh);
1070 while (my $row = $csv->getline_hr ($fh)) {
1071 ...
1072 }
1073
1074 If the header is empty, contains more than one unique separator out of
1075 the allowed set, contains empty fields, or contains identical fields
1076 (after folding), it will croak with error 1010, 1011, 1012, or 1013
1077 respectively.
1078
1079 If the header contains embedded newlines or is not valid CSV in any
1080 other way, this method will croak and leave the parse error untouched.
1081
1082 A successful call to "header" will always set the "sep" of the $csv
1083 object. This behavior can not be disabled.
1084
1085 return value
1086
1087 On error this method will croak.
1088
1089 In list context, the headers will be returned whether they are used to
1090 set "column_names" or not.
1091
1092 In scalar context, the instance itself is returned. Note: the values
1093 as found in the header will effectively be lost if "set_column_names"
1094 is false.
1095
1096 Options
1097
1098 sep_set
1099 $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1100
1101 The list of legal separators defaults to "[ ";", "," ]" and can be
1102 changed by this option. As this is probably the most often used
1103 option, it can be passed on its own as an unnamed argument:
1104
1105 $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1106
1107 Multi-byte sequences are allowed, both multi-character and
1108 Unicode. See "sep".
1109
1110 detect_bom
1111 $csv->header ($fh, { detect_bom => 1 });
1112
1113 The default behavior is to detect if the header line starts with a
1114 BOM. If the header has a BOM, use that to set the encoding of $fh.
1115 This default behavior can be disabled by passing a false value to
1116 "detect_bom".
1117
1118 Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1119 UTF-32BE, and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1120 BOCU-1, and GB-18030 but Encode does not (yet). UTF-7 is not
1121 supported.
1122
1123 If a supported BOM was detected as start of the stream, it is stored
1124 in the abject attribute "ENCODING".
1125
1126 my $enc = $csv->{ENCODING};
1127
1128 The encoding is used with "binmode" on $fh.
1129
1130 If the handle was opened in a (correct) encoding, this method will
1131 not alter the encoding, as it checks the leading bytes of the first
1132 line. In case the stream starts with a decode BOM ("U+FEFF"),
1133 "{ENCODING}" will be "" (empty) instead of the default "undef".
1134
1135 munge_column_names
1136 This option offers the means to modify the column names into
1137 something that is most useful to the application. The default is to
1138 map all column names to lower case.
1139
1140 $csv->header ($fh, { munge_column_names => "lc" });
1141
1142 The following values are available:
1143
1144 lc - lower case
1145 uc - upper case
1146 none - do not change
1147 \%hash - supply a mapping
1148 \&cb - supply a callback
1149
1150 Literal:
1151
1152 $csv->header ($fh, { munge_column_names => "none" });
1153
1154 Hash:
1155
1156 $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1157
1158 if a value does not exist, the original value is used unchanged
1159
1160 Callback:
1161
1162 $csv->header ($fh, { munge_column_names => sub { fc } });
1163 $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1164 $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1165
1166 As this callback is called in a "map", you can use $_ directly.
1167
1168 set_column_names
1169 $csv->header ($fh, { set_column_names => 1 });
1170
1171 The default is to set the instances column names using
1172 "column_names" if the method is successful, so subsequent calls to
1173 "getline_hr" can return a hash. Disable setting the header can be
1174 forced by using a false value for this option.
1175
1176 As described in "return value" above, content is lost in scalar
1177 context.
1178
1179 Validation
1180
1181 When receiving CSV files from external sources, this method can be
1182 used to protect against changes in the layout by restricting to known
1183 headers (and typos in the header fields).
1184
1185 my %known = (
1186 "record key" => "c_rec",
1187 "rec id" => "c_rec",
1188 "id_rec" => "c_rec",
1189 "kode" => "code",
1190 "code" => "code",
1191 "vaule" => "value",
1192 "value" => "value",
1193 );
1194 my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
1195 open my $fh, "<", $source or die "$source: $!";
1196 $csv->header ($fh, { munge_column_names => sub {
1197 s/\s+$//;
1198 s/^\s+//;
1199 $known{lc $_} or die "Unknown column '$_' in $source";
1200 }});
1201 while (my $row = $csv->getline_hr ($fh)) {
1202 say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1203 }
1204
1205 bind_columns
1206 Takes a list of scalar references to be used for output with "print"
1207 or to store in the fields fetched by "getline". When you do not pass
1208 enough references to store the fetched fields in, "getline" will fail
1209 with error 3006. If you pass more than there are fields to return,
1210 the content of the remaining references is left untouched.
1211
1212 $csv->bind_columns (\$code, \$name, \$price, \$description);
1213 while ($csv->getline ($fh)) {
1214 print "The price of a $name is \x{20ac} $price\n";
1215 }
1216
1217 To reset or clear all column binding, call "bind_columns" with the
1218 single argument "undef". This will also clear column names.
1219
1220 $csv->bind_columns (undef);
1221
1222 If no arguments are passed at all, "bind_columns" will return the list
1223 of current bindings or "undef" if no binds are active.
1224
1225 Note that in parsing with "bind_columns", the fields are set on the
1226 fly. That implies that if the third field of a row causes an error
1227 (or this row has just two fields where the previous row had more), the
1228 first two fields already have been assigned the values of the current
1229 row, while the rest of the fields will still hold the values of the
1230 previous row. If you want the parser to fail in these cases, use the
1231 "strict" attribute.
1232
1233 eof
1234 $eof = $csv->eof ();
1235
1236 If "parse" or "getline" was used with an IO stream, this method will
1237 return true (1) if the last call hit end of file, otherwise it will
1238 return false (''). This is useful to see the difference between a
1239 failure and end of file.
1240
1241 Note that if the parsing of the last line caused an error, "eof" is
1242 still true. That means that if you are not using "auto_diag", an idiom
1243 like
1244
1245 while (my $row = $csv->getline ($fh)) {
1246 # ...
1247 }
1248 $csv->eof or $csv->error_diag;
1249
1250 will not report the error. You would have to change that to
1251
1252 while (my $row = $csv->getline ($fh)) {
1253 # ...
1254 }
1255 +$csv->error_diag and $csv->error_diag;
1256
1257 types
1258 $csv->types (\@tref);
1259
1260 This method is used to force that (all) columns are of a given type.
1261 For example, if you have an integer column, two columns with
1262 doubles and a string column, then you might do a
1263
1264 $csv->types ([Text::CSV_PP::IV (),
1265 Text::CSV_PP::NV (),
1266 Text::CSV_PP::NV (),
1267 Text::CSV_PP::PV ()]);
1268
1269 Column types are used only for decoding columns while parsing, in
1270 other words by the "parse" and "getline" methods.
1271
1272 You can unset column types by doing a
1273
1274 $csv->types (undef);
1275
1276 or fetch the current type settings with
1277
1278 $types = $csv->types ();
1279
1280 IV Set field type to integer.
1281
1282 NV Set field type to numeric/float.
1283
1284 PV Set field type to string.
1285
1286 fields
1287 @columns = $csv->fields ();
1288
1289 This method returns the input to "combine" or the resultant
1290 decomposed fields of a successful "parse", whichever was called more
1291 recently.
1292
1293 Note that the return value is undefined after using "getline", which
1294 does not fill the data structures returned by "parse".
1295
1296 meta_info
1297 @flags = $csv->meta_info ();
1298
1299 This method returns the "flags" of the input to "combine" or the flags
1300 of the resultant decomposed fields of "parse", whichever was called
1301 more recently.
1302
1303 For each field, a meta_info field will hold flags that inform
1304 something about the field returned by the "fields" method or
1305 passed to the "combine" method. The flags are bit-wise-"or"'d like:
1306
1307 " "0x0001
1308 The field was quoted.
1309
1310 " "0x0002
1311 The field was binary.
1312
1313 See the "is_***" methods below.
1314
1315 is_quoted
1316 my $quoted = $csv->is_quoted ($column_idx);
1317
1318 Where $column_idx is the (zero-based) index of the column in the
1319 last result of "parse".
1320
1321 This returns a true value if the data in the indicated column was
1322 enclosed in "quote_char" quotes. This might be important for fields
1323 where content ",20070108," is to be treated as a numeric value, and
1324 where ","20070108"," is explicitly marked as character string data.
1325
1326 This method is only valid when "keep_meta_info" is set to a true value.
1327
1328 is_binary
1329 my $binary = $csv->is_binary ($column_idx);
1330
1331 Where $column_idx is the (zero-based) index of the column in the
1332 last result of "parse".
1333
1334 This returns a true value if the data in the indicated column contained
1335 any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1336
1337 This method is only valid when "keep_meta_info" is set to a true value.
1338
1339 is_missing
1340 my $missing = $csv->is_missing ($column_idx);
1341
1342 Where $column_idx is the (zero-based) index of the column in the
1343 last result of "getline_hr".
1344
1345 $csv->keep_meta_info (1);
1346 while (my $hr = $csv->getline_hr ($fh)) {
1347 $csv->is_missing (0) and next; # This was an empty line
1348 }
1349
1350 When using "getline_hr", it is impossible to tell if the parsed
1351 fields are "undef" because they where not filled in the "CSV" stream
1352 or because they were not read at all, as all the fields defined by
1353 "column_names" are set in the hash-ref. If you still need to know if
1354 all fields in each row are provided, you should enable "keep_meta_info"
1355 so you can check the flags.
1356
1357 If "keep_meta_info" is "false", "is_missing" will always return
1358 "undef", regardless of $column_idx being valid or not. If this
1359 attribute is "true" it will return either 0 (the field is present) or 1
1360 (the field is missing).
1361
1362 A special case is the empty line. If the line is completely empty -
1363 after dealing with the flags - this is still a valid CSV line: it is a
1364 record of just one single empty field. However, if "keep_meta_info" is
1365 set, invoking "is_missing" with index 0 will now return true.
1366
1367 status
1368 $status = $csv->status ();
1369
1370 This method returns the status of the last invoked "combine" or "parse"
1371 call. Status is success (true: 1) or failure (false: "undef" or 0).
1372
1373 error_input
1374 $bad_argument = $csv->error_input ();
1375
1376 This method returns the erroneous argument (if it exists) of "combine"
1377 or "parse", whichever was called more recently. If the last
1378 invocation was successful, "error_input" will return "undef".
1379
1380 error_diag
1381 Text::CSV_PP->error_diag ();
1382 $csv->error_diag ();
1383 $error_code = 0 + $csv->error_diag ();
1384 $error_str = "" . $csv->error_diag ();
1385 ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1386
1387 If (and only if) an error occurred, this function returns the
1388 diagnostics of that error.
1389
1390 If called in void context, this will print the internal error code and
1391 the associated error message to STDERR.
1392
1393 If called in list context, this will return the error code and the
1394 error message in that order. If the last error was from parsing, the
1395 rest of the values returned are a best guess at the location within
1396 the line that was being parsed. Their values are 1-based. The
1397 position currently is index of the byte at which the parsing failed in
1398 the current record. It might change to be the index of the current
1399 character in a later release. The records is the index of the record
1400 parsed by the csv instance. The field number is the index of the field
1401 the parser thinks it is currently trying to parse. See
1402 examples/csv-check for how this can be used.
1403
1404 If called in scalar context, it will return the diagnostics in a
1405 single scalar, a-la $!. It will contain the error code in numeric
1406 context, and the diagnostics message in string context.
1407
1408 When called as a class method or a direct function call, the
1409 diagnostics are that of the last "new" call.
1410
1411 record_number
1412 $recno = $csv->record_number ();
1413
1414 Returns the records parsed by this csv instance. This value should be
1415 more accurate than $. when embedded newlines come in play. Records
1416 written by this instance are not counted.
1417
1418 SetDiag
1419 $csv->SetDiag (0);
1420
1421 Use to reset the diagnostics if you are dealing with errors.
1422
1424 This section is also taken from Text::CSV_XS.
1425
1426 csv
1427 This function is not exported by default and should be explicitly
1428 requested:
1429
1430 use Text::CSV_PP qw( csv );
1431
1432 This is an high-level function that aims at simple (user) interfaces.
1433 This can be used to read/parse a "CSV" file or stream (the default
1434 behavior) or to produce a file or write to a stream (define the "out"
1435 attribute). It returns an array- or hash-reference on parsing (or
1436 "undef" on fail) or the numeric value of "error_diag" on writing.
1437 When this function fails you can get to the error using the class call
1438 to "error_diag"
1439
1440 my $aoa = csv (in => "test.csv") or
1441 die Text::CSV_PP->error_diag;
1442
1443 This function takes the arguments as key-value pairs. This can be
1444 passed as a list or as an anonymous hash:
1445
1446 my $aoa = csv ( in => "test.csv", sep_char => ";");
1447 my $aoh = csv ({ in => $fh, headers => "auto" });
1448
1449 The arguments passed consist of two parts: the arguments to "csv"
1450 itself and the optional attributes to the "CSV" object used inside
1451 the function as enumerated and explained in "new".
1452
1453 If not overridden, the default option used for CSV is
1454
1455 auto_diag => 1
1456 escape_null => 0
1457
1458 The option that is always set and cannot be altered is
1459
1460 binary => 1
1461
1462 As this function will likely be used in one-liners, it allows "quote"
1463 to be abbreviated as "quo", and "escape_char" to be abbreviated as
1464 "esc" or "escape".
1465
1466 Alternative invocations:
1467
1468 my $aoa = Text::CSV_PP::csv (in => "file.csv");
1469
1470 my $csv = Text::CSV_PP->new ();
1471 my $aoa = $csv->csv (in => "file.csv");
1472
1473 In the latter case, the object attributes are used from the existing
1474 object and the attribute arguments in the function call are ignored:
1475
1476 my $csv = Text::CSV_PP->new ({ sep_char => ";" });
1477 my $aoh = $csv->csv (in => "file.csv", bom => 1);
1478
1479 will parse using ";" as "sep_char", not ",".
1480
1481 in
1482
1483 Used to specify the source. "in" can be a file name (e.g. "file.csv"),
1484 which will be opened for reading and closed when finished, a file
1485 handle (e.g. $fh or "FH"), a reference to a glob (e.g. "\*ARGV"),
1486 the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1487 "\q{1,2,"csv"}").
1488
1489 When used with "out", "in" should be a reference to a CSV structure
1490 (AoA or AoH) or a CODE-ref that returns an array-reference or a hash-
1491 reference. The code-ref will be invoked with no arguments.
1492
1493 my $aoa = csv (in => "file.csv");
1494
1495 open my $fh, "<", "file.csv";
1496 my $aoa = csv (in => $fh);
1497
1498 my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1499 my $err = csv (in => $csv, out => "file.csv");
1500
1501 If called in void context without the "out" attribute, the resulting
1502 ref will be used as input to a subsequent call to csv:
1503
1504 csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1505
1506 will be a shortcut to
1507
1508 csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1509
1510 where, in the absence of the "out" attribute, this is a shortcut to
1511
1512 csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1513 out => *STDOUT)
1514
1515 out
1516
1517 csv (in => $aoa, out => "file.csv");
1518 csv (in => $aoa, out => $fh);
1519 csv (in => $aoa, out => STDOUT);
1520 csv (in => $aoa, out => *STDOUT);
1521 csv (in => $aoa, out => \*STDOUT);
1522 csv (in => $aoa, out => \my $data);
1523 csv (in => $aoa, out => undef);
1524 csv (in => $aoa, out => \"skip");
1525
1526 In output mode, the default CSV options when producing CSV are
1527
1528 eol => "\r\n"
1529
1530 The "fragment" attribute is ignored in output mode.
1531
1532 "out" can be a file name (e.g. "file.csv"), which will be opened for
1533 writing and closed when finished, a file handle (e.g. $fh or "FH"), a
1534 reference to a glob (e.g. "\*STDOUT"), the glob itself (e.g. *STDOUT),
1535 or a reference to a scalar (e.g. "\my $data").
1536
1537 csv (in => sub { $sth->fetch }, out => "dump.csv");
1538 csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1539 headers => $sth->{NAME_lc});
1540
1541 When a code-ref is used for "in", the output is generated per
1542 invocation, so no buffering is involved. This implies that there is no
1543 size restriction on the number of records. The "csv" function ends when
1544 the coderef returns a false value.
1545
1546 If "out" is set to a reference of the literal string "skip", the output
1547 will be suppressed completely, which might be useful in combination
1548 with a filter for side effects only.
1549
1550 my %cache;
1551 csv (in => "dump.csv",
1552 out => \"skip",
1553 on_in => sub { $cache{$_[1][1]}++ });
1554
1555 Currently, setting "out" to any false value ("undef", "", 0) will be
1556 equivalent to "\"skip"".
1557
1558 encoding
1559
1560 If passed, it should be an encoding accepted by the ":encoding()"
1561 option to "open". There is no default value. This attribute does not
1562 work in perl 5.6.x. "encoding" can be abbreviated to "enc" for ease of
1563 use in command line invocations.
1564
1565 If "encoding" is set to the literal value "auto", the method "header"
1566 will be invoked on the opened stream to check if there is a BOM and set
1567 the encoding accordingly. This is equal to passing a true value in
1568 the option "detect_bom".
1569
1570 detect_bom
1571
1572 If "detect_bom" is given, the method "header" will be invoked on
1573 the opened stream to check if there is a BOM and set the encoding
1574 accordingly.
1575
1576 "detect_bom" can be abbreviated to "bom".
1577
1578 This is the same as setting "encoding" to "auto".
1579
1580 Note that as the method "header" is invoked, its default is to also
1581 set the headers.
1582
1583 headers
1584
1585 If this attribute is not given, the default behavior is to produce an
1586 array of arrays.
1587
1588 If "headers" is supplied, it should be an anonymous list of column
1589 names, an anonymous hashref, a coderef, or a literal flag: "auto",
1590 "lc", "uc", or "skip".
1591
1592 skip
1593 When "skip" is used, the header will not be included in the output.
1594
1595 my $aoa = csv (in => $fh, headers => "skip");
1596
1597 auto
1598 If "auto" is used, the first line of the "CSV" source will be read as
1599 the list of field headers and used to produce an array of hashes.
1600
1601 my $aoh = csv (in => $fh, headers => "auto");
1602
1603 lc
1604 If "lc" is used, the first line of the "CSV" source will be read as
1605 the list of field headers mapped to lower case and used to produce
1606 an array of hashes. This is a variation of "auto".
1607
1608 my $aoh = csv (in => $fh, headers => "lc");
1609
1610 uc
1611 If "uc" is used, the first line of the "CSV" source will be read as
1612 the list of field headers mapped to upper case and used to produce
1613 an array of hashes. This is a variation of "auto".
1614
1615 my $aoh = csv (in => $fh, headers => "uc");
1616
1617 CODE
1618 If a coderef is used, the first line of the "CSV" source will be
1619 read as the list of mangled field headers in which each field is
1620 passed as the only argument to the coderef. This list is used to
1621 produce an array of hashes.
1622
1623 my $aoh = csv (in => $fh,
1624 headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1625
1626 this example is a variation of using "lc" where all occurrences of
1627 "kode" are replaced with "code".
1628
1629 ARRAY
1630 If "headers" is an anonymous list, the entries in the list will be
1631 used as field names. The first line is considered data instead of
1632 headers.
1633
1634 my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1635 csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1636
1637 HASH
1638 If "headers" is an hash reference, this implies "auto", but header
1639 fields for that exist as key in the hashref will be replaced by the
1640 value for that key. Given a CSV file like
1641
1642 post-kode,city,name,id number,fubble
1643 1234AA,Duckstad,Donald,13,"X313DF"
1644
1645 using
1646
1647 csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1648
1649 will return an entry like
1650
1651 { pc => "1234AA",
1652 city => "Duckstad",
1653 name => "Donald",
1654 ID => "13",
1655 fubble => "X313DF",
1656 }
1657
1658 See also "munge_column_names" and "set_column_names".
1659
1660 munge_column_names
1661
1662 If "munge_column_names" is set, the method "header" is invoked on
1663 the opened stream with all matching arguments to detect and set the
1664 headers.
1665
1666 "munge_column_names" can be abbreviated to "munge".
1667
1668 key
1669
1670 If passed, will default "headers" to "auto" and return a hashref
1671 instead of an array of hashes.
1672
1673 my $ref = csv (in => "test.csv", key => "code");
1674
1675 with test.csv like
1676
1677 code,product,price,color
1678 1,pc,850,gray
1679 2,keyboard,12,white
1680 3,mouse,5,black
1681
1682 will return
1683
1684 { 1 => {
1685 code => 1,
1686 color => 'gray',
1687 price => 850,
1688 product => 'pc'
1689 },
1690 2 => {
1691 code => 2,
1692 color => 'white',
1693 price => 12,
1694 product => 'keyboard'
1695 },
1696 3 => {
1697 code => 3,
1698 color => 'black',
1699 price => 5,
1700 product => 'mouse'
1701 }
1702 }
1703
1704 The "key" attribute can be combined with "headers" for "CSV" date that
1705 has no header line, like
1706
1707 my $ref = csv (
1708 in => "foo.csv",
1709 headers => [qw( c_foo foo bar description stock )],
1710 key => "c_foo",
1711 );
1712
1713 keep_headers
1714
1715 When using hashes, keep the column names into the arrayref passed, so
1716 all headers are available after the call in the original order.
1717
1718 my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1719
1720 This attribute can be abbreviated to "kh" or passed as
1721 "keep_column_names".
1722
1723 This attribute implies a default of "auto" for the "headers" attribute.
1724
1725 fragment
1726
1727 Only output the fragment as defined in the "fragment" method. This
1728 option is ignored when generating "CSV". See "out".
1729
1730 Combining all of them could give something like
1731
1732 use Text::CSV_PP qw( csv );
1733 my $aoh = csv (
1734 in => "test.txt",
1735 encoding => "utf-8",
1736 headers => "auto",
1737 sep_char => "|",
1738 fragment => "row=3;6-9;15-*",
1739 );
1740 say $aoh->[15]{Foo};
1741
1742 sep_set
1743
1744 If "sep_set" is set, the method "header" is invoked on the opened
1745 stream to detect and set "sep_char" with the given set.
1746
1747 "sep_set" can be abbreviated to "seps".
1748
1749 Note that as the "header" method is invoked, its default is to also
1750 set the headers.
1751
1752 set_column_names
1753
1754 If "set_column_names" is passed, the method "header" is invoked on
1755 the opened stream with all arguments meant for "header".
1756
1757 If "set_column_names" is passed as a false value, the content of the
1758 first row is only preserved if the output is AoA:
1759
1760 With an input-file like
1761
1762 bAr,foo
1763 1,2
1764 3,4,5
1765
1766 This call
1767
1768 my $aoa = csv (in => $file, set_column_names => 0);
1769
1770 will result in
1771
1772 [[ "bar", "foo" ],
1773 [ "1", "2" ],
1774 [ "3", "4", "5" ]]
1775
1776 and
1777
1778 my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
1779
1780 will result in
1781
1782 [[ "bAr", "foo" ],
1783 [ "1", "2" ],
1784 [ "3", "4", "5" ]]
1785
1786 Callbacks
1787 Callbacks enable actions triggered from the inside of Text::CSV_PP.
1788
1789 While most of what this enables can easily be done in an unrolled
1790 loop as described in the "SYNOPSIS" callbacks can be used to meet
1791 special demands or enhance the "csv" function.
1792
1793 error
1794 $csv->callbacks (error => sub { $csv->SetDiag (0) });
1795
1796 the "error" callback is invoked when an error occurs, but only
1797 when "auto_diag" is set to a true value. A callback is invoked with
1798 the values returned by "error_diag":
1799
1800 my ($c, $s);
1801
1802 sub ignore3006
1803 {
1804 my ($err, $msg, $pos, $recno, $fldno) = @_;
1805 if ($err == 3006) {
1806 # ignore this error
1807 ($c, $s) = (undef, undef);
1808 Text::CSV_PP->SetDiag (0);
1809 }
1810 # Any other error
1811 return;
1812 } # ignore3006
1813
1814 $csv->callbacks (error => \&ignore3006);
1815 $csv->bind_columns (\$c, \$s);
1816 while ($csv->getline ($fh)) {
1817 # Error 3006 will not stop the loop
1818 }
1819
1820 after_parse
1821 $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
1822 while (my $row = $csv->getline ($fh)) {
1823 $row->[-1] eq "NEW";
1824 }
1825
1826 This callback is invoked after parsing with "getline" only if no
1827 error occurred. The callback is invoked with two arguments: the
1828 current "CSV" parser object and an array reference to the fields
1829 parsed.
1830
1831 The return code of the callback is ignored unless it is a reference
1832 to the string "skip", in which case the record will be skipped in
1833 "getline_all".
1834
1835 sub add_from_db
1836 {
1837 my ($csv, $row) = @_;
1838 $sth->execute ($row->[4]);
1839 push @$row, $sth->fetchrow_array;
1840 } # add_from_db
1841
1842 my $aoa = csv (in => "file.csv", callbacks => {
1843 after_parse => \&add_from_db });
1844
1845 This hook can be used for validation:
1846
1847 FAIL
1848 Die if any of the records does not validate a rule:
1849
1850 after_parse => sub {
1851 $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
1852 die "5th field does not have a valid Dutch zipcode";
1853 }
1854
1855 DEFAULT
1856 Replace invalid fields with a default value:
1857
1858 after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
1859
1860 SKIP
1861 Skip records that have invalid fields (only applies to
1862 "getline_all"):
1863
1864 after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
1865
1866 before_print
1867 my $idx = 1;
1868 $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
1869 $csv->print (*STDOUT, [ 0, $_ ]) for @members;
1870
1871 This callback is invoked before printing with "print" only if no
1872 error occurred. The callback is invoked with two arguments: the
1873 current "CSV" parser object and an array reference to the fields
1874 passed.
1875
1876 The return code of the callback is ignored.
1877
1878 sub max_4_fields
1879 {
1880 my ($csv, $row) = @_;
1881 @$row > 4 and splice @$row, 4;
1882 } # max_4_fields
1883
1884 csv (in => csv (in => "file.csv"), out => *STDOUT,
1885 callbacks => { before print => \&max_4_fields });
1886
1887 This callback is not active for "combine".
1888
1889 Callbacks for csv ()
1890
1891 The "csv" allows for some callbacks that do not integrate in XS
1892 internals but only feature the "csv" function.
1893
1894 csv (in => "file.csv",
1895 callbacks => {
1896 filter => { 6 => sub { $_ > 15 } }, # first
1897 after_parse => sub { say "AFTER PARSE"; }, # first
1898 after_in => sub { say "AFTER IN"; }, # second
1899 on_in => sub { say "ON IN"; }, # third
1900 },
1901 );
1902
1903 csv (in => $aoh,
1904 out => "file.csv",
1905 callbacks => {
1906 on_in => sub { say "ON IN"; }, # first
1907 before_out => sub { say "BEFORE OUT"; }, # second
1908 before_print => sub { say "BEFORE PRINT"; }, # third
1909 },
1910 );
1911
1912 filter
1913 This callback can be used to filter records. It is called just after
1914 a new record has been scanned. The callback accepts a:
1915
1916 hashref
1917 The keys are the index to the row (the field name or field number,
1918 1-based) and the values are subs to return a true or false value.
1919
1920 csv (in => "file.csv", filter => {
1921 3 => sub { m/a/ }, # third field should contain an "a"
1922 5 => sub { length > 4 }, # length of the 5th field minimal 5
1923 });
1924
1925 csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
1926
1927 If the keys to the filter hash contain any character that is not a
1928 digit it will also implicitly set "headers" to "auto" unless
1929 "headers" was already passed as argument. When headers are
1930 active, returning an array of hashes, the filter is not applicable
1931 to the header itself.
1932
1933 All sub results should match, as in AND.
1934
1935 The context of the callback sets $_ localized to the field
1936 indicated by the filter. The two arguments are as with all other
1937 callbacks, so the other fields in the current row can be seen:
1938
1939 filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
1940
1941 If the context is set to return a list of hashes ("headers" is
1942 defined), the current record will also be available in the
1943 localized %_:
1944
1945 filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000 }}
1946
1947 If the filter is used to alter the content by changing $_, make
1948 sure that the sub returns true in order not to have that record
1949 skipped:
1950
1951 filter => { 2 => sub { $_ = uc }}
1952
1953 will upper-case the second field, and then skip it if the resulting
1954 content evaluates to false. To always accept, end with truth:
1955
1956 filter => { 2 => sub { $_ = uc; 1 }}
1957
1958 coderef
1959 csv (in => "file.csv", filter => sub { $n++; 0; });
1960
1961 If the argument to "filter" is a coderef, it is an alias or
1962 shortcut to a filter on column 0:
1963
1964 csv (filter => sub { $n++; 0 });
1965
1966 is equal to
1967
1968 csv (filter => { 0 => sub { $n++; 0 });
1969
1970 filter-name
1971 csv (in => "file.csv", filter => "not_blank");
1972 csv (in => "file.csv", filter => "not_empty");
1973 csv (in => "file.csv", filter => "filled");
1974
1975 These are predefined filters
1976
1977 Given a file like (line numbers prefixed for doc purpose only):
1978
1979 1:1,2,3
1980 2:
1981 3:,
1982 4:""
1983 5:,,
1984 6:, ,
1985 7:"",
1986 8:" "
1987 9:4,5,6
1988
1989 not_blank
1990 Filter out the blank lines
1991
1992 This filter is a shortcut for
1993
1994 filter => { 0 => sub { @{$_[1]} > 1 or
1995 defined $_[1][0] && $_[1][0] ne "" } }
1996
1997 Due to the implementation, it is currently impossible to also
1998 filter lines that consists only of a quoted empty field. These
1999 lines are also considered blank lines.
2000
2001 With the given example, lines 2 and 4 will be skipped.
2002
2003 not_empty
2004 Filter out lines where all the fields are empty.
2005
2006 This filter is a shortcut for
2007
2008 filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2009
2010 A space is not regarded being empty, so given the example data,
2011 lines 2, 3, 4, 5, and 7 are skipped.
2012
2013 filled
2014 Filter out lines that have no visible data
2015
2016 This filter is a shortcut for
2017
2018 filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2019
2020 This filter rejects all lines that not have at least one field
2021 that does not evaluate to the empty string.
2022
2023 With the given example data, this filter would skip lines 2
2024 through 8.
2025
2026 after_in
2027 This callback is invoked for each record after all records have been
2028 parsed but before returning the reference to the caller. The hook is
2029 invoked with two arguments: the current "CSV" parser object and a
2030 reference to the record. The reference can be a reference to a
2031 HASH or a reference to an ARRAY as determined by the arguments.
2032
2033 This callback can also be passed as an attribute without the
2034 "callbacks" wrapper.
2035
2036 before_out
2037 This callback is invoked for each record before the record is
2038 printed. The hook is invoked with two arguments: the current "CSV"
2039 parser object and a reference to the record. The reference can be a
2040 reference to a HASH or a reference to an ARRAY as determined by the
2041 arguments.
2042
2043 This callback can also be passed as an attribute without the
2044 "callbacks" wrapper.
2045
2046 This callback makes the row available in %_ if the row is a hashref.
2047 In this case %_ is writable and will change the original row.
2048
2049 on_in
2050 This callback acts exactly as the "after_in" or the "before_out"
2051 hooks.
2052
2053 This callback can also be passed as an attribute without the
2054 "callbacks" wrapper.
2055
2056 This callback makes the row available in %_ if the row is a hashref.
2057 In this case %_ is writable and will change the original row. So e.g.
2058 with
2059
2060 my $aoh = csv (
2061 in => \"foo\n1\n2\n",
2062 headers => "auto",
2063 on_in => sub { $_{bar} = 2; },
2064 );
2065
2066 $aoh will be:
2067
2068 [ { foo => 1,
2069 bar => 2,
2070 }
2071 { foo => 2,
2072 bar => 2,
2073 }
2074 ]
2075
2076 csv
2077 The function "csv" can also be called as a method or with an
2078 existing Text::CSV_PP object. This could help if the function is to
2079 be invoked a lot of times and the overhead of creating the object
2080 internally over and over again would be prevented by passing an
2081 existing instance.
2082
2083 my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
2084
2085 my $aoa = $csv->csv (in => $fh);
2086 my $aoa = csv (in => $fh, csv => $csv);
2087
2088 both act the same. Running this 20000 times on a 20 lines CSV file,
2089 showed a 53% speedup.
2090
2092 This section is also taken from Text::CSV_XS.
2093
2094 Still under construction ...
2095
2096 If an error occurs, "$csv->error_diag" can be used to get information
2097 on the cause of the failure. Note that for speed reasons the internal
2098 value is never cleared on success, so using the value returned by
2099 "error_diag" in normal cases - when no error occurred - may cause
2100 unexpected results.
2101
2102 If the constructor failed, the cause can be found using "error_diag" as
2103 a class method, like "Text::CSV_PP->error_diag".
2104
2105 The "$csv->error_diag" method is automatically invoked upon error when
2106 the contractor was called with "auto_diag" set to 1 or 2, or when
2107 autodie is in effect. When set to 1, this will cause a "warn" with the
2108 error message, when set to 2, it will "die". "2012 - EOF" is excluded
2109 from "auto_diag" reports.
2110
2111 Errors can be (individually) caught using the "error" callback.
2112
2113 The errors as described below are available. I have tried to make the
2114 error itself explanatory enough, but more descriptions will be added.
2115 For most of these errors, the first three capitals describe the error
2116 category:
2117
2118 · INI
2119
2120 Initialization error or option conflict.
2121
2122 · ECR
2123
2124 Carriage-Return related parse error.
2125
2126 · EOF
2127
2128 End-Of-File related parse error.
2129
2130 · EIQ
2131
2132 Parse error inside quotation.
2133
2134 · EIF
2135
2136 Parse error inside field.
2137
2138 · ECB
2139
2140 Combine error.
2141
2142 · EHR
2143
2144 HashRef parse related error.
2145
2146 And below should be the complete list of error codes that can be
2147 returned:
2148
2149 · 1001 "INI - sep_char is equal to quote_char or escape_char"
2150
2151 The separation character cannot be equal to the quotation
2152 character or to the escape character, as this would invalidate all
2153 parsing rules.
2154
2155 · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2156 TAB"
2157
2158 Using the "allow_whitespace" attribute when either "quote_char" or
2159 "escape_char" is equal to "SPACE" or "TAB" is too ambiguous to
2160 allow.
2161
2162 · 1003 "INI - \r or \n in main attr not allowed"
2163
2164 Using default "eol" characters in either "sep_char", "quote_char",
2165 or "escape_char" is not allowed.
2166
2167 · 1004 "INI - callbacks should be undef or a hashref"
2168
2169 The "callbacks" attribute only allows one to be "undef" or a hash
2170 reference.
2171
2172 · 1005 "INI - EOL too long"
2173
2174 The value passed for EOL is exceeding its maximum length (16).
2175
2176 · 1006 "INI - SEP too long"
2177
2178 The value passed for SEP is exceeding its maximum length (16).
2179
2180 · 1007 "INI - QUOTE too long"
2181
2182 The value passed for QUOTE is exceeding its maximum length (16).
2183
2184 · 1008 "INI - SEP undefined"
2185
2186 The value passed for SEP should be defined and not empty.
2187
2188 · 1010 "INI - the header is empty"
2189
2190 The header line parsed in the "header" is empty.
2191
2192 · 1011 "INI - the header contains more than one valid separator"
2193
2194 The header line parsed in the "header" contains more than one
2195 (unique) separator character out of the allowed set of separators.
2196
2197 · 1012 "INI - the header contains an empty field"
2198
2199 The header line parsed in the "header" is contains an empty field.
2200
2201 · 1013 "INI - the header contains nun-unique fields"
2202
2203 The header line parsed in the "header" contains at least two
2204 identical fields.
2205
2206 · 1014 "INI - header called on undefined stream"
2207
2208 The header line cannot be parsed from an undefined sources.
2209
2210 · 1500 "PRM - Invalid/unsupported argument(s)"
2211
2212 Function or method called with invalid argument(s) or parameter(s).
2213
2214 · 1501 "PRM - The key attribute is passed as an unsupported type"
2215
2216 The "key" attribute is of an unsupported type.
2217
2218 · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2219
2220 When "eol" has been set to anything but the default, like
2221 "\r\t\n", and the "\r" is following the second (closing)
2222 "quote_char", where the characters following the "\r" do not make up
2223 the "eol" sequence, this is an error.
2224
2225 · 2011 "ECR - Characters after end of quoted field"
2226
2227 Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2228 quoted field and after the closing double-quote, there should be
2229 either a new-line sequence or a separation character.
2230
2231 · 2012 "EOF - End of data in parsing input stream"
2232
2233 Self-explaining. End-of-file while inside parsing a stream. Can
2234 happen only when reading from streams with "getline", as using
2235 "parse" is done on strings that are not required to have a trailing
2236 "eol".
2237
2238 · 2013 "INI - Specification error for fragments RFC7111"
2239
2240 Invalid specification for URI "fragment" specification.
2241
2242 · 2014 "ENF - Inconsistent number of fields"
2243
2244 Inconsistent number of fields under strict parsing.
2245
2246 · 2021 "EIQ - NL char inside quotes, binary off"
2247
2248 Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2249 option has been selected with the constructor.
2250
2251 · 2022 "EIQ - CR char inside quotes, binary off"
2252
2253 Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2254 option has been selected with the constructor.
2255
2256 · 2023 "EIQ - QUO character not allowed"
2257
2258 Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2259 Bar",\n" will cause this error.
2260
2261 · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2262
2263 The escape character is not allowed as last character in an input
2264 stream.
2265
2266 · 2025 "EIQ - Loose unescaped escape"
2267
2268 An escape character should escape only characters that need escaping.
2269
2270 Allowing the escape for other characters is possible with the
2271 attribute "allow_loose_escape".
2272
2273 · 2026 "EIQ - Binary character inside quoted field, binary off"
2274
2275 Binary characters are not allowed by default. Exceptions are
2276 fields that contain valid UTF-8, that will automatically be upgraded
2277 if the content is valid UTF-8. Set "binary" to 1 to accept binary
2278 data.
2279
2280 · 2027 "EIQ - Quoted field not terminated"
2281
2282 When parsing a field that started with a quotation character, the
2283 field is expected to be closed with a quotation character. When the
2284 parsed line is exhausted before the quote is found, that field is not
2285 terminated.
2286
2287 · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2288
2289 · 2031 "EIF - CR char is first char of field, not part of EOL"
2290
2291 · 2032 "EIF - CR char inside unquoted, not part of EOL"
2292
2293 · 2034 "EIF - Loose unescaped quote"
2294
2295 · 2035 "EIF - Escaped EOF in unquoted field"
2296
2297 · 2036 "EIF - ESC error"
2298
2299 · 2037 "EIF - Binary character in unquoted field, binary off"
2300
2301 · 2110 "ECB - Binary character in Combine, binary off"
2302
2303 · 2200 "EIO - print to IO failed. See errno"
2304
2305 · 3001 "EHR - Unsupported syntax for column_names ()"
2306
2307 · 3002 "EHR - getline_hr () called before column_names ()"
2308
2309 · 3003 "EHR - bind_columns () and column_names () fields count
2310 mismatch"
2311
2312 · 3004 "EHR - bind_columns () only accepts refs to scalars"
2313
2314 · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2315 fields"
2316
2317 · 3007 "EHR - bind_columns needs refs to writable scalars"
2318
2319 · 3008 "EHR - unexpected error in bound fields"
2320
2321 · 3009 "EHR - print_hr () called before column_names ()"
2322
2323 · 3010 "EHR - print_hr () called with invalid arguments"
2324
2326 Text::CSV_XS, Text::CSV
2327
2328 Older versions took many regexp from
2329 <http://www.din.or.jp/~ohzaki/perl.htm>
2330
2332 Kenichi Ishigaki, <ishigaki[at]cpan.org> Makamaka Hannyaharamitu,
2333 <makamaka[at]cpan.org>
2334
2335 Text::CSV_XS was written by <joe[at]ispsoft.de> and maintained by
2336 <h.m.brand[at]xs4all.nl>.
2337
2338 Text::CSV was written by <alan[at]mfgrtl.com>.
2339
2341 Copyright 2017- by Kenichi Ishigaki, <ishigaki[at]cpan.org> Copyright
2342 2005-2015 by Makamaka Hannyaharamitu, <makamaka[at]cpan.org>
2343
2344 Most of the code and doc is directly taken from the pure perl part of
2345 Text::CSV_XS.
2346
2347 Copyright (C) 2007-2016 H.Merijn Brand. All rights reserved.
2348 Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2349 Copyright (C) 1997 Alan Citterman. All rights reserved.
2350
2351 This library is free software; you can redistribute it and/or modify it
2352 under the same terms as Perl itself.
2353
2354
2355
2356perl v5.28.0 2018-08-17 Text::CSV_PP(3)