1Text::CSV_PP(3)       User Contributed Perl Documentation      Text::CSV_PP(3)
2
3
4

NAME

6       Text::CSV_PP - Text::CSV_XS compatible pure-Perl module
7

SYNOPSIS

9        use Text::CSV_PP;
10
11        $csv = Text::CSV_PP->new();     # create a new object
12        # If you want to handle non-ascii char.
13        $csv = Text::CSV_PP->new({binary => 1});
14
15        $status = $csv->combine(@columns);    # combine columns into a string
16        $line   = $csv->string();             # get the combined string
17
18        $status  = $csv->parse($line);        # parse a CSV string into fields
19        @columns = $csv->fields();            # get the parsed fields
20
21        $status       = $csv->status ();      # get the most recent status
22        $bad_argument = $csv->error_input (); # get the most recent bad argument
23        $diag         = $csv->error_diag ();  # if an error occurred, explains WHY
24
25        $status = $csv->print ($io, $colref); # Write an array of fields
26                                              # immediately to a file $io
27        $colref = $csv->getline ($io);        # Read a line from file $io,
28                                              # parse it and return an array
29                                              # ref of fields
30        $csv->column_names (@names);          # Set column names for getline_hr ()
31        $ref = $csv->getline_hr ($io);        # getline (), but returns a hashref
32        $eof = $csv->eof ();                  # Indicate if last parse or
33                                              # getline () hit End Of File
34
35        $csv->types(\@t_array);               # Set column types
36

DESCRIPTION

38       Text::CSV_PP is a pure-perl module that provides facilities for the
39       composition and decomposition of comma-separated values. This is
40       (almost) compatible with much faster Text::CSV_XS, and mainly used as
41       its fallback module when you use Text::CSV module without having
42       installed Text::CSV_XS. If you don't have any reason to use this module
43       directly, use Text::CSV for speed boost and portability (or maybe
44       Text::CSV_XS when you write an one-off script and don't need to care
45       about portability).
46
47       The following caveats are taken from the doc of Text::CSV_XS.
48
49   Embedded newlines
50       Important Note:  The default behavior is to accept only ASCII
51       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
52       that the fields can not contain newlines. If your data contains
53       newlines embedded in fields, or characters above 0x7E (tilde), or
54       binary data, you must set "binary => 1" in the call to "new". To cover
55       the widest range of parsing options, you will always want to set
56       binary.
57
58       But you still have the problem  that you have to pass a correct line to
59       the "parse" method, which is more complicated from the usual point of
60       usage:
61
62        my $csv = Text::CSV_PP->new ({ binary => 1, eol => $/ });
63        while (<>) {           #  WRONG!
64            $csv->parse ($_);
65            my @fields = $csv->fields ();
66            }
67
68       this will break, as the "while" might read broken lines:  it does not
69       care about the quoting. If you need to support embedded newlines,  the
70       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
71       and "\r\n" by default) and then
72
73        my $csv = Text::CSV_PP->new ({ binary => 1 });
74        open my $fh, "<", $file or die "$file: $!";
75        while (my $row = $csv->getline ($fh)) {
76            my @fields = @$row;
77            }
78
79       The old(er) way of using global file handles is still supported
80
81        while (my $row = $csv->getline (*ARGV)) { ... }
82
83   Unicode
84       Unicode is only tested to work with perl-5.8.2 and up.
85
86       See also "BOM".
87
88       The simplest way to ensure the correct encoding is used for  in- and
89       output is by either setting layers on the filehandles, or setting the
90       "encoding" argument for "csv".
91
92        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
93       or
94        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
95
96        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
97       or
98        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
99
100       On parsing (both for  "getline" and  "parse"),  if the source is marked
101       being UTF8, then all fields that are marked binary will also be marked
102       UTF8.
103
104       On combining ("print"  and  "combine"):  if any of the combining fields
105       was marked UTF8, the resulting string will be marked as UTF8.  Note
106       however that all fields  before  the first field marked UTF8 and
107       contained 8-bit characters that were not upgraded to UTF8,  these will
108       be  "bytes"  in the resulting string too, possibly causing unexpected
109       errors.  If you pass data of different encoding,  or you don't know if
110       there is  different  encoding, force it to be upgraded before you pass
111       them on:
112
113        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
114
115       For complete control over encoding, please use Text::CSV::Encoded:
116
117        use Text::CSV::Encoded;
118        my $csv = Text::CSV::Encoded->new ({
119            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
120            encoding_out => "cp1252",     # the encoding comes out of Perl
121            });
122
123        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
124        # combine () and print () accept *literally* utf8 encoded data
125        # parse () and getline () return *literally* utf8 encoded data
126
127        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
128        # combine () and print () accept UTF8 marked data
129        # parse () and getline () return UTF8 marked data
130
131   BOM
132       BOM  (or Byte Order Mark)  handling is available only inside the
133       "header" method.   This method supports the following encodings:
134       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
135       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
136       <https://en.wikipedia.org/wiki/Byte_order_mark>.
137
138       If a file has a BOM, the easiest way to deal with that is
139
140        my $aoh = csv (in => $file, detect_bom => 1);
141
142       All records will be encoded based on the detected BOM.
143
144       This implies a call to the  "header"  method,  which defaults to also
145       set the "column_names". So this is not the same as
146
147        my $aoh = csv (in => $file, headers => "auto");
148
149       which only reads the first record to set  "column_names"  but ignores
150       any meaning of possible present BOM.
151

METHODS

153       This section is taken from Text::CSV_XS.
154
155   version
156       (Class method) Returns the current module version.
157
158   new
159       (Class method) Returns a new instance of class Text::CSV_PP. The
160       attributes are described by the (optional) hash ref "\%attr".
161
162        my $csv = Text::CSV_PP->new ({ attributes ... });
163
164       The following attributes are available:
165
166       eol
167
168        my $csv = Text::CSV_PP->new ({ eol => $/ });
169                  $csv->eol (undef);
170        my $eol = $csv->eol;
171
172       The end-of-line string to add to rows for "print" or the record
173       separator for "getline".
174
175       When not passed in a parser instance,  the default behavior is to
176       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
177       "eol" at all. Passing "undef" or the empty string behave the same.
178
179       When not passed in a generating instance,  records are not terminated
180       at all, so it is probably wise to pass something you expect. A safe
181       choice for "eol" on output is either $/ or "\r\n".
182
183       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
184       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
185       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
186
187       If both $/ and "eol" equal "\015", parsing lines that end on only a
188       Carriage Return without Line Feed, will be "parse"d correct.
189
190       sep_char
191
192        my $csv = Text::CSV_PP->new ({ sep_char => ";" });
193                $csv->sep_char (";");
194        my $c = $csv->sep_char;
195
196       The char used to separate fields, by default a comma. (",").  Limited
197       to a single-byte character, usually in the range from 0x20 (space) to
198       0x7E (tilde). When longer sequences are required, use "sep".
199
200       The separation character can not be equal to the quote character  or to
201       the escape character.
202
203       sep
204
205        my $csv = Text::CSV_PP->new ({ sep => "\N{FULLWIDTH COMMA}" });
206                  $csv->sep (";");
207        my $sep = $csv->sep;
208
209       The chars used to separate fields, by default undefined. Limited to 8
210       bytes.
211
212       When set, overrules "sep_char".  If its length is one byte it acts as
213       an alias to "sep_char".
214
215       quote_char
216
217        my $csv = Text::CSV_PP->new ({ quote_char => "'" });
218                $csv->quote_char (undef);
219        my $c = $csv->quote_char;
220
221       The character to quote fields containing blanks or binary data,  by
222       default the double quote character (""").  A value of undef suppresses
223       quote chars (for simple cases only). Limited to a single-byte
224       character, usually in the range from  0x20 (space) to  0x7E (tilde).
225       When longer sequences are required, use "quote".
226
227       "quote_char" can not be equal to "sep_char".
228
229       quote
230
231        my $csv = Text::CSV_PP->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
232                    $csv->quote ("'");
233        my $quote = $csv->quote;
234
235       The chars used to quote fields, by default undefined. Limited to 8
236       bytes.
237
238       When set, overrules "quote_char". If its length is one byte it acts as
239       an alias to "quote_char".
240
241       escape_char
242
243        my $csv = Text::CSV_PP->new ({ escape_char => "\\" });
244                $csv->escape_char (":");
245        my $c = $csv->escape_char;
246
247       The character to  escape  certain characters inside quoted fields.
248       This is limited to a  single-byte  character,  usually  in the  range
249       from  0x20 (space) to 0x7E (tilde).
250
251       The "escape_char" defaults to being the double-quote mark ("""). In
252       other words the same as the default "quote_char". This means that
253       doubling the quote mark in a field escapes it:
254
255        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
256
257       If  you  change  the   "quote_char"  without  changing  the
258       "escape_char",  the  "escape_char" will still be the double-quote
259       (""").  If instead you want to escape the  "quote_char" by doubling it
260       you will need to also change the  "escape_char"  to be the same as what
261       you have changed the "quote_char" to.
262
263       Setting "escape_char" to <undef> or "" will disable escaping completely
264       and is greatly discouraged. This will also disable "escape_null".
265
266       The escape character can not be equal to the separation character.
267
268       binary
269
270        my $csv = Text::CSV_PP->new ({ binary => 1 });
271                $csv->binary (0);
272        my $f = $csv->binary;
273
274       If this attribute is 1,  you may use binary characters in quoted
275       fields, including line feeds, carriage returns and "NULL" bytes. (The
276       latter could be escaped as ""0".) By default this feature is off.
277
278       If a string is marked UTF8,  "binary" will be turned on automatically
279       when binary characters other than "CR" and "NL" are encountered.   Note
280       that a simple string like "\x{00a0}" might still be binary, but not
281       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
282
283       strict
284
285        my $csv = Text::CSV_PP->new ({ strict => 1 });
286                $csv->strict (0);
287        my $f = $csv->strict;
288
289       If this attribute is set to 1, any row that parses to a different
290       number of fields than the previous row will cause the parser to throw
291       error 2014.
292
293       formula_handling
294
295       formula
296
297        my $csv = Text::CSV_PP->new ({ formula => "none" });
298                $csv->formula ("none");
299        my $f = $csv->formula;
300
301       This defines the behavior of fields containing formulas. As formulas
302       are considered dangerous in spreadsheets, this attribute can define an
303       optional action to be taken if a field starts with an equal sign ("=").
304
305       For purpose of code-readability, this can also be written as
306
307        my $csv = Text::CSV_PP->new ({ formula_handling => "none" });
308                $csv->formula_handling ("none");
309        my $f = $csv->formula_handling;
310
311       Possible values for this attribute are
312
313       none
314         Take no specific action. This is the default.
315
316          $csv->formula ("none");
317
318       die
319         Cause the process to "die" whenever a leading "=" is encountered.
320
321          $csv->formula ("die");
322
323       croak
324         Cause the process to "croak" whenever a leading "=" is encountered.
325         (See Carp)
326
327          $csv->formula ("croak");
328
329       diag
330         Report position and content of the field whenever a leading  "=" is
331         found.  The value of the field is unchanged.
332
333          $csv->formula ("diag");
334
335       empty
336         Replace the content of fields that start with a "=" with the empty
337         string.
338
339          $csv->formula ("empty");
340          $csv->formula ("");
341
342       undef
343         Replace the content of fields that start with a "=" with "undef".
344
345          $csv->formula ("undef");
346          $csv->formula (undef);
347
348       All other values will give a warning and then fallback to "diag".
349
350       decode_utf8
351
352        my $csv = Text::CSV_PP->new ({ decode_utf8 => 1 });
353                $csv->decode_utf8 (0);
354        my $f = $csv->decode_utf8;
355
356       This attributes defaults to TRUE.
357
358       While parsing,  fields that are valid UTF-8, are automatically set to
359       be UTF-8, so that
360
361         $csv->parse ("\xC4\xA8\n");
362
363       results in
364
365         PV("\304\250"\0) [UTF8 "\x{128}"]
366
367       Sometimes it might not be a desired action.  To prevent those upgrades,
368       set this attribute to false, and the result will be
369
370         PV("\304\250"\0)
371
372       auto_diag
373
374        my $csv = Text::CSV_PP->new ({ auto_diag => 1 });
375                $csv->auto_diag (2);
376        my $l = $csv->auto_diag;
377
378       Set this attribute to a number between 1 and 9 causes  "error_diag" to
379       be automatically called in void context upon errors.
380
381       In case of error "2012 - EOF", this call will be void.
382
383       If "auto_diag" is set to a numeric value greater than 1, it will "die"
384       on errors instead of "warn".  If set to anything unrecognized,  it will
385       be silently ignored.
386
387       Future extensions to this feature will include more reliable auto-
388       detection of  "autodie"  being active in the scope of which the error
389       occurred which will increment the value of "auto_diag" with  1 the
390       moment the error is detected.
391
392       diag_verbose
393
394        my $csv = Text::CSV_PP->new ({ diag_verbose => 1 });
395                $csv->diag_verbose (2);
396        my $l = $csv->diag_verbose;
397
398       Set the verbosity of the output triggered by "auto_diag".   Currently
399       only adds the current  input-record-number  (if known)  to the
400       diagnostic output with an indication of the position of the error.
401
402       blank_is_undef
403
404        my $csv = Text::CSV_PP->new ({ blank_is_undef => 1 });
405                $csv->blank_is_undef (0);
406        my $f = $csv->blank_is_undef;
407
408       Under normal circumstances, "CSV" data makes no distinction between
409       quoted- and unquoted empty fields.  These both end up in an empty
410       string field once read, thus
411
412        1,"",," ",2
413
414       is read as
415
416        ("1", "", "", " ", "2")
417
418       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
419       set, the unquoted  empty field is the result of an undefined value.
420       To enable this distinction when  reading "CSV"  data,  the
421       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
422       to "undef", causing the above to be parsed as
423
424        ("1", "", undef, " ", "2")
425
426       note that this is specifically important when loading  "CSV" fields
427       into a database that allows "NULL" values,  as the perl equivalent for
428       "NULL" is "undef" in DBI land.
429
430       empty_is_undef
431
432        my $csv = Text::CSV_PP->new ({ empty_is_undef => 1 });
433                $csv->empty_is_undef (0);
434        my $f = $csv->empty_is_undef;
435
436       Going one  step  further  than  "blank_is_undef",  this attribute
437       converts all empty fields to "undef", so
438
439        1,"",," ",2
440
441       is read as
442
443        (1, undef, undef, " ", 2)
444
445       Note that this effects only fields that are  originally  empty,  not
446       fields that are empty after stripping allowed whitespace. YMMV.
447
448       allow_whitespace
449
450        my $csv = Text::CSV_PP->new ({ allow_whitespace => 1 });
451                $csv->allow_whitespace (0);
452        my $f = $csv->allow_whitespace;
453
454       When this option is set to true,  the whitespace  ("TAB"'s and
455       "SPACE"'s) surrounding  the  separation character  is removed when
456       parsing.  If either "TAB" or "SPACE" is one of the three characters
457       "sep_char", "quote_char", or "escape_char" it will not be considered
458       whitespace.
459
460       Now lines like:
461
462        1 , "foo" , bar , 3 , zapp
463
464       are parsed as valid "CSV", even though it violates the "CSV" specs.
465
466       Note that  all  whitespace is stripped from both  start and  end of
467       each field.  That would make it  more than a feature to enable parsing
468       bad "CSV" lines, as
469
470        1,   2.0,  3,   ape  , monkey
471
472       will now be parsed as
473
474        ("1", "2.0", "3", "ape", "monkey")
475
476       even if the original line was perfectly acceptable "CSV".
477
478       allow_loose_quotes
479
480        my $csv = Text::CSV_PP->new ({ allow_loose_quotes => 1 });
481                $csv->allow_loose_quotes (0);
482        my $f = $csv->allow_loose_quotes;
483
484       By default, parsing unquoted fields containing "quote_char" characters
485       like
486
487        1,foo "bar" baz,42
488
489       would result in parse error 2034.  Though it is still bad practice to
490       allow this format,  we  cannot  help  the  fact  that  some  vendors
491       make  their applications spit out lines styled this way.
492
493       If there is really bad "CSV" data, like
494
495        1,"foo "bar" baz",42
496
497       or
498
499        1,""foo bar baz"",42
500
501       there is a way to get this data-line parsed and leave the quotes inside
502       the quoted field as-is.  This can be achieved by setting
503       "allow_loose_quotes" AND making sure that the "escape_char" is  not
504       equal to "quote_char".
505
506       allow_loose_escapes
507
508        my $csv = Text::CSV_PP->new ({ allow_loose_escapes => 1 });
509                $csv->allow_loose_escapes (0);
510        my $f = $csv->allow_loose_escapes;
511
512       Parsing fields  that  have  "escape_char"  characters that escape
513       characters that do not need to be escaped, like:
514
515        my $csv = Text::CSV_PP->new ({ escape_char => "\\" });
516        $csv->parse (qq{1,"my bar\'s",baz,42});
517
518       would result in parse error 2025.   Though it is bad practice to allow
519       this format,  this attribute enables you to treat all escape character
520       sequences equal.
521
522       allow_unquoted_escape
523
524        my $csv = Text::CSV_PP->new ({ allow_unquoted_escape => 1 });
525                $csv->allow_unquoted_escape (0);
526        my $f = $csv->allow_unquoted_escape;
527
528       A backward compatibility issue where "escape_char" differs from
529       "quote_char"  prevents  "escape_char" to be in the first position of a
530       field.  If "quote_char" is equal to the default """ and "escape_char"
531       is set to "\", this would be illegal:
532
533        1,\0,2
534
535       Setting this attribute to 1  might help to overcome issues with
536       backward compatibility and allow this style.
537
538       always_quote
539
540        my $csv = Text::CSV_PP->new ({ always_quote => 1 });
541                $csv->always_quote (0);
542        my $f = $csv->always_quote;
543
544       By default the generated fields are quoted only if they need to be.
545       For example, if they contain the separator character. If you set this
546       attribute to 1 then all defined fields will be quoted. ("undef" fields
547       are not quoted, see "blank_is_undef"). This makes it quite often easier
548       to handle exported data in external applications.
549
550       quote_space
551
552        my $csv = Text::CSV_PP->new ({ quote_space => 1 });
553                $csv->quote_space (0);
554        my $f = $csv->quote_space;
555
556       By default,  a space in a field would trigger quotation.  As no rule
557       exists this to be forced in "CSV",  nor any for the opposite, the
558       default is true for safety.   You can exclude the space  from this
559       trigger  by setting this attribute to 0.
560
561       quote_empty
562
563        my $csv = Text::CSV_PP->new ({ quote_empty => 1 });
564                $csv->quote_empty (0);
565        my $f = $csv->quote_empty;
566
567       By default the generated fields are quoted only if they need to be.
568       An empty (defined) field does not need quotation. If you set this
569       attribute to 1 then empty defined fields will be quoted.  ("undef"
570       fields are not quoted, see "blank_is_undef"). See also "always_quote".
571
572       quote_binary
573
574        my $csv = Text::CSV_PP->new ({ quote_binary => 1 });
575                $csv->quote_binary (0);
576        my $f = $csv->quote_binary;
577
578       By default,  all "unsafe" bytes inside a string cause the combined
579       field to be quoted.  By setting this attribute to 0, you can disable
580       that trigger for bytes >= 0x7F.
581
582       escape_null
583
584        my $csv = Text::CSV_PP->new ({ escape_null => 1 });
585                $csv->escape_null (0);
586        my $f = $csv->escape_null;
587
588       By default, a "NULL" byte in a field would be escaped. This option
589       enables you to treat the  "NULL"  byte as a simple binary character in
590       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
591       can prevent "NULL" escapes by setting this attribute to 0.
592
593       When the "escape_char" attribute is set to undefined,  this attribute
594       will be set to false.
595
596       The default setting will encode "=\x00=" as
597
598        "="0="
599
600       With "escape_null" set, this will result in
601
602        "=\x00="
603
604       The default when using the "csv" function is "false".
605
606       For backward compatibility reasons,  the deprecated old name
607       "quote_null" is still recognized.
608
609       keep_meta_info
610
611        my $csv = Text::CSV_PP->new ({ keep_meta_info => 1 });
612                $csv->keep_meta_info (0);
613        my $f = $csv->keep_meta_info;
614
615       By default, the parsing of input records is as simple and fast as
616       possible.  However,  some parsing information - like quotation of the
617       original field - is lost in that process.  Setting this flag to true
618       enables retrieving that information after parsing with  the methods
619       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
620       false for performance.
621
622       If you set this attribute to a value greater than 9,   than you can
623       control output quotation style like it was used in the input of the the
624       last parsed record (unless quotation was added because of other
625       reasons).
626
627        my $csv = Text::CSV_PP->new ({
628           binary         => 1,
629           keep_meta_info => 1,
630           quote_space    => 0,
631           });
632
633        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
634
635        $csv->print (*STDOUT, \@row);
636        # 1,,, , ,f,g,"h""h",help,help
637        $csv->keep_meta_info (11);
638        $csv->print (*STDOUT, \@row);
639        # 1,,"", ," ",f,"g","h""h",help,"help"
640
641       undef_str
642
643        my $csv = Text::CSV_PP->new ({ undef_str => "\\N" });
644                $csv->undef_str (undef);
645        my $s = $csv->undef_str;
646
647       This attribute optionally defines the output of undefined fields. The
648       value passed is not changed at all, so if it needs quotation, the
649       quotation needs to be included in the value of the attribute.  Use with
650       caution, as passing a value like  ",",,,,"""  will for sure mess up
651       your output. The default for this attribute is "undef", meaning no
652       special treatment.
653
654       This attribute is useful when exporting  CSV data  to be imported in
655       custom loaders, like for MySQL, that recognize special sequences for
656       "NULL" data.
657
658       verbatim
659
660        my $csv = Text::CSV_PP->new ({ verbatim => 1 });
661                $csv->verbatim (0);
662        my $f = $csv->verbatim;
663
664       This is a quite controversial attribute to set,  but makes some hard
665       things possible.
666
667       The rationale behind this attribute is to tell the parser that the
668       normally special characters newline ("NL") and Carriage Return ("CR")
669       will not be special when this flag is set,  and be dealt with  as being
670       ordinary binary characters. This will ease working with data with
671       embedded newlines.
672
673       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
674       every line.
675
676       Imagine a file format like
677
678        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
679
680       where, the line ending is a very specific "#\r\n", and the sep_char is
681       a "^" (caret).   None of the fields is quoted,   but embedded binary
682       data is likely to be present. With the specific line ending, this
683       should not be too hard to detect.
684
685       By default,  Text::CSV_PP'  parse function is instructed to only know
686       about "\n" and "\r"  to be legal line endings,  and so has to deal with
687       the embedded newline as a real "end-of-line",  so it can scan the next
688       line if binary is true, and the newline is inside a quoted field. With
689       this option, we tell "parse" to parse the line as if "\n" is just
690       nothing more than a binary character.
691
692       For "parse" this means that the parser has no more idea about line
693       ending and "getline" "chomp"s line endings on reading.
694
695       types
696
697       A set of column types; the attribute is immediately passed to the
698       "types" method.
699
700       callbacks
701
702       See the "Callbacks" section below.
703
704       accessors
705
706       To sum it up,
707
708        $csv = Text::CSV_PP->new ();
709
710       is equivalent to
711
712        $csv = Text::CSV_PP->new ({
713            eol                   => undef, # \r, \n, or \r\n
714            sep_char              => ',',
715            sep                   => undef,
716            quote_char            => '"',
717            quote                 => undef,
718            escape_char           => '"',
719            binary                => 0,
720            decode_utf8           => 1,
721            auto_diag             => 0,
722            diag_verbose          => 0,
723            blank_is_undef        => 0,
724            empty_is_undef        => 0,
725            allow_whitespace      => 0,
726            allow_loose_quotes    => 0,
727            allow_loose_escapes   => 0,
728            allow_unquoted_escape => 0,
729            always_quote          => 0,
730            quote_empty           => 0,
731            quote_space           => 1,
732            escape_null           => 1,
733            quote_binary          => 1,
734            keep_meta_info        => 0,
735            verbatim              => 0,
736            undef_str             => undef,
737            types                 => undef,
738            callbacks             => undef,
739            });
740
741       For all of the above mentioned flags, an accessor method is available
742       where you can inquire the current value, or change the value
743
744        my $quote = $csv->quote_char;
745        $csv->binary (1);
746
747       It is not wise to change these settings halfway through writing "CSV"
748       data to a stream. If however you want to create a new stream using the
749       available "CSV" object, there is no harm in changing them.
750
751       If the "new" constructor call fails,  it returns "undef",  and makes
752       the fail reason available through the "error_diag" method.
753
754        $csv = Text::CSV_PP->new ({ ecs_char => 1 }) or
755            die "".Text::CSV_PP->error_diag ();
756
757       "error_diag" will return a string like
758
759        "INI - Unknown attribute 'ecs_char'"
760
761   known_attributes
762        @attr = Text::CSV_PP->known_attributes;
763        @attr = Text::CSV_PP::known_attributes;
764        @attr = $csv->known_attributes;
765
766       This method will return an ordered list of all the supported
767       attributes as described above.   This can be useful for knowing what
768       attributes are valid in classes that use or extend Text::CSV_PP.
769
770   print
771        $status = $csv->print ($fh, $colref);
772
773       Similar to  "combine" + "string" + "print",  but much more efficient.
774       It expects an array ref as input  (not an array!)  and the resulting
775       string is not really  created,  but  immediately  written  to the  $fh
776       object, typically an IO handle or any other object that offers a
777       "print" method.
778
779       For performance reasons  "print"  does not create a result string,  so
780       all "string", "status", "fields", and "error_input" methods will return
781       undefined information after executing this method.
782
783       If $colref is "undef"  (explicit,  not through a variable argument) and
784       "bind_columns"  was used to specify fields to be printed,  it is
785       possible to make performance improvements, as otherwise data would have
786       to be copied as arguments to the method call:
787
788        $csv->bind_columns (\($foo, $bar));
789        $status = $csv->print ($fh, undef);
790
791       A short benchmark
792
793        my @data = ("aa" .. "zz");
794        $csv->bind_columns (\(@data));
795
796        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
797        $csv->print ($fh,  \@data  );   # 57600 recs/sec
798        $csv->print ($fh,   undef  );   # 48500 recs/sec
799
800   say
801        $status = $csv->say ($fh, $colref);
802
803       Like "print", but "eol" defaults to "$\".
804
805   print_hr
806        $csv->print_hr ($fh, $ref);
807
808       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
809       provided the column names are set with "column_names".
810
811       It is just a wrapper method with basic parameter checks over
812
813        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
814
815   combine
816        $status = $csv->combine (@fields);
817
818       This method constructs a "CSV" record from  @fields,  returning success
819       or failure.   Failure can result from lack of arguments or an argument
820       that contains an invalid character.   Upon success,  "string" can be
821       called to retrieve the resultant "CSV" string.  Upon failure,  the
822       value returned by "string" is undefined and "error_input" could be
823       called to retrieve the invalid argument.
824
825   string
826        $line = $csv->string ();
827
828       This method returns the input to  "parse"  or the resultant "CSV"
829       string of "combine", whichever was called more recently.
830
831   getline
832        $colref = $csv->getline ($fh);
833
834       This is the counterpart to  "print",  as "parse"  is the counterpart to
835       "combine":  it parses a row from the $fh  handle using the "getline"
836       method associated with $fh  and parses this row into an array ref.
837       This array ref is returned by the function or "undef" for failure.
838       When $fh does not support "getline", you are likely to hit errors.
839
840       When fields are bound with "bind_columns" the return value is a
841       reference to an empty list.
842
843       The "string", "fields", and "status" methods are meaningless again.
844
845   getline_all
846        $arrayref = $csv->getline_all ($fh);
847        $arrayref = $csv->getline_all ($fh, $offset);
848        $arrayref = $csv->getline_all ($fh, $offset, $length);
849
850       This will return a reference to a list of getline ($fh) results.  In
851       this call, "keep_meta_info" is disabled.  If $offset is negative, as
852       with "splice", only the last  "abs ($offset)" records of $fh are taken
853       into consideration.
854
855       Given a CSV file with 10 lines:
856
857        lines call
858        ----- ---------------------------------------------------------
859        0..9  $csv->getline_all ($fh)         # all
860        0..9  $csv->getline_all ($fh,  0)     # all
861        8..9  $csv->getline_all ($fh,  8)     # start at 8
862        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
863        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
864        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
865        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
866        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
867
868   getline_hr
869       The "getline_hr" and "column_names" methods work together  to allow you
870       to have rows returned as hashrefs.  You must call "column_names" first
871       to declare your column names.
872
873        $csv->column_names (qw( code name price description ));
874        $hr = $csv->getline_hr ($fh);
875        print "Price for $hr->{name} is $hr->{price} EUR\n";
876
877       "getline_hr" will croak if called before "column_names".
878
879       Note that  "getline_hr"  creates a hashref for every row and will be
880       much slower than the combined use of "bind_columns"  and "getline" but
881       still offering the same ease of use hashref inside the loop:
882
883        my @cols = @{$csv->getline ($fh)};
884        $csv->column_names (@cols);
885        while (my $row = $csv->getline_hr ($fh)) {
886            print $row->{price};
887            }
888
889       Could easily be rewritten to the much faster:
890
891        my @cols = @{$csv->getline ($fh)};
892        my $row = {};
893        $csv->bind_columns (\@{$row}{@cols});
894        while ($csv->getline ($fh)) {
895            print $row->{price};
896            }
897
898       Your mileage may vary for the size of the data and the number of rows.
899       With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
900
901                   Rate hashrefs getlines
902        hashrefs 1.00/s       --     -76%
903        getlines 4.15/s     313%       --
904
905   getline_hr_all
906        $arrayref = $csv->getline_hr_all ($fh);
907        $arrayref = $csv->getline_hr_all ($fh, $offset);
908        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
909
910       This will return a reference to a list of   getline_hr ($fh) results.
911       In this call, "keep_meta_info" is disabled.
912
913   parse
914        $status = $csv->parse ($line);
915
916       This method decomposes a  "CSV"  string into fields,  returning success
917       or failure.   Failure can result from a lack of argument  or the given
918       "CSV" string is improperly formatted.   Upon success, "fields" can be
919       called to retrieve the decomposed fields. Upon failure calling "fields"
920       will return undefined data and  "error_input"  can be called to
921       retrieve  the invalid argument.
922
923       You may use the "types"  method for setting column types.  See "types"'
924       description below.
925
926       The $line argument is supposed to be a simple scalar. Everything else
927       is supposed to croak and set error 1500.
928
929   fragment
930       This function tries to implement RFC7111  (URI Fragment Identifiers for
931       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
932
933        my $AoA = $csv->fragment ($fh, $spec);
934
935       In specifications,  "*" is used to specify the last item, a dash ("-")
936       to indicate a range.   All indices are 1-based:  the first row or
937       column has index 1. Selections can be combined with the semi-colon
938       (";").
939
940       When using this method in combination with  "column_names",  the
941       returned reference  will point to a  list of hashes  instead of a  list
942       of lists.  A disjointed  cell-based combined selection  might return
943       rows with different number of columns making the use of hashes
944       unpredictable.
945
946        $csv->column_names ("Name", "Age");
947        my $AoH = $csv->fragment ($fh, "col=3;8");
948
949       If the "after_parse" callback is active,  it is also called on every
950       line parsed and skipped before the fragment.
951
952       row
953          row=4
954          row=5-7
955          row=6-*
956          row=1-2;4;6-*
957
958       col
959          col=2
960          col=1-3
961          col=4-*
962          col=1-2;4;7-*
963
964       cell
965         In cell-based selection, the comma (",") is used to pair row and
966         column
967
968          cell=4,1
969
970         The range operator ("-") using "cell"s can be used to define top-left
971         and bottom-right "cell" location
972
973          cell=3,1-4,6
974
975         The "*" is only allowed in the second part of a pair
976
977          cell=3,2-*,2    # row 3 till end, only column 2
978          cell=3,2-3,*    # column 2 till end, only row 3
979          cell=3,2-*,*    # strip row 1 and 2, and column 1
980
981         Cells and cell ranges may be combined with ";", possibly resulting in
982         rows with different number of columns
983
984          cell=1,1-2,2;3,3-4,4;1,4;4,1
985
986         Disjointed selections will only return selected cells.   The cells
987         that are not  specified  will  not  be  included  in the  returned
988         set,  not even as "undef".  As an example given a "CSV" like
989
990          11,12,13,...19
991          21,22,...28,29
992          :            :
993          91,...97,98,99
994
995         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
996
997          11,12,14
998          21,22
999          33,34
1000          41,43,44
1001
1002         Overlapping cell-specs will return those cells only once, So
1003         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1004
1005          11,12,13
1006          21,22,23,24
1007          31,32,33,34
1008          42,43,44
1009
1010       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1011       types of specs to be combined   (either "row" or "col" or "cell").
1012       Passing an invalid fragment specification will croak and set error
1013       2013.
1014
1015   column_names
1016       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1017       keys (column names) are passed, it will return the current setting as a
1018       list.
1019
1020       "column_names" accepts a list of scalars  (the column names)  or a
1021       single array_ref, so you can pass the return value from "getline" too:
1022
1023        $csv->column_names ($csv->getline ($fh));
1024
1025       "column_names" does no checking on duplicates at all, which might lead
1026       to unexpected results.   Undefined entries will be replaced with the
1027       string "\cAUNDEF\cA", so
1028
1029        $csv->column_names (undef, "", "name", "name");
1030        $hr = $csv->getline_hr ($fh);
1031
1032       Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1033       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1034       field.
1035
1036       "column_names" croaks on invalid arguments.
1037
1038   header
1039       This method does NOT work in perl-5.6.x
1040
1041       Parse the CSV header and set "sep", column_names and encoding.
1042
1043        my @hdr = $csv->header ($fh);
1044        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1045        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1046
1047       The first argument should be a file handle.
1048
1049       This method resets some object properties,  as it is supposed to be
1050       invoked only once per file or stream.  It will leave attributes
1051       "column_names" and "bound_columns" alone of setting column names is
1052       disabled. Reading headers on previously process objects might fail on
1053       perl-5.8.0 and older.
1054
1055       Assuming that the file opened for parsing has a header, and the header
1056       does not contain problematic characters like embedded newlines,   read
1057       the first line from the open handle then auto-detect whether the header
1058       separates the column names with a character from the allowed separator
1059       list.
1060
1061       If any of the allowed separators matches,  and none of the other
1062       allowed separators match,  set  "sep"  to that  separator  for the
1063       current CSV_PP instance and use it to parse the first line, map those
1064       to lowercase, and use that to set the instance "column_names":
1065
1066        my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
1067        open my $fh, "<", "file.csv";
1068        binmode $fh; # for Windows
1069        $csv->header ($fh);
1070        while (my $row = $csv->getline_hr ($fh)) {
1071            ...
1072            }
1073
1074       If the header is empty,  contains more than one unique separator out of
1075       the allowed set,  contains empty fields,   or contains identical fields
1076       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1077       respectively.
1078
1079       If the header contains embedded newlines or is not valid  CSV  in any
1080       other way, this method will croak and leave the parse error untouched.
1081
1082       A successful call to "header"  will always set the  "sep"  of the $csv
1083       object. This behavior can not be disabled.
1084
1085       return value
1086
1087       On error this method will croak.
1088
1089       In list context,  the headers will be returned whether they are used to
1090       set "column_names" or not.
1091
1092       In scalar context, the instance itself is returned.  Note: the values
1093       as found in the header will effectively be  lost if  "set_column_names"
1094       is false.
1095
1096       Options
1097
1098       sep_set
1099          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1100
1101         The list of legal separators defaults to "[ ";", "," ]" and can be
1102         changed by this option.  As this is probably the most often used
1103         option,  it can be passed on its own as an unnamed argument:
1104
1105          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1106
1107         Multi-byte  sequences are allowed,  both multi-character and
1108         Unicode.  See "sep".
1109
1110       detect_bom
1111          $csv->header ($fh, { detect_bom => 1 });
1112
1113         The default behavior is to detect if the header line starts with a
1114         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1115         This default behavior can be disabled by passing a false value to
1116         "detect_bom".
1117
1118         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1119         UTF-32BE,  and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1120         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1121         supported.
1122
1123         If a supported BOM was detected as start of the stream, it is stored
1124         in the abject attribute "ENCODING".
1125
1126          my $enc = $csv->{ENCODING};
1127
1128         The encoding is used with "binmode" on $fh.
1129
1130         If the handle was opened in a (correct) encoding,  this method will
1131         not alter the encoding, as it checks the leading bytes of the first
1132         line. In case the stream starts with a decode BOM ("U+FEFF"),
1133         "{ENCODING}" will be "" (empty) instead of the default "undef".
1134
1135       munge_column_names
1136         This option offers the means to modify the column names into
1137         something that is most useful to the application.   The default is to
1138         map all column names to lower case.
1139
1140          $csv->header ($fh, { munge_column_names => "lc" });
1141
1142         The following values are available:
1143
1144           lc     - lower case
1145           uc     - upper case
1146           none   - do not change
1147           \%hash - supply a mapping
1148           \&cb   - supply a callback
1149
1150         Literal:
1151
1152          $csv->header ($fh, { munge_column_names => "none" });
1153
1154         Hash:
1155
1156          $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1157
1158         if a value does not exist, the original value is used unchanged
1159
1160         Callback:
1161
1162          $csv->header ($fh, { munge_column_names => sub { fc } });
1163          $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1164          $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1165
1166         As this callback is called in a "map", you can use $_ directly.
1167
1168       set_column_names
1169          $csv->header ($fh, { set_column_names => 1 });
1170
1171         The default is to set the instances column names using
1172         "column_names" if the method is successful,  so subsequent calls to
1173         "getline_hr" can return a hash. Disable setting the header can be
1174         forced by using a false value for this option.
1175
1176         As described in "return value" above, content is lost in scalar
1177         context.
1178
1179       Validation
1180
1181       When receiving CSV files from external sources,  this method can be
1182       used to protect against changes in the layout by restricting to known
1183       headers  (and typos in the header fields).
1184
1185        my %known = (
1186            "record key" => "c_rec",
1187            "rec id"     => "c_rec",
1188            "id_rec"     => "c_rec",
1189            "kode"       => "code",
1190            "code"       => "code",
1191            "vaule"      => "value",
1192            "value"      => "value",
1193            );
1194        my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
1195        open my $fh, "<", $source or die "$source: $!";
1196        $csv->header ($fh, { munge_column_names => sub {
1197            s/\s+$//;
1198            s/^\s+//;
1199            $known{lc $_} or die "Unknown column '$_' in $source";
1200            }});
1201        while (my $row = $csv->getline_hr ($fh)) {
1202            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1203            }
1204
1205   bind_columns
1206       Takes a list of scalar references to be used for output with  "print"
1207       or to store in the fields fetched by "getline".  When you do not pass
1208       enough references to store the fetched fields in, "getline" will fail
1209       with error 3006.  If you pass more than there are fields to return,
1210       the content of the remaining references is left untouched.
1211
1212        $csv->bind_columns (\$code, \$name, \$price, \$description);
1213        while ($csv->getline ($fh)) {
1214            print "The price of a $name is \x{20ac} $price\n";
1215            }
1216
1217       To reset or clear all column binding, call "bind_columns" with the
1218       single argument "undef". This will also clear column names.
1219
1220        $csv->bind_columns (undef);
1221
1222       If no arguments are passed at all, "bind_columns" will return the list
1223       of current bindings or "undef" if no binds are active.
1224
1225       Note that in parsing with  "bind_columns",  the fields are set on the
1226       fly.  That implies that if the third field of a row causes an error
1227       (or this row has just two fields where the previous row had more),  the
1228       first two fields already have been assigned the values of the current
1229       row, while the rest of the fields will still hold the values of the
1230       previous row.  If you want the parser to fail in these cases, use the
1231       "strict" attribute.
1232
1233   eof
1234        $eof = $csv->eof ();
1235
1236       If "parse" or  "getline"  was used with an IO stream,  this method will
1237       return true (1) if the last call hit end of file,  otherwise it will
1238       return false ('').  This is useful to see the difference between a
1239       failure and end of file.
1240
1241       Note that if the parsing of the last line caused an error,  "eof" is
1242       still true.  That means that if you are not using "auto_diag", an idiom
1243       like
1244
1245        while (my $row = $csv->getline ($fh)) {
1246            # ...
1247            }
1248        $csv->eof or $csv->error_diag;
1249
1250       will not report the error. You would have to change that to
1251
1252        while (my $row = $csv->getline ($fh)) {
1253            # ...
1254            }
1255        +$csv->error_diag and $csv->error_diag;
1256
1257   types
1258        $csv->types (\@tref);
1259
1260       This method is used to force that  (all)  columns are of a given type.
1261       For example, if you have an integer column,  two  columns  with
1262       doubles  and a string column, then you might do a
1263
1264        $csv->types ([Text::CSV_PP::IV (),
1265                      Text::CSV_PP::NV (),
1266                      Text::CSV_PP::NV (),
1267                      Text::CSV_PP::PV ()]);
1268
1269       Column types are used only for decoding columns while parsing,  in
1270       other words by the "parse" and "getline" methods.
1271
1272       You can unset column types by doing a
1273
1274        $csv->types (undef);
1275
1276       or fetch the current type settings with
1277
1278        $types = $csv->types ();
1279
1280       IV  Set field type to integer.
1281
1282       NV  Set field type to numeric/float.
1283
1284       PV  Set field type to string.
1285
1286   fields
1287        @columns = $csv->fields ();
1288
1289       This method returns the input to   "combine"  or the resultant
1290       decomposed fields of a successful "parse", whichever was called more
1291       recently.
1292
1293       Note that the return value is undefined after using "getline", which
1294       does not fill the data structures returned by "parse".
1295
1296   meta_info
1297        @flags = $csv->meta_info ();
1298
1299       This method returns the "flags" of the input to "combine" or the flags
1300       of the resultant  decomposed fields of  "parse",   whichever was called
1301       more recently.
1302
1303       For each field,  a meta_info field will hold  flags that  inform
1304       something about  the  field  returned  by  the  "fields"  method or
1305       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1306
1307       " "0x0001
1308         The field was quoted.
1309
1310       " "0x0002
1311         The field was binary.
1312
1313       See the "is_***" methods below.
1314
1315   is_quoted
1316        my $quoted = $csv->is_quoted ($column_idx);
1317
1318       Where  $column_idx is the  (zero-based)  index of the column in the
1319       last result of "parse".
1320
1321       This returns a true value  if the data in the indicated column was
1322       enclosed in "quote_char" quotes.  This might be important for fields
1323       where content ",20070108," is to be treated as a numeric value,  and
1324       where ","20070108"," is explicitly marked as character string data.
1325
1326       This method is only valid when "keep_meta_info" is set to a true value.
1327
1328   is_binary
1329        my $binary = $csv->is_binary ($column_idx);
1330
1331       Where  $column_idx is the  (zero-based)  index of the column in the
1332       last result of "parse".
1333
1334       This returns a true value if the data in the indicated column contained
1335       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1336
1337       This method is only valid when "keep_meta_info" is set to a true value.
1338
1339   is_missing
1340        my $missing = $csv->is_missing ($column_idx);
1341
1342       Where  $column_idx is the  (zero-based)  index of the column in the
1343       last result of "getline_hr".
1344
1345        $csv->keep_meta_info (1);
1346        while (my $hr = $csv->getline_hr ($fh)) {
1347            $csv->is_missing (0) and next; # This was an empty line
1348            }
1349
1350       When using  "getline_hr",  it is impossible to tell if the  parsed
1351       fields are "undef" because they where not filled in the "CSV" stream
1352       or because they were not read at all, as all the fields defined by
1353       "column_names" are set in the hash-ref.    If you still need to know if
1354       all fields in each row are provided, you should enable "keep_meta_info"
1355       so you can check the flags.
1356
1357       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1358       "undef", regardless of $column_idx being valid or not. If this
1359       attribute is "true" it will return either 0 (the field is present) or 1
1360       (the field is missing).
1361
1362       A special case is the empty line.  If the line is completely empty -
1363       after dealing with the flags - this is still a valid CSV line:  it is a
1364       record of just one single empty field. However, if "keep_meta_info" is
1365       set, invoking "is_missing" with index 0 will now return true.
1366
1367   status
1368        $status = $csv->status ();
1369
1370       This method returns the status of the last invoked "combine" or "parse"
1371       call. Status is success (true: 1) or failure (false: "undef" or 0).
1372
1373   error_input
1374        $bad_argument = $csv->error_input ();
1375
1376       This method returns the erroneous argument (if it exists) of "combine"
1377       or "parse",  whichever was called more recently.  If the last
1378       invocation was successful, "error_input" will return "undef".
1379
1380   error_diag
1381        Text::CSV_PP->error_diag ();
1382        $csv->error_diag ();
1383        $error_code               = 0  + $csv->error_diag ();
1384        $error_str                = "" . $csv->error_diag ();
1385        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1386
1387       If (and only if) an error occurred,  this function returns  the
1388       diagnostics of that error.
1389
1390       If called in void context,  this will print the internal error code and
1391       the associated error message to STDERR.
1392
1393       If called in list context,  this will return  the error code  and the
1394       error message in that order.  If the last error was from parsing, the
1395       rest of the values returned are a best guess at the location  within
1396       the line  that was being parsed. Their values are 1-based.  The
1397       position currently is index of the byte at which the parsing failed in
1398       the current record. It might change to be the index of the current
1399       character in a later release. The records is the index of the record
1400       parsed by the csv instance. The field number is the index of the field
1401       the parser thinks it is currently  trying to  parse. See
1402       examples/csv-check for how this can be used.
1403
1404       If called in  scalar context,  it will return  the diagnostics  in a
1405       single scalar, a-la $!.  It will contain the error code in numeric
1406       context, and the diagnostics message in string context.
1407
1408       When called as a class method or a  direct function call,  the
1409       diagnostics are that of the last "new" call.
1410
1411   record_number
1412        $recno = $csv->record_number ();
1413
1414       Returns the records parsed by this csv instance.  This value should be
1415       more accurate than $. when embedded newlines come in play. Records
1416       written by this instance are not counted.
1417
1418   SetDiag
1419        $csv->SetDiag (0);
1420
1421       Use to reset the diagnostics if you are dealing with errors.
1422

FUNCTIONS

1424       This section is also taken from Text::CSV_XS.
1425
1426   csv
1427       This function is not exported by default and should be explicitly
1428       requested:
1429
1430        use Text::CSV_PP qw( csv );
1431
1432       This is an high-level function that aims at simple (user) interfaces.
1433       This can be used to read/parse a "CSV" file or stream (the default
1434       behavior) or to produce a file or write to a stream (define the  "out"
1435       attribute).  It returns an array- or hash-reference on parsing (or
1436       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1437       When this function fails you can get to the error using the class call
1438       to "error_diag"
1439
1440        my $aoa = csv (in => "test.csv") or
1441            die Text::CSV_PP->error_diag;
1442
1443       This function takes the arguments as key-value pairs. This can be
1444       passed as a list or as an anonymous hash:
1445
1446        my $aoa = csv (  in => "test.csv", sep_char => ";");
1447        my $aoh = csv ({ in => $fh, headers => "auto" });
1448
1449       The arguments passed consist of two parts:  the arguments to "csv"
1450       itself and the optional attributes to the  "CSV"  object used inside
1451       the function as enumerated and explained in "new".
1452
1453       If not overridden, the default option used for CSV is
1454
1455        auto_diag   => 1
1456        escape_null => 0
1457
1458       The option that is always set and cannot be altered is
1459
1460        binary      => 1
1461
1462       As this function will likely be used in one-liners,  it allows  "quote"
1463       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1464       "esc" or "escape".
1465
1466       Alternative invocations:
1467
1468        my $aoa = Text::CSV_PP::csv (in => "file.csv");
1469
1470        my $csv = Text::CSV_PP->new ();
1471        my $aoa = $csv->csv (in => "file.csv");
1472
1473       In the latter case, the object attributes are used from the existing
1474       object and the attribute arguments in the function call are ignored:
1475
1476        my $csv = Text::CSV_PP->new ({ sep_char => ";" });
1477        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1478
1479       will parse using ";" as "sep_char", not ",".
1480
1481       in
1482
1483       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1484       which will be  opened for reading  and closed when finished,  a file
1485       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1486       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1487       "\q{1,2,"csv"}").
1488
1489       When used with "out", "in" should be a reference to a CSV structure
1490       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1491       reference.  The code-ref will be invoked with no arguments.
1492
1493        my $aoa = csv (in => "file.csv");
1494
1495        open my $fh, "<", "file.csv";
1496        my $aoa = csv (in => $fh);
1497
1498        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1499        my $err = csv (in => $csv, out => "file.csv");
1500
1501       If called in void context without the "out" attribute, the resulting
1502       ref will be used as input to a subsequent call to csv:
1503
1504        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1505
1506       will be a shortcut to
1507
1508        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1509
1510       where, in the absence of the "out" attribute, this is a shortcut to
1511
1512        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1513             out => *STDOUT)
1514
1515       out
1516
1517        csv (in => $aoa, out => "file.csv");
1518        csv (in => $aoa, out => $fh);
1519        csv (in => $aoa, out =>   STDOUT);
1520        csv (in => $aoa, out =>  *STDOUT);
1521        csv (in => $aoa, out => \*STDOUT);
1522        csv (in => $aoa, out => \my $data);
1523        csv (in => $aoa, out =>  undef);
1524        csv (in => $aoa, out => \"skip");
1525
1526       In output mode, the default CSV options when producing CSV are
1527
1528        eol       => "\r\n"
1529
1530       The "fragment" attribute is ignored in output mode.
1531
1532       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1533       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1534       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1535       or a reference to a scalar (e.g. "\my $data").
1536
1537        csv (in => sub { $sth->fetch },            out => "dump.csv");
1538        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1539             headers => $sth->{NAME_lc});
1540
1541       When a code-ref is used for "in", the output is generated  per
1542       invocation, so no buffering is involved. This implies that there is no
1543       size restriction on the number of records. The "csv" function ends when
1544       the coderef returns a false value.
1545
1546       If "out" is set to a reference of the literal string "skip", the output
1547       will be suppressed completely,  which might be useful in combination
1548       with a filter for side effects only.
1549
1550        my %cache;
1551        csv (in    => "dump.csv",
1552             out   => \"skip",
1553             on_in => sub { $cache{$_[1][1]}++ });
1554
1555       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1556       equivalent to "\"skip"".
1557
1558       encoding
1559
1560       If passed,  it should be an encoding accepted by the  ":encoding()"
1561       option to "open". There is no default value. This attribute does not
1562       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1563       use in command line invocations.
1564
1565       If "encoding" is set to the literal value "auto", the method "header"
1566       will be invoked on the opened stream to check if there is a BOM and set
1567       the encoding accordingly.   This is equal to passing a true value in
1568       the option "detect_bom".
1569
1570       detect_bom
1571
1572       If  "detect_bom"  is given, the method  "header"  will be invoked on
1573       the opened stream to check if there is a BOM and set the encoding
1574       accordingly.
1575
1576       "detect_bom" can be abbreviated to "bom".
1577
1578       This is the same as setting "encoding" to "auto".
1579
1580       Note that as the method  "header" is invoked,  its default is to also
1581       set the headers.
1582
1583       headers
1584
1585       If this attribute is not given, the default behavior is to produce an
1586       array of arrays.
1587
1588       If "headers" is supplied,  it should be an anonymous list of column
1589       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1590       "lc", "uc", or "skip".
1591
1592       skip
1593         When "skip" is used, the header will not be included in the output.
1594
1595          my $aoa = csv (in => $fh, headers => "skip");
1596
1597       auto
1598         If "auto" is used, the first line of the "CSV" source will be read as
1599         the list of field headers and used to produce an array of hashes.
1600
1601          my $aoh = csv (in => $fh, headers => "auto");
1602
1603       lc
1604         If "lc" is used,  the first line of the  "CSV" source will be read as
1605         the list of field headers mapped to  lower case and used to produce
1606         an array of hashes. This is a variation of "auto".
1607
1608          my $aoh = csv (in => $fh, headers => "lc");
1609
1610       uc
1611         If "uc" is used,  the first line of the  "CSV" source will be read as
1612         the list of field headers mapped to  upper case and used to produce
1613         an array of hashes. This is a variation of "auto".
1614
1615          my $aoh = csv (in => $fh, headers => "uc");
1616
1617       CODE
1618         If a coderef is used,  the first line of the  "CSV" source will be
1619         read as the list of mangled field headers in which each field is
1620         passed as the only argument to the coderef. This list is used to
1621         produce an array of hashes.
1622
1623          my $aoh = csv (in      => $fh,
1624                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1625
1626         this example is a variation of using "lc" where all occurrences of
1627         "kode" are replaced with "code".
1628
1629       ARRAY
1630         If  "headers"  is an anonymous list,  the entries in the list will be
1631         used as field names. The first line is considered data instead of
1632         headers.
1633
1634          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1635          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1636
1637       HASH
1638         If "headers" is an hash reference, this implies "auto", but header
1639         fields for that exist as key in the hashref will be replaced by the
1640         value for that key. Given a CSV file like
1641
1642          post-kode,city,name,id number,fubble
1643          1234AA,Duckstad,Donald,13,"X313DF"
1644
1645         using
1646
1647          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1648
1649         will return an entry like
1650
1651          { pc     => "1234AA",
1652            city   => "Duckstad",
1653            name   => "Donald",
1654            ID     => "13",
1655            fubble => "X313DF",
1656            }
1657
1658       See also "munge_column_names" and "set_column_names".
1659
1660       munge_column_names
1661
1662       If "munge_column_names" is set,  the method  "header"  is invoked on
1663       the opened stream with all matching arguments to detect and set the
1664       headers.
1665
1666       "munge_column_names" can be abbreviated to "munge".
1667
1668       key
1669
1670       If passed,  will default  "headers"  to "auto" and return a hashref
1671       instead of an array of hashes.
1672
1673        my $ref = csv (in => "test.csv", key => "code");
1674
1675       with test.csv like
1676
1677        code,product,price,color
1678        1,pc,850,gray
1679        2,keyboard,12,white
1680        3,mouse,5,black
1681
1682       will return
1683
1684         { 1   => {
1685               code    => 1,
1686               color   => 'gray',
1687               price   => 850,
1688               product => 'pc'
1689               },
1690           2   => {
1691               code    => 2,
1692               color   => 'white',
1693               price   => 12,
1694               product => 'keyboard'
1695               },
1696           3   => {
1697               code    => 3,
1698               color   => 'black',
1699               price   => 5,
1700               product => 'mouse'
1701               }
1702           }
1703
1704       The "key" attribute can be combined with "headers" for "CSV" date that
1705       has no header line, like
1706
1707        my $ref = csv (
1708            in      => "foo.csv",
1709            headers => [qw( c_foo foo bar description stock )],
1710            key     =>     "c_foo",
1711            );
1712
1713       keep_headers
1714
1715       When using hashes,  keep the column names into the arrayref passed,  so
1716       all headers are available after the call in the original order.
1717
1718        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1719
1720       This attribute can be abbreviated to "kh" or passed as
1721       "keep_column_names".
1722
1723       This attribute implies a default of "auto" for the "headers" attribute.
1724
1725       fragment
1726
1727       Only output the fragment as defined in the "fragment" method. This
1728       option is ignored when generating "CSV". See "out".
1729
1730       Combining all of them could give something like
1731
1732        use Text::CSV_PP qw( csv );
1733        my $aoh = csv (
1734            in       => "test.txt",
1735            encoding => "utf-8",
1736            headers  => "auto",
1737            sep_char => "|",
1738            fragment => "row=3;6-9;15-*",
1739            );
1740        say $aoh->[15]{Foo};
1741
1742       sep_set
1743
1744       If "sep_set" is set, the method "header" is invoked on the opened
1745       stream to detect and set "sep_char" with the given set.
1746
1747       "sep_set" can be abbreviated to "seps".
1748
1749       Note that as the  "header" method is invoked,  its default is to also
1750       set the headers.
1751
1752       set_column_names
1753
1754       If  "set_column_names" is passed,  the method "header" is invoked on
1755       the opened stream with all arguments meant for "header".
1756
1757       If "set_column_names" is passed as a false value, the content of the
1758       first row is only preserved if the output is AoA:
1759
1760       With an input-file like
1761
1762        bAr,foo
1763        1,2
1764        3,4,5
1765
1766       This call
1767
1768        my $aoa = csv (in => $file, set_column_names => 0);
1769
1770       will result in
1771
1772        [[ "bar", "foo"     ],
1773         [ "1",   "2"       ],
1774         [ "3",   "4",  "5" ]]
1775
1776       and
1777
1778        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
1779
1780       will result in
1781
1782        [[ "bAr", "foo"     ],
1783         [ "1",   "2"       ],
1784         [ "3",   "4",  "5" ]]
1785
1786   Callbacks
1787       Callbacks enable actions triggered from the inside of Text::CSV_PP.
1788
1789       While most of what this enables  can easily be done in an  unrolled
1790       loop as described in the "SYNOPSIS" callbacks can be used to meet
1791       special demands or enhance the "csv" function.
1792
1793       error
1794          $csv->callbacks (error => sub { $csv->SetDiag (0) });
1795
1796         the "error"  callback is invoked when an error occurs,  but  only
1797         when "auto_diag" is set to a true value. A callback is invoked with
1798         the values returned by "error_diag":
1799
1800          my ($c, $s);
1801
1802          sub ignore3006
1803          {
1804              my ($err, $msg, $pos, $recno, $fldno) = @_;
1805              if ($err == 3006) {
1806                  # ignore this error
1807                  ($c, $s) = (undef, undef);
1808                  Text::CSV_PP->SetDiag (0);
1809                  }
1810              # Any other error
1811              return;
1812              } # ignore3006
1813
1814          $csv->callbacks (error => \&ignore3006);
1815          $csv->bind_columns (\$c, \$s);
1816          while ($csv->getline ($fh)) {
1817              # Error 3006 will not stop the loop
1818              }
1819
1820       after_parse
1821          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
1822          while (my $row = $csv->getline ($fh)) {
1823              $row->[-1] eq "NEW";
1824              }
1825
1826         This callback is invoked after parsing with  "getline"  only if no
1827         error occurred.  The callback is invoked with two arguments:   the
1828         current "CSV" parser object and an array reference to the fields
1829         parsed.
1830
1831         The return code of the callback is ignored  unless it is a reference
1832         to the string "skip", in which case the record will be skipped in
1833         "getline_all".
1834
1835          sub add_from_db
1836          {
1837              my ($csv, $row) = @_;
1838              $sth->execute ($row->[4]);
1839              push @$row, $sth->fetchrow_array;
1840              } # add_from_db
1841
1842          my $aoa = csv (in => "file.csv", callbacks => {
1843              after_parse => \&add_from_db });
1844
1845         This hook can be used for validation:
1846
1847         FAIL
1848           Die if any of the records does not validate a rule:
1849
1850            after_parse => sub {
1851                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
1852                    die "5th field does not have a valid Dutch zipcode";
1853                }
1854
1855         DEFAULT
1856           Replace invalid fields with a default value:
1857
1858            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
1859
1860         SKIP
1861           Skip records that have invalid fields (only applies to
1862           "getline_all"):
1863
1864            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
1865
1866       before_print
1867          my $idx = 1;
1868          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
1869          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
1870
1871         This callback is invoked  before printing with  "print"  only if no
1872         error occurred.  The callback is invoked with two arguments:  the
1873         current  "CSV" parser object and an array reference to the fields
1874         passed.
1875
1876         The return code of the callback is ignored.
1877
1878          sub max_4_fields
1879          {
1880              my ($csv, $row) = @_;
1881              @$row > 4 and splice @$row, 4;
1882              } # max_4_fields
1883
1884          csv (in => csv (in => "file.csv"), out => *STDOUT,
1885              callbacks => { before print => \&max_4_fields });
1886
1887         This callback is not active for "combine".
1888
1889       Callbacks for csv ()
1890
1891       The "csv" allows for some callbacks that do not integrate in XS
1892       internals but only feature the "csv" function.
1893
1894         csv (in        => "file.csv",
1895              callbacks => {
1896                  filter       => { 6 => sub { $_ > 15 } },    # first
1897                  after_parse  => sub { say "AFTER PARSE";  }, # first
1898                  after_in     => sub { say "AFTER IN";     }, # second
1899                  on_in        => sub { say "ON IN";        }, # third
1900                  },
1901              );
1902
1903         csv (in        => $aoh,
1904              out       => "file.csv",
1905              callbacks => {
1906                  on_in        => sub { say "ON IN";        }, # first
1907                  before_out   => sub { say "BEFORE OUT";   }, # second
1908                  before_print => sub { say "BEFORE PRINT"; }, # third
1909                  },
1910              );
1911
1912       filter
1913         This callback can be used to filter records.  It is called just after
1914         a new record has been scanned.  The callback accepts a:
1915
1916         hashref
1917           The keys are the index to the row (the field name or field number,
1918           1-based) and the values are subs to return a true or false value.
1919
1920            csv (in => "file.csv", filter => {
1921                       3 => sub { m/a/ },       # third field should contain an "a"
1922                       5 => sub { length > 4 }, # length of the 5th field minimal 5
1923                       });
1924
1925            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
1926
1927           If the keys to the filter hash contain any character that is not a
1928           digit it will also implicitly set "headers" to "auto"  unless
1929           "headers"  was already passed as argument.  When headers are
1930           active, returning an array of hashes, the filter is not applicable
1931           to the header itself.
1932
1933           All sub results should match, as in AND.
1934
1935           The context of the callback sets  $_ localized to the field
1936           indicated by the filter. The two arguments are as with all other
1937           callbacks, so the other fields in the current row can be seen:
1938
1939            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
1940
1941           If the context is set to return a list of hashes  ("headers" is
1942           defined), the current record will also be available in the
1943           localized %_:
1944
1945            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
1946
1947           If the filter is used to alter the content by changing $_,  make
1948           sure that the sub returns true in order not to have that record
1949           skipped:
1950
1951            filter => { 2 => sub { $_ = uc }}
1952
1953           will upper-case the second field, and then skip it if the resulting
1954           content evaluates to false. To always accept, end with truth:
1955
1956            filter => { 2 => sub { $_ = uc; 1 }}
1957
1958         coderef
1959            csv (in => "file.csv", filter => sub { $n++; 0; });
1960
1961           If the argument to "filter" is a coderef,  it is an alias or
1962           shortcut to a filter on column 0:
1963
1964            csv (filter => sub { $n++; 0 });
1965
1966           is equal to
1967
1968            csv (filter => { 0 => sub { $n++; 0 });
1969
1970         filter-name
1971            csv (in => "file.csv", filter => "not_blank");
1972            csv (in => "file.csv", filter => "not_empty");
1973            csv (in => "file.csv", filter => "filled");
1974
1975           These are predefined filters
1976
1977           Given a file like (line numbers prefixed for doc purpose only):
1978
1979            1:1,2,3
1980            2:
1981            3:,
1982            4:""
1983            5:,,
1984            6:, ,
1985            7:"",
1986            8:" "
1987            9:4,5,6
1988
1989           not_blank
1990             Filter out the blank lines
1991
1992             This filter is a shortcut for
1993
1994              filter => { 0 => sub { @{$_[1]} > 1 or
1995                          defined $_[1][0] && $_[1][0] ne "" } }
1996
1997             Due to the implementation,  it is currently impossible to also
1998             filter lines that consists only of a quoted empty field. These
1999             lines are also considered blank lines.
2000
2001             With the given example, lines 2 and 4 will be skipped.
2002
2003           not_empty
2004             Filter out lines where all the fields are empty.
2005
2006             This filter is a shortcut for
2007
2008              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2009
2010             A space is not regarded being empty, so given the example data,
2011             lines 2, 3, 4, 5, and 7 are skipped.
2012
2013           filled
2014             Filter out lines that have no visible data
2015
2016             This filter is a shortcut for
2017
2018              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2019
2020             This filter rejects all lines that not have at least one field
2021             that does not evaluate to the empty string.
2022
2023             With the given example data, this filter would skip lines 2
2024             through 8.
2025
2026       after_in
2027         This callback is invoked for each record after all records have been
2028         parsed but before returning the reference to the caller.  The hook is
2029         invoked with two arguments:  the current  "CSV"  parser object  and a
2030         reference to the record.   The reference can be a reference to a
2031         HASH  or a reference to an ARRAY as determined by the arguments.
2032
2033         This callback can also be passed as  an attribute without the
2034         "callbacks" wrapper.
2035
2036       before_out
2037         This callback is invoked for each record before the record is
2038         printed.  The hook is invoked with two arguments:  the current "CSV"
2039         parser object and a reference to the record.   The reference can be a
2040         reference to a  HASH or a reference to an ARRAY as determined by the
2041         arguments.
2042
2043         This callback can also be passed as an attribute  without the
2044         "callbacks" wrapper.
2045
2046         This callback makes the row available in %_ if the row is a hashref.
2047         In this case %_ is writable and will change the original row.
2048
2049       on_in
2050         This callback acts exactly as the "after_in" or the "before_out"
2051         hooks.
2052
2053         This callback can also be passed as an attribute  without the
2054         "callbacks" wrapper.
2055
2056         This callback makes the row available in %_ if the row is a hashref.
2057         In this case %_ is writable and will change the original row. So e.g.
2058         with
2059
2060           my $aoh = csv (
2061               in      => \"foo\n1\n2\n",
2062               headers => "auto",
2063               on_in   => sub { $_{bar} = 2; },
2064               );
2065
2066         $aoh will be:
2067
2068           [ { foo => 1,
2069               bar => 2,
2070               }
2071             { foo => 2,
2072               bar => 2,
2073               }
2074             ]
2075
2076       csv
2077         The function  "csv" can also be called as a method or with an
2078         existing Text::CSV_PP object. This could help if the function is to
2079         be invoked a lot of times and the overhead of creating the object
2080         internally over  and  over again would be prevented by passing an
2081         existing instance.
2082
2083          my $csv = Text::CSV_PP->new ({ binary => 1, auto_diag => 1 });
2084
2085          my $aoa = $csv->csv (in => $fh);
2086          my $aoa = csv (in => $fh, csv => $csv);
2087
2088         both act the same. Running this 20000 times on a 20 lines CSV file,
2089         showed a 53% speedup.
2090

DIAGNOSTICS

2092       This section is also taken from Text::CSV_XS.
2093
2094       Still under construction ...
2095
2096       If an error occurs,  "$csv->error_diag" can be used to get information
2097       on the cause of the failure. Note that for speed reasons the internal
2098       value is never cleared on success,  so using the value returned by
2099       "error_diag" in normal cases - when no error occurred - may cause
2100       unexpected results.
2101
2102       If the constructor failed, the cause can be found using "error_diag" as
2103       a class method, like "Text::CSV_PP->error_diag".
2104
2105       The "$csv->error_diag" method is automatically invoked upon error when
2106       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2107       autodie is in effect.  When set to 1, this will cause a "warn" with the
2108       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2109       from "auto_diag" reports.
2110
2111       Errors can be (individually) caught using the "error" callback.
2112
2113       The errors as described below are available. I have tried to make the
2114       error itself explanatory enough, but more descriptions will be added.
2115       For most of these errors, the first three capitals describe the error
2116       category:
2117
2118       · INI
2119
2120         Initialization error or option conflict.
2121
2122       · ECR
2123
2124         Carriage-Return related parse error.
2125
2126       · EOF
2127
2128         End-Of-File related parse error.
2129
2130       · EIQ
2131
2132         Parse error inside quotation.
2133
2134       · EIF
2135
2136         Parse error inside field.
2137
2138       · ECB
2139
2140         Combine error.
2141
2142       · EHR
2143
2144         HashRef parse related error.
2145
2146       And below should be the complete list of error codes that can be
2147       returned:
2148
2149       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2150
2151         The  separation character  cannot be equal to  the quotation
2152         character or to the escape character,  as this would invalidate all
2153         parsing rules.
2154
2155       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2156         TAB"
2157
2158         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2159         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2160         allow.
2161
2162       · 1003 "INI - \r or \n in main attr not allowed"
2163
2164         Using default "eol" characters in either "sep_char", "quote_char",
2165         or  "escape_char"  is  not allowed.
2166
2167       · 1004 "INI - callbacks should be undef or a hashref"
2168
2169         The "callbacks"  attribute only allows one to be "undef" or a hash
2170         reference.
2171
2172       · 1005 "INI - EOL too long"
2173
2174         The value passed for EOL is exceeding its maximum length (16).
2175
2176       · 1006 "INI - SEP too long"
2177
2178         The value passed for SEP is exceeding its maximum length (16).
2179
2180       · 1007 "INI - QUOTE too long"
2181
2182         The value passed for QUOTE is exceeding its maximum length (16).
2183
2184       · 1008 "INI - SEP undefined"
2185
2186         The value passed for SEP should be defined and not empty.
2187
2188       · 1010 "INI - the header is empty"
2189
2190         The header line parsed in the "header" is empty.
2191
2192       · 1011 "INI - the header contains more than one valid separator"
2193
2194         The header line parsed in the  "header"  contains more than one
2195         (unique) separator character out of the allowed set of separators.
2196
2197       · 1012 "INI - the header contains an empty field"
2198
2199         The header line parsed in the "header" is contains an empty field.
2200
2201       · 1013 "INI - the header contains nun-unique fields"
2202
2203         The header line parsed in the  "header"  contains at least  two
2204         identical fields.
2205
2206       · 1014 "INI - header called on undefined stream"
2207
2208         The header line cannot be parsed from an undefined sources.
2209
2210       · 1500 "PRM - Invalid/unsupported argument(s)"
2211
2212         Function or method called with invalid argument(s) or parameter(s).
2213
2214       · 1501 "PRM - The key attribute is passed as an unsupported type"
2215
2216         The "key" attribute is of an unsupported type.
2217
2218       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2219
2220         When  "eol"  has  been  set  to  anything  but the  default,  like
2221         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2222         "quote_char", where the characters following the "\r" do not make up
2223         the "eol" sequence, this is an error.
2224
2225       · 2011 "ECR - Characters after end of quoted field"
2226
2227         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2228         quoted field and after the closing double-quote, there should be
2229         either a new-line sequence or a separation character.
2230
2231       · 2012 "EOF - End of data in parsing input stream"
2232
2233         Self-explaining. End-of-file while inside parsing a stream. Can
2234         happen only when reading from streams with "getline",  as using
2235         "parse" is done on strings that are not required to have a trailing
2236         "eol".
2237
2238       · 2013 "INI - Specification error for fragments RFC7111"
2239
2240         Invalid specification for URI "fragment" specification.
2241
2242       · 2014 "ENF - Inconsistent number of fields"
2243
2244         Inconsistent number of fields under strict parsing.
2245
2246       · 2021 "EIQ - NL char inside quotes, binary off"
2247
2248         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2249         option has been selected with the constructor.
2250
2251       · 2022 "EIQ - CR char inside quotes, binary off"
2252
2253         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2254         option has been selected with the constructor.
2255
2256       · 2023 "EIQ - QUO character not allowed"
2257
2258         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2259         Bar",\n" will cause this error.
2260
2261       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2262
2263         The escape character is not allowed as last character in an input
2264         stream.
2265
2266       · 2025 "EIQ - Loose unescaped escape"
2267
2268         An escape character should escape only characters that need escaping.
2269
2270         Allowing  the escape  for other characters  is possible  with the
2271         attribute "allow_loose_escape".
2272
2273       · 2026 "EIQ - Binary character inside quoted field, binary off"
2274
2275         Binary characters are not allowed by default.    Exceptions are
2276         fields that contain valid UTF-8,  that will automatically be upgraded
2277         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2278         data.
2279
2280       · 2027 "EIQ - Quoted field not terminated"
2281
2282         When parsing a field that started with a quotation character,  the
2283         field is expected to be closed with a quotation character.   When the
2284         parsed line is exhausted before the quote is found, that field is not
2285         terminated.
2286
2287       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2288
2289       · 2031 "EIF - CR char is first char of field, not part of EOL"
2290
2291       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2292
2293       · 2034 "EIF - Loose unescaped quote"
2294
2295       · 2035 "EIF - Escaped EOF in unquoted field"
2296
2297       · 2036 "EIF - ESC error"
2298
2299       · 2037 "EIF - Binary character in unquoted field, binary off"
2300
2301       · 2110 "ECB - Binary character in Combine, binary off"
2302
2303       · 2200 "EIO - print to IO failed. See errno"
2304
2305       · 3001 "EHR - Unsupported syntax for column_names ()"
2306
2307       · 3002 "EHR - getline_hr () called before column_names ()"
2308
2309       · 3003 "EHR - bind_columns () and column_names () fields count
2310         mismatch"
2311
2312       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2313
2314       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2315         fields"
2316
2317       · 3007 "EHR - bind_columns needs refs to writable scalars"
2318
2319       · 3008 "EHR - unexpected error in bound fields"
2320
2321       · 3009 "EHR - print_hr () called before column_names ()"
2322
2323       · 3010 "EHR - print_hr () called with invalid arguments"
2324

SEE ALSO

2326       Text::CSV_XS, Text::CSV
2327
2328       Older versions took many regexp from
2329       <http://www.din.or.jp/~ohzaki/perl.htm>
2330

AUTHOR

2332       Kenichi Ishigaki, <ishigaki[at]cpan.org> Makamaka Hannyaharamitu,
2333       <makamaka[at]cpan.org>
2334
2335       Text::CSV_XS was written by <joe[at]ispsoft.de> and maintained by
2336       <h.m.brand[at]xs4all.nl>.
2337
2338       Text::CSV was written by <alan[at]mfgrtl.com>.
2339
2341       Copyright 2017- by Kenichi Ishigaki, <ishigaki[at]cpan.org> Copyright
2342       2005-2015 by Makamaka Hannyaharamitu, <makamaka[at]cpan.org>
2343
2344       Most of the code and doc is directly taken from the pure perl part of
2345       Text::CSV_XS.
2346
2347       Copyright (C) 2007-2016 H.Merijn Brand.  All rights reserved.
2348       Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2349       Copyright (C) 1997      Alan Citterman.  All rights reserved.
2350
2351       This library is free software; you can redistribute it and/or modify it
2352       under the same terms as Perl itself.
2353
2354
2355
2356perl v5.28.0                      2018-08-17                   Text::CSV_PP(3)
Impressum