1CSV_XS(3)             User Contributed Perl Documentation            CSV_XS(3)
2
3
4

NAME

6       Text::CSV_XS - comma-separated values manipulation routines
7

SYNOPSIS

9        # Functional interface
10        use Text::CSV_XS qw( csv );
11
12        # Read whole file in memory
13        my $aoa = csv (in => "data.csv");    # as array of array
14        my $aoh = csv (in => "data.csv",
15                       headers => "auto");   # as array of hash
16
17        # Write array of arrays as csv file
18        csv (in => $aoa, out => "file.csv", sep_char=> ";");
19
20        # Only show lines where "code" is odd
21        csv (in => "data.csv", filter => { code => sub { $_ % 2 }});
22
23
24        # Object interface
25        use Text::CSV_XS;
26
27        my @rows;
28        # Read/parse CSV
29        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
30        open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
31        while (my $row = $csv->getline ($fh)) {
32            $row->[2] =~ m/pattern/ or next; # 3rd field should match
33            push @rows, $row;
34            }
35        close $fh;
36
37        # and write as CSV
38        open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
39        $csv->say ($fh, $_) for @rows;
40        close $fh or die "new.csv: $!";
41

DESCRIPTION

43       Text::CSV_XS  provides facilities for the composition  and
44       decomposition of comma-separated values.  An instance of the
45       Text::CSV_XS class will combine fields into a "CSV" string and parse a
46       "CSV" string into fields.
47
48       The module accepts either strings or files as input  and support the
49       use of user-specified characters for delimiters, separators, and
50       escapes.
51
52   Embedded newlines
53       Important Note:  The default behavior is to accept only ASCII
54       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
55       that the fields can not contain newlines. If your data contains
56       newlines embedded in fields, or characters above 0x7E (tilde), or
57       binary data, you must set "binary => 1" in the call to "new". To cover
58       the widest range of parsing options, you will always want to set
59       binary.
60
61       But you still have the problem  that you have to pass a correct line to
62       the "parse" method, which is more complicated from the usual point of
63       usage:
64
65        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
66        while (<>) {           #  WRONG!
67            $csv->parse ($_);
68            my @fields = $csv->fields ();
69            }
70
71       this will break, as the "while" might read broken lines:  it does not
72       care about the quoting. If you need to support embedded newlines,  the
73       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
74       and "\r\n" by default) and then
75
76        my $csv = Text::CSV_XS->new ({ binary => 1 });
77        open my $fh, "<", $file or die "$file: $!";
78        while (my $row = $csv->getline ($fh)) {
79            my @fields = @$row;
80            }
81
82       The old(er) way of using global file handles is still supported
83
84        while (my $row = $csv->getline (*ARGV)) { ... }
85
86   Unicode
87       Unicode is only tested to work with perl-5.8.2 and up.
88
89       See also "BOM".
90
91       The simplest way to ensure the correct encoding is used for  in- and
92       output is by either setting layers on the filehandles, or setting the
93       "encoding" argument for "csv".
94
95        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
96       or
97        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
98
99        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
100       or
101        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
102
103       On parsing (both for  "getline" and  "parse"),  if the source is marked
104       being UTF8, then all fields that are marked binary will also be marked
105       UTF8.
106
107       On combining ("print"  and  "combine"):  if any of the combining fields
108       was marked UTF8, the resulting string will be marked as UTF8.  Note
109       however that all fields  before  the first field marked UTF8 and
110       contained 8-bit characters that were not upgraded to UTF8,  these will
111       be  "bytes"  in the resulting string too, possibly causing unexpected
112       errors.  If you pass data of different encoding,  or you don't know if
113       there is  different  encoding, force it to be upgraded before you pass
114       them on:
115
116        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
117
118       For complete control over encoding, please use Text::CSV::Encoded:
119
120        use Text::CSV::Encoded;
121        my $csv = Text::CSV::Encoded->new ({
122            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
123            encoding_out => "cp1252",     # the encoding comes out of Perl
124            });
125
126        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
127        # combine () and print () accept *literally* utf8 encoded data
128        # parse () and getline () return *literally* utf8 encoded data
129
130        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
131        # combine () and print () accept UTF8 marked data
132        # parse () and getline () return UTF8 marked data
133
134   BOM
135       BOM  (or Byte Order Mark)  handling is available only inside the
136       "header" method.   This method supports the following encodings:
137       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
138       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
139       <https://en.wikipedia.org/wiki/Byte_order_mark>.
140
141       If a file has a BOM, the easiest way to deal with that is
142
143        my $aoh = csv (in => $file, detect_bom => 1);
144
145       All records will be encoded based on the detected BOM.
146
147       This implies a call to the  "header"  method,  which defaults to also
148       set the "column_names". So this is not the same as
149
150        my $aoh = csv (in => $file, headers => "auto");
151
152       which only reads the first record to set  "column_names"  but ignores
153       any meaning of possible present BOM.
154

SPECIFICATION

156       While no formal specification for CSV exists, RFC 4180
157       <http://tools.ietf.org/html/rfc4180> (1) describes the common format
158       and establishes  "text/csv" as the MIME type registered with the IANA.
159       RFC 7111 <http://tools.ietf.org/html/rfc7111> (2) adds fragments to
160       CSV.
161
162       Many informal documents exist that describe the "CSV" format.   "How
163       To: The Comma Separated Value (CSV) File Format"
164       <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm> (3)  provides an
165       overview of the  "CSV"  format in the most widely used applications and
166       explains how it can best be used and supported.
167
168        1) http://tools.ietf.org/html/rfc4180
169        2) http://tools.ietf.org/html/rfc7111
170        3) http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
171
172       The basic rules are as follows:
173
174       CSV  is a delimited data format that has fields/columns separated by
175       the comma character and records/rows separated by newlines. Fields that
176       contain a special character (comma, newline, or double quote),  must be
177       enclosed in double quotes. However, if a line contains a single entry
178       that is the empty string, it may be enclosed in double quotes.  If a
179       field's value contains a double quote character it is escaped by
180       placing another double quote character next to it. The "CSV" file
181       format does not require a specific character encoding, byte order, or
182       line terminator format.
183
184       · Each record is a single line ended by a line feed  (ASCII/"LF"=0x0A)
185         or a carriage return and line feed pair (ASCII/"CRLF"="0x0D 0x0A"),
186         however, line-breaks may be embedded.
187
188       · Fields are separated by commas.
189
190       · Allowable characters within a "CSV" field include 0x09 ("TAB") and
191         the inclusive range of 0x20 (space) through 0x7E (tilde).  In binary
192         mode all characters are accepted, at least in quoted fields.
193
194       · A field within  "CSV"  must be surrounded by  double-quotes to
195         contain  a separator character (comma).
196
197       Though this is the most clear and restrictive definition,  Text::CSV_XS
198       is way more liberal than this, and allows extension:
199
200       · Line termination by a single carriage return is accepted by default
201
202       · The separation-, escape-, and escape- characters can be any ASCII
203         character in the range from  0x20 (space) to  0x7E (tilde).
204         Characters outside this range may or may not work as expected.
205         Multibyte characters, like UTF "U+060C" (ARABIC COMMA),   "U+FF0C"
206         (FULLWIDTH COMMA),  "U+241B" (SYMBOL FOR ESCAPE), "U+2424" (SYMBOL
207         FOR NEWLINE), "U+FF02" (FULLWIDTH QUOTATION MARK), and "U+201C" (LEFT
208         DOUBLE QUOTATION MARK) (to give some examples of what might look
209         promising) work for newer versions of perl for "sep_char", and
210         "quote_char" but not for "escape_char".
211
212         If you use perl-5.8.2 or higher these three attributes are
213         utf8-decoded, to increase the likelihood of success. This way
214         "U+00FE" will be allowed as a quote character.
215
216       · A field in  "CSV"  must be surrounded by double-quotes to make an
217         embedded double-quote, represented by a pair of consecutive double-
218         quotes, valid. In binary mode you may additionally use the sequence
219         ""0" for representation of a NULL byte. Using 0x00 in binary mode is
220         just as valid.
221
222       · Several violations of the above specification may be lifted by
223         passing some options as attributes to the object constructor.
224

METHODS

226   version
227       (Class method) Returns the current module version.
228
229   new
230       (Class method) Returns a new instance of class Text::CSV_XS. The
231       attributes are described by the (optional) hash ref "\%attr".
232
233        my $csv = Text::CSV_XS->new ({ attributes ... });
234
235       The following attributes are available:
236
237       eol
238
239        my $csv = Text::CSV_XS->new ({ eol => $/ });
240                  $csv->eol (undef);
241        my $eol = $csv->eol;
242
243       The end-of-line string to add to rows for "print" or the record
244       separator for "getline".
245
246       When not passed in a parser instance,  the default behavior is to
247       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
248       "eol" at all. Passing "undef" or the empty string behave the same.
249
250       When not passed in a generating instance,  records are not terminated
251       at all, so it is probably wise to pass something you expect. A safe
252       choice for "eol" on output is either $/ or "\r\n".
253
254       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
255       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
256       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
257
258       If both $/ and "eol" equal "\015", parsing lines that end on only a
259       Carriage Return without Line Feed, will be "parse"d correct.
260
261       sep_char
262
263        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
264                $csv->sep_char (";");
265        my $c = $csv->sep_char;
266
267       The char used to separate fields, by default a comma. (",").  Limited
268       to a single-byte character, usually in the range from 0x20 (space) to
269       0x7E (tilde). When longer sequences are required, use "sep".
270
271       The separation character can not be equal to the quote character  or to
272       the escape character.
273
274       See also "CAVEATS"
275
276       sep
277
278        my $csv = Text::CSV_XS->new ({ sep => "\N{FULLWIDTH COMMA}" });
279                  $csv->sep (";");
280        my $sep = $csv->sep;
281
282       The chars used to separate fields, by default undefined. Limited to 8
283       bytes.
284
285       When set, overrules "sep_char".  If its length is one byte it acts as
286       an alias to "sep_char".
287
288       See also "CAVEATS"
289
290       quote_char
291
292        my $csv = Text::CSV_XS->new ({ quote_char => "'" });
293                $csv->quote_char (undef);
294        my $c = $csv->quote_char;
295
296       The character to quote fields containing blanks or binary data,  by
297       default the double quote character (""").  A value of undef suppresses
298       quote chars (for simple cases only). Limited to a single-byte
299       character, usually in the range from  0x20 (space) to  0x7E (tilde).
300       When longer sequences are required, use "quote".
301
302       "quote_char" can not be equal to "sep_char".
303
304       quote
305
306        my $csv = Text::CSV_XS->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
307                    $csv->quote ("'");
308        my $quote = $csv->quote;
309
310       The chars used to quote fields, by default undefined. Limited to 8
311       bytes.
312
313       When set, overrules "quote_char". If its length is one byte it acts as
314       an alias to "quote_char".
315
316       This method does not support "undef".  Use "quote_char" to disable
317       quotation.
318
319       See also "CAVEATS"
320
321       escape_char
322
323        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
324                $csv->escape_char (":");
325        my $c = $csv->escape_char;
326
327       The character to  escape  certain characters inside quoted fields.
328       This is limited to a  single-byte  character,  usually  in the  range
329       from  0x20 (space) to 0x7E (tilde).
330
331       The "escape_char" defaults to being the double-quote mark ("""). In
332       other words the same as the default "quote_char". This means that
333       doubling the quote mark in a field escapes it:
334
335        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
336
337       If  you  change  the   "quote_char"  without  changing  the
338       "escape_char",  the  "escape_char" will still be the double-quote
339       (""").  If instead you want to escape the  "quote_char" by doubling it
340       you will need to also change the  "escape_char"  to be the same as what
341       you have changed the "quote_char" to.
342
343       Setting "escape_char" to <undef> or "" will disable escaping completely
344       and is greatly discouraged. This will also disable "escape_null".
345
346       The escape character can not be equal to the separation character.
347
348       binary
349
350        my $csv = Text::CSV_XS->new ({ binary => 1 });
351                $csv->binary (0);
352        my $f = $csv->binary;
353
354       If this attribute is 1,  you may use binary characters in quoted
355       fields, including line feeds, carriage returns and "NULL" bytes. (The
356       latter could be escaped as ""0".) By default this feature is off.
357
358       If a string is marked UTF8,  "binary" will be turned on automatically
359       when binary characters other than "CR" and "NL" are encountered.   Note
360       that a simple string like "\x{00a0}" might still be binary, but not
361       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
362
363       strict
364
365        my $csv = Text::CSV_XS->new ({ strict => 1 });
366                $csv->strict (0);
367        my $f = $csv->strict;
368
369       If this attribute is set to 1, any row that parses to a different
370       number of fields than the previous row will cause the parser to throw
371       error 2014.
372
373       formula_handling
374
375       formula
376
377        my $csv = Text::CSV_XS->new ({ formula => "none" });
378                $csv->formula ("none");
379        my $f = $csv->formula;
380
381       This defines the behavior of fields containing formulas. As formulas
382       are considered dangerous in spreadsheets, this attribute can define an
383       optional action to be taken if a field starts with an equal sign ("=").
384
385       For purpose of code-readability, this can also be written as
386
387        my $csv = Text::CSV_XS->new ({ formula_handling => "none" });
388                $csv->formula_handling ("none");
389        my $f = $csv->formula_handling;
390
391       Possible values for this attribute are
392
393       none
394         Take no specific action. This is the default.
395
396          $csv->formula ("none");
397
398       die
399         Cause the process to "die" whenever a leading "=" is encountered.
400
401          $csv->formula ("die");
402
403       croak
404         Cause the process to "croak" whenever a leading "=" is encountered.
405         (See Carp)
406
407          $csv->formula ("croak");
408
409       diag
410         Report position and content of the field whenever a leading  "=" is
411         found.  The value of the field is unchanged.
412
413          $csv->formula ("diag");
414
415       empty
416         Replace the content of fields that start with a "=" with the empty
417         string.
418
419          $csv->formula ("empty");
420          $csv->formula ("");
421
422       undef
423         Replace the content of fields that start with a "=" with "undef".
424
425          $csv->formula ("undef");
426          $csv->formula (undef);
427
428       a callback
429         Modify the content of fields that start with a  "="  with the return-
430         value of the callback.  The original content of the field is
431         available inside the callback as $_;
432
433          # Replace all formula's with 42
434          $csv->formula (sub { 42; });
435
436          # same as $csv->formula ("empty") but slower
437          $csv->formula (sub { "" });
438
439          # Allow =4+12
440          $csv->formula (sub { s/^=(\d+\+\d+)$/$1/eer });
441
442          # Allow more complex calculations
443          $csv->formula (sub { eval { s{^=([-+*/0-9()]+)$}{$1}ee }; $_ });
444
445       All other values will give a warning and then fallback to "diag".
446
447       decode_utf8
448
449        my $csv = Text::CSV_XS->new ({ decode_utf8 => 1 });
450                $csv->decode_utf8 (0);
451        my $f = $csv->decode_utf8;
452
453       This attributes defaults to TRUE.
454
455       While parsing,  fields that are valid UTF-8, are automatically set to
456       be UTF-8, so that
457
458         $csv->parse ("\xC4\xA8\n");
459
460       results in
461
462         PV("\304\250"\0) [UTF8 "\x{128}"]
463
464       Sometimes it might not be a desired action.  To prevent those upgrades,
465       set this attribute to false, and the result will be
466
467         PV("\304\250"\0)
468
469       auto_diag
470
471        my $csv = Text::CSV_XS->new ({ auto_diag => 1 });
472                $csv->auto_diag (2);
473        my $l = $csv->auto_diag;
474
475       Set this attribute to a number between 1 and 9 causes  "error_diag" to
476       be automatically called in void context upon errors.
477
478       In case of error "2012 - EOF", this call will be void.
479
480       If "auto_diag" is set to a numeric value greater than 1, it will "die"
481       on errors instead of "warn".  If set to anything unrecognized,  it will
482       be silently ignored.
483
484       Future extensions to this feature will include more reliable auto-
485       detection of  "autodie"  being active in the scope of which the error
486       occurred which will increment the value of "auto_diag" with  1 the
487       moment the error is detected.
488
489       diag_verbose
490
491        my $csv = Text::CSV_XS->new ({ diag_verbose => 1 });
492                $csv->diag_verbose (2);
493        my $l = $csv->diag_verbose;
494
495       Set the verbosity of the output triggered by "auto_diag".   Currently
496       only adds the current  input-record-number  (if known)  to the
497       diagnostic output with an indication of the position of the error.
498
499       blank_is_undef
500
501        my $csv = Text::CSV_XS->new ({ blank_is_undef => 1 });
502                $csv->blank_is_undef (0);
503        my $f = $csv->blank_is_undef;
504
505       Under normal circumstances, "CSV" data makes no distinction between
506       quoted- and unquoted empty fields.  These both end up in an empty
507       string field once read, thus
508
509        1,"",," ",2
510
511       is read as
512
513        ("1", "", "", " ", "2")
514
515       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
516       set, the unquoted  empty field is the result of an undefined value.
517       To enable this distinction when  reading "CSV"  data,  the
518       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
519       to "undef", causing the above to be parsed as
520
521        ("1", "", undef, " ", "2")
522
523       Note that this is specifically important when loading  "CSV" fields
524       into a database that allows "NULL" values,  as the perl equivalent for
525       "NULL" is "undef" in DBI land.
526
527       empty_is_undef
528
529        my $csv = Text::CSV_XS->new ({ empty_is_undef => 1 });
530                $csv->empty_is_undef (0);
531        my $f = $csv->empty_is_undef;
532
533       Going one  step  further  than  "blank_is_undef",  this attribute
534       converts all empty fields to "undef", so
535
536        1,"",," ",2
537
538       is read as
539
540        (1, undef, undef, " ", 2)
541
542       Note that this affects only fields that are  originally  empty,  not
543       fields that are empty after stripping allowed whitespace. YMMV.
544
545       allow_whitespace
546
547        my $csv = Text::CSV_XS->new ({ allow_whitespace => 1 });
548                $csv->allow_whitespace (0);
549        my $f = $csv->allow_whitespace;
550
551       When this option is set to true,  the whitespace  ("TAB"'s and
552       "SPACE"'s) surrounding  the  separation character  is removed when
553       parsing.  If either "TAB" or "SPACE" is one of the three characters
554       "sep_char", "quote_char", or "escape_char" it will not be considered
555       whitespace.
556
557       Now lines like:
558
559        1 , "foo" , bar , 3 , zapp
560
561       are parsed as valid "CSV", even though it violates the "CSV" specs.
562
563       Note that  all  whitespace is stripped from both  start and  end of
564       each field.  That would make it  more than a feature to enable parsing
565       bad "CSV" lines, as
566
567        1,   2.0,  3,   ape  , monkey
568
569       will now be parsed as
570
571        ("1", "2.0", "3", "ape", "monkey")
572
573       even if the original line was perfectly acceptable "CSV".
574
575       allow_loose_quotes
576
577        my $csv = Text::CSV_XS->new ({ allow_loose_quotes => 1 });
578                $csv->allow_loose_quotes (0);
579        my $f = $csv->allow_loose_quotes;
580
581       By default, parsing unquoted fields containing "quote_char" characters
582       like
583
584        1,foo "bar" baz,42
585
586       would result in parse error 2034.  Though it is still bad practice to
587       allow this format,  we  cannot  help  the  fact  that  some  vendors
588       make  their applications spit out lines styled this way.
589
590       If there is really bad "CSV" data, like
591
592        1,"foo "bar" baz",42
593
594       or
595
596        1,""foo bar baz"",42
597
598       there is a way to get this data-line parsed and leave the quotes inside
599       the quoted field as-is.  This can be achieved by setting
600       "allow_loose_quotes" AND making sure that the "escape_char" is  not
601       equal to "quote_char".
602
603       allow_loose_escapes
604
605        my $csv = Text::CSV_XS->new ({ allow_loose_escapes => 1 });
606                $csv->allow_loose_escapes (0);
607        my $f = $csv->allow_loose_escapes;
608
609       Parsing fields  that  have  "escape_char"  characters that escape
610       characters that do not need to be escaped, like:
611
612        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
613        $csv->parse (qq{1,"my bar\'s",baz,42});
614
615       would result in parse error 2025.   Though it is bad practice to allow
616       this format,  this attribute enables you to treat all escape character
617       sequences equal.
618
619       allow_unquoted_escape
620
621        my $csv = Text::CSV_XS->new ({ allow_unquoted_escape => 1 });
622                $csv->allow_unquoted_escape (0);
623        my $f = $csv->allow_unquoted_escape;
624
625       A backward compatibility issue where "escape_char" differs from
626       "quote_char"  prevents  "escape_char" to be in the first position of a
627       field.  If "quote_char" is equal to the default """ and "escape_char"
628       is set to "\", this would be illegal:
629
630        1,\0,2
631
632       Setting this attribute to 1  might help to overcome issues with
633       backward compatibility and allow this style.
634
635       always_quote
636
637        my $csv = Text::CSV_XS->new ({ always_quote => 1 });
638                $csv->always_quote (0);
639        my $f = $csv->always_quote;
640
641       By default the generated fields are quoted only if they need to be.
642       For example, if they contain the separator character. If you set this
643       attribute to 1 then all defined fields will be quoted. ("undef" fields
644       are not quoted, see "blank_is_undef"). This makes it quite often easier
645       to handle exported data in external applications.   (Poor creatures who
646       are better to use Text::CSV_XS. :)
647
648       quote_space
649
650        my $csv = Text::CSV_XS->new ({ quote_space => 1 });
651                $csv->quote_space (0);
652        my $f = $csv->quote_space;
653
654       By default,  a space in a field would trigger quotation.  As no rule
655       exists this to be forced in "CSV",  nor any for the opposite, the
656       default is true for safety.   You can exclude the space  from this
657       trigger  by setting this attribute to 0.
658
659       quote_empty
660
661        my $csv = Text::CSV_XS->new ({ quote_empty => 1 });
662                $csv->quote_empty (0);
663        my $f = $csv->quote_empty;
664
665       By default the generated fields are quoted only if they need to be.
666       An empty (defined) field does not need quotation. If you set this
667       attribute to 1 then empty defined fields will be quoted.  ("undef"
668       fields are not quoted, see "blank_is_undef"). See also "always_quote".
669
670       quote_binary
671
672        my $csv = Text::CSV_XS->new ({ quote_binary => 1 });
673                $csv->quote_binary (0);
674        my $f = $csv->quote_binary;
675
676       By default,  all "unsafe" bytes inside a string cause the combined
677       field to be quoted.  By setting this attribute to 0, you can disable
678       that trigger for bytes >= 0x7F.
679
680       escape_null
681
682        my $csv = Text::CSV_XS->new ({ escape_null => 1 });
683                $csv->escape_null (0);
684        my $f = $csv->escape_null;
685
686       By default, a "NULL" byte in a field would be escaped. This option
687       enables you to treat the  "NULL"  byte as a simple binary character in
688       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
689       can prevent "NULL" escapes by setting this attribute to 0.
690
691       When the "escape_char" attribute is set to undefined,  this attribute
692       will be set to false.
693
694       The default setting will encode "=\x00=" as
695
696        "="0="
697
698       With "escape_null" set, this will result in
699
700        "=\x00="
701
702       The default when using the "csv" function is "false".
703
704       For backward compatibility reasons,  the deprecated old name
705       "quote_null" is still recognized.
706
707       keep_meta_info
708
709        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1 });
710                $csv->keep_meta_info (0);
711        my $f = $csv->keep_meta_info;
712
713       By default, the parsing of input records is as simple and fast as
714       possible.  However,  some parsing information - like quotation of the
715       original field - is lost in that process.  Setting this flag to true
716       enables retrieving that information after parsing with  the methods
717       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
718       false for performance.
719
720       If you set this attribute to a value greater than 9,   then you can
721       control output quotation style like it was used in the input of the the
722       last parsed record (unless quotation was added because of other
723       reasons).
724
725        my $csv = Text::CSV_XS->new ({
726           binary         => 1,
727           keep_meta_info => 1,
728           quote_space    => 0,
729           });
730
731        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
732
733        $csv->print (*STDOUT, \@row);
734        # 1,,, , ,f,g,"h""h",help,help
735        $csv->keep_meta_info (11);
736        $csv->print (*STDOUT, \@row);
737        # 1,,"", ," ",f,"g","h""h",help,"help"
738
739       undef_str
740
741        my $csv = Text::CSV_XS->new ({ undef_str => "\\N" });
742                $csv->undef_str (undef);
743        my $s = $csv->undef_str;
744
745       This attribute optionally defines the output of undefined fields. The
746       value passed is not changed at all, so if it needs quotation, the
747       quotation needs to be included in the value of the attribute.  Use with
748       caution, as passing a value like  ",",,,,"""  will for sure mess up
749       your output. The default for this attribute is "undef", meaning no
750       special treatment.
751
752       This attribute is useful when exporting  CSV data  to be imported in
753       custom loaders, like for MySQL, that recognize special sequences for
754       "NULL" data.
755
756       This attribute has no meaning when parsing CSV data.
757
758       verbatim
759
760        my $csv = Text::CSV_XS->new ({ verbatim => 1 });
761                $csv->verbatim (0);
762        my $f = $csv->verbatim;
763
764       This is a quite controversial attribute to set,  but makes some hard
765       things possible.
766
767       The rationale behind this attribute is to tell the parser that the
768       normally special characters newline ("NL") and Carriage Return ("CR")
769       will not be special when this flag is set,  and be dealt with  as being
770       ordinary binary characters. This will ease working with data with
771       embedded newlines.
772
773       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
774       every line.
775
776       Imagine a file format like
777
778        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
779
780       where, the line ending is a very specific "#\r\n", and the sep_char is
781       a "^" (caret).   None of the fields is quoted,   but embedded binary
782       data is likely to be present. With the specific line ending, this
783       should not be too hard to detect.
784
785       By default,  Text::CSV_XS'  parse function is instructed to only know
786       about "\n" and "\r"  to be legal line endings,  and so has to deal with
787       the embedded newline as a real "end-of-line",  so it can scan the next
788       line if binary is true, and the newline is inside a quoted field. With
789       this option, we tell "parse" to parse the line as if "\n" is just
790       nothing more than a binary character.
791
792       For "parse" this means that the parser has no more idea about line
793       ending and "getline" "chomp"s line endings on reading.
794
795       types
796
797       A set of column types; the attribute is immediately passed to the
798       "types" method.
799
800       callbacks
801
802       See the "Callbacks" section below.
803
804       accessors
805
806       To sum it up,
807
808        $csv = Text::CSV_XS->new ();
809
810       is equivalent to
811
812        $csv = Text::CSV_XS->new ({
813            eol                   => undef, # \r, \n, or \r\n
814            sep_char              => ',',
815            sep                   => undef,
816            quote_char            => '"',
817            quote                 => undef,
818            escape_char           => '"',
819            binary                => 0,
820            decode_utf8           => 1,
821            auto_diag             => 0,
822            diag_verbose          => 0,
823            blank_is_undef        => 0,
824            empty_is_undef        => 0,
825            allow_whitespace      => 0,
826            allow_loose_quotes    => 0,
827            allow_loose_escapes   => 0,
828            allow_unquoted_escape => 0,
829            always_quote          => 0,
830            quote_empty           => 0,
831            quote_space           => 1,
832            escape_null           => 1,
833            quote_binary          => 1,
834            keep_meta_info        => 0,
835            strict                => 0,
836            formula               => 0,
837            verbatim              => 0,
838            undef_str             => undef,
839            types                 => undef,
840            callbacks             => undef,
841            });
842
843       For all of the above mentioned flags, an accessor method is available
844       where you can inquire the current value, or change the value
845
846        my $quote = $csv->quote_char;
847        $csv->binary (1);
848
849       It is not wise to change these settings halfway through writing "CSV"
850       data to a stream. If however you want to create a new stream using the
851       available "CSV" object, there is no harm in changing them.
852
853       If the "new" constructor call fails,  it returns "undef",  and makes
854       the fail reason available through the "error_diag" method.
855
856        $csv = Text::CSV_XS->new ({ ecs_char => 1 }) or
857            die "".Text::CSV_XS->error_diag ();
858
859       "error_diag" will return a string like
860
861        "INI - Unknown attribute 'ecs_char'"
862
863   known_attributes
864        @attr = Text::CSV_XS->known_attributes;
865        @attr = Text::CSV_XS::known_attributes;
866        @attr = $csv->known_attributes;
867
868       This method will return an ordered list of all the supported
869       attributes as described above.   This can be useful for knowing what
870       attributes are valid in classes that use or extend Text::CSV_XS.
871
872   print
873        $status = $csv->print ($fh, $colref);
874
875       Similar to  "combine" + "string" + "print",  but much more efficient.
876       It expects an array ref as input  (not an array!)  and the resulting
877       string is not really  created,  but  immediately  written  to the  $fh
878       object, typically an IO handle or any other object that offers a
879       "print" method.
880
881       For performance reasons  "print"  does not create a result string,  so
882       all "string", "status", "fields", and "error_input" methods will return
883       undefined information after executing this method.
884
885       If $colref is "undef"  (explicit,  not through a variable argument) and
886       "bind_columns"  was used to specify fields to be printed,  it is
887       possible to make performance improvements, as otherwise data would have
888       to be copied as arguments to the method call:
889
890        $csv->bind_columns (\($foo, $bar));
891        $status = $csv->print ($fh, undef);
892
893       A short benchmark
894
895        my @data = ("aa" .. "zz");
896        $csv->bind_columns (\(@data));
897
898        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
899        $csv->print ($fh,  \@data  );   # 57600 recs/sec
900        $csv->print ($fh,   undef  );   # 48500 recs/sec
901
902   say
903        $status = $csv->say ($fh, $colref);
904
905       Like "print", but "eol" defaults to "$\".
906
907   print_hr
908        $csv->print_hr ($fh, $ref);
909
910       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
911       provided the column names are set with "column_names".
912
913       It is just a wrapper method with basic parameter checks over
914
915        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
916
917   combine
918        $status = $csv->combine (@fields);
919
920       This method constructs a "CSV" record from  @fields,  returning success
921       or failure.   Failure can result from lack of arguments or an argument
922       that contains an invalid character.   Upon success,  "string" can be
923       called to retrieve the resultant "CSV" string.  Upon failure,  the
924       value returned by "string" is undefined and "error_input" could be
925       called to retrieve the invalid argument.
926
927   string
928        $line = $csv->string ();
929
930       This method returns the input to  "parse"  or the resultant "CSV"
931       string of "combine", whichever was called more recently.
932
933   getline
934        $colref = $csv->getline ($fh);
935
936       This is the counterpart to  "print",  as "parse"  is the counterpart to
937       "combine":  it parses a row from the $fh  handle using the "getline"
938       method associated with $fh  and parses this row into an array ref.
939       This array ref is returned by the function or "undef" for failure.
940       When $fh does not support "getline", you are likely to hit errors.
941
942       When fields are bound with "bind_columns" the return value is a
943       reference to an empty list.
944
945       The "string", "fields", and "status" methods are meaningless again.
946
947   getline_all
948        $arrayref = $csv->getline_all ($fh);
949        $arrayref = $csv->getline_all ($fh, $offset);
950        $arrayref = $csv->getline_all ($fh, $offset, $length);
951
952       This will return a reference to a list of getline ($fh) results.  In
953       this call, "keep_meta_info" is disabled.  If $offset is negative, as
954       with "splice", only the last  "abs ($offset)" records of $fh are taken
955       into consideration.
956
957       Given a CSV file with 10 lines:
958
959        lines call
960        ----- ---------------------------------------------------------
961        0..9  $csv->getline_all ($fh)         # all
962        0..9  $csv->getline_all ($fh,  0)     # all
963        8..9  $csv->getline_all ($fh,  8)     # start at 8
964        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
965        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
966        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
967        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
968        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
969
970   getline_hr
971       The "getline_hr" and "column_names" methods work together  to allow you
972       to have rows returned as hashrefs.  You must call "column_names" first
973       to declare your column names.
974
975        $csv->column_names (qw( code name price description ));
976        $hr = $csv->getline_hr ($fh);
977        print "Price for $hr->{name} is $hr->{price} EUR\n";
978
979       "getline_hr" will croak if called before "column_names".
980
981       Note that  "getline_hr"  creates a hashref for every row and will be
982       much slower than the combined use of "bind_columns"  and "getline" but
983       still offering the same easy to use hashref inside the loop:
984
985        my @cols = @{$csv->getline ($fh)};
986        $csv->column_names (@cols);
987        while (my $row = $csv->getline_hr ($fh)) {
988            print $row->{price};
989            }
990
991       Could easily be rewritten to the much faster:
992
993        my @cols = @{$csv->getline ($fh)};
994        my $row = {};
995        $csv->bind_columns (\@{$row}{@cols});
996        while ($csv->getline ($fh)) {
997            print $row->{price};
998            }
999
1000       Your mileage may vary for the size of the data and the number of rows.
1001       With perl-5.14.2 the comparison for a 100_000 line file with 14
1002       columns:
1003
1004                   Rate hashrefs getlines
1005        hashrefs 1.00/s       --     -76%
1006        getlines 4.15/s     313%       --
1007
1008   getline_hr_all
1009        $arrayref = $csv->getline_hr_all ($fh);
1010        $arrayref = $csv->getline_hr_all ($fh, $offset);
1011        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
1012
1013       This will return a reference to a list of   getline_hr ($fh) results.
1014       In this call, "keep_meta_info" is disabled.
1015
1016   parse
1017        $status = $csv->parse ($line);
1018
1019       This method decomposes a  "CSV"  string into fields,  returning success
1020       or failure.   Failure can result from a lack of argument  or the given
1021       "CSV" string is improperly formatted.   Upon success, "fields" can be
1022       called to retrieve the decomposed fields. Upon failure calling "fields"
1023       will return undefined data and  "error_input"  can be called to
1024       retrieve  the invalid argument.
1025
1026       You may use the "types"  method for setting column types.  See "types"'
1027       description below.
1028
1029       The $line argument is supposed to be a simple scalar. Everything else
1030       is supposed to croak and set error 1500.
1031
1032   fragment
1033       This function tries to implement RFC7111  (URI Fragment Identifiers for
1034       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
1035
1036        my $AoA = $csv->fragment ($fh, $spec);
1037
1038       In specifications,  "*" is used to specify the last item, a dash ("-")
1039       to indicate a range.   All indices are 1-based:  the first row or
1040       column has index 1. Selections can be combined with the semi-colon
1041       (";").
1042
1043       When using this method in combination with  "column_names",  the
1044       returned reference  will point to a  list of hashes  instead of a  list
1045       of lists.  A disjointed  cell-based combined selection  might return
1046       rows with different number of columns making the use of hashes
1047       unpredictable.
1048
1049        $csv->column_names ("Name", "Age");
1050        my $AoH = $csv->fragment ($fh, "col=3;8");
1051
1052       If the "after_parse" callback is active,  it is also called on every
1053       line parsed and skipped before the fragment.
1054
1055       row
1056          row=4
1057          row=5-7
1058          row=6-*
1059          row=1-2;4;6-*
1060
1061       col
1062          col=2
1063          col=1-3
1064          col=4-*
1065          col=1-2;4;7-*
1066
1067       cell
1068         In cell-based selection, the comma (",") is used to pair row and
1069         column
1070
1071          cell=4,1
1072
1073         The range operator ("-") using "cell"s can be used to define top-left
1074         and bottom-right "cell" location
1075
1076          cell=3,1-4,6
1077
1078         The "*" is only allowed in the second part of a pair
1079
1080          cell=3,2-*,2    # row 3 till end, only column 2
1081          cell=3,2-3,*    # column 2 till end, only row 3
1082          cell=3,2-*,*    # strip row 1 and 2, and column 1
1083
1084         Cells and cell ranges may be combined with ";", possibly resulting in
1085         rows with different numbers of columns
1086
1087          cell=1,1-2,2;3,3-4,4;1,4;4,1
1088
1089         Disjointed selections will only return selected cells.   The cells
1090         that are not  specified  will  not  be  included  in the  returned
1091         set,  not even as "undef".  As an example given a "CSV" like
1092
1093          11,12,13,...19
1094          21,22,...28,29
1095          :            :
1096          91,...97,98,99
1097
1098         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
1099
1100          11,12,14
1101          21,22
1102          33,34
1103          41,43,44
1104
1105         Overlapping cell-specs will return those cells only once, So
1106         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1107
1108          11,12,13
1109          21,22,23,24
1110          31,32,33,34
1111          42,43,44
1112
1113       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1114       types of specs to be combined   (either "row" or "col" or "cell").
1115       Passing an invalid fragment specification will croak and set error
1116       2013.
1117
1118   column_names
1119       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1120       keys (column names) are passed, it will return the current setting as a
1121       list.
1122
1123       "column_names" accepts a list of scalars  (the column names)  or a
1124       single array_ref, so you can pass the return value from "getline" too:
1125
1126        $csv->column_names ($csv->getline ($fh));
1127
1128       "column_names" does no checking on duplicates at all, which might lead
1129       to unexpected results.   Undefined entries will be replaced with the
1130       string "\cAUNDEF\cA", so
1131
1132        $csv->column_names (undef, "", "name", "name");
1133        $hr = $csv->getline_hr ($fh);
1134
1135       will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1136       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1137       field.
1138
1139       "column_names" croaks on invalid arguments.
1140
1141   header
1142       This method does NOT work in perl-5.6.x
1143
1144       Parse the CSV header and set "sep", column_names and encoding.
1145
1146        my @hdr = $csv->header ($fh);
1147        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1148        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1149
1150       The first argument should be a file handle.
1151
1152       This method resets some object properties,  as it is supposed to be
1153       invoked only once per file or stream.  It will leave attributes
1154       "column_names" and "bound_columns" alone if setting column names is
1155       disabled. Reading headers on previously process objects might fail on
1156       perl-5.8.0 and older.
1157
1158       Assuming that the file opened for parsing has a header, and the header
1159       does not contain problematic characters like embedded newlines,   read
1160       the first line from the open handle then auto-detect whether the header
1161       separates the column names with a character from the allowed separator
1162       list.
1163
1164       If any of the allowed separators matches,  and none of the other
1165       allowed separators match,  set  "sep"  to that  separator  for the
1166       current CSV_XS instance and use it to parse the first line, map those
1167       to lowercase, and use that to set the instance "column_names":
1168
1169        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1170        open my $fh, "<", "file.csv";
1171        binmode $fh; # for Windows
1172        $csv->header ($fh);
1173        while (my $row = $csv->getline_hr ($fh)) {
1174            ...
1175            }
1176
1177       If the header is empty,  contains more than one unique separator out of
1178       the allowed set,  contains empty fields,   or contains identical fields
1179       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1180       respectively.
1181
1182       If the header contains embedded newlines or is not valid  CSV  in any
1183       other way, this method will croak and leave the parse error untouched.
1184
1185       A successful call to "header"  will always set the  "sep"  of the $csv
1186       object. This behavior can not be disabled.
1187
1188       return value
1189
1190       On error this method will croak.
1191
1192       In list context,  the headers will be returned whether they are used to
1193       set "column_names" or not.
1194
1195       In scalar context, the instance itself is returned.  Note: the values
1196       as found in the header will effectively be  lost if  "set_column_names"
1197       is false.
1198
1199       Options
1200
1201       sep_set
1202          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1203
1204         The list of legal separators defaults to "[ ";", "," ]" and can be
1205         changed by this option.  As this is probably the most often used
1206         option,  it can be passed on its own as an unnamed argument:
1207
1208          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1209
1210         Multi-byte  sequences are allowed,  both multi-character and
1211         Unicode.  See "sep".
1212
1213       detect_bom
1214          $csv->header ($fh, { detect_bom => 1 });
1215
1216         The default behavior is to detect if the header line starts with a
1217         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1218         This default behavior can be disabled by passing a false value to
1219         "detect_bom".
1220
1221         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1222         UTF-32BE,  and UTF-32LE. BOM also supports UTF-1, UTF-EBCDIC, SCSU,
1223         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1224         supported.
1225
1226         If a supported BOM was detected as start of the stream, it is stored
1227         in the object attribute "ENCODING".
1228
1229          my $enc = $csv->{ENCODING};
1230
1231         The encoding is used with "binmode" on $fh.
1232
1233         If the handle was opened in a (correct) encoding,  this method will
1234         not alter the encoding, as it checks the leading bytes of the first
1235         line. In case the stream starts with a decoded BOM ("U+FEFF"),
1236         "{ENCODING}" will be "" (empty) instead of the default "undef".
1237
1238       munge_column_names
1239         This option offers the means to modify the column names into
1240         something that is most useful to the application.   The default is to
1241         map all column names to lower case.
1242
1243          $csv->header ($fh, { munge_column_names => "lc" });
1244
1245         The following values are available:
1246
1247           lc     - lower case
1248           uc     - upper case
1249           db     - valid DB field names
1250           none   - do not change
1251           \%hash - supply a mapping
1252           \&cb   - supply a callback
1253
1254         Lower case
1255            $csv->header ($fh, { munge_column_names => "lc" });
1256
1257           The header is changed to all lower-case
1258
1259            $_ = lc;
1260
1261         Upper case
1262            $csv->header ($fh, { munge_column_names => "uc" });
1263
1264           The header is changed to all upper-case
1265
1266            $_ = uc;
1267
1268         Literal
1269            $csv->header ($fh, { munge_column_names => "none" });
1270
1271         Hash
1272            $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1273
1274           if a value does not exist, the original value is used unchanged
1275
1276         Database
1277            $csv->header ($fh, { munge_column_names => "db" });
1278
1279           - lower-case
1280
1281           - all sequences of non-word characters are replaced with an
1282             underscore
1283
1284           - all leading underscores are removed
1285
1286            $_ = lc (s/\W+/_/gr =~ s/^_+//r);
1287
1288         Callback
1289            $csv->header ($fh, { munge_column_names => sub { fc } });
1290            $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1291            $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1292
1293           As this callback is called in a "map", you can use $_ directly.
1294
1295       set_column_names
1296          $csv->header ($fh, { set_column_names => 1 });
1297
1298         The default is to set the instances column names using
1299         "column_names" if the method is successful,  so subsequent calls to
1300         "getline_hr" can return a hash. Disable setting the header can be
1301         forced by using a false value for this option.
1302
1303         As described in "return value" above, content is lost in scalar
1304         context.
1305
1306       Validation
1307
1308       When receiving CSV files from external sources,  this method can be
1309       used to protect against changes in the layout by restricting to known
1310       headers  (and typos in the header fields).
1311
1312        my %known = (
1313            "record key" => "c_rec",
1314            "rec id"     => "c_rec",
1315            "id_rec"     => "c_rec",
1316            "kode"       => "code",
1317            "code"       => "code",
1318            "vaule"      => "value",
1319            "value"      => "value",
1320            );
1321        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1322        open my $fh, "<", $source or die "$source: $!";
1323        $csv->header ($fh, { munge_column_names => sub {
1324            s/\s+$//;
1325            s/^\s+//;
1326            $known{lc $_} or die "Unknown column '$_' in $source";
1327            }});
1328        while (my $row = $csv->getline_hr ($fh)) {
1329            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1330            }
1331
1332   bind_columns
1333       Takes a list of scalar references to be used for output with  "print"
1334       or to store in the fields fetched by "getline".  When you do not pass
1335       enough references to store the fetched fields in, "getline" will fail
1336       with error 3006.  If you pass more than there are fields to return,
1337       the content of the remaining references is left untouched.
1338
1339        $csv->bind_columns (\$code, \$name, \$price, \$description);
1340        while ($csv->getline ($fh)) {
1341            print "The price of a $name is \x{20ac} $price\n";
1342            }
1343
1344       To reset or clear all column binding, call "bind_columns" with the
1345       single argument "undef". This will also clear column names.
1346
1347        $csv->bind_columns (undef);
1348
1349       If no arguments are passed at all, "bind_columns" will return the list
1350       of current bindings or "undef" if no binds are active.
1351
1352       Note that in parsing with  "bind_columns",  the fields are set on the
1353       fly.  That implies that if the third field of a row causes an error
1354       (or this row has just two fields where the previous row had more),  the
1355       first two fields already have been assigned the values of the current
1356       row, while the rest of the fields will still hold the values of the
1357       previous row.  If you want the parser to fail in these cases, use the
1358       "strict" attribute.
1359
1360   eof
1361        $eof = $csv->eof ();
1362
1363       If "parse" or  "getline"  was used with an IO stream,  this method will
1364       return true (1) if the last call hit end of file,  otherwise it will
1365       return false ('').  This is useful to see the difference between a
1366       failure and end of file.
1367
1368       Note that if the parsing of the last line caused an error,  "eof" is
1369       still true.  That means that if you are not using "auto_diag", an idiom
1370       like
1371
1372        while (my $row = $csv->getline ($fh)) {
1373            # ...
1374            }
1375        $csv->eof or $csv->error_diag;
1376
1377       will not report the error. You would have to change that to
1378
1379        while (my $row = $csv->getline ($fh)) {
1380            # ...
1381            }
1382        +$csv->error_diag and $csv->error_diag;
1383
1384   types
1385        $csv->types (\@tref);
1386
1387       This method is used to force that  (all)  columns are of a given type.
1388       For example, if you have an integer column,  two  columns  with
1389       doubles  and a string column, then you might do a
1390
1391        $csv->types ([Text::CSV_XS::IV (),
1392                      Text::CSV_XS::NV (),
1393                      Text::CSV_XS::NV (),
1394                      Text::CSV_XS::PV ()]);
1395
1396       Column types are used only for decoding columns while parsing,  in
1397       other words by the "parse" and "getline" methods.
1398
1399       You can unset column types by doing a
1400
1401        $csv->types (undef);
1402
1403       or fetch the current type settings with
1404
1405        $types = $csv->types ();
1406
1407       IV  Set field type to integer.
1408
1409       NV  Set field type to numeric/float.
1410
1411       PV  Set field type to string.
1412
1413   fields
1414        @columns = $csv->fields ();
1415
1416       This method returns the input to   "combine"  or the resultant
1417       decomposed fields of a successful "parse", whichever was called more
1418       recently.
1419
1420       Note that the return value is undefined after using "getline", which
1421       does not fill the data structures returned by "parse".
1422
1423   meta_info
1424        @flags = $csv->meta_info ();
1425
1426       This method returns the "flags" of the input to "combine" or the flags
1427       of the resultant  decomposed fields of  "parse",   whichever was called
1428       more recently.
1429
1430       For each field,  a meta_info field will hold  flags that  inform
1431       something about  the  field  returned  by  the  "fields"  method or
1432       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1433
1434       " "0x0001
1435         The field was quoted.
1436
1437       " "0x0002
1438         The field was binary.
1439
1440       See the "is_***" methods below.
1441
1442   is_quoted
1443        my $quoted = $csv->is_quoted ($column_idx);
1444
1445       where  $column_idx is the  (zero-based)  index of the column in the
1446       last result of "parse".
1447
1448       This returns a true value  if the data in the indicated column was
1449       enclosed in "quote_char" quotes.  This might be important for fields
1450       where content ",20070108," is to be treated as a numeric value,  and
1451       where ","20070108"," is explicitly marked as character string data.
1452
1453       This method is only valid when "keep_meta_info" is set to a true value.
1454
1455   is_binary
1456        my $binary = $csv->is_binary ($column_idx);
1457
1458       where  $column_idx is the  (zero-based)  index of the column in the
1459       last result of "parse".
1460
1461       This returns a true value if the data in the indicated column contained
1462       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1463
1464       This method is only valid when "keep_meta_info" is set to a true value.
1465
1466   is_missing
1467        my $missing = $csv->is_missing ($column_idx);
1468
1469       where  $column_idx is the  (zero-based)  index of the column in the
1470       last result of "getline_hr".
1471
1472        $csv->keep_meta_info (1);
1473        while (my $hr = $csv->getline_hr ($fh)) {
1474            $csv->is_missing (0) and next; # This was an empty line
1475            }
1476
1477       When using  "getline_hr",  it is impossible to tell if the  parsed
1478       fields are "undef" because they where not filled in the "CSV" stream
1479       or because they were not read at all, as all the fields defined by
1480       "column_names" are set in the hash-ref.    If you still need to know if
1481       all fields in each row are provided, you should enable "keep_meta_info"
1482       so you can check the flags.
1483
1484       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1485       "undef", regardless of $column_idx being valid or not. If this
1486       attribute is "true" it will return either 0 (the field is present) or 1
1487       (the field is missing).
1488
1489       A special case is the empty line.  If the line is completely empty -
1490       after dealing with the flags - this is still a valid CSV line:  it is a
1491       record of just one single empty field. However, if "keep_meta_info" is
1492       set, invoking "is_missing" with index 0 will now return true.
1493
1494   status
1495        $status = $csv->status ();
1496
1497       This method returns the status of the last invoked "combine" or "parse"
1498       call. Status is success (true: 1) or failure (false: "undef" or 0).
1499
1500   error_input
1501        $bad_argument = $csv->error_input ();
1502
1503       This method returns the erroneous argument (if it exists) of "combine"
1504       or "parse",  whichever was called more recently.  If the last
1505       invocation was successful, "error_input" will return "undef".
1506
1507   error_diag
1508        Text::CSV_XS->error_diag ();
1509        $csv->error_diag ();
1510        $error_code               = 0  + $csv->error_diag ();
1511        $error_str                = "" . $csv->error_diag ();
1512        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1513
1514       If (and only if) an error occurred,  this function returns  the
1515       diagnostics of that error.
1516
1517       If called in void context,  this will print the internal error code and
1518       the associated error message to STDERR.
1519
1520       If called in list context,  this will return  the error code  and the
1521       error message in that order.  If the last error was from parsing, the
1522       rest of the values returned are a best guess at the location  within
1523       the line  that was being parsed. Their values are 1-based.  The
1524       position currently is index of the byte at which the parsing failed in
1525       the current record. It might change to be the index of the current
1526       character in a later release. The records is the index of the record
1527       parsed by the csv instance. The field number is the index of the field
1528       the parser thinks it is currently  trying to  parse. See
1529       examples/csv-check for how this can be used.
1530
1531       If called in  scalar context,  it will return  the diagnostics  in a
1532       single scalar, a-la $!.  It will contain the error code in numeric
1533       context, and the diagnostics message in string context.
1534
1535       When called as a class method or a  direct function call,  the
1536       diagnostics are that of the last "new" call.
1537
1538   record_number
1539        $recno = $csv->record_number ();
1540
1541       Returns the records parsed by this csv instance.  This value should be
1542       more accurate than $. when embedded newlines come in play. Records
1543       written by this instance are not counted.
1544
1545   SetDiag
1546        $csv->SetDiag (0);
1547
1548       Use to reset the diagnostics if you are dealing with errors.
1549

FUNCTIONS

1551   csv
1552       This function is not exported by default and should be explicitly
1553       requested:
1554
1555        use Text::CSV_XS qw( csv );
1556
1557       This is a high-level function that aims at simple (user) interfaces.
1558       This can be used to read/parse a "CSV" file or stream (the default
1559       behavior) or to produce a file or write to a stream (define the  "out"
1560       attribute).  It returns an array- or hash-reference on parsing (or
1561       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1562       When this function fails you can get to the error using the class call
1563       to "error_diag"
1564
1565        my $aoa = csv (in => "test.csv") or
1566            die Text::CSV_XS->error_diag;
1567
1568       This function takes the arguments as key-value pairs. This can be
1569       passed as a list or as an anonymous hash:
1570
1571        my $aoa = csv (  in => "test.csv", sep_char => ";");
1572        my $aoh = csv ({ in => $fh, headers => "auto" });
1573
1574       The arguments passed consist of two parts:  the arguments to "csv"
1575       itself and the optional attributes to the  "CSV"  object used inside
1576       the function as enumerated and explained in "new".
1577
1578       If not overridden, the default option used for CSV is
1579
1580        auto_diag   => 1
1581        escape_null => 0
1582
1583       The option that is always set and cannot be altered is
1584
1585        binary      => 1
1586
1587       As this function will likely be used in one-liners,  it allows  "quote"
1588       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1589       "esc" or "escape".
1590
1591       Alternative invocations:
1592
1593        my $aoa = Text::CSV_XS::csv (in => "file.csv");
1594
1595        my $csv = Text::CSV_XS->new ();
1596        my $aoa = $csv->csv (in => "file.csv");
1597
1598       In the latter case, the object attributes are used from the existing
1599       object and the attribute arguments in the function call are ignored:
1600
1601        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
1602        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1603
1604       will parse using ";" as "sep_char", not ",".
1605
1606       in
1607
1608       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1609       which will be  opened for reading  and closed when finished,  a file
1610       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1611       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1612       "\q{1,2,"csv"}").
1613
1614       When used with "out", "in" should be a reference to a CSV structure
1615       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1616       reference.  The code-ref will be invoked with no arguments.
1617
1618        my $aoa = csv (in => "file.csv");
1619
1620        open my $fh, "<", "file.csv";
1621        my $aoa = csv (in => $fh);
1622
1623        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1624        my $err = csv (in => $csv, out => "file.csv");
1625
1626       If called in void context without the "out" attribute, the resulting
1627       ref will be used as input to a subsequent call to csv:
1628
1629        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1630
1631       will be a shortcut to
1632
1633        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1634
1635       where, in the absence of the "out" attribute, this is a shortcut to
1636
1637        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1638             out => *STDOUT)
1639
1640       out
1641
1642        csv (in => $aoa, out => "file.csv");
1643        csv (in => $aoa, out => $fh);
1644        csv (in => $aoa, out =>   STDOUT);
1645        csv (in => $aoa, out =>  *STDOUT);
1646        csv (in => $aoa, out => \*STDOUT);
1647        csv (in => $aoa, out => \my $data);
1648        csv (in => $aoa, out =>  undef);
1649        csv (in => $aoa, out => \"skip");
1650
1651       In output mode, the default CSV options when producing CSV are
1652
1653        eol       => "\r\n"
1654
1655       The "fragment" attribute is ignored in output mode.
1656
1657       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1658       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1659       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1660       or a reference to a scalar (e.g. "\my $data").
1661
1662        csv (in => sub { $sth->fetch },            out => "dump.csv");
1663        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1664             headers => $sth->{NAME_lc});
1665
1666       When a code-ref is used for "in", the output is generated  per
1667       invocation, so no buffering is involved. This implies that there is no
1668       size restriction on the number of records. The "csv" function ends when
1669       the coderef returns a false value.
1670
1671       If "out" is set to a reference of the literal string "skip", the output
1672       will be suppressed completely,  which might be useful in combination
1673       with a filter for side effects only.
1674
1675        my %cache;
1676        csv (in    => "dump.csv",
1677             out   => \"skip",
1678             on_in => sub { $cache{$_[1][1]}++ });
1679
1680       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1681       equivalent to "\"skip"".
1682
1683       encoding
1684
1685       If passed,  it should be an encoding accepted by the  ":encoding()"
1686       option to "open". There is no default value. This attribute does not
1687       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1688       use in command line invocations.
1689
1690       If "encoding" is set to the literal value "auto", the method "header"
1691       will be invoked on the opened stream to check if there is a BOM and set
1692       the encoding accordingly.   This is equal to passing a true value in
1693       the option "detect_bom".
1694
1695       Encodings can be stacked, as supported by "binmode":
1696
1697        # Using PerlIO::via::gzip
1698        csv (in       => \@csv,
1699             out      => "test.csv:via.gz",
1700             encoding => ":via(gzip):encoding(utf-8)",
1701             );
1702        $aoa = csv (in => "test.csv:via.gz",  encoding => ":via(gzip)");
1703
1704        # Using PerlIO::gzip
1705        csv (in       => \@csv,
1706             out      => "test.csv:via.gz",
1707             encoding => ":gzip:encoding(utf-8)",
1708             );
1709        $aoa = csv (in => "test.csv:gzip.gz", encoding => ":gzip");
1710
1711       detect_bom
1712
1713       If  "detect_bom"  is given, the method  "header"  will be invoked on
1714       the opened stream to check if there is a BOM and set the encoding
1715       accordingly.
1716
1717       "detect_bom" can be abbreviated to "bom".
1718
1719       This is the same as setting "encoding" to "auto".
1720
1721       Note that as the method  "header" is invoked,  its default is to also
1722       set the headers.
1723
1724       headers
1725
1726       If this attribute is not given, the default behavior is to produce an
1727       array of arrays.
1728
1729       If "headers" is supplied,  it should be an anonymous list of column
1730       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1731       "lc", "uc", or "skip".
1732
1733       skip
1734         When "skip" is used, the header will not be included in the output.
1735
1736          my $aoa = csv (in => $fh, headers => "skip");
1737
1738       auto
1739         If "auto" is used, the first line of the "CSV" source will be read as
1740         the list of field headers and used to produce an array of hashes.
1741
1742          my $aoh = csv (in => $fh, headers => "auto");
1743
1744       lc
1745         If "lc" is used,  the first line of the  "CSV" source will be read as
1746         the list of field headers mapped to  lower case and used to produce
1747         an array of hashes. This is a variation of "auto".
1748
1749          my $aoh = csv (in => $fh, headers => "lc");
1750
1751       uc
1752         If "uc" is used,  the first line of the  "CSV" source will be read as
1753         the list of field headers mapped to  upper case and used to produce
1754         an array of hashes. This is a variation of "auto".
1755
1756          my $aoh = csv (in => $fh, headers => "uc");
1757
1758       CODE
1759         If a coderef is used,  the first line of the  "CSV" source will be
1760         read as the list of mangled field headers in which each field is
1761         passed as the only argument to the coderef. This list is used to
1762         produce an array of hashes.
1763
1764          my $aoh = csv (in      => $fh,
1765                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1766
1767         this example is a variation of using "lc" where all occurrences of
1768         "kode" are replaced with "code".
1769
1770       ARRAY
1771         If  "headers"  is an anonymous list,  the entries in the list will be
1772         used as field names. The first line is considered data instead of
1773         headers.
1774
1775          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1776          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1777
1778       HASH
1779         If "headers" is a hash reference, this implies "auto", but header
1780         fields that exist as key in the hashref will be replaced by the value
1781         for that key. Given a CSV file like
1782
1783          post-kode,city,name,id number,fubble
1784          1234AA,Duckstad,Donald,13,"X313DF"
1785
1786         using
1787
1788          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1789
1790         will return an entry like
1791
1792          { pc     => "1234AA",
1793            city   => "Duckstad",
1794            name   => "Donald",
1795            ID     => "13",
1796            fubble => "X313DF",
1797            }
1798
1799       See also "munge_column_names" and "set_column_names".
1800
1801       munge_column_names
1802
1803       If "munge_column_names" is set,  the method  "header"  is invoked on
1804       the opened stream with all matching arguments to detect and set the
1805       headers.
1806
1807       "munge_column_names" can be abbreviated to "munge".
1808
1809       key
1810
1811       If passed,  will default  "headers"  to "auto" and return a hashref
1812       instead of an array of hashes. Allowed values are simple scalars or
1813       array-references where the first element is the joiner and the rest are
1814       the fields to join to combine the key.
1815
1816        my $ref = csv (in => "test.csv", key => "code");
1817        my $ref = csv (in => "test.csv", key => [ ":" => "code", "color" ]);
1818
1819       with test.csv like
1820
1821        code,product,price,color
1822        1,pc,850,gray
1823        2,keyboard,12,white
1824        3,mouse,5,black
1825
1826       the first example will return
1827
1828         { 1   => {
1829               code    => 1,
1830               color   => 'gray',
1831               price   => 850,
1832               product => 'pc'
1833               },
1834           2   => {
1835               code    => 2,
1836               color   => 'white',
1837               price   => 12,
1838               product => 'keyboard'
1839               },
1840           3   => {
1841               code    => 3,
1842               color   => 'black',
1843               price   => 5,
1844               product => 'mouse'
1845               }
1846           }
1847
1848       the second example will return
1849
1850         { "1:gray"    => {
1851               code    => 1,
1852               color   => 'gray',
1853               price   => 850,
1854               product => 'pc'
1855               },
1856           "2:white"   => {
1857               code    => 2,
1858               color   => 'white',
1859               price   => 12,
1860               product => 'keyboard'
1861               },
1862           "3:black"   => {
1863               code    => 3,
1864               color   => 'black',
1865               price   => 5,
1866               product => 'mouse'
1867               }
1868           }
1869
1870       The "key" attribute can be combined with "headers" for "CSV" date that
1871       has no header line, like
1872
1873        my $ref = csv (
1874            in      => "foo.csv",
1875            headers => [qw( c_foo foo bar description stock )],
1876            key     =>     "c_foo",
1877            );
1878
1879       value
1880
1881       Used to create key-value hashes.
1882
1883       Only allowed when "key" is valid. A "value" can be either a single
1884       column label or an anonymous list of column labels.  In the first case,
1885       the value will be a simple scalar value, in the latter case, it will be
1886       a hashref.
1887
1888        my $ref = csv (in => "test.csv", key   => "code",
1889                                         value => "price");
1890        my $ref = csv (in => "test.csv", key   => "code",
1891                                         value => [ "product", "price" ]);
1892        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1893                                         value => "price");
1894        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1895                                         value => [ "product", "price" ]);
1896
1897       with test.csv like
1898
1899        code,product,price,color
1900        1,pc,850,gray
1901        2,keyboard,12,white
1902        3,mouse,5,black
1903
1904       the first example will return
1905
1906         { 1 => 850,
1907           2 =>  12,
1908           3 =>   5,
1909           }
1910
1911       the second example will return
1912
1913         { 1   => {
1914               price   => 850,
1915               product => 'pc'
1916               },
1917           2   => {
1918               price   => 12,
1919               product => 'keyboard'
1920               },
1921           3   => {
1922               price   => 5,
1923               product => 'mouse'
1924               }
1925           }
1926
1927       the third example will return
1928
1929         { "1:gray"    => 850,
1930           "2:white"   =>  12,
1931           "3:black"   =>   5,
1932           }
1933
1934       the fourth example will return
1935
1936         { "1:gray"    => {
1937               price   => 850,
1938               product => 'pc'
1939               },
1940           "2:white"   => {
1941               price   => 12,
1942               product => 'keyboard'
1943               },
1944           "3:black"   => {
1945               price   => 5,
1946               product => 'mouse'
1947               }
1948           }
1949
1950       keep_headers
1951
1952       When using hashes,  keep the column names into the arrayref passed,  so
1953       all headers are available after the call in the original order.
1954
1955        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1956
1957       This attribute can be abbreviated to "kh" or passed as
1958       "keep_column_names".
1959
1960       This attribute implies a default of "auto" for the "headers" attribute.
1961
1962       fragment
1963
1964       Only output the fragment as defined in the "fragment" method. This
1965       option is ignored when generating "CSV". See "out".
1966
1967       Combining all of them could give something like
1968
1969        use Text::CSV_XS qw( csv );
1970        my $aoh = csv (
1971            in       => "test.txt",
1972            encoding => "utf-8",
1973            headers  => "auto",
1974            sep_char => "|",
1975            fragment => "row=3;6-9;15-*",
1976            );
1977        say $aoh->[15]{Foo};
1978
1979       sep_set
1980
1981       If "sep_set" is set, the method "header" is invoked on the opened
1982       stream to detect and set "sep_char" with the given set.
1983
1984       "sep_set" can be abbreviated to "seps".
1985
1986       Note that as the  "header" method is invoked,  its default is to also
1987       set the headers.
1988
1989       set_column_names
1990
1991       If  "set_column_names" is passed,  the method "header" is invoked on
1992       the opened stream with all arguments meant for "header".
1993
1994       If "set_column_names" is passed as a false value, the content of the
1995       first row is only preserved if the output is AoA:
1996
1997       With an input-file like
1998
1999        bAr,foo
2000        1,2
2001        3,4,5
2002
2003       This call
2004
2005        my $aoa = csv (in => $file, set_column_names => 0);
2006
2007       will result in
2008
2009        [[ "bar", "foo"     ],
2010         [ "1",   "2"       ],
2011         [ "3",   "4",  "5" ]]
2012
2013       and
2014
2015        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
2016
2017       will result in
2018
2019        [[ "bAr", "foo"     ],
2020         [ "1",   "2"       ],
2021         [ "3",   "4",  "5" ]]
2022
2023   Callbacks
2024       Callbacks enable actions triggered from the inside of Text::CSV_XS.
2025
2026       While most of what this enables  can easily be done in an  unrolled
2027       loop as described in the "SYNOPSIS" callbacks can be used to meet
2028       special demands or enhance the "csv" function.
2029
2030       error
2031          $csv->callbacks (error => sub { $csv->SetDiag (0) });
2032
2033         the "error"  callback is invoked when an error occurs,  but  only
2034         when "auto_diag" is set to a true value. A callback is invoked with
2035         the values returned by "error_diag":
2036
2037          my ($c, $s);
2038
2039          sub ignore3006 {
2040              my ($err, $msg, $pos, $recno, $fldno) = @_;
2041              if ($err == 3006) {
2042                  # ignore this error
2043                  ($c, $s) = (undef, undef);
2044                  Text::CSV_XS->SetDiag (0);
2045                  }
2046              # Any other error
2047              return;
2048              } # ignore3006
2049
2050          $csv->callbacks (error => \&ignore3006);
2051          $csv->bind_columns (\$c, \$s);
2052          while ($csv->getline ($fh)) {
2053              # Error 3006 will not stop the loop
2054              }
2055
2056       after_parse
2057          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
2058          while (my $row = $csv->getline ($fh)) {
2059              $row->[-1] eq "NEW";
2060              }
2061
2062         This callback is invoked after parsing with  "getline"  only if no
2063         error occurred.  The callback is invoked with two arguments:   the
2064         current "CSV" parser object and an array reference to the fields
2065         parsed.
2066
2067         The return code of the callback is ignored  unless it is a reference
2068         to the string "skip", in which case the record will be skipped in
2069         "getline_all".
2070
2071          sub add_from_db {
2072              my ($csv, $row) = @_;
2073              $sth->execute ($row->[4]);
2074              push @$row, $sth->fetchrow_array;
2075              } # add_from_db
2076
2077          my $aoa = csv (in => "file.csv", callbacks => {
2078              after_parse => \&add_from_db });
2079
2080         This hook can be used for validation:
2081
2082         FAIL
2083           Die if any of the records does not validate a rule:
2084
2085            after_parse => sub {
2086                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
2087                    die "5th field does not have a valid Dutch zipcode";
2088                }
2089
2090         DEFAULT
2091           Replace invalid fields with a default value:
2092
2093            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
2094
2095         SKIP
2096           Skip records that have invalid fields (only applies to
2097           "getline_all"):
2098
2099            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
2100
2101       before_print
2102          my $idx = 1;
2103          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
2104          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
2105
2106         This callback is invoked  before printing with  "print"  only if no
2107         error occurred.  The callback is invoked with two arguments:  the
2108         current  "CSV" parser object and an array reference to the fields
2109         passed.
2110
2111         The return code of the callback is ignored.
2112
2113          sub max_4_fields {
2114              my ($csv, $row) = @_;
2115              @$row > 4 and splice @$row, 4;
2116              } # max_4_fields
2117
2118          csv (in => csv (in => "file.csv"), out => *STDOUT,
2119              callbacks => { before_print => \&max_4_fields });
2120
2121         This callback is not active for "combine".
2122
2123       Callbacks for csv ()
2124
2125       The "csv" allows for some callbacks that do not integrate in XS
2126       internals but only feature the "csv" function.
2127
2128         csv (in        => "file.csv",
2129              callbacks => {
2130                  filter       => { 6 => sub { $_ > 15 } },    # first
2131                  after_parse  => sub { say "AFTER PARSE";  }, # first
2132                  after_in     => sub { say "AFTER IN";     }, # second
2133                  on_in        => sub { say "ON IN";        }, # third
2134                  },
2135              );
2136
2137         csv (in        => $aoh,
2138              out       => "file.csv",
2139              callbacks => {
2140                  on_in        => sub { say "ON IN";        }, # first
2141                  before_out   => sub { say "BEFORE OUT";   }, # second
2142                  before_print => sub { say "BEFORE PRINT"; }, # third
2143                  },
2144              );
2145
2146       filter
2147         This callback can be used to filter records.  It is called just after
2148         a new record has been scanned.  The callback accepts a:
2149
2150         hashref
2151           The keys are the index to the row (the field name or field number,
2152           1-based) and the values are subs to return a true or false value.
2153
2154            csv (in => "file.csv", filter => {
2155                       3 => sub { m/a/ },       # third field should contain an "a"
2156                       5 => sub { length > 4 }, # length of the 5th field minimal 5
2157                       });
2158
2159            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
2160
2161           If the keys to the filter hash contain any character that is not a
2162           digit it will also implicitly set "headers" to "auto"  unless
2163           "headers"  was already passed as argument.  When headers are
2164           active, returning an array of hashes, the filter is not applicable
2165           to the header itself.
2166
2167           All sub results should match, as in AND.
2168
2169           The context of the callback sets  $_ localized to the field
2170           indicated by the filter. The two arguments are as with all other
2171           callbacks, so the other fields in the current row can be seen:
2172
2173            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
2174
2175           If the context is set to return a list of hashes  ("headers" is
2176           defined), the current record will also be available in the
2177           localized %_:
2178
2179            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
2180
2181           If the filter is used to alter the content by changing $_,  make
2182           sure that the sub returns true in order not to have that record
2183           skipped:
2184
2185            filter => { 2 => sub { $_ = uc }}
2186
2187           will upper-case the second field, and then skip it if the resulting
2188           content evaluates to false. To always accept, end with truth:
2189
2190            filter => { 2 => sub { $_ = uc; 1 }}
2191
2192         coderef
2193            csv (in => "file.csv", filter => sub { $n++; 0; });
2194
2195           If the argument to "filter" is a coderef,  it is an alias or
2196           shortcut to a filter on column 0:
2197
2198            csv (filter => sub { $n++; 0 });
2199
2200           is equal to
2201
2202            csv (filter => { 0 => sub { $n++; 0 });
2203
2204         filter-name
2205            csv (in => "file.csv", filter => "not_blank");
2206            csv (in => "file.csv", filter => "not_empty");
2207            csv (in => "file.csv", filter => "filled");
2208
2209           These are predefined filters
2210
2211           Given a file like (line numbers prefixed for doc purpose only):
2212
2213            1:1,2,3
2214            2:
2215            3:,
2216            4:""
2217            5:,,
2218            6:, ,
2219            7:"",
2220            8:" "
2221            9:4,5,6
2222
2223           not_blank
2224             Filter out the blank lines
2225
2226             This filter is a shortcut for
2227
2228              filter => { 0 => sub { @{$_[1]} > 1 or
2229                          defined $_[1][0] && $_[1][0] ne "" } }
2230
2231             Due to the implementation,  it is currently impossible to also
2232             filter lines that consists only of a quoted empty field. These
2233             lines are also considered blank lines.
2234
2235             With the given example, lines 2 and 4 will be skipped.
2236
2237           not_empty
2238             Filter out lines where all the fields are empty.
2239
2240             This filter is a shortcut for
2241
2242              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2243
2244             A space is not regarded being empty, so given the example data,
2245             lines 2, 3, 4, 5, and 7 are skipped.
2246
2247           filled
2248             Filter out lines that have no visible data
2249
2250             This filter is a shortcut for
2251
2252              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2253
2254             This filter rejects all lines that not have at least one field
2255             that does not evaluate to the empty string.
2256
2257             With the given example data, this filter would skip lines 2
2258             through 8.
2259
2260         One could also use modules like Types::Standard:
2261
2262          use Types::Standard -types;
2263
2264          my $type   = Tuple[Str, Str, Int, Bool, Optional[Num]];
2265          my $check  = $type->compiled_check;
2266
2267          # filter with compiled check and warnings
2268          my $aoa = csv (
2269             in     => \$data,
2270             filter => {
2271                 0 => sub {
2272                     my $ok = $check->($_[1]) or
2273                         warn $type->get_message ($_[1]), "\n";
2274                     return $ok;
2275                     },
2276                 },
2277             );
2278
2279       after_in
2280         This callback is invoked for each record after all records have been
2281         parsed but before returning the reference to the caller.  The hook is
2282         invoked with two arguments:  the current  "CSV"  parser object  and a
2283         reference to the record.   The reference can be a reference to a
2284         HASH  or a reference to an ARRAY as determined by the arguments.
2285
2286         This callback can also be passed as  an attribute without the
2287         "callbacks" wrapper.
2288
2289       before_out
2290         This callback is invoked for each record before the record is
2291         printed.  The hook is invoked with two arguments:  the current "CSV"
2292         parser object and a reference to the record.   The reference can be a
2293         reference to a  HASH or a reference to an ARRAY as determined by the
2294         arguments.
2295
2296         This callback can also be passed as an attribute  without the
2297         "callbacks" wrapper.
2298
2299         This callback makes the row available in %_ if the row is a hashref.
2300         In this case %_ is writable and will change the original row.
2301
2302       on_in
2303         This callback acts exactly as the "after_in" or the "before_out"
2304         hooks.
2305
2306         This callback can also be passed as an attribute  without the
2307         "callbacks" wrapper.
2308
2309         This callback makes the row available in %_ if the row is a hashref.
2310         In this case %_ is writable and will change the original row. So e.g.
2311         with
2312
2313           my $aoh = csv (
2314               in      => \"foo\n1\n2\n",
2315               headers => "auto",
2316               on_in   => sub { $_{bar} = 2; },
2317               );
2318
2319         $aoh will be:
2320
2321           [ { foo => 1,
2322               bar => 2,
2323               }
2324             { foo => 2,
2325               bar => 2,
2326               }
2327             ]
2328
2329       csv
2330         The function  "csv" can also be called as a method or with an
2331         existing Text::CSV_XS object. This could help if the function is to
2332         be invoked a lot of times and the overhead of creating the object
2333         internally over  and  over again would be prevented by passing an
2334         existing instance.
2335
2336          my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2337
2338          my $aoa = $csv->csv (in => $fh);
2339          my $aoa = csv (in => $fh, csv => $csv);
2340
2341         both act the same. Running this 20000 times on a 20 lines CSV file,
2342         showed a 53% speedup.
2343

INTERNALS

2345       Combine (...)
2346       Parse (...)
2347
2348       The arguments to these internal functions are deliberately not
2349       described or documented in order to enable the  module authors make
2350       changes it when they feel the need for it.  Using them is  highly
2351       discouraged  as  the  API may change in future releases.
2352

EXAMPLES

2354   Reading a CSV file line by line:
2355        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2356        open my $fh, "<", "file.csv" or die "file.csv: $!";
2357        while (my $row = $csv->getline ($fh)) {
2358            # do something with @$row
2359            }
2360        close $fh or die "file.csv: $!";
2361
2362       or
2363
2364        my $aoa = csv (in => "file.csv", on_in => sub {
2365            # do something with %_
2366            });
2367
2368       Reading only a single column
2369
2370        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2371        open my $fh, "<", "file.csv" or die "file.csv: $!";
2372        # get only the 4th column
2373        my @column = map { $_->[3] } @{$csv->getline_all ($fh)};
2374        close $fh or die "file.csv: $!";
2375
2376       with "csv", you could do
2377
2378        my @column = map { $_->[0] }
2379            @{csv (in => "file.csv", fragment => "col=4")};
2380
2381   Parsing CSV strings:
2382        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1, binary => 1 });
2383
2384        my $sample_input_string =
2385            qq{"I said, ""Hi!""",Yes,"",2.34,,"1.09","\x{20ac}",};
2386        if ($csv->parse ($sample_input_string)) {
2387            my @field = $csv->fields;
2388            foreach my $col (0 .. $#field) {
2389                my $quo = $csv->is_quoted ($col) ? $csv->{quote_char} : "";
2390                printf "%2d: %s%s%s\n", $col, $quo, $field[$col], $quo;
2391                }
2392            }
2393        else {
2394            print STDERR "parse () failed on argument: ",
2395                $csv->error_input, "\n";
2396            $csv->error_diag ();
2397            }
2398
2399       Parsing CSV from memory
2400
2401       Given a complete CSV data-set in scalar $data,  generate a list of
2402       lists to represent the rows and fields
2403
2404        # The data
2405        my $data = join "\r\n" => map { join "," => 0 .. 5 } 0 .. 5;
2406
2407        # in a loop
2408        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2409        open my $fh, "<", \$data;
2410        my @foo;
2411        while (my $row = $csv->getline ($fh)) {
2412            push @foo, $row;
2413            }
2414        close $fh;
2415
2416        # a single call
2417        my $foo = csv (in => \$data);
2418
2419   Printing CSV data
2420       The fast way: using "print"
2421
2422       An example for creating "CSV" files using the "print" method:
2423
2424        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
2425        open my $fh, ">", "foo.csv" or die "foo.csv: $!";
2426        for (1 .. 10) {
2427            $csv->print ($fh, [ $_, "$_" ]) or $csv->error_diag;
2428            }
2429        close $fh or die "$tbl.csv: $!";
2430
2431       The slow way: using "combine" and "string"
2432
2433       or using the slower "combine" and "string" methods:
2434
2435        my $csv = Text::CSV_XS->new;
2436
2437        open my $csv_fh, ">", "hello.csv" or die "hello.csv: $!";
2438
2439        my @sample_input_fields = (
2440            'You said, "Hello!"',   5.67,
2441            '"Surely"',   '',   '3.14159');
2442        if ($csv->combine (@sample_input_fields)) {
2443            print $csv_fh $csv->string, "\n";
2444            }
2445        else {
2446            print "combine () failed on argument: ",
2447                $csv->error_input, "\n";
2448            }
2449        close $csv_fh or die "hello.csv: $!";
2450
2451       Generating CSV into memory
2452
2453       Format a data-set (@foo) into a scalar value in memory ($data):
2454
2455        # The data
2456        my @foo = map { [ 0 .. 5 ] } 0 .. 3;
2457
2458        # in a loop
2459        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, eol => "\r\n" });
2460        open my $fh, ">", \my $data;
2461        $csv->print ($fh, $_) for @foo;
2462        close $fh;
2463
2464        # a single call
2465        csv (in => \@foo, out => \my $data);
2466
2467   Rewriting CSV
2468       Rewrite "CSV" files with ";" as separator character to well-formed
2469       "CSV":
2470
2471        use Text::CSV_XS qw( csv );
2472        csv (in => csv (in => "bad.csv", sep_char => ";"), out => *STDOUT);
2473
2474       As "STDOUT" is now default in "csv", a one-liner converting a UTF-16
2475       CSV file with BOM and TAB-separation to valid UTF-8 CSV could be:
2476
2477        $ perl -C3 -MText::CSV_XS=csv -we\
2478           'csv(in=>"utf16tab.csv",encoding=>"utf16",sep=>"\t")' >utf8.csv
2479
2480   Dumping database tables to CSV
2481       Dumping a database table can be simple as this (TIMTOWTDI):
2482
2483        my $dbh = DBI->connect (...);
2484        my $sql = "select * from foo";
2485
2486        # using your own loop
2487        open my $fh, ">", "foo.csv" or die "foo.csv: $!\n";
2488        my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n" });
2489        my $sth = $dbh->prepare ($sql); $sth->execute;
2490        $csv->print ($fh, $sth->{NAME_lc});
2491        while (my $row = $sth->fetch) {
2492            $csv->print ($fh, $row);
2493            }
2494
2495        # using the csv function, all in memory
2496        csv (out => "foo.csv", in => $dbh->selectall_arrayref ($sql));
2497
2498        # using the csv function, streaming with callbacks
2499        my $sth = $dbh->prepare ($sql); $sth->execute;
2500        csv (out => "foo.csv", in => sub { $sth->fetch            });
2501        csv (out => "foo.csv", in => sub { $sth->fetchrow_hashref });
2502
2503       Note that this does not discriminate between "empty" values and NULL-
2504       values from the database,  as both will be the same empty field in CSV.
2505       To enable distinction between the two, use "quote_empty".
2506
2507        csv (out => "foo.csv", in => sub { $sth->fetch }, quote_empty => 1);
2508
2509       If the database import utility supports special sequences to insert
2510       "NULL" values into the database,  like MySQL/MariaDB supports "\N",
2511       use a filter or a map
2512
2513        csv (out => "foo.csv", in => sub { $sth->fetch },
2514                            on_in => sub { $_ //= "\\N" for @{$_[1]} });
2515
2516        while (my $row = $sth->fetch) {
2517            $csv->print ($fh, [ map { $_ // "\\N" } @$row ]);
2518            }
2519
2520       Note that this will not work as expected when choosing the backslash
2521       ("\") as "escape_char", as that will cause the "\" to need to be
2522       escaped by yet another "\",  which will cause the field to need
2523       quotation and thus ending up as "\\N" instead of "\N". See also
2524       "undef_str".
2525
2526        csv (out => "foo.csv", in => sub { $sth->fetch }, undef_str => "\\N");
2527
2528       These special sequences are not recognized by  Text::CSV_XS  on parsing
2529       the CSV generated like this, but map and filter are your friends again
2530
2531        while (my $row = $csv->getline ($fh)) {
2532            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @$row);
2533            }
2534
2535        csv (in => "foo.csv", filter => { 1 => sub {
2536            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @{$_[1]}); 0; }});
2537
2538   The examples folder
2539       For more extended examples, see the examples/ 1. sub-directory in the
2540       original distribution or the git repository 2.
2541
2542        1. https://github.com/Tux/Text-CSV_XS/tree/master/examples
2543        2. https://github.com/Tux/Text-CSV_XS
2544
2545       The following files can be found there:
2546
2547       parser-xs.pl
2548         This can be used as a boilerplate to parse invalid "CSV"  and parse
2549         beyond (expected) errors alternative to using the "error" callback.
2550
2551          $ perl examples/parser-xs.pl bad.csv >good.csv
2552
2553       csv-check
2554         This is a command-line tool that uses parser-xs.pl  techniques to
2555         check the "CSV" file and report on its content.
2556
2557          $ csv-check files/utf8.csv
2558          Checked files/utf8.csv  with csv-check 1.9
2559          using Text::CSV_XS 1.32 with perl 5.26.0 and Unicode 9.0.0
2560          OK: rows: 1, columns: 2
2561              sep = <,>, quo = <">, bin = <1>, eol = <"\n">
2562
2563       csv2xls
2564         A script to convert "CSV" to Microsoft Excel ("XLS"). This requires
2565         extra modules Date::Calc and Spreadsheet::WriteExcel. The converter
2566         accepts various options and can produce UTF-8 compliant Excel files.
2567
2568       csv2xlsx
2569         A script to convert "CSV" to Microsoft Excel ("XLSX").  This requires
2570         the modules Date::Calc and Spreadsheet::Writer::XLSX.  The converter
2571         does accept various options including merging several "CSV" files
2572         into a single Excel file.
2573
2574       csvdiff
2575         A script that provides colorized diff on sorted CSV files,  assuming
2576         first line is header and first field is the key. Output options
2577         include colorized ANSI escape codes or HTML.
2578
2579          $ csvdiff --html --output=diff.html file1.csv file2.csv
2580
2581       rewrite.pl
2582         A script to rewrite (in)valid CSV into valid CSV files.  Script has
2583         options to generate confusing CSV files or CSV files that conform to
2584         Dutch MS-Excel exports (using ";" as separation).
2585
2586         Script - by default - honors BOM  and auto-detects separation
2587         converting it to default standard CSV with "," as separator.
2588

CAVEATS

2590       Text::CSV_XS  is not designed to detect the characters used to quote
2591       and separate fields.  The parsing is done using predefined  (default)
2592       settings.  In the examples  sub-directory,  you can find scripts  that
2593       demonstrate how you could try to detect these characters yourself.
2594
2595   Microsoft Excel
2596       The import/export from Microsoft Excel is a risky task, according to
2597       the documentation in "Text::CSV::Separator".  Microsoft uses the
2598       system's list separator defined in the regional settings, which happens
2599       to be a semicolon for Dutch, German and Spanish (and probably some
2600       others as well).   For the English locale,  the default is a comma.
2601       In Windows however,  the user is free to choose a  predefined locale,
2602       and then change  every  individual setting in it, so checking the
2603       locale is no solution.
2604
2605       As of version 1.17, a lone first line with just
2606
2607         sep=;
2608
2609       will be recognized and honored when parsing with "getline".
2610

TODO

2612       More Errors & Warnings
2613         New extensions ought to be  clear and concise  in reporting what
2614         error has occurred where and why, and maybe also offer a remedy to
2615         the problem.
2616
2617         "error_diag" is a (very) good start, but there is more work to be
2618         done in this area.
2619
2620         Basic calls  should croak or warn on  illegal parameters.  Errors
2621         should be documented.
2622
2623       setting meta info
2624         Future extensions might include extending the "meta_info",
2625         "is_quoted", and  "is_binary"  to accept setting these  flags for
2626         fields,  so you can specify which fields are quoted in the
2627         "combine"/"string" combination.
2628
2629          $csv->meta_info (0, 1, 1, 3, 0, 0);
2630          $csv->is_quoted (3, 1);
2631
2632         Metadata Vocabulary for Tabular Data
2633         <http://w3c.github.io/csvw/metadata/> (a W3C editor's draft) could be
2634         an example for supporting more metadata.
2635
2636       Parse the whole file at once
2637         Implement new methods or functions  that enable parsing of a
2638         complete file at once, returning a list of hashes. Possible extension
2639         to this could be to enable a column selection on the call:
2640
2641          my @AoH = $csv->parse_file ($filename, { cols => [ 1, 4..8, 12 ]});
2642
2643         returning something like
2644
2645          [ { fields => [ 1, 2, "foo", 4.5, undef, "", 8 ],
2646              flags  => [ ... ],
2647              },
2648            { fields => [ ... ],
2649              .
2650              },
2651            ]
2652
2653         Note that the "csv" function already supports most of this,  but does
2654         not return flags. "getline_all" returns all rows for an open stream,
2655         but this will not return flags either.  "fragment"  can reduce the
2656         required  rows or columns, but cannot combine them.
2657
2658       Cookbook
2659         Write a document that has recipes for  most known  non-standard  (and
2660         maybe some standard)  "CSV" formats,  including formats that use
2661         "TAB",  ";", "|", or other non-comma separators.
2662
2663         Examples could be taken from W3C's CSV on the Web: Use Cases and
2664         Requirements <http://w3c.github.io/csvw/use-cases-and-
2665         requirements/index.html>
2666
2667       Steal
2668         Steal good new ideas and features from PapaParse
2669         <http://papaparse.com> or csvkit <http://csvkit.readthedocs.org>.
2670
2671       Perl6 support
2672         I'm already working on perl6 support here
2673         <https://github.com/Tux/CSV>. No promises yet on when it is finished
2674         (or fast). Trying to keep the API alike as much as possible.
2675
2676   NOT TODO
2677       combined methods
2678         Requests for adding means (methods) that combine "combine" and
2679         "string" in a single call will not be honored (use "print" instead).
2680         Likewise for "parse" and "fields"  (use "getline" instead), given the
2681         problems with embedded newlines.
2682
2683   Release plan
2684       No guarantees, but this is what I had in mind some time ago:
2685
2686       · DIAGNOSTICS section in pod to *describe* the errors (see below)
2687

EBCDIC

2689       Everything should now work on native EBCDIC systems.   As the test does
2690       not cover all possible codepoints and Encode does not support
2691       "utf-ebcdic", there is no guarantee that all handling of Unicode is
2692       done correct.
2693
2694       Opening "EBCDIC" encoded files on  "ASCII"+  systems is likely to
2695       succeed using Encode's "cp37", "cp1047", or "posix-bc":
2696
2697        open my $fh, "<:encoding(cp1047)", "ebcdic_file.csv" or die "...";
2698

DIAGNOSTICS

2700       Still under construction ...
2701
2702       If an error occurs,  "$csv->error_diag" can be used to get information
2703       on the cause of the failure. Note that for speed reasons the internal
2704       value is never cleared on success,  so using the value returned by
2705       "error_diag" in normal cases - when no error occurred - may cause
2706       unexpected results.
2707
2708       If the constructor failed, the cause can be found using "error_diag" as
2709       a class method, like "Text::CSV_XS->error_diag".
2710
2711       The "$csv->error_diag" method is automatically invoked upon error when
2712       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2713       autodie is in effect.  When set to 1, this will cause a "warn" with the
2714       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2715       from "auto_diag" reports.
2716
2717       Errors can be (individually) caught using the "error" callback.
2718
2719       The errors as described below are available. I have tried to make the
2720       error itself explanatory enough, but more descriptions will be added.
2721       For most of these errors, the first three capitals describe the error
2722       category:
2723
2724       · INI
2725
2726         Initialization error or option conflict.
2727
2728       · ECR
2729
2730         Carriage-Return related parse error.
2731
2732       · EOF
2733
2734         End-Of-File related parse error.
2735
2736       · EIQ
2737
2738         Parse error inside quotation.
2739
2740       · EIF
2741
2742         Parse error inside field.
2743
2744       · ECB
2745
2746         Combine error.
2747
2748       · EHR
2749
2750         HashRef parse related error.
2751
2752       And below should be the complete list of error codes that can be
2753       returned:
2754
2755       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2756
2757         The  separation character  cannot be equal to  the quotation
2758         character or to the escape character,  as this would invalidate all
2759         parsing rules.
2760
2761       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2762         TAB"
2763
2764         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2765         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2766         allow.
2767
2768       · 1003 "INI - \r or \n in main attr not allowed"
2769
2770         Using default "eol" characters in either "sep_char", "quote_char",
2771         or  "escape_char"  is  not allowed.
2772
2773       · 1004 "INI - callbacks should be undef or a hashref"
2774
2775         The "callbacks"  attribute only allows one to be "undef" or a hash
2776         reference.
2777
2778       · 1005 "INI - EOL too long"
2779
2780         The value passed for EOL is exceeding its maximum length (16).
2781
2782       · 1006 "INI - SEP too long"
2783
2784         The value passed for SEP is exceeding its maximum length (16).
2785
2786       · 1007 "INI - QUOTE too long"
2787
2788         The value passed for QUOTE is exceeding its maximum length (16).
2789
2790       · 1008 "INI - SEP undefined"
2791
2792         The value passed for SEP should be defined and not empty.
2793
2794       · 1010 "INI - the header is empty"
2795
2796         The header line parsed in the "header" is empty.
2797
2798       · 1011 "INI - the header contains more than one valid separator"
2799
2800         The header line parsed in the  "header"  contains more than one
2801         (unique) separator character out of the allowed set of separators.
2802
2803       · 1012 "INI - the header contains an empty field"
2804
2805         The header line parsed in the "header" contains an empty field.
2806
2807       · 1013 "INI - the header contains nun-unique fields"
2808
2809         The header line parsed in the  "header"  contains at least  two
2810         identical fields.
2811
2812       · 1014 "INI - header called on undefined stream"
2813
2814         The header line cannot be parsed from an undefined source.
2815
2816       · 1500 "PRM - Invalid/unsupported argument(s)"
2817
2818         Function or method called with invalid argument(s) or parameter(s).
2819
2820       · 1501 "PRM - The key attribute is passed as an unsupported type"
2821
2822         The "key" attribute is of an unsupported type.
2823
2824       · 1502 "PRM - The value attribute is passed without the key attribute"
2825
2826         The "value" attribute is only allowed when a valid key is given.
2827
2828       · 1503 "PRM - The value attribute is passed as an unsupported type"
2829
2830         The "value" attribute is of an unsupported type.
2831
2832       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2833
2834         When  "eol"  has  been  set  to  anything  but the  default,  like
2835         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2836         "quote_char", where the characters following the "\r" do not make up
2837         the "eol" sequence, this is an error.
2838
2839       · 2011 "ECR - Characters after end of quoted field"
2840
2841         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2842         quoted field and after the closing double-quote, there should be
2843         either a new-line sequence or a separation character.
2844
2845       · 2012 "EOF - End of data in parsing input stream"
2846
2847         Self-explaining. End-of-file while inside parsing a stream. Can
2848         happen only when reading from streams with "getline",  as using
2849         "parse" is done on strings that are not required to have a trailing
2850         "eol".
2851
2852       · 2013 "INI - Specification error for fragments RFC7111"
2853
2854         Invalid specification for URI "fragment" specification.
2855
2856       · 2014 "ENF - Inconsistent number of fields"
2857
2858         Inconsistent number of fields under strict parsing.
2859
2860       · 2021 "EIQ - NL char inside quotes, binary off"
2861
2862         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2863         option has been selected with the constructor.
2864
2865       · 2022 "EIQ - CR char inside quotes, binary off"
2866
2867         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2868         option has been selected with the constructor.
2869
2870       · 2023 "EIQ - QUO character not allowed"
2871
2872         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2873         Bar",\n" will cause this error.
2874
2875       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2876
2877         The escape character is not allowed as last character in an input
2878         stream.
2879
2880       · 2025 "EIQ - Loose unescaped escape"
2881
2882         An escape character should escape only characters that need escaping.
2883
2884         Allowing  the escape  for other characters  is possible  with the
2885         attribute "allow_loose_escapes".
2886
2887       · 2026 "EIQ - Binary character inside quoted field, binary off"
2888
2889         Binary characters are not allowed by default.    Exceptions are
2890         fields that contain valid UTF-8,  that will automatically be upgraded
2891         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2892         data.
2893
2894       · 2027 "EIQ - Quoted field not terminated"
2895
2896         When parsing a field that started with a quotation character,  the
2897         field is expected to be closed with a quotation character.   When the
2898         parsed line is exhausted before the quote is found, that field is not
2899         terminated.
2900
2901       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2902
2903       · 2031 "EIF - CR char is first char of field, not part of EOL"
2904
2905       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2906
2907       · 2034 "EIF - Loose unescaped quote"
2908
2909       · 2035 "EIF - Escaped EOF in unquoted field"
2910
2911       · 2036 "EIF - ESC error"
2912
2913       · 2037 "EIF - Binary character in unquoted field, binary off"
2914
2915       · 2110 "ECB - Binary character in Combine, binary off"
2916
2917       · 2200 "EIO - print to IO failed. See errno"
2918
2919       · 3001 "EHR - Unsupported syntax for column_names ()"
2920
2921       · 3002 "EHR - getline_hr () called before column_names ()"
2922
2923       · 3003 "EHR - bind_columns () and column_names () fields count
2924         mismatch"
2925
2926       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2927
2928       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2929         fields"
2930
2931       · 3007 "EHR - bind_columns needs refs to writable scalars"
2932
2933       · 3008 "EHR - unexpected error in bound fields"
2934
2935       · 3009 "EHR - print_hr () called before column_names ()"
2936
2937       · 3010 "EHR - print_hr () called with invalid arguments"
2938

SEE ALSO

2940       IO::File,  IO::Handle,  IO::Wrap,  Text::CSV,  Text::CSV_PP,
2941       Text::CSV::Encoded,     Text::CSV::Separator,    Text::CSV::Slurp,
2942       Spreadsheet::CSV and Spreadsheet::Read, and of course perl.
2943
2944       If you are using perl6,  you can have a look at  "Text::CSV"  in the
2945       perl6 ecosystem, offering the same features.
2946
2947       non-perl
2948
2949       A CSV parser in JavaScript,  also used by W3C <http://www.w3.org>,  is
2950       the multi-threaded in-browser PapaParse <http://papaparse.com/>.
2951
2952       csvkit <http://csvkit.readthedocs.org> is a python CSV parsing toolkit.
2953

AUTHOR

2955       Alan Citterman <alan@mfgrtl.com> wrote the original Perl module.
2956       Please don't send mail concerning Text::CSV_XS to Alan, who is not
2957       involved in the C/XS part that is now the main part of the module.
2958
2959       Jochen Wiedmann <joe@ispsoft.de> rewrote the en- and decoding in C by
2960       implementing a simple finite-state machine.   He added variable quote,
2961       escape and separator characters, the binary mode and the print and
2962       getline methods. See ChangeLog releases 0.10 through 0.23.
2963
2964       H.Merijn Brand <h.m.brand@xs4all.nl> cleaned up the code,  added the
2965       field flags methods,  wrote the major part of the test suite, completed
2966       the documentation,   fixed most RT bugs,  added all the allow flags and
2967       the "csv" function. See ChangeLog releases 0.25 and on.
2968
2970        Copyright (C) 2007-2020 H.Merijn Brand.  All rights reserved.
2971        Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2972        Copyright (C) 1997      Alan Citterman.  All rights reserved.
2973
2974       This library is free software;  you can redistribute and/or modify it
2975       under the same terms as Perl itself.
2976
2977
2978
2979perl v5.32.0                      2020-07-28                         CSV_XS(3)
Impressum