1CSV_XS(3)             User Contributed Perl Documentation            CSV_XS(3)
2
3
4

NAME

6       Text::CSV_XS - comma-separated values manipulation routines
7

SYNOPSIS

9        # Functional interface
10        use Text::CSV_XS qw( csv );
11
12        # Read whole file in memory
13        my $aoa = csv (in => "data.csv");    # as array of array
14        my $aoh = csv (in => "data.csv",
15                       headers => "auto");   # as array of hash
16
17        # Write array of arrays as csv file
18        csv (in => $aoa, out => "file.csv", sep_char=> ";");
19
20        # Only show lines where "code" is odd
21        csv (in => "data.csv", filter => { code => sub { $_ % 2 }});
22
23
24        # Object interface
25        use Text::CSV_XS;
26
27        my @rows;
28        # Read/parse CSV
29        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
30        open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
31        while (my $row = $csv->getline ($fh)) {
32            $row->[2] =~ m/pattern/ or next; # 3rd field should match
33            push @rows, $row;
34            }
35        close $fh;
36
37        # and write as CSV
38        open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
39        $csv->say ($fh, $_) for @rows;
40        close $fh or die "new.csv: $!";
41

DESCRIPTION

43       Text::CSV_XS  provides facilities for the composition  and
44       decomposition of comma-separated values.  An instance of the
45       Text::CSV_XS class will combine fields into a "CSV" string and parse a
46       "CSV" string into fields.
47
48       The module accepts either strings or files as input  and support the
49       use of user-specified characters for delimiters, separators, and
50       escapes.
51
52   Embedded newlines
53       Important Note:  The default behavior is to accept only ASCII
54       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
55       that the fields can not contain newlines. If your data contains
56       newlines embedded in fields, or characters above 0x7E (tilde), or
57       binary data, you must set "binary => 1" in the call to "new". To cover
58       the widest range of parsing options, you will always want to set
59       binary.
60
61       But you still have the problem  that you have to pass a correct line to
62       the "parse" method, which is more complicated from the usual point of
63       usage:
64
65        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
66        while (<>) {           #  WRONG!
67            $csv->parse ($_);
68            my @fields = $csv->fields ();
69            }
70
71       this will break, as the "while" might read broken lines:  it does not
72       care about the quoting. If you need to support embedded newlines,  the
73       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
74       and "\r\n" by default) and then
75
76        my $csv = Text::CSV_XS->new ({ binary => 1 });
77        open my $fh, "<", $file or die "$file: $!";
78        while (my $row = $csv->getline ($fh)) {
79            my @fields = @$row;
80            }
81
82       The old(er) way of using global file handles is still supported
83
84        while (my $row = $csv->getline (*ARGV)) { ... }
85
86   Unicode
87       Unicode is only tested to work with perl-5.8.2 and up.
88
89       See also "BOM".
90
91       The simplest way to ensure the correct encoding is used for  in- and
92       output is by either setting layers on the filehandles, or setting the
93       "encoding" argument for "csv".
94
95        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
96       or
97        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
98
99        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
100       or
101        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
102
103       On parsing (both for  "getline" and  "parse"),  if the source is marked
104       being UTF8, then all fields that are marked binary will also be marked
105       UTF8.
106
107       On combining ("print"  and  "combine"):  if any of the combining fields
108       was marked UTF8, the resulting string will be marked as UTF8.  Note
109       however that all fields  before  the first field marked UTF8 and
110       contained 8-bit characters that were not upgraded to UTF8,  these will
111       be  "bytes"  in the resulting string too, possibly causing unexpected
112       errors.  If you pass data of different encoding,  or you don't know if
113       there is  different  encoding, force it to be upgraded before you pass
114       them on:
115
116        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
117
118       For complete control over encoding, please use Text::CSV::Encoded:
119
120        use Text::CSV::Encoded;
121        my $csv = Text::CSV::Encoded->new ({
122            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
123            encoding_out => "cp1252",     # the encoding comes out of Perl
124            });
125
126        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
127        # combine () and print () accept *literally* utf8 encoded data
128        # parse () and getline () return *literally* utf8 encoded data
129
130        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
131        # combine () and print () accept UTF8 marked data
132        # parse () and getline () return UTF8 marked data
133
134   BOM
135       BOM  (or Byte Order Mark)  handling is available only inside the
136       "header" method.   This method supports the following encodings:
137       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
138       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
139       <https://en.wikipedia.org/wiki/Byte_order_mark>.
140
141       If a file has a BOM, the easiest way to deal with that is
142
143        my $aoh = csv (in => $file, detect_bom => 1);
144
145       All records will be encoded based on the detected BOM.
146
147       This implies a call to the  "header"  method,  which defaults to also
148       set the "column_names". So this is not the same as
149
150        my $aoh = csv (in => $file, headers => "auto");
151
152       which only reads the first record to set  "column_names"  but ignores
153       any meaning of possible present BOM.
154

SPECIFICATION

156       While no formal specification for CSV exists, RFC 4180
157       <http://tools.ietf.org/html/rfc4180> (1) describes the common format
158       and establishes  "text/csv" as the MIME type registered with the IANA.
159       RFC 7111 <http://tools.ietf.org/html/rfc7111> (2) adds fragments to
160       CSV.
161
162       Many informal documents exist that describe the "CSV" format.   "How
163       To: The Comma Separated Value (CSV) File Format"
164       <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm> (3)  provides an
165       overview of the  "CSV"  format in the most widely used applications and
166       explains how it can best be used and supported.
167
168        1) http://tools.ietf.org/html/rfc4180
169        2) http://tools.ietf.org/html/rfc7111
170        3) http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
171
172       The basic rules are as follows:
173
174       CSV  is a delimited data format that has fields/columns separated by
175       the comma character and records/rows separated by newlines. Fields that
176       contain a special character (comma, newline, or double quote),  must be
177       enclosed in double quotes. However, if a line contains a single entry
178       that is the empty string, it may be enclosed in double quotes.  If a
179       field's value contains a double quote character it is escaped by
180       placing another double quote character next to it. The "CSV" file
181       format does not require a specific character encoding, byte order, or
182       line terminator format.
183
184       · Each record is a single line ended by a line feed  (ASCII/"LF"=0x0A)
185         or a carriage return and line feed pair (ASCII/"CRLF"="0x0D 0x0A"),
186         however, line-breaks may be embedded.
187
188       · Fields are separated by commas.
189
190       · Allowable characters within a "CSV" field include 0x09 ("TAB") and
191         the inclusive range of 0x20 (space) through 0x7E (tilde).  In binary
192         mode all characters are accepted, at least in quoted fields.
193
194       · A field within  "CSV"  must be surrounded by  double-quotes to
195         contain  a separator character (comma).
196
197       Though this is the most clear and restrictive definition,  Text::CSV_XS
198       is way more liberal than this, and allows extension:
199
200       · Line termination by a single carriage return is accepted by default
201
202       · The separation-, escape-, and escape- characters can be any ASCII
203         character in the range from  0x20 (space) to  0x7E (tilde).
204         Characters outside this range may or may not work as expected.
205         Multibyte characters, like UTF "U+060C" (ARABIC COMMA),   "U+FF0C"
206         (FULLWIDTH COMMA),  "U+241B" (SYMBOL FOR ESCAPE), "U+2424" (SYMBOL
207         FOR NEWLINE), "U+FF02" (FULLWIDTH QUOTATION MARK), and "U+201C" (LEFT
208         DOUBLE QUOTATION MARK) (to give some examples of what might look
209         promising) work for newer versions of perl for "sep_char", and
210         "quote_char" but not for "escape_char".
211
212         If you use perl-5.8.2 or higher these three attributes are
213         utf8-decoded, to increase the likelihood of success. This way
214         "U+00FE" will be allowed as a quote character.
215
216       · A field in  "CSV"  must be surrounded by double-quotes to make an
217         embedded double-quote, represented by a pair of consecutive double-
218         quotes, valid. In binary mode you may additionally use the sequence
219         ""0" for representation of a NULL byte. Using 0x00 in binary mode is
220         just as valid.
221
222       · Several violations of the above specification may be lifted by
223         passing some options as attributes to the object constructor.
224

METHODS

226   version
227       (Class method) Returns the current module version.
228
229   new
230       (Class method) Returns a new instance of class Text::CSV_XS. The
231       attributes are described by the (optional) hash ref "\%attr".
232
233        my $csv = Text::CSV_XS->new ({ attributes ... });
234
235       The following attributes are available:
236
237       eol
238
239        my $csv = Text::CSV_XS->new ({ eol => $/ });
240                  $csv->eol (undef);
241        my $eol = $csv->eol;
242
243       The end-of-line string to add to rows for "print" or the record
244       separator for "getline".
245
246       When not passed in a parser instance,  the default behavior is to
247       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
248       "eol" at all. Passing "undef" or the empty string behave the same.
249
250       When not passed in a generating instance,  records are not terminated
251       at all, so it is probably wise to pass something you expect. A safe
252       choice for "eol" on output is either $/ or "\r\n".
253
254       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
255       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
256       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
257
258       If both $/ and "eol" equal "\015", parsing lines that end on only a
259       Carriage Return without Line Feed, will be "parse"d correct.
260
261       sep_char
262
263        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
264                $csv->sep_char (";");
265        my $c = $csv->sep_char;
266
267       The char used to separate fields, by default a comma. (",").  Limited
268       to a single-byte character, usually in the range from 0x20 (space) to
269       0x7E (tilde). When longer sequences are required, use "sep".
270
271       The separation character can not be equal to the quote character  or to
272       the escape character.
273
274       See also "CAVEATS"
275
276       sep
277
278        my $csv = Text::CSV_XS->new ({ sep => "\N{FULLWIDTH COMMA}" });
279                  $csv->sep (";");
280        my $sep = $csv->sep;
281
282       The chars used to separate fields, by default undefined. Limited to 8
283       bytes.
284
285       When set, overrules "sep_char".  If its length is one byte it acts as
286       an alias to "sep_char".
287
288       See also "CAVEATS"
289
290       quote_char
291
292        my $csv = Text::CSV_XS->new ({ quote_char => "'" });
293                $csv->quote_char (undef);
294        my $c = $csv->quote_char;
295
296       The character to quote fields containing blanks or binary data,  by
297       default the double quote character (""").  A value of undef suppresses
298       quote chars (for simple cases only). Limited to a single-byte
299       character, usually in the range from  0x20 (space) to  0x7E (tilde).
300       When longer sequences are required, use "quote".
301
302       "quote_char" can not be equal to "sep_char".
303
304       quote
305
306        my $csv = Text::CSV_XS->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
307                    $csv->quote ("'");
308        my $quote = $csv->quote;
309
310       The chars used to quote fields, by default undefined. Limited to 8
311       bytes.
312
313       When set, overrules "quote_char". If its length is one byte it acts as
314       an alias to "quote_char".
315
316       See also "CAVEATS"
317
318       escape_char
319
320        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
321                $csv->escape_char (":");
322        my $c = $csv->escape_char;
323
324       The character to  escape  certain characters inside quoted fields.
325       This is limited to a  single-byte  character,  usually  in the  range
326       from  0x20 (space) to 0x7E (tilde).
327
328       The "escape_char" defaults to being the double-quote mark ("""). In
329       other words the same as the default "quote_char". This means that
330       doubling the quote mark in a field escapes it:
331
332        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
333
334       If  you  change  the   "quote_char"  without  changing  the
335       "escape_char",  the  "escape_char" will still be the double-quote
336       (""").  If instead you want to escape the  "quote_char" by doubling it
337       you will need to also change the  "escape_char"  to be the same as what
338       you have changed the "quote_char" to.
339
340       Setting "escape_char" to <undef> or "" will disable escaping completely
341       and is greatly discouraged. This will also disable "escape_null".
342
343       The escape character can not be equal to the separation character.
344
345       binary
346
347        my $csv = Text::CSV_XS->new ({ binary => 1 });
348                $csv->binary (0);
349        my $f = $csv->binary;
350
351       If this attribute is 1,  you may use binary characters in quoted
352       fields, including line feeds, carriage returns and "NULL" bytes. (The
353       latter could be escaped as ""0".) By default this feature is off.
354
355       If a string is marked UTF8,  "binary" will be turned on automatically
356       when binary characters other than "CR" and "NL" are encountered.   Note
357       that a simple string like "\x{00a0}" might still be binary, but not
358       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
359
360       strict
361
362        my $csv = Text::CSV_XS->new ({ strict => 1 });
363                $csv->strict (0);
364        my $f = $csv->strict;
365
366       If this attribute is set to 1, any row that parses to a different
367       number of fields than the previous row will cause the parser to throw
368       error 2014.
369
370       formula_handling
371
372       formula
373
374        my $csv = Text::CSV_XS->new ({ formula => "none" });
375                $csv->formula ("none");
376        my $f = $csv->formula;
377
378       This defines the behavior of fields containing formulas. As formulas
379       are considered dangerous in spreadsheets, this attribute can define an
380       optional action to be taken if a field starts with an equal sign ("=").
381
382       For purpose of code-readability, this can also be written as
383
384        my $csv = Text::CSV_XS->new ({ formula_handling => "none" });
385                $csv->formula_handling ("none");
386        my $f = $csv->formula_handling;
387
388       Possible values for this attribute are
389
390       none
391         Take no specific action. This is the default.
392
393          $csv->formula ("none");
394
395       die
396         Cause the process to "die" whenever a leading "=" is encountered.
397
398          $csv->formula ("die");
399
400       croak
401         Cause the process to "croak" whenever a leading "=" is encountered.
402         (See Carp)
403
404          $csv->formula ("croak");
405
406       diag
407         Report position and content of the field whenever a leading  "=" is
408         found.  The value of the field is unchanged.
409
410          $csv->formula ("diag");
411
412       empty
413         Replace the content of fields that start with a "=" with the empty
414         string.
415
416          $csv->formula ("empty");
417          $csv->formula ("");
418
419       undef
420         Replace the content of fields that start with a "=" with "undef".
421
422          $csv->formula ("undef");
423          $csv->formula (undef);
424
425       a callback
426         Modify the content of fields that start with a  "="  with the return-
427         value of the callback.  The original content of the field is
428         available inside the callback as $_;
429
430          # Replace all formula's with 42
431          $csv->formula (sub { 42; });
432
433          # same as $csv->formula ("empty") but slower
434          $csv->formula (sub { "" });
435
436          # Allow =4+12
437          $csv->formula (sub { s/^=(\d+\+\d+)$/$1/eer });
438
439          # Allow more complex calculations
440          $csv->formula (sub { eval { s{^=([-+*/0-9()]+)$}{$1}ee }; $_ });
441
442       All other values will give a warning and then fallback to "diag".
443
444       decode_utf8
445
446        my $csv = Text::CSV_XS->new ({ decode_utf8 => 1 });
447                $csv->decode_utf8 (0);
448        my $f = $csv->decode_utf8;
449
450       This attributes defaults to TRUE.
451
452       While parsing,  fields that are valid UTF-8, are automatically set to
453       be UTF-8, so that
454
455         $csv->parse ("\xC4\xA8\n");
456
457       results in
458
459         PV("\304\250"\0) [UTF8 "\x{128}"]
460
461       Sometimes it might not be a desired action.  To prevent those upgrades,
462       set this attribute to false, and the result will be
463
464         PV("\304\250"\0)
465
466       auto_diag
467
468        my $csv = Text::CSV_XS->new ({ auto_diag => 1 });
469                $csv->auto_diag (2);
470        my $l = $csv->auto_diag;
471
472       Set this attribute to a number between 1 and 9 causes  "error_diag" to
473       be automatically called in void context upon errors.
474
475       In case of error "2012 - EOF", this call will be void.
476
477       If "auto_diag" is set to a numeric value greater than 1, it will "die"
478       on errors instead of "warn".  If set to anything unrecognized,  it will
479       be silently ignored.
480
481       Future extensions to this feature will include more reliable auto-
482       detection of  "autodie"  being active in the scope of which the error
483       occurred which will increment the value of "auto_diag" with  1 the
484       moment the error is detected.
485
486       diag_verbose
487
488        my $csv = Text::CSV_XS->new ({ diag_verbose => 1 });
489                $csv->diag_verbose (2);
490        my $l = $csv->diag_verbose;
491
492       Set the verbosity of the output triggered by "auto_diag".   Currently
493       only adds the current  input-record-number  (if known)  to the
494       diagnostic output with an indication of the position of the error.
495
496       blank_is_undef
497
498        my $csv = Text::CSV_XS->new ({ blank_is_undef => 1 });
499                $csv->blank_is_undef (0);
500        my $f = $csv->blank_is_undef;
501
502       Under normal circumstances, "CSV" data makes no distinction between
503       quoted- and unquoted empty fields.  These both end up in an empty
504       string field once read, thus
505
506        1,"",," ",2
507
508       is read as
509
510        ("1", "", "", " ", "2")
511
512       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
513       set, the unquoted  empty field is the result of an undefined value.
514       To enable this distinction when  reading "CSV"  data,  the
515       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
516       to "undef", causing the above to be parsed as
517
518        ("1", "", undef, " ", "2")
519
520       note that this is specifically important when loading  "CSV" fields
521       into a database that allows "NULL" values,  as the perl equivalent for
522       "NULL" is "undef" in DBI land.
523
524       empty_is_undef
525
526        my $csv = Text::CSV_XS->new ({ empty_is_undef => 1 });
527                $csv->empty_is_undef (0);
528        my $f = $csv->empty_is_undef;
529
530       Going one  step  further  than  "blank_is_undef",  this attribute
531       converts all empty fields to "undef", so
532
533        1,"",," ",2
534
535       is read as
536
537        (1, undef, undef, " ", 2)
538
539       Note that this effects only fields that are  originally  empty,  not
540       fields that are empty after stripping allowed whitespace. YMMV.
541
542       allow_whitespace
543
544        my $csv = Text::CSV_XS->new ({ allow_whitespace => 1 });
545                $csv->allow_whitespace (0);
546        my $f = $csv->allow_whitespace;
547
548       When this option is set to true,  the whitespace  ("TAB"'s and
549       "SPACE"'s) surrounding  the  separation character  is removed when
550       parsing.  If either "TAB" or "SPACE" is one of the three characters
551       "sep_char", "quote_char", or "escape_char" it will not be considered
552       whitespace.
553
554       Now lines like:
555
556        1 , "foo" , bar , 3 , zapp
557
558       are parsed as valid "CSV", even though it violates the "CSV" specs.
559
560       Note that  all  whitespace is stripped from both  start and  end of
561       each field.  That would make it  more than a feature to enable parsing
562       bad "CSV" lines, as
563
564        1,   2.0,  3,   ape  , monkey
565
566       will now be parsed as
567
568        ("1", "2.0", "3", "ape", "monkey")
569
570       even if the original line was perfectly acceptable "CSV".
571
572       allow_loose_quotes
573
574        my $csv = Text::CSV_XS->new ({ allow_loose_quotes => 1 });
575                $csv->allow_loose_quotes (0);
576        my $f = $csv->allow_loose_quotes;
577
578       By default, parsing unquoted fields containing "quote_char" characters
579       like
580
581        1,foo "bar" baz,42
582
583       would result in parse error 2034.  Though it is still bad practice to
584       allow this format,  we  cannot  help  the  fact  that  some  vendors
585       make  their applications spit out lines styled this way.
586
587       If there is really bad "CSV" data, like
588
589        1,"foo "bar" baz",42
590
591       or
592
593        1,""foo bar baz"",42
594
595       there is a way to get this data-line parsed and leave the quotes inside
596       the quoted field as-is.  This can be achieved by setting
597       "allow_loose_quotes" AND making sure that the "escape_char" is  not
598       equal to "quote_char".
599
600       allow_loose_escapes
601
602        my $csv = Text::CSV_XS->new ({ allow_loose_escapes => 1 });
603                $csv->allow_loose_escapes (0);
604        my $f = $csv->allow_loose_escapes;
605
606       Parsing fields  that  have  "escape_char"  characters that escape
607       characters that do not need to be escaped, like:
608
609        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
610        $csv->parse (qq{1,"my bar\'s",baz,42});
611
612       would result in parse error 2025.   Though it is bad practice to allow
613       this format,  this attribute enables you to treat all escape character
614       sequences equal.
615
616       allow_unquoted_escape
617
618        my $csv = Text::CSV_XS->new ({ allow_unquoted_escape => 1 });
619                $csv->allow_unquoted_escape (0);
620        my $f = $csv->allow_unquoted_escape;
621
622       A backward compatibility issue where "escape_char" differs from
623       "quote_char"  prevents  "escape_char" to be in the first position of a
624       field.  If "quote_char" is equal to the default """ and "escape_char"
625       is set to "\", this would be illegal:
626
627        1,\0,2
628
629       Setting this attribute to 1  might help to overcome issues with
630       backward compatibility and allow this style.
631
632       always_quote
633
634        my $csv = Text::CSV_XS->new ({ always_quote => 1 });
635                $csv->always_quote (0);
636        my $f = $csv->always_quote;
637
638       By default the generated fields are quoted only if they need to be.
639       For example, if they contain the separator character. If you set this
640       attribute to 1 then all defined fields will be quoted. ("undef" fields
641       are not quoted, see "blank_is_undef"). This makes it quite often easier
642       to handle exported data in external applications.   (Poor creatures who
643       are better to use Text::CSV_XS. :)
644
645       quote_space
646
647        my $csv = Text::CSV_XS->new ({ quote_space => 1 });
648                $csv->quote_space (0);
649        my $f = $csv->quote_space;
650
651       By default,  a space in a field would trigger quotation.  As no rule
652       exists this to be forced in "CSV",  nor any for the opposite, the
653       default is true for safety.   You can exclude the space  from this
654       trigger  by setting this attribute to 0.
655
656       quote_empty
657
658        my $csv = Text::CSV_XS->new ({ quote_empty => 1 });
659                $csv->quote_empty (0);
660        my $f = $csv->quote_empty;
661
662       By default the generated fields are quoted only if they need to be.
663       An empty (defined) field does not need quotation. If you set this
664       attribute to 1 then empty defined fields will be quoted.  ("undef"
665       fields are not quoted, see "blank_is_undef"). See also "always_quote".
666
667       quote_binary
668
669        my $csv = Text::CSV_XS->new ({ quote_binary => 1 });
670                $csv->quote_binary (0);
671        my $f = $csv->quote_binary;
672
673       By default,  all "unsafe" bytes inside a string cause the combined
674       field to be quoted.  By setting this attribute to 0, you can disable
675       that trigger for bytes >= 0x7F.
676
677       escape_null
678
679        my $csv = Text::CSV_XS->new ({ escape_null => 1 });
680                $csv->escape_null (0);
681        my $f = $csv->escape_null;
682
683       By default, a "NULL" byte in a field would be escaped. This option
684       enables you to treat the  "NULL"  byte as a simple binary character in
685       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
686       can prevent "NULL" escapes by setting this attribute to 0.
687
688       When the "escape_char" attribute is set to undefined,  this attribute
689       will be set to false.
690
691       The default setting will encode "=\x00=" as
692
693        "="0="
694
695       With "escape_null" set, this will result in
696
697        "=\x00="
698
699       The default when using the "csv" function is "false".
700
701       For backward compatibility reasons,  the deprecated old name
702       "quote_null" is still recognized.
703
704       keep_meta_info
705
706        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1 });
707                $csv->keep_meta_info (0);
708        my $f = $csv->keep_meta_info;
709
710       By default, the parsing of input records is as simple and fast as
711       possible.  However,  some parsing information - like quotation of the
712       original field - is lost in that process.  Setting this flag to true
713       enables retrieving that information after parsing with  the methods
714       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
715       false for performance.
716
717       If you set this attribute to a value greater than 9,   than you can
718       control output quotation style like it was used in the input of the the
719       last parsed record (unless quotation was added because of other
720       reasons).
721
722        my $csv = Text::CSV_XS->new ({
723           binary         => 1,
724           keep_meta_info => 1,
725           quote_space    => 0,
726           });
727
728        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
729
730        $csv->print (*STDOUT, \@row);
731        # 1,,, , ,f,g,"h""h",help,help
732        $csv->keep_meta_info (11);
733        $csv->print (*STDOUT, \@row);
734        # 1,,"", ," ",f,"g","h""h",help,"help"
735
736       undef_str
737
738        my $csv = Text::CSV_XS->new ({ undef_str => "\\N" });
739                $csv->undef_str (undef);
740        my $s = $csv->undef_str;
741
742       This attribute optionally defines the output of undefined fields. The
743       value passed is not changed at all, so if it needs quotation, the
744       quotation needs to be included in the value of the attribute.  Use with
745       caution, as passing a value like  ",",,,,"""  will for sure mess up
746       your output. The default for this attribute is "undef", meaning no
747       special treatment.
748
749       This attribute is useful when exporting  CSV data  to be imported in
750       custom loaders, like for MySQL, that recognize special sequences for
751       "NULL" data.
752
753       This attribute has no meaning when parsing CSV data.
754
755       verbatim
756
757        my $csv = Text::CSV_XS->new ({ verbatim => 1 });
758                $csv->verbatim (0);
759        my $f = $csv->verbatim;
760
761       This is a quite controversial attribute to set,  but makes some hard
762       things possible.
763
764       The rationale behind this attribute is to tell the parser that the
765       normally special characters newline ("NL") and Carriage Return ("CR")
766       will not be special when this flag is set,  and be dealt with  as being
767       ordinary binary characters. This will ease working with data with
768       embedded newlines.
769
770       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
771       every line.
772
773       Imagine a file format like
774
775        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
776
777       where, the line ending is a very specific "#\r\n", and the sep_char is
778       a "^" (caret).   None of the fields is quoted,   but embedded binary
779       data is likely to be present. With the specific line ending, this
780       should not be too hard to detect.
781
782       By default,  Text::CSV_XS'  parse function is instructed to only know
783       about "\n" and "\r"  to be legal line endings,  and so has to deal with
784       the embedded newline as a real "end-of-line",  so it can scan the next
785       line if binary is true, and the newline is inside a quoted field. With
786       this option, we tell "parse" to parse the line as if "\n" is just
787       nothing more than a binary character.
788
789       For "parse" this means that the parser has no more idea about line
790       ending and "getline" "chomp"s line endings on reading.
791
792       types
793
794       A set of column types; the attribute is immediately passed to the
795       "types" method.
796
797       callbacks
798
799       See the "Callbacks" section below.
800
801       accessors
802
803       To sum it up,
804
805        $csv = Text::CSV_XS->new ();
806
807       is equivalent to
808
809        $csv = Text::CSV_XS->new ({
810            eol                   => undef, # \r, \n, or \r\n
811            sep_char              => ',',
812            sep                   => undef,
813            quote_char            => '"',
814            quote                 => undef,
815            escape_char           => '"',
816            binary                => 0,
817            decode_utf8           => 1,
818            auto_diag             => 0,
819            diag_verbose          => 0,
820            blank_is_undef        => 0,
821            empty_is_undef        => 0,
822            allow_whitespace      => 0,
823            allow_loose_quotes    => 0,
824            allow_loose_escapes   => 0,
825            allow_unquoted_escape => 0,
826            always_quote          => 0,
827            quote_empty           => 0,
828            quote_space           => 1,
829            escape_null           => 1,
830            quote_binary          => 1,
831            keep_meta_info        => 0,
832            strict                => 0,
833            formula               => 0,
834            verbatim              => 0,
835            undef_str             => undef,
836            types                 => undef,
837            callbacks             => undef,
838            });
839
840       For all of the above mentioned flags, an accessor method is available
841       where you can inquire the current value, or change the value
842
843        my $quote = $csv->quote_char;
844        $csv->binary (1);
845
846       It is not wise to change these settings halfway through writing "CSV"
847       data to a stream. If however you want to create a new stream using the
848       available "CSV" object, there is no harm in changing them.
849
850       If the "new" constructor call fails,  it returns "undef",  and makes
851       the fail reason available through the "error_diag" method.
852
853        $csv = Text::CSV_XS->new ({ ecs_char => 1 }) or
854            die "".Text::CSV_XS->error_diag ();
855
856       "error_diag" will return a string like
857
858        "INI - Unknown attribute 'ecs_char'"
859
860   known_attributes
861        @attr = Text::CSV_XS->known_attributes;
862        @attr = Text::CSV_XS::known_attributes;
863        @attr = $csv->known_attributes;
864
865       This method will return an ordered list of all the supported
866       attributes as described above.   This can be useful for knowing what
867       attributes are valid in classes that use or extend Text::CSV_XS.
868
869   print
870        $status = $csv->print ($fh, $colref);
871
872       Similar to  "combine" + "string" + "print",  but much more efficient.
873       It expects an array ref as input  (not an array!)  and the resulting
874       string is not really  created,  but  immediately  written  to the  $fh
875       object, typically an IO handle or any other object that offers a
876       "print" method.
877
878       For performance reasons  "print"  does not create a result string,  so
879       all "string", "status", "fields", and "error_input" methods will return
880       undefined information after executing this method.
881
882       If $colref is "undef"  (explicit,  not through a variable argument) and
883       "bind_columns"  was used to specify fields to be printed,  it is
884       possible to make performance improvements, as otherwise data would have
885       to be copied as arguments to the method call:
886
887        $csv->bind_columns (\($foo, $bar));
888        $status = $csv->print ($fh, undef);
889
890       A short benchmark
891
892        my @data = ("aa" .. "zz");
893        $csv->bind_columns (\(@data));
894
895        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
896        $csv->print ($fh,  \@data  );   # 57600 recs/sec
897        $csv->print ($fh,   undef  );   # 48500 recs/sec
898
899   say
900        $status = $csv->say ($fh, $colref);
901
902       Like "print", but "eol" defaults to "$\".
903
904   print_hr
905        $csv->print_hr ($fh, $ref);
906
907       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
908       provided the column names are set with "column_names".
909
910       It is just a wrapper method with basic parameter checks over
911
912        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
913
914   combine
915        $status = $csv->combine (@fields);
916
917       This method constructs a "CSV" record from  @fields,  returning success
918       or failure.   Failure can result from lack of arguments or an argument
919       that contains an invalid character.   Upon success,  "string" can be
920       called to retrieve the resultant "CSV" string.  Upon failure,  the
921       value returned by "string" is undefined and "error_input" could be
922       called to retrieve the invalid argument.
923
924   string
925        $line = $csv->string ();
926
927       This method returns the input to  "parse"  or the resultant "CSV"
928       string of "combine", whichever was called more recently.
929
930   getline
931        $colref = $csv->getline ($fh);
932
933       This is the counterpart to  "print",  as "parse"  is the counterpart to
934       "combine":  it parses a row from the $fh  handle using the "getline"
935       method associated with $fh  and parses this row into an array ref.
936       This array ref is returned by the function or "undef" for failure.
937       When $fh does not support "getline", you are likely to hit errors.
938
939       When fields are bound with "bind_columns" the return value is a
940       reference to an empty list.
941
942       The "string", "fields", and "status" methods are meaningless again.
943
944   getline_all
945        $arrayref = $csv->getline_all ($fh);
946        $arrayref = $csv->getline_all ($fh, $offset);
947        $arrayref = $csv->getline_all ($fh, $offset, $length);
948
949       This will return a reference to a list of getline ($fh) results.  In
950       this call, "keep_meta_info" is disabled.  If $offset is negative, as
951       with "splice", only the last  "abs ($offset)" records of $fh are taken
952       into consideration.
953
954       Given a CSV file with 10 lines:
955
956        lines call
957        ----- ---------------------------------------------------------
958        0..9  $csv->getline_all ($fh)         # all
959        0..9  $csv->getline_all ($fh,  0)     # all
960        8..9  $csv->getline_all ($fh,  8)     # start at 8
961        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
962        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
963        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
964        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
965        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
966
967   getline_hr
968       The "getline_hr" and "column_names" methods work together  to allow you
969       to have rows returned as hashrefs.  You must call "column_names" first
970       to declare your column names.
971
972        $csv->column_names (qw( code name price description ));
973        $hr = $csv->getline_hr ($fh);
974        print "Price for $hr->{name} is $hr->{price} EUR\n";
975
976       "getline_hr" will croak if called before "column_names".
977
978       Note that  "getline_hr"  creates a hashref for every row and will be
979       much slower than the combined use of "bind_columns"  and "getline" but
980       still offering the same ease of use hashref inside the loop:
981
982        my @cols = @{$csv->getline ($fh)};
983        $csv->column_names (@cols);
984        while (my $row = $csv->getline_hr ($fh)) {
985            print $row->{price};
986            }
987
988       Could easily be rewritten to the much faster:
989
990        my @cols = @{$csv->getline ($fh)};
991        my $row = {};
992        $csv->bind_columns (\@{$row}{@cols});
993        while ($csv->getline ($fh)) {
994            print $row->{price};
995            }
996
997       Your mileage may vary for the size of the data and the number of rows.
998       With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
999
1000                   Rate hashrefs getlines
1001        hashrefs 1.00/s       --     -76%
1002        getlines 4.15/s     313%       --
1003
1004   getline_hr_all
1005        $arrayref = $csv->getline_hr_all ($fh);
1006        $arrayref = $csv->getline_hr_all ($fh, $offset);
1007        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
1008
1009       This will return a reference to a list of   getline_hr ($fh) results.
1010       In this call, "keep_meta_info" is disabled.
1011
1012   parse
1013        $status = $csv->parse ($line);
1014
1015       This method decomposes a  "CSV"  string into fields,  returning success
1016       or failure.   Failure can result from a lack of argument  or the given
1017       "CSV" string is improperly formatted.   Upon success, "fields" can be
1018       called to retrieve the decomposed fields. Upon failure calling "fields"
1019       will return undefined data and  "error_input"  can be called to
1020       retrieve  the invalid argument.
1021
1022       You may use the "types"  method for setting column types.  See "types"'
1023       description below.
1024
1025       The $line argument is supposed to be a simple scalar. Everything else
1026       is supposed to croak and set error 1500.
1027
1028   fragment
1029       This function tries to implement RFC7111  (URI Fragment Identifiers for
1030       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
1031
1032        my $AoA = $csv->fragment ($fh, $spec);
1033
1034       In specifications,  "*" is used to specify the last item, a dash ("-")
1035       to indicate a range.   All indices are 1-based:  the first row or
1036       column has index 1. Selections can be combined with the semi-colon
1037       (";").
1038
1039       When using this method in combination with  "column_names",  the
1040       returned reference  will point to a  list of hashes  instead of a  list
1041       of lists.  A disjointed  cell-based combined selection  might return
1042       rows with different number of columns making the use of hashes
1043       unpredictable.
1044
1045        $csv->column_names ("Name", "Age");
1046        my $AoH = $csv->fragment ($fh, "col=3;8");
1047
1048       If the "after_parse" callback is active,  it is also called on every
1049       line parsed and skipped before the fragment.
1050
1051       row
1052          row=4
1053          row=5-7
1054          row=6-*
1055          row=1-2;4;6-*
1056
1057       col
1058          col=2
1059          col=1-3
1060          col=4-*
1061          col=1-2;4;7-*
1062
1063       cell
1064         In cell-based selection, the comma (",") is used to pair row and
1065         column
1066
1067          cell=4,1
1068
1069         The range operator ("-") using "cell"s can be used to define top-left
1070         and bottom-right "cell" location
1071
1072          cell=3,1-4,6
1073
1074         The "*" is only allowed in the second part of a pair
1075
1076          cell=3,2-*,2    # row 3 till end, only column 2
1077          cell=3,2-3,*    # column 2 till end, only row 3
1078          cell=3,2-*,*    # strip row 1 and 2, and column 1
1079
1080         Cells and cell ranges may be combined with ";", possibly resulting in
1081         rows with different number of columns
1082
1083          cell=1,1-2,2;3,3-4,4;1,4;4,1
1084
1085         Disjointed selections will only return selected cells.   The cells
1086         that are not  specified  will  not  be  included  in the  returned
1087         set,  not even as "undef".  As an example given a "CSV" like
1088
1089          11,12,13,...19
1090          21,22,...28,29
1091          :            :
1092          91,...97,98,99
1093
1094         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
1095
1096          11,12,14
1097          21,22
1098          33,34
1099          41,43,44
1100
1101         Overlapping cell-specs will return those cells only once, So
1102         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1103
1104          11,12,13
1105          21,22,23,24
1106          31,32,33,34
1107          42,43,44
1108
1109       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1110       types of specs to be combined   (either "row" or "col" or "cell").
1111       Passing an invalid fragment specification will croak and set error
1112       2013.
1113
1114   column_names
1115       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1116       keys (column names) are passed, it will return the current setting as a
1117       list.
1118
1119       "column_names" accepts a list of scalars  (the column names)  or a
1120       single array_ref, so you can pass the return value from "getline" too:
1121
1122        $csv->column_names ($csv->getline ($fh));
1123
1124       "column_names" does no checking on duplicates at all, which might lead
1125       to unexpected results.   Undefined entries will be replaced with the
1126       string "\cAUNDEF\cA", so
1127
1128        $csv->column_names (undef, "", "name", "name");
1129        $hr = $csv->getline_hr ($fh);
1130
1131       Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1132       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1133       field.
1134
1135       "column_names" croaks on invalid arguments.
1136
1137   header
1138       This method does NOT work in perl-5.6.x
1139
1140       Parse the CSV header and set "sep", column_names and encoding.
1141
1142        my @hdr = $csv->header ($fh);
1143        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1144        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1145
1146       The first argument should be a file handle.
1147
1148       This method resets some object properties,  as it is supposed to be
1149       invoked only once per file or stream.  It will leave attributes
1150       "column_names" and "bound_columns" alone of setting column names is
1151       disabled. Reading headers on previously process objects might fail on
1152       perl-5.8.0 and older.
1153
1154       Assuming that the file opened for parsing has a header, and the header
1155       does not contain problematic characters like embedded newlines,   read
1156       the first line from the open handle then auto-detect whether the header
1157       separates the column names with a character from the allowed separator
1158       list.
1159
1160       If any of the allowed separators matches,  and none of the other
1161       allowed separators match,  set  "sep"  to that  separator  for the
1162       current CSV_XS instance and use it to parse the first line, map those
1163       to lowercase, and use that to set the instance "column_names":
1164
1165        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1166        open my $fh, "<", "file.csv";
1167        binmode $fh; # for Windows
1168        $csv->header ($fh);
1169        while (my $row = $csv->getline_hr ($fh)) {
1170            ...
1171            }
1172
1173       If the header is empty,  contains more than one unique separator out of
1174       the allowed set,  contains empty fields,   or contains identical fields
1175       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1176       respectively.
1177
1178       If the header contains embedded newlines or is not valid  CSV  in any
1179       other way, this method will croak and leave the parse error untouched.
1180
1181       A successful call to "header"  will always set the  "sep"  of the $csv
1182       object. This behavior can not be disabled.
1183
1184       return value
1185
1186       On error this method will croak.
1187
1188       In list context,  the headers will be returned whether they are used to
1189       set "column_names" or not.
1190
1191       In scalar context, the instance itself is returned.  Note: the values
1192       as found in the header will effectively be  lost if  "set_column_names"
1193       is false.
1194
1195       Options
1196
1197       sep_set
1198          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1199
1200         The list of legal separators defaults to "[ ";", "," ]" and can be
1201         changed by this option.  As this is probably the most often used
1202         option,  it can be passed on its own as an unnamed argument:
1203
1204          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1205
1206         Multi-byte  sequences are allowed,  both multi-character and
1207         Unicode.  See "sep".
1208
1209       detect_bom
1210          $csv->header ($fh, { detect_bom => 1 });
1211
1212         The default behavior is to detect if the header line starts with a
1213         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1214         This default behavior can be disabled by passing a false value to
1215         "detect_bom".
1216
1217         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1218         UTF-32BE,  and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1219         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1220         supported.
1221
1222         If a supported BOM was detected as start of the stream, it is stored
1223         in the abject attribute "ENCODING".
1224
1225          my $enc = $csv->{ENCODING};
1226
1227         The encoding is used with "binmode" on $fh.
1228
1229         If the handle was opened in a (correct) encoding,  this method will
1230         not alter the encoding, as it checks the leading bytes of the first
1231         line. In case the stream starts with a decode BOM ("U+FEFF"),
1232         "{ENCODING}" will be "" (empty) instead of the default "undef".
1233
1234       munge_column_names
1235         This option offers the means to modify the column names into
1236         something that is most useful to the application.   The default is to
1237         map all column names to lower case.
1238
1239          $csv->header ($fh, { munge_column_names => "lc" });
1240
1241         The following values are available:
1242
1243           lc     - lower case
1244           uc     - upper case
1245           db     - valid DB field names
1246           none   - do not change
1247           \%hash - supply a mapping
1248           \&cb   - supply a callback
1249
1250         Lower case
1251            $csv->header ($fh, { munge_column_names => "lc" });
1252
1253           The header is changed to all lower-case
1254
1255            $_ = lc;
1256
1257         Upper case
1258            $csv->header ($fh, { munge_column_names => "uc" });
1259
1260           The header is changed to all upper-case
1261
1262            $_ = uc;
1263
1264         Literal
1265            $csv->header ($fh, { munge_column_names => "none" });
1266
1267         Hash
1268            $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1269
1270           if a value does not exist, the original value is used unchanged
1271
1272         Database
1273            $csv->header ($fh, { munge_column_names => "db" });
1274
1275           - lower-case
1276
1277           - all sequences of non-word characters are replaced with an
1278             underscore
1279
1280           - all leading underscores are removed
1281
1282            $_ = lc (s/\W+/_/gr =~ s/^_+//r);
1283
1284         Callback
1285            $csv->header ($fh, { munge_column_names => sub { fc } });
1286            $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1287            $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1288
1289           As this callback is called in a "map", you can use $_ directly.
1290
1291       set_column_names
1292          $csv->header ($fh, { set_column_names => 1 });
1293
1294         The default is to set the instances column names using
1295         "column_names" if the method is successful,  so subsequent calls to
1296         "getline_hr" can return a hash. Disable setting the header can be
1297         forced by using a false value for this option.
1298
1299         As described in "return value" above, content is lost in scalar
1300         context.
1301
1302       Validation
1303
1304       When receiving CSV files from external sources,  this method can be
1305       used to protect against changes in the layout by restricting to known
1306       headers  (and typos in the header fields).
1307
1308        my %known = (
1309            "record key" => "c_rec",
1310            "rec id"     => "c_rec",
1311            "id_rec"     => "c_rec",
1312            "kode"       => "code",
1313            "code"       => "code",
1314            "vaule"      => "value",
1315            "value"      => "value",
1316            );
1317        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1318        open my $fh, "<", $source or die "$source: $!";
1319        $csv->header ($fh, { munge_column_names => sub {
1320            s/\s+$//;
1321            s/^\s+//;
1322            $known{lc $_} or die "Unknown column '$_' in $source";
1323            }});
1324        while (my $row = $csv->getline_hr ($fh)) {
1325            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1326            }
1327
1328   bind_columns
1329       Takes a list of scalar references to be used for output with  "print"
1330       or to store in the fields fetched by "getline".  When you do not pass
1331       enough references to store the fetched fields in, "getline" will fail
1332       with error 3006.  If you pass more than there are fields to return,
1333       the content of the remaining references is left untouched.
1334
1335        $csv->bind_columns (\$code, \$name, \$price, \$description);
1336        while ($csv->getline ($fh)) {
1337            print "The price of a $name is \x{20ac} $price\n";
1338            }
1339
1340       To reset or clear all column binding, call "bind_columns" with the
1341       single argument "undef". This will also clear column names.
1342
1343        $csv->bind_columns (undef);
1344
1345       If no arguments are passed at all, "bind_columns" will return the list
1346       of current bindings or "undef" if no binds are active.
1347
1348       Note that in parsing with  "bind_columns",  the fields are set on the
1349       fly.  That implies that if the third field of a row causes an error
1350       (or this row has just two fields where the previous row had more),  the
1351       first two fields already have been assigned the values of the current
1352       row, while the rest of the fields will still hold the values of the
1353       previous row.  If you want the parser to fail in these cases, use the
1354       "strict" attribute.
1355
1356   eof
1357        $eof = $csv->eof ();
1358
1359       If "parse" or  "getline"  was used with an IO stream,  this method will
1360       return true (1) if the last call hit end of file,  otherwise it will
1361       return false ('').  This is useful to see the difference between a
1362       failure and end of file.
1363
1364       Note that if the parsing of the last line caused an error,  "eof" is
1365       still true.  That means that if you are not using "auto_diag", an idiom
1366       like
1367
1368        while (my $row = $csv->getline ($fh)) {
1369            # ...
1370            }
1371        $csv->eof or $csv->error_diag;
1372
1373       will not report the error. You would have to change that to
1374
1375        while (my $row = $csv->getline ($fh)) {
1376            # ...
1377            }
1378        +$csv->error_diag and $csv->error_diag;
1379
1380   types
1381        $csv->types (\@tref);
1382
1383       This method is used to force that  (all)  columns are of a given type.
1384       For example, if you have an integer column,  two  columns  with
1385       doubles  and a string column, then you might do a
1386
1387        $csv->types ([Text::CSV_XS::IV (),
1388                      Text::CSV_XS::NV (),
1389                      Text::CSV_XS::NV (),
1390                      Text::CSV_XS::PV ()]);
1391
1392       Column types are used only for decoding columns while parsing,  in
1393       other words by the "parse" and "getline" methods.
1394
1395       You can unset column types by doing a
1396
1397        $csv->types (undef);
1398
1399       or fetch the current type settings with
1400
1401        $types = $csv->types ();
1402
1403       IV  Set field type to integer.
1404
1405       NV  Set field type to numeric/float.
1406
1407       PV  Set field type to string.
1408
1409   fields
1410        @columns = $csv->fields ();
1411
1412       This method returns the input to   "combine"  or the resultant
1413       decomposed fields of a successful "parse", whichever was called more
1414       recently.
1415
1416       Note that the return value is undefined after using "getline", which
1417       does not fill the data structures returned by "parse".
1418
1419   meta_info
1420        @flags = $csv->meta_info ();
1421
1422       This method returns the "flags" of the input to "combine" or the flags
1423       of the resultant  decomposed fields of  "parse",   whichever was called
1424       more recently.
1425
1426       For each field,  a meta_info field will hold  flags that  inform
1427       something about  the  field  returned  by  the  "fields"  method or
1428       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1429
1430       " "0x0001
1431         The field was quoted.
1432
1433       " "0x0002
1434         The field was binary.
1435
1436       See the "is_***" methods below.
1437
1438   is_quoted
1439        my $quoted = $csv->is_quoted ($column_idx);
1440
1441       Where  $column_idx is the  (zero-based)  index of the column in the
1442       last result of "parse".
1443
1444       This returns a true value  if the data in the indicated column was
1445       enclosed in "quote_char" quotes.  This might be important for fields
1446       where content ",20070108," is to be treated as a numeric value,  and
1447       where ","20070108"," is explicitly marked as character string data.
1448
1449       This method is only valid when "keep_meta_info" is set to a true value.
1450
1451   is_binary
1452        my $binary = $csv->is_binary ($column_idx);
1453
1454       Where  $column_idx is the  (zero-based)  index of the column in the
1455       last result of "parse".
1456
1457       This returns a true value if the data in the indicated column contained
1458       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1459
1460       This method is only valid when "keep_meta_info" is set to a true value.
1461
1462   is_missing
1463        my $missing = $csv->is_missing ($column_idx);
1464
1465       Where  $column_idx is the  (zero-based)  index of the column in the
1466       last result of "getline_hr".
1467
1468        $csv->keep_meta_info (1);
1469        while (my $hr = $csv->getline_hr ($fh)) {
1470            $csv->is_missing (0) and next; # This was an empty line
1471            }
1472
1473       When using  "getline_hr",  it is impossible to tell if the  parsed
1474       fields are "undef" because they where not filled in the "CSV" stream
1475       or because they were not read at all, as all the fields defined by
1476       "column_names" are set in the hash-ref.    If you still need to know if
1477       all fields in each row are provided, you should enable "keep_meta_info"
1478       so you can check the flags.
1479
1480       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1481       "undef", regardless of $column_idx being valid or not. If this
1482       attribute is "true" it will return either 0 (the field is present) or 1
1483       (the field is missing).
1484
1485       A special case is the empty line.  If the line is completely empty -
1486       after dealing with the flags - this is still a valid CSV line:  it is a
1487       record of just one single empty field. However, if "keep_meta_info" is
1488       set, invoking "is_missing" with index 0 will now return true.
1489
1490   status
1491        $status = $csv->status ();
1492
1493       This method returns the status of the last invoked "combine" or "parse"
1494       call. Status is success (true: 1) or failure (false: "undef" or 0).
1495
1496   error_input
1497        $bad_argument = $csv->error_input ();
1498
1499       This method returns the erroneous argument (if it exists) of "combine"
1500       or "parse",  whichever was called more recently.  If the last
1501       invocation was successful, "error_input" will return "undef".
1502
1503   error_diag
1504        Text::CSV_XS->error_diag ();
1505        $csv->error_diag ();
1506        $error_code               = 0  + $csv->error_diag ();
1507        $error_str                = "" . $csv->error_diag ();
1508        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1509
1510       If (and only if) an error occurred,  this function returns  the
1511       diagnostics of that error.
1512
1513       If called in void context,  this will print the internal error code and
1514       the associated error message to STDERR.
1515
1516       If called in list context,  this will return  the error code  and the
1517       error message in that order.  If the last error was from parsing, the
1518       rest of the values returned are a best guess at the location  within
1519       the line  that was being parsed. Their values are 1-based.  The
1520       position currently is index of the byte at which the parsing failed in
1521       the current record. It might change to be the index of the current
1522       character in a later release. The records is the index of the record
1523       parsed by the csv instance. The field number is the index of the field
1524       the parser thinks it is currently  trying to  parse. See
1525       examples/csv-check for how this can be used.
1526
1527       If called in  scalar context,  it will return  the diagnostics  in a
1528       single scalar, a-la $!.  It will contain the error code in numeric
1529       context, and the diagnostics message in string context.
1530
1531       When called as a class method or a  direct function call,  the
1532       diagnostics are that of the last "new" call.
1533
1534   record_number
1535        $recno = $csv->record_number ();
1536
1537       Returns the records parsed by this csv instance.  This value should be
1538       more accurate than $. when embedded newlines come in play. Records
1539       written by this instance are not counted.
1540
1541   SetDiag
1542        $csv->SetDiag (0);
1543
1544       Use to reset the diagnostics if you are dealing with errors.
1545

FUNCTIONS

1547   csv
1548       This function is not exported by default and should be explicitly
1549       requested:
1550
1551        use Text::CSV_XS qw( csv );
1552
1553       This is an high-level function that aims at simple (user) interfaces.
1554       This can be used to read/parse a "CSV" file or stream (the default
1555       behavior) or to produce a file or write to a stream (define the  "out"
1556       attribute).  It returns an array- or hash-reference on parsing (or
1557       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1558       When this function fails you can get to the error using the class call
1559       to "error_diag"
1560
1561        my $aoa = csv (in => "test.csv") or
1562            die Text::CSV_XS->error_diag;
1563
1564       This function takes the arguments as key-value pairs. This can be
1565       passed as a list or as an anonymous hash:
1566
1567        my $aoa = csv (  in => "test.csv", sep_char => ";");
1568        my $aoh = csv ({ in => $fh, headers => "auto" });
1569
1570       The arguments passed consist of two parts:  the arguments to "csv"
1571       itself and the optional attributes to the  "CSV"  object used inside
1572       the function as enumerated and explained in "new".
1573
1574       If not overridden, the default option used for CSV is
1575
1576        auto_diag   => 1
1577        escape_null => 0
1578
1579       The option that is always set and cannot be altered is
1580
1581        binary      => 1
1582
1583       As this function will likely be used in one-liners,  it allows  "quote"
1584       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1585       "esc" or "escape".
1586
1587       Alternative invocations:
1588
1589        my $aoa = Text::CSV_XS::csv (in => "file.csv");
1590
1591        my $csv = Text::CSV_XS->new ();
1592        my $aoa = $csv->csv (in => "file.csv");
1593
1594       In the latter case, the object attributes are used from the existing
1595       object and the attribute arguments in the function call are ignored:
1596
1597        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
1598        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1599
1600       will parse using ";" as "sep_char", not ",".
1601
1602       in
1603
1604       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1605       which will be  opened for reading  and closed when finished,  a file
1606       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1607       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1608       "\q{1,2,"csv"}").
1609
1610       When used with "out", "in" should be a reference to a CSV structure
1611       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1612       reference.  The code-ref will be invoked with no arguments.
1613
1614        my $aoa = csv (in => "file.csv");
1615
1616        open my $fh, "<", "file.csv";
1617        my $aoa = csv (in => $fh);
1618
1619        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1620        my $err = csv (in => $csv, out => "file.csv");
1621
1622       If called in void context without the "out" attribute, the resulting
1623       ref will be used as input to a subsequent call to csv:
1624
1625        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1626
1627       will be a shortcut to
1628
1629        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1630
1631       where, in the absence of the "out" attribute, this is a shortcut to
1632
1633        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1634             out => *STDOUT)
1635
1636       out
1637
1638        csv (in => $aoa, out => "file.csv");
1639        csv (in => $aoa, out => $fh);
1640        csv (in => $aoa, out =>   STDOUT);
1641        csv (in => $aoa, out =>  *STDOUT);
1642        csv (in => $aoa, out => \*STDOUT);
1643        csv (in => $aoa, out => \my $data);
1644        csv (in => $aoa, out =>  undef);
1645        csv (in => $aoa, out => \"skip");
1646
1647       In output mode, the default CSV options when producing CSV are
1648
1649        eol       => "\r\n"
1650
1651       The "fragment" attribute is ignored in output mode.
1652
1653       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1654       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1655       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1656       or a reference to a scalar (e.g. "\my $data").
1657
1658        csv (in => sub { $sth->fetch },            out => "dump.csv");
1659        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1660             headers => $sth->{NAME_lc});
1661
1662       When a code-ref is used for "in", the output is generated  per
1663       invocation, so no buffering is involved. This implies that there is no
1664       size restriction on the number of records. The "csv" function ends when
1665       the coderef returns a false value.
1666
1667       If "out" is set to a reference of the literal string "skip", the output
1668       will be suppressed completely,  which might be useful in combination
1669       with a filter for side effects only.
1670
1671        my %cache;
1672        csv (in    => "dump.csv",
1673             out   => \"skip",
1674             on_in => sub { $cache{$_[1][1]}++ });
1675
1676       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1677       equivalent to "\"skip"".
1678
1679       encoding
1680
1681       If passed,  it should be an encoding accepted by the  ":encoding()"
1682       option to "open". There is no default value. This attribute does not
1683       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1684       use in command line invocations.
1685
1686       If "encoding" is set to the literal value "auto", the method "header"
1687       will be invoked on the opened stream to check if there is a BOM and set
1688       the encoding accordingly.   This is equal to passing a true value in
1689       the option "detect_bom".
1690
1691       Encodings can be stacked, as supported by "binmode":
1692
1693        # Using PerlIO::via::gzip
1694        csv (in       => \@csv,
1695             out      => "test.csv:via.gz",
1696             encoding => ":via(gzip):encoding(utf-8)",
1697             );
1698        $aoa = csv (in => "test.csv:via.gz",  encoding => ":via(gzip)");
1699
1700        # Using PerlIO::gzip
1701        csv (in       => \@csv,
1702             out      => "test.csv:via.gz",
1703             encoding => ":gzip:encoding(utf-8)",
1704             );
1705        $aoa = csv (in => "test.csv:gzip.gz", encoding => ":gzip");
1706
1707       detect_bom
1708
1709       If  "detect_bom"  is given, the method  "header"  will be invoked on
1710       the opened stream to check if there is a BOM and set the encoding
1711       accordingly.
1712
1713       "detect_bom" can be abbreviated to "bom".
1714
1715       This is the same as setting "encoding" to "auto".
1716
1717       Note that as the method  "header" is invoked,  its default is to also
1718       set the headers.
1719
1720       headers
1721
1722       If this attribute is not given, the default behavior is to produce an
1723       array of arrays.
1724
1725       If "headers" is supplied,  it should be an anonymous list of column
1726       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1727       "lc", "uc", or "skip".
1728
1729       skip
1730         When "skip" is used, the header will not be included in the output.
1731
1732          my $aoa = csv (in => $fh, headers => "skip");
1733
1734       auto
1735         If "auto" is used, the first line of the "CSV" source will be read as
1736         the list of field headers and used to produce an array of hashes.
1737
1738          my $aoh = csv (in => $fh, headers => "auto");
1739
1740       lc
1741         If "lc" is used,  the first line of the  "CSV" source will be read as
1742         the list of field headers mapped to  lower case and used to produce
1743         an array of hashes. This is a variation of "auto".
1744
1745          my $aoh = csv (in => $fh, headers => "lc");
1746
1747       uc
1748         If "uc" is used,  the first line of the  "CSV" source will be read as
1749         the list of field headers mapped to  upper case and used to produce
1750         an array of hashes. This is a variation of "auto".
1751
1752          my $aoh = csv (in => $fh, headers => "uc");
1753
1754       CODE
1755         If a coderef is used,  the first line of the  "CSV" source will be
1756         read as the list of mangled field headers in which each field is
1757         passed as the only argument to the coderef. This list is used to
1758         produce an array of hashes.
1759
1760          my $aoh = csv (in      => $fh,
1761                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1762
1763         this example is a variation of using "lc" where all occurrences of
1764         "kode" are replaced with "code".
1765
1766       ARRAY
1767         If  "headers"  is an anonymous list,  the entries in the list will be
1768         used as field names. The first line is considered data instead of
1769         headers.
1770
1771          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1772          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1773
1774       HASH
1775         If "headers" is an hash reference, this implies "auto", but header
1776         fields for that exist as key in the hashref will be replaced by the
1777         value for that key. Given a CSV file like
1778
1779          post-kode,city,name,id number,fubble
1780          1234AA,Duckstad,Donald,13,"X313DF"
1781
1782         using
1783
1784          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1785
1786         will return an entry like
1787
1788          { pc     => "1234AA",
1789            city   => "Duckstad",
1790            name   => "Donald",
1791            ID     => "13",
1792            fubble => "X313DF",
1793            }
1794
1795       See also "munge_column_names" and "set_column_names".
1796
1797       munge_column_names
1798
1799       If "munge_column_names" is set,  the method  "header"  is invoked on
1800       the opened stream with all matching arguments to detect and set the
1801       headers.
1802
1803       "munge_column_names" can be abbreviated to "munge".
1804
1805       key
1806
1807       If passed,  will default  "headers"  to "auto" and return a hashref
1808       instead of an array of hashes. Allowed values are simple scalars or
1809       array-references where the first element is the joiner and the rest are
1810       the fields to join to combine the key.
1811
1812        my $ref = csv (in => "test.csv", key => "code");
1813        my $ref = csv (in => "test.csv", key => [ ":" => "code", "color" ]);
1814
1815       with test.csv like
1816
1817        code,product,price,color
1818        1,pc,850,gray
1819        2,keyboard,12,white
1820        3,mouse,5,black
1821
1822       the first example will return
1823
1824         { 1   => {
1825               code    => 1,
1826               color   => 'gray',
1827               price   => 850,
1828               product => 'pc'
1829               },
1830           2   => {
1831               code    => 2,
1832               color   => 'white',
1833               price   => 12,
1834               product => 'keyboard'
1835               },
1836           3   => {
1837               code    => 3,
1838               color   => 'black',
1839               price   => 5,
1840               product => 'mouse'
1841               }
1842           }
1843
1844       the second example will return
1845
1846         { "1:gray"    => {
1847               code    => 1,
1848               color   => 'gray',
1849               price   => 850,
1850               product => 'pc'
1851               },
1852           "2:white"   => {
1853               code    => 2,
1854               color   => 'white',
1855               price   => 12,
1856               product => 'keyboard'
1857               },
1858           "3:black"   => {
1859               code    => 3,
1860               color   => 'black',
1861               price   => 5,
1862               product => 'mouse'
1863               }
1864           }
1865
1866       The "key" attribute can be combined with "headers" for "CSV" date that
1867       has no header line, like
1868
1869        my $ref = csv (
1870            in      => "foo.csv",
1871            headers => [qw( c_foo foo bar description stock )],
1872            key     =>     "c_foo",
1873            );
1874
1875       value
1876
1877       Used to create key-value hashes.
1878
1879       Only allowed when "key" is valid. A "value" can be either a single
1880       column label or an anonymous list of column labels.  In the first case,
1881       the value will be a simple scalar value, in the latter case, it will be
1882       a hashref.
1883
1884        my $ref = csv (in => "test.csv", key   => "code",
1885                                         value => "price");
1886        my $ref = csv (in => "test.csv", key   => "code",
1887                                         value => [ "product", "price" ]);
1888        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1889                                         value => "price");
1890        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1891                                         value => [ "product", "price" ]);
1892
1893       with test.csv like
1894
1895        code,product,price,color
1896        1,pc,850,gray
1897        2,keyboard,12,white
1898        3,mouse,5,black
1899
1900       the first example will return
1901
1902         { 1 => 850,
1903           2 =>  12,
1904           3 =>   5,
1905           }
1906
1907       the second example will return
1908
1909         { 1   => {
1910               price   => 850,
1911               product => 'pc'
1912               },
1913           2   => {
1914               price   => 12,
1915               product => 'keyboard'
1916               },
1917           3   => {
1918               price   => 5,
1919               product => 'mouse'
1920               }
1921           }
1922
1923       the third example will return
1924
1925         { "1:gray"    => 850,
1926           "2:white"   =>  12,
1927           "3:black"   =>   5,
1928           }
1929
1930       the fourth example will return
1931
1932         { "1:gray"    => {
1933               price   => 850,
1934               product => 'pc'
1935               },
1936           "2:white"   => {
1937               price   => 12,
1938               product => 'keyboard'
1939               },
1940           "3:black"   => {
1941               price   => 5,
1942               product => 'mouse'
1943               }
1944           }
1945
1946       keep_headers
1947
1948       When using hashes,  keep the column names into the arrayref passed,  so
1949       all headers are available after the call in the original order.
1950
1951        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1952
1953       This attribute can be abbreviated to "kh" or passed as
1954       "keep_column_names".
1955
1956       This attribute implies a default of "auto" for the "headers" attribute.
1957
1958       fragment
1959
1960       Only output the fragment as defined in the "fragment" method. This
1961       option is ignored when generating "CSV". See "out".
1962
1963       Combining all of them could give something like
1964
1965        use Text::CSV_XS qw( csv );
1966        my $aoh = csv (
1967            in       => "test.txt",
1968            encoding => "utf-8",
1969            headers  => "auto",
1970            sep_char => "|",
1971            fragment => "row=3;6-9;15-*",
1972            );
1973        say $aoh->[15]{Foo};
1974
1975       sep_set
1976
1977       If "sep_set" is set, the method "header" is invoked on the opened
1978       stream to detect and set "sep_char" with the given set.
1979
1980       "sep_set" can be abbreviated to "seps".
1981
1982       Note that as the  "header" method is invoked,  its default is to also
1983       set the headers.
1984
1985       set_column_names
1986
1987       If  "set_column_names" is passed,  the method "header" is invoked on
1988       the opened stream with all arguments meant for "header".
1989
1990       If "set_column_names" is passed as a false value, the content of the
1991       first row is only preserved if the output is AoA:
1992
1993       With an input-file like
1994
1995        bAr,foo
1996        1,2
1997        3,4,5
1998
1999       This call
2000
2001        my $aoa = csv (in => $file, set_column_names => 0);
2002
2003       will result in
2004
2005        [[ "bar", "foo"     ],
2006         [ "1",   "2"       ],
2007         [ "3",   "4",  "5" ]]
2008
2009       and
2010
2011        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
2012
2013       will result in
2014
2015        [[ "bAr", "foo"     ],
2016         [ "1",   "2"       ],
2017         [ "3",   "4",  "5" ]]
2018
2019   Callbacks
2020       Callbacks enable actions triggered from the inside of Text::CSV_XS.
2021
2022       While most of what this enables  can easily be done in an  unrolled
2023       loop as described in the "SYNOPSIS" callbacks can be used to meet
2024       special demands or enhance the "csv" function.
2025
2026       error
2027          $csv->callbacks (error => sub { $csv->SetDiag (0) });
2028
2029         the "error"  callback is invoked when an error occurs,  but  only
2030         when "auto_diag" is set to a true value. A callback is invoked with
2031         the values returned by "error_diag":
2032
2033          my ($c, $s);
2034
2035          sub ignore3006 {
2036              my ($err, $msg, $pos, $recno, $fldno) = @_;
2037              if ($err == 3006) {
2038                  # ignore this error
2039                  ($c, $s) = (undef, undef);
2040                  Text::CSV_XS->SetDiag (0);
2041                  }
2042              # Any other error
2043              return;
2044              } # ignore3006
2045
2046          $csv->callbacks (error => \&ignore3006);
2047          $csv->bind_columns (\$c, \$s);
2048          while ($csv->getline ($fh)) {
2049              # Error 3006 will not stop the loop
2050              }
2051
2052       after_parse
2053          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
2054          while (my $row = $csv->getline ($fh)) {
2055              $row->[-1] eq "NEW";
2056              }
2057
2058         This callback is invoked after parsing with  "getline"  only if no
2059         error occurred.  The callback is invoked with two arguments:   the
2060         current "CSV" parser object and an array reference to the fields
2061         parsed.
2062
2063         The return code of the callback is ignored  unless it is a reference
2064         to the string "skip", in which case the record will be skipped in
2065         "getline_all".
2066
2067          sub add_from_db {
2068              my ($csv, $row) = @_;
2069              $sth->execute ($row->[4]);
2070              push @$row, $sth->fetchrow_array;
2071              } # add_from_db
2072
2073          my $aoa = csv (in => "file.csv", callbacks => {
2074              after_parse => \&add_from_db });
2075
2076         This hook can be used for validation:
2077
2078         FAIL
2079           Die if any of the records does not validate a rule:
2080
2081            after_parse => sub {
2082                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
2083                    die "5th field does not have a valid Dutch zipcode";
2084                }
2085
2086         DEFAULT
2087           Replace invalid fields with a default value:
2088
2089            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
2090
2091         SKIP
2092           Skip records that have invalid fields (only applies to
2093           "getline_all"):
2094
2095            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
2096
2097       before_print
2098          my $idx = 1;
2099          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
2100          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
2101
2102         This callback is invoked  before printing with  "print"  only if no
2103         error occurred.  The callback is invoked with two arguments:  the
2104         current  "CSV" parser object and an array reference to the fields
2105         passed.
2106
2107         The return code of the callback is ignored.
2108
2109          sub max_4_fields {
2110              my ($csv, $row) = @_;
2111              @$row > 4 and splice @$row, 4;
2112              } # max_4_fields
2113
2114          csv (in => csv (in => "file.csv"), out => *STDOUT,
2115              callbacks => { before print => \&max_4_fields });
2116
2117         This callback is not active for "combine".
2118
2119       Callbacks for csv ()
2120
2121       The "csv" allows for some callbacks that do not integrate in XS
2122       internals but only feature the "csv" function.
2123
2124         csv (in        => "file.csv",
2125              callbacks => {
2126                  filter       => { 6 => sub { $_ > 15 } },    # first
2127                  after_parse  => sub { say "AFTER PARSE";  }, # first
2128                  after_in     => sub { say "AFTER IN";     }, # second
2129                  on_in        => sub { say "ON IN";        }, # third
2130                  },
2131              );
2132
2133         csv (in        => $aoh,
2134              out       => "file.csv",
2135              callbacks => {
2136                  on_in        => sub { say "ON IN";        }, # first
2137                  before_out   => sub { say "BEFORE OUT";   }, # second
2138                  before_print => sub { say "BEFORE PRINT"; }, # third
2139                  },
2140              );
2141
2142       filter
2143         This callback can be used to filter records.  It is called just after
2144         a new record has been scanned.  The callback accepts a:
2145
2146         hashref
2147           The keys are the index to the row (the field name or field number,
2148           1-based) and the values are subs to return a true or false value.
2149
2150            csv (in => "file.csv", filter => {
2151                       3 => sub { m/a/ },       # third field should contain an "a"
2152                       5 => sub { length > 4 }, # length of the 5th field minimal 5
2153                       });
2154
2155            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
2156
2157           If the keys to the filter hash contain any character that is not a
2158           digit it will also implicitly set "headers" to "auto"  unless
2159           "headers"  was already passed as argument.  When headers are
2160           active, returning an array of hashes, the filter is not applicable
2161           to the header itself.
2162
2163           All sub results should match, as in AND.
2164
2165           The context of the callback sets  $_ localized to the field
2166           indicated by the filter. The two arguments are as with all other
2167           callbacks, so the other fields in the current row can be seen:
2168
2169            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
2170
2171           If the context is set to return a list of hashes  ("headers" is
2172           defined), the current record will also be available in the
2173           localized %_:
2174
2175            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
2176
2177           If the filter is used to alter the content by changing $_,  make
2178           sure that the sub returns true in order not to have that record
2179           skipped:
2180
2181            filter => { 2 => sub { $_ = uc }}
2182
2183           will upper-case the second field, and then skip it if the resulting
2184           content evaluates to false. To always accept, end with truth:
2185
2186            filter => { 2 => sub { $_ = uc; 1 }}
2187
2188         coderef
2189            csv (in => "file.csv", filter => sub { $n++; 0; });
2190
2191           If the argument to "filter" is a coderef,  it is an alias or
2192           shortcut to a filter on column 0:
2193
2194            csv (filter => sub { $n++; 0 });
2195
2196           is equal to
2197
2198            csv (filter => { 0 => sub { $n++; 0 });
2199
2200         filter-name
2201            csv (in => "file.csv", filter => "not_blank");
2202            csv (in => "file.csv", filter => "not_empty");
2203            csv (in => "file.csv", filter => "filled");
2204
2205           These are predefined filters
2206
2207           Given a file like (line numbers prefixed for doc purpose only):
2208
2209            1:1,2,3
2210            2:
2211            3:,
2212            4:""
2213            5:,,
2214            6:, ,
2215            7:"",
2216            8:" "
2217            9:4,5,6
2218
2219           not_blank
2220             Filter out the blank lines
2221
2222             This filter is a shortcut for
2223
2224              filter => { 0 => sub { @{$_[1]} > 1 or
2225                          defined $_[1][0] && $_[1][0] ne "" } }
2226
2227             Due to the implementation,  it is currently impossible to also
2228             filter lines that consists only of a quoted empty field. These
2229             lines are also considered blank lines.
2230
2231             With the given example, lines 2 and 4 will be skipped.
2232
2233           not_empty
2234             Filter out lines where all the fields are empty.
2235
2236             This filter is a shortcut for
2237
2238              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2239
2240             A space is not regarded being empty, so given the example data,
2241             lines 2, 3, 4, 5, and 7 are skipped.
2242
2243           filled
2244             Filter out lines that have no visible data
2245
2246             This filter is a shortcut for
2247
2248              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2249
2250             This filter rejects all lines that not have at least one field
2251             that does not evaluate to the empty string.
2252
2253             With the given example data, this filter would skip lines 2
2254             through 8.
2255
2256         One could also use modules like Types::Standard:
2257
2258          use Types::Standard -types;
2259
2260          my $type   = Tuple[Str, Str, Int, Bool, Optional[Num]];
2261          my $check  = $type->compiled_check;
2262
2263          # filter with compiled check and warnings
2264          my $aoa = csv (
2265             in     => \$data,
2266             filter => {
2267                 0 => sub {
2268                     my $ok = $check->($_[1]) or
2269                         warn $type->get_message ($_[1]), "\n";
2270                     return $ok;
2271                     },
2272                 },
2273             );
2274
2275       after_in
2276         This callback is invoked for each record after all records have been
2277         parsed but before returning the reference to the caller.  The hook is
2278         invoked with two arguments:  the current  "CSV"  parser object  and a
2279         reference to the record.   The reference can be a reference to a
2280         HASH  or a reference to an ARRAY as determined by the arguments.
2281
2282         This callback can also be passed as  an attribute without the
2283         "callbacks" wrapper.
2284
2285       before_out
2286         This callback is invoked for each record before the record is
2287         printed.  The hook is invoked with two arguments:  the current "CSV"
2288         parser object and a reference to the record.   The reference can be a
2289         reference to a  HASH or a reference to an ARRAY as determined by the
2290         arguments.
2291
2292         This callback can also be passed as an attribute  without the
2293         "callbacks" wrapper.
2294
2295         This callback makes the row available in %_ if the row is a hashref.
2296         In this case %_ is writable and will change the original row.
2297
2298       on_in
2299         This callback acts exactly as the "after_in" or the "before_out"
2300         hooks.
2301
2302         This callback can also be passed as an attribute  without the
2303         "callbacks" wrapper.
2304
2305         This callback makes the row available in %_ if the row is a hashref.
2306         In this case %_ is writable and will change the original row. So e.g.
2307         with
2308
2309           my $aoh = csv (
2310               in      => \"foo\n1\n2\n",
2311               headers => "auto",
2312               on_in   => sub { $_{bar} = 2; },
2313               );
2314
2315         $aoh will be:
2316
2317           [ { foo => 1,
2318               bar => 2,
2319               }
2320             { foo => 2,
2321               bar => 2,
2322               }
2323             ]
2324
2325       csv
2326         The function  "csv" can also be called as a method or with an
2327         existing Text::CSV_XS object. This could help if the function is to
2328         be invoked a lot of times and the overhead of creating the object
2329         internally over  and  over again would be prevented by passing an
2330         existing instance.
2331
2332          my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2333
2334          my $aoa = $csv->csv (in => $fh);
2335          my $aoa = csv (in => $fh, csv => $csv);
2336
2337         both act the same. Running this 20000 times on a 20 lines CSV file,
2338         showed a 53% speedup.
2339

INTERNALS

2341       Combine (...)
2342       Parse (...)
2343
2344       The arguments to these internal functions are deliberately not
2345       described or documented in order to enable the  module authors make
2346       changes it when they feel the need for it.  Using them is  highly
2347       discouraged  as  the  API may change in future releases.
2348

EXAMPLES

2350   Reading a CSV file line by line:
2351        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2352        open my $fh, "<", "file.csv" or die "file.csv: $!";
2353        while (my $row = $csv->getline ($fh)) {
2354            # do something with @$row
2355            }
2356        close $fh or die "file.csv: $!";
2357
2358       or
2359
2360        my $aoa = csv (in => "file.csv", on_in => sub {
2361            # do something with %_
2362            });
2363
2364       Reading only a single column
2365
2366        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2367        open my $fh, "<", "file.csv" or die "file.csv: $!";
2368        # get only the 4th column
2369        my @column = map { $_->[3] } @{$csv->getline_all ($fh)};
2370        close $fh or die "file.csv: $!";
2371
2372       with "csv", you could do
2373
2374        my @column = map { $_->[0] }
2375            @{csv (in => "file.csv", fragment => "col=4")};
2376
2377   Parsing CSV strings:
2378        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1, binary => 1 });
2379
2380        my $sample_input_string =
2381            qq{"I said, ""Hi!""",Yes,"",2.34,,"1.09","\x{20ac}",};
2382        if ($csv->parse ($sample_input_string)) {
2383            my @field = $csv->fields;
2384            foreach my $col (0 .. $#field) {
2385                my $quo = $csv->is_quoted ($col) ? $csv->{quote_char} : "";
2386                printf "%2d: %s%s%s\n", $col, $quo, $field[$col], $quo;
2387                }
2388            }
2389        else {
2390            print STDERR "parse () failed on argument: ",
2391                $csv->error_input, "\n";
2392            $csv->error_diag ();
2393            }
2394
2395       Parsing CSV from memory
2396
2397       Given a complete CSV data-set in scalar $data,  generate a list of
2398       lists to represent the rows and fields
2399
2400        # The data
2401        my $data = join "\r\n" => map { join "," => 0 .. 5 } 0 .. 5;
2402
2403        # in a loop
2404        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2405        open my $fh, "<", \$data;
2406        my @foo;
2407        while (my $row = $csv->getline ($fh)) {
2408            push @foo, $row;
2409            }
2410        close $fh;
2411
2412        # a single call
2413        my $foo = csv (in => \$data);
2414
2415   Printing CSV data
2416       The fast way: using "print"
2417
2418       An example for creating "CSV" files using the "print" method:
2419
2420        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
2421        open my $fh, ">", "foo.csv" or die "foo.csv: $!";
2422        for (1 .. 10) {
2423            $csv->print ($fh, [ $_, "$_" ]) or $csv->error_diag;
2424            }
2425        close $fh or die "$tbl.csv: $!";
2426
2427       The slow way: using "combine" and "string"
2428
2429       or using the slower "combine" and "string" methods:
2430
2431        my $csv = Text::CSV_XS->new;
2432
2433        open my $csv_fh, ">", "hello.csv" or die "hello.csv: $!";
2434
2435        my @sample_input_fields = (
2436            'You said, "Hello!"',   5.67,
2437            '"Surely"',   '',   '3.14159');
2438        if ($csv->combine (@sample_input_fields)) {
2439            print $csv_fh $csv->string, "\n";
2440            }
2441        else {
2442            print "combine () failed on argument: ",
2443                $csv->error_input, "\n";
2444            }
2445        close $csv_fh or die "hello.csv: $!";
2446
2447       Generating CSV into memory
2448
2449       Format a data-set (@foo) into a scalar value in memory ($data):
2450
2451        # The data
2452        my @foo = map { [ 0 .. 5 ] } 0 .. 3;
2453
2454        # in a loop
2455        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, eol => "\r\n" });
2456        open my $fh, ">", \my $data;
2457        $csv->print ($fh, $_) for @foo;
2458        close $fh;
2459
2460        # a single call
2461        csv (in => \@foo, out => \my $data);
2462
2463   Rewriting CSV
2464       Rewrite "CSV" files with ";" as separator character to well-formed
2465       "CSV":
2466
2467        use Text::CSV_XS qw( csv );
2468        csv (in => csv (in => "bad.csv", sep_char => ";"), out => *STDOUT);
2469
2470       As "STDOUT" is now default in "csv", a one-liner converting a UTF-16
2471       CSV file with BOM and TAB-separation to valid UTF-8 CSV could be:
2472
2473        $ perl -C3 -MText::CSV_XS=csv -we\
2474           'csv(in=>"utf16tab.csv",encoding=>"utf16",sep=>"\t")' >utf8.csv
2475
2476   Dumping database tables to CSV
2477       Dumping a database table can be simple as this (TIMTOWTDI):
2478
2479        my $dbh = DBI->connect (...);
2480        my $sql = "select * from foo";
2481
2482        # using your own loop
2483        open my $fh, ">", "foo.csv" or die "foo.csv: $!\n";
2484        my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n" });
2485        my $sth = $dbh->prepare ($sql); $sth->execute;
2486        $csv->print ($fh, $sth->{NAME_lc});
2487        while (my $row = $sth->fetch) {
2488            $csv->print ($fh, $row);
2489            }
2490
2491        # using the csv function, all in memory
2492        csv (out => "foo.csv", in => $dbh->selectall_arrayref ($sql));
2493
2494        # using the csv function, streaming with callbacks
2495        my $sth = $dbh->prepare ($sql); $sth->execute;
2496        csv (out => "foo.csv", in => sub { $sth->fetch            });
2497        csv (out => "foo.csv", in => sub { $sth->fetchrow_hashref });
2498
2499       Note that this does not discriminate between "empty" values and NULL-
2500       values from the database,  as both will be the same empty field in CSV.
2501       To enable distinction between the two, use "quote_empty".
2502
2503        csv (out => "foo.csv", in => sub { $sth->fetch }, quote_empty => 1);
2504
2505       If the database import utility supports special sequences to insert
2506       "NULL" values into the database,  like MySQL/MariaDB supports "\N",
2507       use a filter or a map
2508
2509        csv (out => "foo.csv", in => sub { $sth->fetch },
2510                            on_in => sub { $_ //= "\\N" for @{$_[1]} });
2511
2512        while (my $row = $sth->fetch) {
2513            $csv->print ($fh, [ map { $_ // "\\N" } @$row ]);
2514            }
2515
2516       note that this will not work as expected when choosing the backslash
2517       ("\") as "escape_char", as that will cause the "\" to need to be
2518       escaped by yet another "\",  which will cause the field to need
2519       quotation and thus ending up as "\\N" instead of "\N". See also
2520       "undef_str".
2521
2522        csv (out => "foo.csv", in => sub { $sth->fetch }, undef_str => "\\N");
2523
2524       these special sequences are not recognized by  Text::CSV_XS  on parsing
2525       the CSV generated like this, but map and filter are your friends again
2526
2527        while (my $row = $csv->getline ($fh)) {
2528            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @$row);
2529            }
2530
2531        csv (in => "foo.csv", filter => { 1 => sub {
2532            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @{$_[1]}); 0; }});
2533
2534   The examples folder
2535       For more extended examples, see the examples/ 1. sub-directory in the
2536       original distribution or the git repository 2.
2537
2538        1. https://github.com/Tux/Text-CSV_XS/tree/master/examples
2539        2. https://github.com/Tux/Text-CSV_XS
2540
2541       The following files can be found there:
2542
2543       parser-xs.pl
2544         This can be used as a boilerplate to parse invalid "CSV"  and parse
2545         beyond (expected) errors alternative to using the "error" callback.
2546
2547          $ perl examples/parser-xs.pl bad.csv >good.csv
2548
2549       csv-check
2550         This is a command-line tool that uses parser-xs.pl  techniques to
2551         check the "CSV" file and report on its content.
2552
2553          $ csv-check files/utf8.csv
2554          Checked files/utf8.csv  with csv-check 1.9
2555          using Text::CSV_XS 1.32 with perl 5.26.0 and Unicode 9.0.0
2556          OK: rows: 1, columns: 2
2557              sep = <,>, quo = <">, bin = <1>, eol = <"\n">
2558
2559       csv2xls
2560         A script to convert "CSV" to Microsoft Excel ("XLS"). This requires
2561         extra modules Date::Calc and Spreadsheet::WriteExcel. The converter
2562         accepts various options and can produce UTF-8 compliant Excel files.
2563
2564       csv2xlsx
2565         A script to convert "CSV" to Microsoft Excel ("XLSX").  This requires
2566         the modules Date::Calc and Spreadsheet::Writer::XLSX.  The converter
2567         does accept various options including merging several "CSV" files
2568         into a single Excel file.
2569
2570       csvdiff
2571         A script that provides colorized diff on sorted CSV files,  assuming
2572         first line is header and first field is the key. Output options
2573         include colorized ANSI escape codes or HTML.
2574
2575          $ csvdiff --html --output=diff.html file1.csv file2.csv
2576
2577       rewrite.pl
2578         A script to rewrite (in)valid CSV into valid CSV files.  Script has
2579         options to generate confusing CSV files or CSV files that conform to
2580         Dutch MS-Excel exports (using ";" as separation).
2581
2582         Script - by default - honors BOM  and auto-detects separation
2583         converting it to default standard CSV with "," as separator.
2584

CAVEATS

2586       Text::CSV_XS  is not designed to detect the characters used to quote
2587       and separate fields.  The parsing is done using predefined  (default)
2588       settings.  In the examples  sub-directory,  you can find scripts  that
2589       demonstrate how you could try to detect these characters yourself.
2590
2591   Microsoft Excel
2592       The import/export from Microsoft Excel is a risky task, according to
2593       the documentation in "Text::CSV::Separator".  Microsoft uses the
2594       system's list separator defined in the regional settings, which happens
2595       to be a semicolon for Dutch, German and Spanish (and probably some
2596       others as well).   For the English locale,  the default is a comma.
2597       In Windows however,  the user is free to choose a  predefined locale,
2598       and then change  every  individual setting in it, so checking the
2599       locale is no solution.
2600
2601       As of version 1.17, a lone first line with just
2602
2603         sep=;
2604
2605       will be recognized and honored when parsing with "getline".
2606

TODO

2608       More Errors & Warnings
2609         New extensions ought to be  clear and concise  in reporting what
2610         error has occurred where and why, and maybe also offer a remedy to
2611         the problem.
2612
2613         "error_diag" is a (very) good start, but there is more work to be
2614         done in this area.
2615
2616         Basic calls  should croak or warn on  illegal parameters.  Errors
2617         should be documented.
2618
2619       setting meta info
2620         Future extensions might include extending the "meta_info",
2621         "is_quoted", and  "is_binary"  to accept setting these  flags for
2622         fields,  so you can specify which fields are quoted in the
2623         "combine"/"string" combination.
2624
2625          $csv->meta_info (0, 1, 1, 3, 0, 0);
2626          $csv->is_quoted (3, 1);
2627
2628         Metadata Vocabulary for Tabular Data
2629         <http://w3c.github.io/csvw/metadata/> (a W3C editor's draft) could be
2630         an example for supporting more metadata.
2631
2632       Parse the whole file at once
2633         Implement new methods or functions  that enable parsing of a
2634         complete file at once, returning a list of hashes. Possible extension
2635         to this could be to enable a column selection on the call:
2636
2637          my @AoH = $csv->parse_file ($filename, { cols => [ 1, 4..8, 12 ]});
2638
2639         Returning something like
2640
2641          [ { fields => [ 1, 2, "foo", 4.5, undef, "", 8 ],
2642              flags  => [ ... ],
2643              },
2644            { fields => [ ... ],
2645              .
2646              },
2647            ]
2648
2649         Note that the "csv" function already supports most of this,  but does
2650         not return flags. "getline_all" returns all rows for an open stream,
2651         but this will not return flags either.  "fragment"  can reduce the
2652         required  rows or columns, but cannot combine them.
2653
2654       Cookbook
2655         Write a document that has recipes for  most known  non-standard  (and
2656         maybe some standard)  "CSV" formats,  including formats that use
2657         "TAB",  ";", "|", or other non-comma separators.
2658
2659         Examples could be taken from W3C's CSV on the Web: Use Cases and
2660         Requirements <http://w3c.github.io/csvw/use-cases-and-
2661         requirements/index.html>
2662
2663       Steal
2664         Steal good new ideas and features from PapaParse
2665         <http://papaparse.com> or csvkit <http://csvkit.readthedocs.org>.
2666
2667       Perl6 support
2668         I'm already working on perl6 support here
2669         <https://github.com/Tux/CSV>. No promises yet on when it is finished
2670         (or fast). Trying to keep the API alike as much as possible.
2671
2672   NOT TODO
2673       combined methods
2674         Requests for adding means (methods) that combine "combine" and
2675         "string" in a single call will not be honored (use "print" instead).
2676         Likewise for "parse" and "fields"  (use "getline" instead), given the
2677         problems with embedded newlines.
2678
2679   Release plan
2680       No guarantees, but this is what I had in mind some time ago:
2681
2682       · DIAGNOSTICS section in pod to *describe* the errors (see below)
2683

EBCDIC

2685       The current hard-coding of characters and character ranges  makes this
2686       code unusable on "EBCDIC" systems. Recent work in perl-5.20 might
2687       change that.
2688
2689       Opening "EBCDIC" encoded files on  "ASCII"+  systems is likely to
2690       succeed using Encode's "cp37", "cp1047", or "posix-bc":
2691
2692        open my $fh, "<:encoding(cp1047)", "ebcdic_file.csv" or die "...";
2693

DIAGNOSTICS

2695       Still under construction ...
2696
2697       If an error occurs,  "$csv->error_diag" can be used to get information
2698       on the cause of the failure. Note that for speed reasons the internal
2699       value is never cleared on success,  so using the value returned by
2700       "error_diag" in normal cases - when no error occurred - may cause
2701       unexpected results.
2702
2703       If the constructor failed, the cause can be found using "error_diag" as
2704       a class method, like "Text::CSV_XS->error_diag".
2705
2706       The "$csv->error_diag" method is automatically invoked upon error when
2707       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2708       autodie is in effect.  When set to 1, this will cause a "warn" with the
2709       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2710       from "auto_diag" reports.
2711
2712       Errors can be (individually) caught using the "error" callback.
2713
2714       The errors as described below are available. I have tried to make the
2715       error itself explanatory enough, but more descriptions will be added.
2716       For most of these errors, the first three capitals describe the error
2717       category:
2718
2719       · INI
2720
2721         Initialization error or option conflict.
2722
2723       · ECR
2724
2725         Carriage-Return related parse error.
2726
2727       · EOF
2728
2729         End-Of-File related parse error.
2730
2731       · EIQ
2732
2733         Parse error inside quotation.
2734
2735       · EIF
2736
2737         Parse error inside field.
2738
2739       · ECB
2740
2741         Combine error.
2742
2743       · EHR
2744
2745         HashRef parse related error.
2746
2747       And below should be the complete list of error codes that can be
2748       returned:
2749
2750       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2751
2752         The  separation character  cannot be equal to  the quotation
2753         character or to the escape character,  as this would invalidate all
2754         parsing rules.
2755
2756       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2757         TAB"
2758
2759         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2760         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2761         allow.
2762
2763       · 1003 "INI - \r or \n in main attr not allowed"
2764
2765         Using default "eol" characters in either "sep_char", "quote_char",
2766         or  "escape_char"  is  not allowed.
2767
2768       · 1004 "INI - callbacks should be undef or a hashref"
2769
2770         The "callbacks"  attribute only allows one to be "undef" or a hash
2771         reference.
2772
2773       · 1005 "INI - EOL too long"
2774
2775         The value passed for EOL is exceeding its maximum length (16).
2776
2777       · 1006 "INI - SEP too long"
2778
2779         The value passed for SEP is exceeding its maximum length (16).
2780
2781       · 1007 "INI - QUOTE too long"
2782
2783         The value passed for QUOTE is exceeding its maximum length (16).
2784
2785       · 1008 "INI - SEP undefined"
2786
2787         The value passed for SEP should be defined and not empty.
2788
2789       · 1010 "INI - the header is empty"
2790
2791         The header line parsed in the "header" is empty.
2792
2793       · 1011 "INI - the header contains more than one valid separator"
2794
2795         The header line parsed in the  "header"  contains more than one
2796         (unique) separator character out of the allowed set of separators.
2797
2798       · 1012 "INI - the header contains an empty field"
2799
2800         The header line parsed in the "header" is contains an empty field.
2801
2802       · 1013 "INI - the header contains nun-unique fields"
2803
2804         The header line parsed in the  "header"  contains at least  two
2805         identical fields.
2806
2807       · 1014 "INI - header called on undefined stream"
2808
2809         The header line cannot be parsed from an undefined sources.
2810
2811       · 1500 "PRM - Invalid/unsupported argument(s)"
2812
2813         Function or method called with invalid argument(s) or parameter(s).
2814
2815       · 1501 "PRM - The key attribute is passed as an unsupported type"
2816
2817         The "key" attribute is of an unsupported type.
2818
2819       · 1502 "PRM - The value attribute is passed without the key attribute"
2820
2821         The "value" attribute is only allowed when a valid key is given.
2822
2823       · 1503 "PRM - The value attribute is passed as an unsupported type"
2824
2825         The "value" attribute is of an unsupported type.
2826
2827       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2828
2829         When  "eol"  has  been  set  to  anything  but the  default,  like
2830         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2831         "quote_char", where the characters following the "\r" do not make up
2832         the "eol" sequence, this is an error.
2833
2834       · 2011 "ECR - Characters after end of quoted field"
2835
2836         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2837         quoted field and after the closing double-quote, there should be
2838         either a new-line sequence or a separation character.
2839
2840       · 2012 "EOF - End of data in parsing input stream"
2841
2842         Self-explaining. End-of-file while inside parsing a stream. Can
2843         happen only when reading from streams with "getline",  as using
2844         "parse" is done on strings that are not required to have a trailing
2845         "eol".
2846
2847       · 2013 "INI - Specification error for fragments RFC7111"
2848
2849         Invalid specification for URI "fragment" specification.
2850
2851       · 2014 "ENF - Inconsistent number of fields"
2852
2853         Inconsistent number of fields under strict parsing.
2854
2855       · 2021 "EIQ - NL char inside quotes, binary off"
2856
2857         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2858         option has been selected with the constructor.
2859
2860       · 2022 "EIQ - CR char inside quotes, binary off"
2861
2862         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2863         option has been selected with the constructor.
2864
2865       · 2023 "EIQ - QUO character not allowed"
2866
2867         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2868         Bar",\n" will cause this error.
2869
2870       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2871
2872         The escape character is not allowed as last character in an input
2873         stream.
2874
2875       · 2025 "EIQ - Loose unescaped escape"
2876
2877         An escape character should escape only characters that need escaping.
2878
2879         Allowing  the escape  for other characters  is possible  with the
2880         attribute "allow_loose_escapes".
2881
2882       · 2026 "EIQ - Binary character inside quoted field, binary off"
2883
2884         Binary characters are not allowed by default.    Exceptions are
2885         fields that contain valid UTF-8,  that will automatically be upgraded
2886         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2887         data.
2888
2889       · 2027 "EIQ - Quoted field not terminated"
2890
2891         When parsing a field that started with a quotation character,  the
2892         field is expected to be closed with a quotation character.   When the
2893         parsed line is exhausted before the quote is found, that field is not
2894         terminated.
2895
2896       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2897
2898       · 2031 "EIF - CR char is first char of field, not part of EOL"
2899
2900       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2901
2902       · 2034 "EIF - Loose unescaped quote"
2903
2904       · 2035 "EIF - Escaped EOF in unquoted field"
2905
2906       · 2036 "EIF - ESC error"
2907
2908       · 2037 "EIF - Binary character in unquoted field, binary off"
2909
2910       · 2110 "ECB - Binary character in Combine, binary off"
2911
2912       · 2200 "EIO - print to IO failed. See errno"
2913
2914       · 3001 "EHR - Unsupported syntax for column_names ()"
2915
2916       · 3002 "EHR - getline_hr () called before column_names ()"
2917
2918       · 3003 "EHR - bind_columns () and column_names () fields count
2919         mismatch"
2920
2921       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2922
2923       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2924         fields"
2925
2926       · 3007 "EHR - bind_columns needs refs to writable scalars"
2927
2928       · 3008 "EHR - unexpected error in bound fields"
2929
2930       · 3009 "EHR - print_hr () called before column_names ()"
2931
2932       · 3010 "EHR - print_hr () called with invalid arguments"
2933

SEE ALSO

2935       IO::File,  IO::Handle,  IO::Wrap,  Text::CSV,  Text::CSV_PP,
2936       Text::CSV::Encoded,     Text::CSV::Separator,    Text::CSV::Slurp,
2937       Spreadsheet::CSV and Spreadsheet::Read, and of course perl.
2938
2939       If you are using perl6,  you can have a look at  "Text::CSV"  in the
2940       perl6 ecosystem, offering the same features.
2941
2942       non-perl
2943
2944       A CSV parser in JavaScript,  also used by W3C <http://www.w3.org>,  is
2945       the multi-threaded in-browser PapaParse <http://papaparse.com/>.
2946
2947       csvkit <http://csvkit.readthedocs.org> is a python CSV parsing toolkit.
2948

AUTHOR

2950       Alan Citterman <alan@mfgrtl.com> wrote the original Perl module.
2951       Please don't send mail concerning Text::CSV_XS to Alan, who is not
2952       involved in the C/XS part that is now the main part of the module.
2953
2954       Jochen Wiedmann <joe@ispsoft.de> rewrote the en- and decoding in C by
2955       implementing a simple finite-state machine.   He added variable quote,
2956       escape and separator characters, the binary mode and the print and
2957       getline methods. See ChangeLog releases 0.10 through 0.23.
2958
2959       H.Merijn Brand <h.m.brand@xs4all.nl> cleaned up the code,  added the
2960       field flags methods,  wrote the major part of the test suite, completed
2961       the documentation,   fixed most RT bugs,  added all the allow flags and
2962       the "csv" function. See ChangeLog releases 0.25 and on.
2963
2965        Copyright (C) 2007-2020 H.Merijn Brand.  All rights reserved.
2966        Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2967        Copyright (C) 1997      Alan Citterman.  All rights reserved.
2968
2969       This library is free software;  you can redistribute and/or modify it
2970       under the same terms as Perl itself.
2971
2972
2973
2974perl v5.30.1                      2020-02-16                         CSV_XS(3)
Impressum