1CSV_XS(3)             User Contributed Perl Documentation            CSV_XS(3)
2
3
4

NAME

6       Text::CSV_XS - comma-separated values manipulation routines
7

SYNOPSIS

9        # Functional interface
10        use Text::CSV_XS qw( csv );
11
12        # Read whole file in memory
13        my $aoa = csv (in => "data.csv");    # as array of array
14        my $aoh = csv (in => "data.csv",
15                       headers => "auto");   # as array of hash
16
17        # Write array of arrays as csv file
18        csv (in => $aoa, out => "file.csv", sep_char=> ";");
19
20        # Only show lines where "code" is odd
21        csv (in => "data.csv", filter => { code => sub { $_ % 2 }});
22
23
24        # Object interface
25        use Text::CSV_XS;
26
27        my @rows;
28        # Read/parse CSV
29        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
30        open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
31        while (my $row = $csv->getline ($fh)) {
32            $row->[2] =~ m/pattern/ or next; # 3rd field should match
33            push @rows, $row;
34            }
35        close $fh;
36
37        # and write as CSV
38        open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
39        $csv->say ($fh, $_) for @rows;
40        close $fh or die "new.csv: $!";
41

DESCRIPTION

43       Text::CSV_XS  provides facilities for the composition  and
44       decomposition of comma-separated values.  An instance of the
45       Text::CSV_XS class will combine fields into a "CSV" string and parse a
46       "CSV" string into fields.
47
48       The module accepts either strings or files as input  and support the
49       use of user-specified characters for delimiters, separators, and
50       escapes.
51
52   Embedded newlines
53       Important Note:  The default behavior is to accept only ASCII
54       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
55       that the fields can not contain newlines. If your data contains
56       newlines embedded in fields, or characters above 0x7E (tilde), or
57       binary data, you must set "binary => 1" in the call to "new". To cover
58       the widest range of parsing options, you will always want to set
59       binary.
60
61       But you still have the problem  that you have to pass a correct line to
62       the "parse" method, which is more complicated from the usual point of
63       usage:
64
65        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
66        while (<>) {           #  WRONG!
67            $csv->parse ($_);
68            my @fields = $csv->fields ();
69            }
70
71       this will break, as the "while" might read broken lines:  it does not
72       care about the quoting. If you need to support embedded newlines,  the
73       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
74       and "\r\n" by default) and then
75
76        my $csv = Text::CSV_XS->new ({ binary => 1 });
77        open my $fh, "<", $file or die "$file: $!";
78        while (my $row = $csv->getline ($fh)) {
79            my @fields = @$row;
80            }
81
82       The old(er) way of using global file handles is still supported
83
84        while (my $row = $csv->getline (*ARGV)) { ... }
85
86   Unicode
87       Unicode is only tested to work with perl-5.8.2 and up.
88
89       See also "BOM".
90
91       The simplest way to ensure the correct encoding is used for  in- and
92       output is by either setting layers on the filehandles, or setting the
93       "encoding" argument for "csv".
94
95        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
96       or
97        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
98
99        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
100       or
101        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
102
103       On parsing (both for  "getline" and  "parse"),  if the source is marked
104       being UTF8, then all fields that are marked binary will also be marked
105       UTF8.
106
107       On combining ("print"  and  "combine"):  if any of the combining fields
108       was marked UTF8, the resulting string will be marked as UTF8.  Note
109       however that all fields  before  the first field marked UTF8 and
110       contained 8-bit characters that were not upgraded to UTF8,  these will
111       be  "bytes"  in the resulting string too, possibly causing unexpected
112       errors.  If you pass data of different encoding,  or you don't know if
113       there is  different  encoding, force it to be upgraded before you pass
114       them on:
115
116        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
117
118       For complete control over encoding, please use Text::CSV::Encoded:
119
120        use Text::CSV::Encoded;
121        my $csv = Text::CSV::Encoded->new ({
122            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
123            encoding_out => "cp1252",     # the encoding comes out of Perl
124            });
125
126        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
127        # combine () and print () accept *literally* utf8 encoded data
128        # parse () and getline () return *literally* utf8 encoded data
129
130        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
131        # combine () and print () accept UTF8 marked data
132        # parse () and getline () return UTF8 marked data
133
134   BOM
135       BOM  (or Byte Order Mark)  handling is available only inside the
136       "header" method.   This method supports the following encodings:
137       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
138       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
139       <https://en.wikipedia.org/wiki/Byte_order_mark>.
140
141       If a file has a BOM, the easiest way to deal with that is
142
143        my $aoh = csv (in => $file, detect_bom => 1);
144
145       All records will be encoded based on the detected BOM.
146
147       This implies a call to the  "header"  method,  which defaults to also
148       set the "column_names". So this is not the same as
149
150        my $aoh = csv (in => $file, headers => "auto");
151
152       which only reads the first record to set  "column_names"  but ignores
153       any meaning of possible present BOM.
154

SPECIFICATION

156       While no formal specification for CSV exists, RFC 4180
157       <http://tools.ietf.org/html/rfc4180> (1) describes the common format
158       and establishes  "text/csv" as the MIME type registered with the IANA.
159       RFC 7111 <http://tools.ietf.org/html/rfc7111> (2) adds fragments to
160       CSV.
161
162       Many informal documents exist that describe the "CSV" format.   "How
163       To: The Comma Separated Value (CSV) File Format"
164       <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm> (3)  provides an
165       overview of the  "CSV"  format in the most widely used applications and
166       explains how it can best be used and supported.
167
168        1) http://tools.ietf.org/html/rfc4180
169        2) http://tools.ietf.org/html/rfc7111
170        3) http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
171
172       The basic rules are as follows:
173
174       CSV  is a delimited data format that has fields/columns separated by
175       the comma character and records/rows separated by newlines. Fields that
176       contain a special character (comma, newline, or double quote),  must be
177       enclosed in double quotes. However, if a line contains a single entry
178       that is the empty string, it may be enclosed in double quotes.  If a
179       field's value contains a double quote character it is escaped by
180       placing another double quote character next to it. The "CSV" file
181       format does not require a specific character encoding, byte order, or
182       line terminator format.
183
184       · Each record is a single line ended by a line feed  (ASCII/"LF"=0x0A)
185         or a carriage return and line feed pair (ASCII/"CRLF"="0x0D 0x0A"),
186         however, line-breaks may be embedded.
187
188       · Fields are separated by commas.
189
190       · Allowable characters within a "CSV" field include 0x09 ("TAB") and
191         the inclusive range of 0x20 (space) through 0x7E (tilde).  In binary
192         mode all characters are accepted, at least in quoted fields.
193
194       · A field within  "CSV"  must be surrounded by  double-quotes to
195         contain  a separator character (comma).
196
197       Though this is the most clear and restrictive definition,  Text::CSV_XS
198       is way more liberal than this, and allows extension:
199
200       · Line termination by a single carriage return is accepted by default
201
202       · The separation-, escape-, and escape- characters can be any ASCII
203         character in the range from  0x20 (space) to  0x7E (tilde).
204         Characters outside this range may or may not work as expected.
205         Multibyte characters, like UTF "U+060C" (ARABIC COMMA),   "U+FF0C"
206         (FULLWIDTH COMMA),  "U+241B" (SYMBOL FOR ESCAPE), "U+2424" (SYMBOL
207         FOR NEWLINE), "U+FF02" (FULLWIDTH QUOTATION MARK), and "U+201C" (LEFT
208         DOUBLE QUOTATION MARK) (to give some examples of what might look
209         promising) work for newer versions of perl for "sep_char", and
210         "quote_char" but not for "escape_char".
211
212         If you use perl-5.8.2 or higher these three attributes are
213         utf8-decoded, to increase the likelihood of success. This way
214         "U+00FE" will be allowed as a quote character.
215
216       · A field in  "CSV"  must be surrounded by double-quotes to make an
217         embedded double-quote, represented by a pair of consecutive double-
218         quotes, valid. In binary mode you may additionally use the sequence
219         ""0" for representation of a NULL byte. Using 0x00 in binary mode is
220         just as valid.
221
222       · Several violations of the above specification may be lifted by
223         passing some options as attributes to the object constructor.
224

METHODS

226   version
227       (Class method) Returns the current module version.
228
229   new
230       (Class method) Returns a new instance of class Text::CSV_XS. The
231       attributes are described by the (optional) hash ref "\%attr".
232
233        my $csv = Text::CSV_XS->new ({ attributes ... });
234
235       The following attributes are available:
236
237       eol
238
239        my $csv = Text::CSV_XS->new ({ eol => $/ });
240                  $csv->eol (undef);
241        my $eol = $csv->eol;
242
243       The end-of-line string to add to rows for "print" or the record
244       separator for "getline".
245
246       When not passed in a parser instance,  the default behavior is to
247       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
248       "eol" at all. Passing "undef" or the empty string behave the same.
249
250       When not passed in a generating instance,  records are not terminated
251       at all, so it is probably wise to pass something you expect. A safe
252       choice for "eol" on output is either $/ or "\r\n".
253
254       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
255       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
256       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
257
258       If both $/ and "eol" equal "\015", parsing lines that end on only a
259       Carriage Return without Line Feed, will be "parse"d correct.
260
261       sep_char
262
263        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
264                $csv->sep_char (";");
265        my $c = $csv->sep_char;
266
267       The char used to separate fields, by default a comma. (",").  Limited
268       to a single-byte character, usually in the range from 0x20 (space) to
269       0x7E (tilde). When longer sequences are required, use "sep".
270
271       The separation character can not be equal to the quote character  or to
272       the escape character.
273
274       See also "CAVEATS"
275
276       sep
277
278        my $csv = Text::CSV_XS->new ({ sep => "\N{FULLWIDTH COMMA}" });
279                  $csv->sep (";");
280        my $sep = $csv->sep;
281
282       The chars used to separate fields, by default undefined. Limited to 8
283       bytes.
284
285       When set, overrules "sep_char".  If its length is one byte it acts as
286       an alias to "sep_char".
287
288       See also "CAVEATS"
289
290       quote_char
291
292        my $csv = Text::CSV_XS->new ({ quote_char => "'" });
293                $csv->quote_char (undef);
294        my $c = $csv->quote_char;
295
296       The character to quote fields containing blanks or binary data,  by
297       default the double quote character (""").  A value of undef suppresses
298       quote chars (for simple cases only). Limited to a single-byte
299       character, usually in the range from  0x20 (space) to  0x7E (tilde).
300       When longer sequences are required, use "quote".
301
302       "quote_char" can not be equal to "sep_char".
303
304       quote
305
306        my $csv = Text::CSV_XS->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
307                    $csv->quote ("'");
308        my $quote = $csv->quote;
309
310       The chars used to quote fields, by default undefined. Limited to 8
311       bytes.
312
313       When set, overrules "quote_char". If its length is one byte it acts as
314       an alias to "quote_char".
315
316       See also "CAVEATS"
317
318       escape_char
319
320        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
321                $csv->escape_char (":");
322        my $c = $csv->escape_char;
323
324       The character to  escape  certain characters inside quoted fields.
325       This is limited to a  single-byte  character,  usually  in the  range
326       from  0x20 (space) to 0x7E (tilde).
327
328       The "escape_char" defaults to being the double-quote mark ("""). In
329       other words the same as the default "quote_char". This means that
330       doubling the quote mark in a field escapes it:
331
332        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
333
334       If  you  change  the   "quote_char"  without  changing  the
335       "escape_char",  the  "escape_char" will still be the double-quote
336       (""").  If instead you want to escape the  "quote_char" by doubling it
337       you will need to also change the  "escape_char"  to be the same as what
338       you have changed the "quote_char" to.
339
340       Setting "escape_char" to <undef> or "" will disable escaping completely
341       and is greatly discouraged. This will also disable "escape_null".
342
343       The escape character can not be equal to the separation character.
344
345       binary
346
347        my $csv = Text::CSV_XS->new ({ binary => 1 });
348                $csv->binary (0);
349        my $f = $csv->binary;
350
351       If this attribute is 1,  you may use binary characters in quoted
352       fields, including line feeds, carriage returns and "NULL" bytes. (The
353       latter could be escaped as ""0".) By default this feature is off.
354
355       If a string is marked UTF8,  "binary" will be turned on automatically
356       when binary characters other than "CR" and "NL" are encountered.   Note
357       that a simple string like "\x{00a0}" might still be binary, but not
358       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
359
360       strict
361
362        my $csv = Text::CSV_XS->new ({ strict => 1 });
363                $csv->strict (0);
364        my $f = $csv->strict;
365
366       If this attribute is set to 1, any row that parses to a different
367       number of fields than the previous row will cause the parser to throw
368       error 2014.
369
370       formula_handling
371
372       formula
373
374        my $csv = Text::CSV_XS->new ({ formula => "none" });
375                $csv->formula ("none");
376        my $f = $csv->formula;
377
378       This defines the behavior of fields containing formulas. As formulas
379       are considered dangerous in spreadsheets, this attribute can define an
380       optional action to be taken if a field starts with an equal sign ("=").
381
382       For purpose of code-readability, this can also be written as
383
384        my $csv = Text::CSV_XS->new ({ formula_handling => "none" });
385                $csv->formula_handling ("none");
386        my $f = $csv->formula_handling;
387
388       Possible values for this attribute are
389
390       none
391         Take no specific action. This is the default.
392
393          $csv->formula ("none");
394
395       die
396         Cause the process to "die" whenever a leading "=" is encountered.
397
398          $csv->formula ("die");
399
400       croak
401         Cause the process to "croak" whenever a leading "=" is encountered.
402         (See Carp)
403
404          $csv->formula ("croak");
405
406       diag
407         Report position and content of the field whenever a leading  "=" is
408         found.  The value of the field is unchanged.
409
410          $csv->formula ("diag");
411
412       empty
413         Replace the content of fields that start with a "=" with the empty
414         string.
415
416          $csv->formula ("empty");
417          $csv->formula ("");
418
419       undef
420         Replace the content of fields that start with a "=" with "undef".
421
422          $csv->formula ("undef");
423          $csv->formula (undef);
424
425       a callback
426         Modify the content of fields that start with a  "="  with the return-
427         value of the callback.  The original content of the field is
428         available inside the callback as $_;
429
430          # Replace all formula's with 42
431          $csv->formula (sub { 42; });
432
433          # same as $csv->formula ("empty") but slower
434          $csv->formula (sub { "" });
435
436          # Allow =4+12
437          $csv->formula (sub { s/^=(\d+\+\d+)$/$1/eer });
438
439          # Allow more complex calculations
440          $csv->formula (sub { eval { s{^=([-+*/0-9()]+)$}{$1}ee }; $_ });
441
442       All other values will give a warning and then fallback to "diag".
443
444       decode_utf8
445
446        my $csv = Text::CSV_XS->new ({ decode_utf8 => 1 });
447                $csv->decode_utf8 (0);
448        my $f = $csv->decode_utf8;
449
450       This attributes defaults to TRUE.
451
452       While parsing,  fields that are valid UTF-8, are automatically set to
453       be UTF-8, so that
454
455         $csv->parse ("\xC4\xA8\n");
456
457       results in
458
459         PV("\304\250"\0) [UTF8 "\x{128}"]
460
461       Sometimes it might not be a desired action.  To prevent those upgrades,
462       set this attribute to false, and the result will be
463
464         PV("\304\250"\0)
465
466       auto_diag
467
468        my $csv = Text::CSV_XS->new ({ auto_diag => 1 });
469                $csv->auto_diag (2);
470        my $l = $csv->auto_diag;
471
472       Set this attribute to a number between 1 and 9 causes  "error_diag" to
473       be automatically called in void context upon errors.
474
475       In case of error "2012 - EOF", this call will be void.
476
477       If "auto_diag" is set to a numeric value greater than 1, it will "die"
478       on errors instead of "warn".  If set to anything unrecognized,  it will
479       be silently ignored.
480
481       Future extensions to this feature will include more reliable auto-
482       detection of  "autodie"  being active in the scope of which the error
483       occurred which will increment the value of "auto_diag" with  1 the
484       moment the error is detected.
485
486       diag_verbose
487
488        my $csv = Text::CSV_XS->new ({ diag_verbose => 1 });
489                $csv->diag_verbose (2);
490        my $l = $csv->diag_verbose;
491
492       Set the verbosity of the output triggered by "auto_diag".   Currently
493       only adds the current  input-record-number  (if known)  to the
494       diagnostic output with an indication of the position of the error.
495
496       blank_is_undef
497
498        my $csv = Text::CSV_XS->new ({ blank_is_undef => 1 });
499                $csv->blank_is_undef (0);
500        my $f = $csv->blank_is_undef;
501
502       Under normal circumstances, "CSV" data makes no distinction between
503       quoted- and unquoted empty fields.  These both end up in an empty
504       string field once read, thus
505
506        1,"",," ",2
507
508       is read as
509
510        ("1", "", "", " ", "2")
511
512       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
513       set, the unquoted  empty field is the result of an undefined value.
514       To enable this distinction when  reading "CSV"  data,  the
515       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
516       to "undef", causing the above to be parsed as
517
518        ("1", "", undef, " ", "2")
519
520       note that this is specifically important when loading  "CSV" fields
521       into a database that allows "NULL" values,  as the perl equivalent for
522       "NULL" is "undef" in DBI land.
523
524       empty_is_undef
525
526        my $csv = Text::CSV_XS->new ({ empty_is_undef => 1 });
527                $csv->empty_is_undef (0);
528        my $f = $csv->empty_is_undef;
529
530       Going one  step  further  than  "blank_is_undef",  this attribute
531       converts all empty fields to "undef", so
532
533        1,"",," ",2
534
535       is read as
536
537        (1, undef, undef, " ", 2)
538
539       Note that this effects only fields that are  originally  empty,  not
540       fields that are empty after stripping allowed whitespace. YMMV.
541
542       allow_whitespace
543
544        my $csv = Text::CSV_XS->new ({ allow_whitespace => 1 });
545                $csv->allow_whitespace (0);
546        my $f = $csv->allow_whitespace;
547
548       When this option is set to true,  the whitespace  ("TAB"'s and
549       "SPACE"'s) surrounding  the  separation character  is removed when
550       parsing.  If either "TAB" or "SPACE" is one of the three characters
551       "sep_char", "quote_char", or "escape_char" it will not be considered
552       whitespace.
553
554       Now lines like:
555
556        1 , "foo" , bar , 3 , zapp
557
558       are parsed as valid "CSV", even though it violates the "CSV" specs.
559
560       Note that  all  whitespace is stripped from both  start and  end of
561       each field.  That would make it  more than a feature to enable parsing
562       bad "CSV" lines, as
563
564        1,   2.0,  3,   ape  , monkey
565
566       will now be parsed as
567
568        ("1", "2.0", "3", "ape", "monkey")
569
570       even if the original line was perfectly acceptable "CSV".
571
572       allow_loose_quotes
573
574        my $csv = Text::CSV_XS->new ({ allow_loose_quotes => 1 });
575                $csv->allow_loose_quotes (0);
576        my $f = $csv->allow_loose_quotes;
577
578       By default, parsing unquoted fields containing "quote_char" characters
579       like
580
581        1,foo "bar" baz,42
582
583       would result in parse error 2034.  Though it is still bad practice to
584       allow this format,  we  cannot  help  the  fact  that  some  vendors
585       make  their applications spit out lines styled this way.
586
587       If there is really bad "CSV" data, like
588
589        1,"foo "bar" baz",42
590
591       or
592
593        1,""foo bar baz"",42
594
595       there is a way to get this data-line parsed and leave the quotes inside
596       the quoted field as-is.  This can be achieved by setting
597       "allow_loose_quotes" AND making sure that the "escape_char" is  not
598       equal to "quote_char".
599
600       allow_loose_escapes
601
602        my $csv = Text::CSV_XS->new ({ allow_loose_escapes => 1 });
603                $csv->allow_loose_escapes (0);
604        my $f = $csv->allow_loose_escapes;
605
606       Parsing fields  that  have  "escape_char"  characters that escape
607       characters that do not need to be escaped, like:
608
609        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
610        $csv->parse (qq{1,"my bar\'s",baz,42});
611
612       would result in parse error 2025.   Though it is bad practice to allow
613       this format,  this attribute enables you to treat all escape character
614       sequences equal.
615
616       allow_unquoted_escape
617
618        my $csv = Text::CSV_XS->new ({ allow_unquoted_escape => 1 });
619                $csv->allow_unquoted_escape (0);
620        my $f = $csv->allow_unquoted_escape;
621
622       A backward compatibility issue where "escape_char" differs from
623       "quote_char"  prevents  "escape_char" to be in the first position of a
624       field.  If "quote_char" is equal to the default """ and "escape_char"
625       is set to "\", this would be illegal:
626
627        1,\0,2
628
629       Setting this attribute to 1  might help to overcome issues with
630       backward compatibility and allow this style.
631
632       always_quote
633
634        my $csv = Text::CSV_XS->new ({ always_quote => 1 });
635                $csv->always_quote (0);
636        my $f = $csv->always_quote;
637
638       By default the generated fields are quoted only if they need to be.
639       For example, if they contain the separator character. If you set this
640       attribute to 1 then all defined fields will be quoted. ("undef" fields
641       are not quoted, see "blank_is_undef"). This makes it quite often easier
642       to handle exported data in external applications.   (Poor creatures who
643       are better to use Text::CSV_XS. :)
644
645       quote_space
646
647        my $csv = Text::CSV_XS->new ({ quote_space => 1 });
648                $csv->quote_space (0);
649        my $f = $csv->quote_space;
650
651       By default,  a space in a field would trigger quotation.  As no rule
652       exists this to be forced in "CSV",  nor any for the opposite, the
653       default is true for safety.   You can exclude the space  from this
654       trigger  by setting this attribute to 0.
655
656       quote_empty
657
658        my $csv = Text::CSV_XS->new ({ quote_empty => 1 });
659                $csv->quote_empty (0);
660        my $f = $csv->quote_empty;
661
662       By default the generated fields are quoted only if they need to be.
663       An empty (defined) field does not need quotation. If you set this
664       attribute to 1 then empty defined fields will be quoted.  ("undef"
665       fields are not quoted, see "blank_is_undef"). See also "always_quote".
666
667       quote_binary
668
669        my $csv = Text::CSV_XS->new ({ quote_binary => 1 });
670                $csv->quote_binary (0);
671        my $f = $csv->quote_binary;
672
673       By default,  all "unsafe" bytes inside a string cause the combined
674       field to be quoted.  By setting this attribute to 0, you can disable
675       that trigger for bytes >= 0x7F.
676
677       escape_null
678
679        my $csv = Text::CSV_XS->new ({ escape_null => 1 });
680                $csv->escape_null (0);
681        my $f = $csv->escape_null;
682
683       By default, a "NULL" byte in a field would be escaped. This option
684       enables you to treat the  "NULL"  byte as a simple binary character in
685       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
686       can prevent "NULL" escapes by setting this attribute to 0.
687
688       When the "escape_char" attribute is set to undefined,  this attribute
689       will be set to false.
690
691       The default setting will encode "=\x00=" as
692
693        "="0="
694
695       With "escape_null" set, this will result in
696
697        "=\x00="
698
699       The default when using the "csv" function is "false".
700
701       For backward compatibility reasons,  the deprecated old name
702       "quote_null" is still recognized.
703
704       keep_meta_info
705
706        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1 });
707                $csv->keep_meta_info (0);
708        my $f = $csv->keep_meta_info;
709
710       By default, the parsing of input records is as simple and fast as
711       possible.  However,  some parsing information - like quotation of the
712       original field - is lost in that process.  Setting this flag to true
713       enables retrieving that information after parsing with  the methods
714       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
715       false for performance.
716
717       If you set this attribute to a value greater than 9,   than you can
718       control output quotation style like it was used in the input of the the
719       last parsed record (unless quotation was added because of other
720       reasons).
721
722        my $csv = Text::CSV_XS->new ({
723           binary         => 1,
724           keep_meta_info => 1,
725           quote_space    => 0,
726           });
727
728        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
729
730        $csv->print (*STDOUT, \@row);
731        # 1,,, , ,f,g,"h""h",help,help
732        $csv->keep_meta_info (11);
733        $csv->print (*STDOUT, \@row);
734        # 1,,"", ," ",f,"g","h""h",help,"help"
735
736       undef_str
737
738        my $csv = Text::CSV_XS->new ({ undef_str => "\\N" });
739                $csv->undef_str (undef);
740        my $s = $csv->undef_str;
741
742       This attribute optionally defines the output of undefined fields. The
743       value passed is not changed at all, so if it needs quotation, the
744       quotation needs to be included in the value of the attribute.  Use with
745       caution, as passing a value like  ",",,,,"""  will for sure mess up
746       your output. The default for this attribute is "undef", meaning no
747       special treatment.
748
749       This attribute is useful when exporting  CSV data  to be imported in
750       custom loaders, like for MySQL, that recognize special sequences for
751       "NULL" data.
752
753       This attribute has no meaning when parsing CSV data.
754
755       verbatim
756
757        my $csv = Text::CSV_XS->new ({ verbatim => 1 });
758                $csv->verbatim (0);
759        my $f = $csv->verbatim;
760
761       This is a quite controversial attribute to set,  but makes some hard
762       things possible.
763
764       The rationale behind this attribute is to tell the parser that the
765       normally special characters newline ("NL") and Carriage Return ("CR")
766       will not be special when this flag is set,  and be dealt with  as being
767       ordinary binary characters. This will ease working with data with
768       embedded newlines.
769
770       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
771       every line.
772
773       Imagine a file format like
774
775        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
776
777       where, the line ending is a very specific "#\r\n", and the sep_char is
778       a "^" (caret).   None of the fields is quoted,   but embedded binary
779       data is likely to be present. With the specific line ending, this
780       should not be too hard to detect.
781
782       By default,  Text::CSV_XS'  parse function is instructed to only know
783       about "\n" and "\r"  to be legal line endings,  and so has to deal with
784       the embedded newline as a real "end-of-line",  so it can scan the next
785       line if binary is true, and the newline is inside a quoted field. With
786       this option, we tell "parse" to parse the line as if "\n" is just
787       nothing more than a binary character.
788
789       For "parse" this means that the parser has no more idea about line
790       ending and "getline" "chomp"s line endings on reading.
791
792       types
793
794       A set of column types; the attribute is immediately passed to the
795       "types" method.
796
797       callbacks
798
799       See the "Callbacks" section below.
800
801       accessors
802
803       To sum it up,
804
805        $csv = Text::CSV_XS->new ();
806
807       is equivalent to
808
809        $csv = Text::CSV_XS->new ({
810            eol                   => undef, # \r, \n, or \r\n
811            sep_char              => ',',
812            sep                   => undef,
813            quote_char            => '"',
814            quote                 => undef,
815            escape_char           => '"',
816            binary                => 0,
817            decode_utf8           => 1,
818            auto_diag             => 0,
819            diag_verbose          => 0,
820            blank_is_undef        => 0,
821            empty_is_undef        => 0,
822            allow_whitespace      => 0,
823            allow_loose_quotes    => 0,
824            allow_loose_escapes   => 0,
825            allow_unquoted_escape => 0,
826            always_quote          => 0,
827            quote_empty           => 0,
828            quote_space           => 1,
829            escape_null           => 1,
830            quote_binary          => 1,
831            keep_meta_info        => 0,
832            strict                => 0,
833            formula               => 0,
834            verbatim              => 0,
835            undef_str             => undef,
836            types                 => undef,
837            callbacks             => undef,
838            });
839
840       For all of the above mentioned flags, an accessor method is available
841       where you can inquire the current value, or change the value
842
843        my $quote = $csv->quote_char;
844        $csv->binary (1);
845
846       It is not wise to change these settings halfway through writing "CSV"
847       data to a stream. If however you want to create a new stream using the
848       available "CSV" object, there is no harm in changing them.
849
850       If the "new" constructor call fails,  it returns "undef",  and makes
851       the fail reason available through the "error_diag" method.
852
853        $csv = Text::CSV_XS->new ({ ecs_char => 1 }) or
854            die "".Text::CSV_XS->error_diag ();
855
856       "error_diag" will return a string like
857
858        "INI - Unknown attribute 'ecs_char'"
859
860   known_attributes
861        @attr = Text::CSV_XS->known_attributes;
862        @attr = Text::CSV_XS::known_attributes;
863        @attr = $csv->known_attributes;
864
865       This method will return an ordered list of all the supported
866       attributes as described above.   This can be useful for knowing what
867       attributes are valid in classes that use or extend Text::CSV_XS.
868
869   print
870        $status = $csv->print ($fh, $colref);
871
872       Similar to  "combine" + "string" + "print",  but much more efficient.
873       It expects an array ref as input  (not an array!)  and the resulting
874       string is not really  created,  but  immediately  written  to the  $fh
875       object, typically an IO handle or any other object that offers a
876       "print" method.
877
878       For performance reasons  "print"  does not create a result string,  so
879       all "string", "status", "fields", and "error_input" methods will return
880       undefined information after executing this method.
881
882       If $colref is "undef"  (explicit,  not through a variable argument) and
883       "bind_columns"  was used to specify fields to be printed,  it is
884       possible to make performance improvements, as otherwise data would have
885       to be copied as arguments to the method call:
886
887        $csv->bind_columns (\($foo, $bar));
888        $status = $csv->print ($fh, undef);
889
890       A short benchmark
891
892        my @data = ("aa" .. "zz");
893        $csv->bind_columns (\(@data));
894
895        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
896        $csv->print ($fh,  \@data  );   # 57600 recs/sec
897        $csv->print ($fh,   undef  );   # 48500 recs/sec
898
899   say
900        $status = $csv->say ($fh, $colref);
901
902       Like "print", but "eol" defaults to "$\".
903
904   print_hr
905        $csv->print_hr ($fh, $ref);
906
907       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
908       provided the column names are set with "column_names".
909
910       It is just a wrapper method with basic parameter checks over
911
912        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
913
914   combine
915        $status = $csv->combine (@fields);
916
917       This method constructs a "CSV" record from  @fields,  returning success
918       or failure.   Failure can result from lack of arguments or an argument
919       that contains an invalid character.   Upon success,  "string" can be
920       called to retrieve the resultant "CSV" string.  Upon failure,  the
921       value returned by "string" is undefined and "error_input" could be
922       called to retrieve the invalid argument.
923
924   string
925        $line = $csv->string ();
926
927       This method returns the input to  "parse"  or the resultant "CSV"
928       string of "combine", whichever was called more recently.
929
930   getline
931        $colref = $csv->getline ($fh);
932
933       This is the counterpart to  "print",  as "parse"  is the counterpart to
934       "combine":  it parses a row from the $fh  handle using the "getline"
935       method associated with $fh  and parses this row into an array ref.
936       This array ref is returned by the function or "undef" for failure.
937       When $fh does not support "getline", you are likely to hit errors.
938
939       When fields are bound with "bind_columns" the return value is a
940       reference to an empty list.
941
942       The "string", "fields", and "status" methods are meaningless again.
943
944   getline_all
945        $arrayref = $csv->getline_all ($fh);
946        $arrayref = $csv->getline_all ($fh, $offset);
947        $arrayref = $csv->getline_all ($fh, $offset, $length);
948
949       This will return a reference to a list of getline ($fh) results.  In
950       this call, "keep_meta_info" is disabled.  If $offset is negative, as
951       with "splice", only the last  "abs ($offset)" records of $fh are taken
952       into consideration.
953
954       Given a CSV file with 10 lines:
955
956        lines call
957        ----- ---------------------------------------------------------
958        0..9  $csv->getline_all ($fh)         # all
959        0..9  $csv->getline_all ($fh,  0)     # all
960        8..9  $csv->getline_all ($fh,  8)     # start at 8
961        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
962        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
963        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
964        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
965        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
966
967   getline_hr
968       The "getline_hr" and "column_names" methods work together  to allow you
969       to have rows returned as hashrefs.  You must call "column_names" first
970       to declare your column names.
971
972        $csv->column_names (qw( code name price description ));
973        $hr = $csv->getline_hr ($fh);
974        print "Price for $hr->{name} is $hr->{price} EUR\n";
975
976       "getline_hr" will croak if called before "column_names".
977
978       Note that  "getline_hr"  creates a hashref for every row and will be
979       much slower than the combined use of "bind_columns"  and "getline" but
980       still offering the same ease of use hashref inside the loop:
981
982        my @cols = @{$csv->getline ($fh)};
983        $csv->column_names (@cols);
984        while (my $row = $csv->getline_hr ($fh)) {
985            print $row->{price};
986            }
987
988       Could easily be rewritten to the much faster:
989
990        my @cols = @{$csv->getline ($fh)};
991        my $row = {};
992        $csv->bind_columns (\@{$row}{@cols});
993        while ($csv->getline ($fh)) {
994            print $row->{price};
995            }
996
997       Your mileage may vary for the size of the data and the number of rows.
998       With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
999
1000                   Rate hashrefs getlines
1001        hashrefs 1.00/s       --     -76%
1002        getlines 4.15/s     313%       --
1003
1004   getline_hr_all
1005        $arrayref = $csv->getline_hr_all ($fh);
1006        $arrayref = $csv->getline_hr_all ($fh, $offset);
1007        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
1008
1009       This will return a reference to a list of   getline_hr ($fh) results.
1010       In this call, "keep_meta_info" is disabled.
1011
1012   parse
1013        $status = $csv->parse ($line);
1014
1015       This method decomposes a  "CSV"  string into fields,  returning success
1016       or failure.   Failure can result from a lack of argument  or the given
1017       "CSV" string is improperly formatted.   Upon success, "fields" can be
1018       called to retrieve the decomposed fields. Upon failure calling "fields"
1019       will return undefined data and  "error_input"  can be called to
1020       retrieve  the invalid argument.
1021
1022       You may use the "types"  method for setting column types.  See "types"'
1023       description below.
1024
1025       The $line argument is supposed to be a simple scalar. Everything else
1026       is supposed to croak and set error 1500.
1027
1028   fragment
1029       This function tries to implement RFC7111  (URI Fragment Identifiers for
1030       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
1031
1032        my $AoA = $csv->fragment ($fh, $spec);
1033
1034       In specifications,  "*" is used to specify the last item, a dash ("-")
1035       to indicate a range.   All indices are 1-based:  the first row or
1036       column has index 1. Selections can be combined with the semi-colon
1037       (";").
1038
1039       When using this method in combination with  "column_names",  the
1040       returned reference  will point to a  list of hashes  instead of a  list
1041       of lists.  A disjointed  cell-based combined selection  might return
1042       rows with different number of columns making the use of hashes
1043       unpredictable.
1044
1045        $csv->column_names ("Name", "Age");
1046        my $AoH = $csv->fragment ($fh, "col=3;8");
1047
1048       If the "after_parse" callback is active,  it is also called on every
1049       line parsed and skipped before the fragment.
1050
1051       row
1052          row=4
1053          row=5-7
1054          row=6-*
1055          row=1-2;4;6-*
1056
1057       col
1058          col=2
1059          col=1-3
1060          col=4-*
1061          col=1-2;4;7-*
1062
1063       cell
1064         In cell-based selection, the comma (",") is used to pair row and
1065         column
1066
1067          cell=4,1
1068
1069         The range operator ("-") using "cell"s can be used to define top-left
1070         and bottom-right "cell" location
1071
1072          cell=3,1-4,6
1073
1074         The "*" is only allowed in the second part of a pair
1075
1076          cell=3,2-*,2    # row 3 till end, only column 2
1077          cell=3,2-3,*    # column 2 till end, only row 3
1078          cell=3,2-*,*    # strip row 1 and 2, and column 1
1079
1080         Cells and cell ranges may be combined with ";", possibly resulting in
1081         rows with different number of columns
1082
1083          cell=1,1-2,2;3,3-4,4;1,4;4,1
1084
1085         Disjointed selections will only return selected cells.   The cells
1086         that are not  specified  will  not  be  included  in the  returned
1087         set,  not even as "undef".  As an example given a "CSV" like
1088
1089          11,12,13,...19
1090          21,22,...28,29
1091          :            :
1092          91,...97,98,99
1093
1094         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
1095
1096          11,12,14
1097          21,22
1098          33,34
1099          41,43,44
1100
1101         Overlapping cell-specs will return those cells only once, So
1102         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1103
1104          11,12,13
1105          21,22,23,24
1106          31,32,33,34
1107          42,43,44
1108
1109       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1110       types of specs to be combined   (either "row" or "col" or "cell").
1111       Passing an invalid fragment specification will croak and set error
1112       2013.
1113
1114   column_names
1115       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1116       keys (column names) are passed, it will return the current setting as a
1117       list.
1118
1119       "column_names" accepts a list of scalars  (the column names)  or a
1120       single array_ref, so you can pass the return value from "getline" too:
1121
1122        $csv->column_names ($csv->getline ($fh));
1123
1124       "column_names" does no checking on duplicates at all, which might lead
1125       to unexpected results.   Undefined entries will be replaced with the
1126       string "\cAUNDEF\cA", so
1127
1128        $csv->column_names (undef, "", "name", "name");
1129        $hr = $csv->getline_hr ($fh);
1130
1131       Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1132       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1133       field.
1134
1135       "column_names" croaks on invalid arguments.
1136
1137   header
1138       This method does NOT work in perl-5.6.x
1139
1140       Parse the CSV header and set "sep", column_names and encoding.
1141
1142        my @hdr = $csv->header ($fh);
1143        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1144        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1145
1146       The first argument should be a file handle.
1147
1148       This method resets some object properties,  as it is supposed to be
1149       invoked only once per file or stream.  It will leave attributes
1150       "column_names" and "bound_columns" alone of setting column names is
1151       disabled. Reading headers on previously process objects might fail on
1152       perl-5.8.0 and older.
1153
1154       Assuming that the file opened for parsing has a header, and the header
1155       does not contain problematic characters like embedded newlines,   read
1156       the first line from the open handle then auto-detect whether the header
1157       separates the column names with a character from the allowed separator
1158       list.
1159
1160       If any of the allowed separators matches,  and none of the other
1161       allowed separators match,  set  "sep"  to that  separator  for the
1162       current CSV_XS instance and use it to parse the first line, map those
1163       to lowercase, and use that to set the instance "column_names":
1164
1165        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1166        open my $fh, "<", "file.csv";
1167        binmode $fh; # for Windows
1168        $csv->header ($fh);
1169        while (my $row = $csv->getline_hr ($fh)) {
1170            ...
1171            }
1172
1173       If the header is empty,  contains more than one unique separator out of
1174       the allowed set,  contains empty fields,   or contains identical fields
1175       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1176       respectively.
1177
1178       If the header contains embedded newlines or is not valid  CSV  in any
1179       other way, this method will croak and leave the parse error untouched.
1180
1181       A successful call to "header"  will always set the  "sep"  of the $csv
1182       object. This behavior can not be disabled.
1183
1184       return value
1185
1186       On error this method will croak.
1187
1188       In list context,  the headers will be returned whether they are used to
1189       set "column_names" or not.
1190
1191       In scalar context, the instance itself is returned.  Note: the values
1192       as found in the header will effectively be  lost if  "set_column_names"
1193       is false.
1194
1195       Options
1196
1197       sep_set
1198          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1199
1200         The list of legal separators defaults to "[ ";", "," ]" and can be
1201         changed by this option.  As this is probably the most often used
1202         option,  it can be passed on its own as an unnamed argument:
1203
1204          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1205
1206         Multi-byte  sequences are allowed,  both multi-character and
1207         Unicode.  See "sep".
1208
1209       detect_bom
1210          $csv->header ($fh, { detect_bom => 1 });
1211
1212         The default behavior is to detect if the header line starts with a
1213         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1214         This default behavior can be disabled by passing a false value to
1215         "detect_bom".
1216
1217         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1218         UTF-32BE,  and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1219         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1220         supported.
1221
1222         If a supported BOM was detected as start of the stream, it is stored
1223         in the abject attribute "ENCODING".
1224
1225          my $enc = $csv->{ENCODING};
1226
1227         The encoding is used with "binmode" on $fh.
1228
1229         If the handle was opened in a (correct) encoding,  this method will
1230         not alter the encoding, as it checks the leading bytes of the first
1231         line. In case the stream starts with a decode BOM ("U+FEFF"),
1232         "{ENCODING}" will be "" (empty) instead of the default "undef".
1233
1234       munge_column_names
1235         This option offers the means to modify the column names into
1236         something that is most useful to the application.   The default is to
1237         map all column names to lower case.
1238
1239          $csv->header ($fh, { munge_column_names => "lc" });
1240
1241         The following values are available:
1242
1243           lc     - lower case
1244           uc     - upper case
1245           db     - valid DB field names
1246           none   - do not change
1247           \%hash - supply a mapping
1248           \&cb   - supply a callback
1249
1250         Lower case
1251            $csv->header ($fh, { munge_column_names => "lc" });
1252
1253           The header is changed to all lower-case
1254
1255            $_ = lc;
1256
1257         Upper case
1258            $csv->header ($fh, { munge_column_names => "uc" });
1259
1260           The header is changed to all upper-case
1261
1262            $_ = uc;
1263
1264         Literal
1265            $csv->header ($fh, { munge_column_names => "none" });
1266
1267         Hash
1268            $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1269
1270           if a value does not exist, the original value is used unchanged
1271
1272         Database
1273            $csv->header ($fh, { munge_column_names => "db" });
1274
1275           - lower-case
1276
1277           - all sequences of non-word characters are replaced with an
1278             underscore
1279
1280           - all leading underscores are removed
1281
1282            $_ = lc (s/\W+/_/gr =~ s/^_+//r);
1283
1284         Callback
1285            $csv->header ($fh, { munge_column_names => sub { fc } });
1286            $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1287            $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1288
1289           As this callback is called in a "map", you can use $_ directly.
1290
1291       set_column_names
1292          $csv->header ($fh, { set_column_names => 1 });
1293
1294         The default is to set the instances column names using
1295         "column_names" if the method is successful,  so subsequent calls to
1296         "getline_hr" can return a hash. Disable setting the header can be
1297         forced by using a false value for this option.
1298
1299         As described in "return value" above, content is lost in scalar
1300         context.
1301
1302       Validation
1303
1304       When receiving CSV files from external sources,  this method can be
1305       used to protect against changes in the layout by restricting to known
1306       headers  (and typos in the header fields).
1307
1308        my %known = (
1309            "record key" => "c_rec",
1310            "rec id"     => "c_rec",
1311            "id_rec"     => "c_rec",
1312            "kode"       => "code",
1313            "code"       => "code",
1314            "vaule"      => "value",
1315            "value"      => "value",
1316            );
1317        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1318        open my $fh, "<", $source or die "$source: $!";
1319        $csv->header ($fh, { munge_column_names => sub {
1320            s/\s+$//;
1321            s/^\s+//;
1322            $known{lc $_} or die "Unknown column '$_' in $source";
1323            }});
1324        while (my $row = $csv->getline_hr ($fh)) {
1325            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1326            }
1327
1328   bind_columns
1329       Takes a list of scalar references to be used for output with  "print"
1330       or to store in the fields fetched by "getline".  When you do not pass
1331       enough references to store the fetched fields in, "getline" will fail
1332       with error 3006.  If you pass more than there are fields to return,
1333       the content of the remaining references is left untouched.
1334
1335        $csv->bind_columns (\$code, \$name, \$price, \$description);
1336        while ($csv->getline ($fh)) {
1337            print "The price of a $name is \x{20ac} $price\n";
1338            }
1339
1340       To reset or clear all column binding, call "bind_columns" with the
1341       single argument "undef". This will also clear column names.
1342
1343        $csv->bind_columns (undef);
1344
1345       If no arguments are passed at all, "bind_columns" will return the list
1346       of current bindings or "undef" if no binds are active.
1347
1348       Note that in parsing with  "bind_columns",  the fields are set on the
1349       fly.  That implies that if the third field of a row causes an error
1350       (or this row has just two fields where the previous row had more),  the
1351       first two fields already have been assigned the values of the current
1352       row, while the rest of the fields will still hold the values of the
1353       previous row.  If you want the parser to fail in these cases, use the
1354       "strict" attribute.
1355
1356   eof
1357        $eof = $csv->eof ();
1358
1359       If "parse" or  "getline"  was used with an IO stream,  this method will
1360       return true (1) if the last call hit end of file,  otherwise it will
1361       return false ('').  This is useful to see the difference between a
1362       failure and end of file.
1363
1364       Note that if the parsing of the last line caused an error,  "eof" is
1365       still true.  That means that if you are not using "auto_diag", an idiom
1366       like
1367
1368        while (my $row = $csv->getline ($fh)) {
1369            # ...
1370            }
1371        $csv->eof or $csv->error_diag;
1372
1373       will not report the error. You would have to change that to
1374
1375        while (my $row = $csv->getline ($fh)) {
1376            # ...
1377            }
1378        +$csv->error_diag and $csv->error_diag;
1379
1380   types
1381        $csv->types (\@tref);
1382
1383       This method is used to force that  (all)  columns are of a given type.
1384       For example, if you have an integer column,  two  columns  with
1385       doubles  and a string column, then you might do a
1386
1387        $csv->types ([Text::CSV_XS::IV (),
1388                      Text::CSV_XS::NV (),
1389                      Text::CSV_XS::NV (),
1390                      Text::CSV_XS::PV ()]);
1391
1392       Column types are used only for decoding columns while parsing,  in
1393       other words by the "parse" and "getline" methods.
1394
1395       You can unset column types by doing a
1396
1397        $csv->types (undef);
1398
1399       or fetch the current type settings with
1400
1401        $types = $csv->types ();
1402
1403       IV  Set field type to integer.
1404
1405       NV  Set field type to numeric/float.
1406
1407       PV  Set field type to string.
1408
1409   fields
1410        @columns = $csv->fields ();
1411
1412       This method returns the input to   "combine"  or the resultant
1413       decomposed fields of a successful "parse", whichever was called more
1414       recently.
1415
1416       Note that the return value is undefined after using "getline", which
1417       does not fill the data structures returned by "parse".
1418
1419   meta_info
1420        @flags = $csv->meta_info ();
1421
1422       This method returns the "flags" of the input to "combine" or the flags
1423       of the resultant  decomposed fields of  "parse",   whichever was called
1424       more recently.
1425
1426       For each field,  a meta_info field will hold  flags that  inform
1427       something about  the  field  returned  by  the  "fields"  method or
1428       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1429
1430       " "0x0001
1431         The field was quoted.
1432
1433       " "0x0002
1434         The field was binary.
1435
1436       See the "is_***" methods below.
1437
1438   is_quoted
1439        my $quoted = $csv->is_quoted ($column_idx);
1440
1441       Where  $column_idx is the  (zero-based)  index of the column in the
1442       last result of "parse".
1443
1444       This returns a true value  if the data in the indicated column was
1445       enclosed in "quote_char" quotes.  This might be important for fields
1446       where content ",20070108," is to be treated as a numeric value,  and
1447       where ","20070108"," is explicitly marked as character string data.
1448
1449       This method is only valid when "keep_meta_info" is set to a true value.
1450
1451   is_binary
1452        my $binary = $csv->is_binary ($column_idx);
1453
1454       Where  $column_idx is the  (zero-based)  index of the column in the
1455       last result of "parse".
1456
1457       This returns a true value if the data in the indicated column contained
1458       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1459
1460       This method is only valid when "keep_meta_info" is set to a true value.
1461
1462   is_missing
1463        my $missing = $csv->is_missing ($column_idx);
1464
1465       Where  $column_idx is the  (zero-based)  index of the column in the
1466       last result of "getline_hr".
1467
1468        $csv->keep_meta_info (1);
1469        while (my $hr = $csv->getline_hr ($fh)) {
1470            $csv->is_missing (0) and next; # This was an empty line
1471            }
1472
1473       When using  "getline_hr",  it is impossible to tell if the  parsed
1474       fields are "undef" because they where not filled in the "CSV" stream
1475       or because they were not read at all, as all the fields defined by
1476       "column_names" are set in the hash-ref.    If you still need to know if
1477       all fields in each row are provided, you should enable "keep_meta_info"
1478       so you can check the flags.
1479
1480       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1481       "undef", regardless of $column_idx being valid or not. If this
1482       attribute is "true" it will return either 0 (the field is present) or 1
1483       (the field is missing).
1484
1485       A special case is the empty line.  If the line is completely empty -
1486       after dealing with the flags - this is still a valid CSV line:  it is a
1487       record of just one single empty field. However, if "keep_meta_info" is
1488       set, invoking "is_missing" with index 0 will now return true.
1489
1490   status
1491        $status = $csv->status ();
1492
1493       This method returns the status of the last invoked "combine" or "parse"
1494       call. Status is success (true: 1) or failure (false: "undef" or 0).
1495
1496   error_input
1497        $bad_argument = $csv->error_input ();
1498
1499       This method returns the erroneous argument (if it exists) of "combine"
1500       or "parse",  whichever was called more recently.  If the last
1501       invocation was successful, "error_input" will return "undef".
1502
1503   error_diag
1504        Text::CSV_XS->error_diag ();
1505        $csv->error_diag ();
1506        $error_code               = 0  + $csv->error_diag ();
1507        $error_str                = "" . $csv->error_diag ();
1508        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1509
1510       If (and only if) an error occurred,  this function returns  the
1511       diagnostics of that error.
1512
1513       If called in void context,  this will print the internal error code and
1514       the associated error message to STDERR.
1515
1516       If called in list context,  this will return  the error code  and the
1517       error message in that order.  If the last error was from parsing, the
1518       rest of the values returned are a best guess at the location  within
1519       the line  that was being parsed. Their values are 1-based.  The
1520       position currently is index of the byte at which the parsing failed in
1521       the current record. It might change to be the index of the current
1522       character in a later release. The records is the index of the record
1523       parsed by the csv instance. The field number is the index of the field
1524       the parser thinks it is currently  trying to  parse. See
1525       examples/csv-check for how this can be used.
1526
1527       If called in  scalar context,  it will return  the diagnostics  in a
1528       single scalar, a-la $!.  It will contain the error code in numeric
1529       context, and the diagnostics message in string context.
1530
1531       When called as a class method or a  direct function call,  the
1532       diagnostics are that of the last "new" call.
1533
1534   record_number
1535        $recno = $csv->record_number ();
1536
1537       Returns the records parsed by this csv instance.  This value should be
1538       more accurate than $. when embedded newlines come in play. Records
1539       written by this instance are not counted.
1540
1541   SetDiag
1542        $csv->SetDiag (0);
1543
1544       Use to reset the diagnostics if you are dealing with errors.
1545

FUNCTIONS

1547   csv
1548       This function is not exported by default and should be explicitly
1549       requested:
1550
1551        use Text::CSV_XS qw( csv );
1552
1553       This is an high-level function that aims at simple (user) interfaces.
1554       This can be used to read/parse a "CSV" file or stream (the default
1555       behavior) or to produce a file or write to a stream (define the  "out"
1556       attribute).  It returns an array- or hash-reference on parsing (or
1557       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1558       When this function fails you can get to the error using the class call
1559       to "error_diag"
1560
1561        my $aoa = csv (in => "test.csv") or
1562            die Text::CSV_XS->error_diag;
1563
1564       This function takes the arguments as key-value pairs. This can be
1565       passed as a list or as an anonymous hash:
1566
1567        my $aoa = csv (  in => "test.csv", sep_char => ";");
1568        my $aoh = csv ({ in => $fh, headers => "auto" });
1569
1570       The arguments passed consist of two parts:  the arguments to "csv"
1571       itself and the optional attributes to the  "CSV"  object used inside
1572       the function as enumerated and explained in "new".
1573
1574       If not overridden, the default option used for CSV is
1575
1576        auto_diag   => 1
1577        escape_null => 0
1578
1579       The option that is always set and cannot be altered is
1580
1581        binary      => 1
1582
1583       As this function will likely be used in one-liners,  it allows  "quote"
1584       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1585       "esc" or "escape".
1586
1587       Alternative invocations:
1588
1589        my $aoa = Text::CSV_XS::csv (in => "file.csv");
1590
1591        my $csv = Text::CSV_XS->new ();
1592        my $aoa = $csv->csv (in => "file.csv");
1593
1594       In the latter case, the object attributes are used from the existing
1595       object and the attribute arguments in the function call are ignored:
1596
1597        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
1598        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1599
1600       will parse using ";" as "sep_char", not ",".
1601
1602       in
1603
1604       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1605       which will be  opened for reading  and closed when finished,  a file
1606       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1607       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1608       "\q{1,2,"csv"}").
1609
1610       When used with "out", "in" should be a reference to a CSV structure
1611       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1612       reference.  The code-ref will be invoked with no arguments.
1613
1614        my $aoa = csv (in => "file.csv");
1615
1616        open my $fh, "<", "file.csv";
1617        my $aoa = csv (in => $fh);
1618
1619        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1620        my $err = csv (in => $csv, out => "file.csv");
1621
1622       If called in void context without the "out" attribute, the resulting
1623       ref will be used as input to a subsequent call to csv:
1624
1625        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1626
1627       will be a shortcut to
1628
1629        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1630
1631       where, in the absence of the "out" attribute, this is a shortcut to
1632
1633        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1634             out => *STDOUT)
1635
1636       out
1637
1638        csv (in => $aoa, out => "file.csv");
1639        csv (in => $aoa, out => $fh);
1640        csv (in => $aoa, out =>   STDOUT);
1641        csv (in => $aoa, out =>  *STDOUT);
1642        csv (in => $aoa, out => \*STDOUT);
1643        csv (in => $aoa, out => \my $data);
1644        csv (in => $aoa, out =>  undef);
1645        csv (in => $aoa, out => \"skip");
1646
1647       In output mode, the default CSV options when producing CSV are
1648
1649        eol       => "\r\n"
1650
1651       The "fragment" attribute is ignored in output mode.
1652
1653       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1654       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1655       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1656       or a reference to a scalar (e.g. "\my $data").
1657
1658        csv (in => sub { $sth->fetch },            out => "dump.csv");
1659        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1660             headers => $sth->{NAME_lc});
1661
1662       When a code-ref is used for "in", the output is generated  per
1663       invocation, so no buffering is involved. This implies that there is no
1664       size restriction on the number of records. The "csv" function ends when
1665       the coderef returns a false value.
1666
1667       If "out" is set to a reference of the literal string "skip", the output
1668       will be suppressed completely,  which might be useful in combination
1669       with a filter for side effects only.
1670
1671        my %cache;
1672        csv (in    => "dump.csv",
1673             out   => \"skip",
1674             on_in => sub { $cache{$_[1][1]}++ });
1675
1676       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1677       equivalent to "\"skip"".
1678
1679       encoding
1680
1681       If passed,  it should be an encoding accepted by the  ":encoding()"
1682       option to "open". There is no default value. This attribute does not
1683       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1684       use in command line invocations.
1685
1686       If "encoding" is set to the literal value "auto", the method "header"
1687       will be invoked on the opened stream to check if there is a BOM and set
1688       the encoding accordingly.   This is equal to passing a true value in
1689       the option "detect_bom".
1690
1691       Encodings can be stacked, as supported by "binmode":
1692
1693        # Using PerlIO::via::gzip
1694        csv (in       => \@csv,
1695             out      => "test.csv:via.gz",
1696             encoding => ":via(gzip):encoding(utf-8)",
1697             );
1698        $aoa = csv (in => "test.csv:via.gz",  encoding => ":via(gzip)");
1699
1700        # Using PerlIO::gzip
1701        csv (in       => \@csv,
1702             out      => "test.csv:via.gz",
1703             encoding => ":gzip:encoding(utf-8)",
1704             );
1705        $aoa = csv (in => "test.csv:gzip.gz", encoding => ":gzip");
1706
1707       detect_bom
1708
1709       If  "detect_bom"  is given, the method  "header"  will be invoked on
1710       the opened stream to check if there is a BOM and set the encoding
1711       accordingly.
1712
1713       "detect_bom" can be abbreviated to "bom".
1714
1715       This is the same as setting "encoding" to "auto".
1716
1717       Note that as the method  "header" is invoked,  its default is to also
1718       set the headers.
1719
1720       headers
1721
1722       If this attribute is not given, the default behavior is to produce an
1723       array of arrays.
1724
1725       If "headers" is supplied,  it should be an anonymous list of column
1726       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1727       "lc", "uc", or "skip".
1728
1729       skip
1730         When "skip" is used, the header will not be included in the output.
1731
1732          my $aoa = csv (in => $fh, headers => "skip");
1733
1734       auto
1735         If "auto" is used, the first line of the "CSV" source will be read as
1736         the list of field headers and used to produce an array of hashes.
1737
1738          my $aoh = csv (in => $fh, headers => "auto");
1739
1740       lc
1741         If "lc" is used,  the first line of the  "CSV" source will be read as
1742         the list of field headers mapped to  lower case and used to produce
1743         an array of hashes. This is a variation of "auto".
1744
1745          my $aoh = csv (in => $fh, headers => "lc");
1746
1747       uc
1748         If "uc" is used,  the first line of the  "CSV" source will be read as
1749         the list of field headers mapped to  upper case and used to produce
1750         an array of hashes. This is a variation of "auto".
1751
1752          my $aoh = csv (in => $fh, headers => "uc");
1753
1754       CODE
1755         If a coderef is used,  the first line of the  "CSV" source will be
1756         read as the list of mangled field headers in which each field is
1757         passed as the only argument to the coderef. This list is used to
1758         produce an array of hashes.
1759
1760          my $aoh = csv (in      => $fh,
1761                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1762
1763         this example is a variation of using "lc" where all occurrences of
1764         "kode" are replaced with "code".
1765
1766       ARRAY
1767         If  "headers"  is an anonymous list,  the entries in the list will be
1768         used as field names. The first line is considered data instead of
1769         headers.
1770
1771          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1772          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1773
1774       HASH
1775         If "headers" is an hash reference, this implies "auto", but header
1776         fields for that exist as key in the hashref will be replaced by the
1777         value for that key. Given a CSV file like
1778
1779          post-kode,city,name,id number,fubble
1780          1234AA,Duckstad,Donald,13,"X313DF"
1781
1782         using
1783
1784          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1785
1786         will return an entry like
1787
1788          { pc     => "1234AA",
1789            city   => "Duckstad",
1790            name   => "Donald",
1791            ID     => "13",
1792            fubble => "X313DF",
1793            }
1794
1795       See also "munge_column_names" and "set_column_names".
1796
1797       munge_column_names
1798
1799       If "munge_column_names" is set,  the method  "header"  is invoked on
1800       the opened stream with all matching arguments to detect and set the
1801       headers.
1802
1803       "munge_column_names" can be abbreviated to "munge".
1804
1805       key
1806
1807       If passed,  will default  "headers"  to "auto" and return a hashref
1808       instead of an array of hashes. Allowed values are simple scalars or
1809       array-references where the first element is the joiner and the rest are
1810       the fields to join to combine the key.
1811
1812        my $ref = csv (in => "test.csv", key => "code");
1813        my $ref = csv (in => "test.csv", key => [ ":" => "code", "color" ]);
1814
1815       with test.csv like
1816
1817        code,product,price,color
1818        1,pc,850,gray
1819        2,keyboard,12,white
1820        3,mouse,5,black
1821
1822       the first example will return
1823
1824         { 1   => {
1825               code    => 1,
1826               color   => 'gray',
1827               price   => 850,
1828               product => 'pc'
1829               },
1830           2   => {
1831               code    => 2,
1832               color   => 'white',
1833               price   => 12,
1834               product => 'keyboard'
1835               },
1836           3   => {
1837               code    => 3,
1838               color   => 'black',
1839               price   => 5,
1840               product => 'mouse'
1841               }
1842           }
1843
1844       the second example will return
1845
1846         { "1:gray"    => {
1847               code    => 1,
1848               color   => 'gray',
1849               price   => 850,
1850               product => 'pc'
1851               },
1852           "2:white"   => {
1853               code    => 2,
1854               color   => 'white',
1855               price   => 12,
1856               product => 'keyboard'
1857               },
1858           "3:black"   => {
1859               code    => 3,
1860               color   => 'black',
1861               price   => 5,
1862               product => 'mouse'
1863               }
1864           }
1865
1866       The "key" attribute can be combined with "headers" for "CSV" date that
1867       has no header line, like
1868
1869        my $ref = csv (
1870            in      => "foo.csv",
1871            headers => [qw( c_foo foo bar description stock )],
1872            key     =>     "c_foo",
1873            );
1874
1875       value
1876
1877       Used to create key-value hashes.
1878
1879       Only allowed when "key" is valid. A "value" can be either a single
1880       column label or an anonymous list of column labels.  In the first case,
1881       the value will be a simple scalar value, in the latter case, it will be
1882       a hashref.
1883
1884        my $ref = csv (in => "test.csv", key   => "code",
1885                                         value => "price");
1886        my $ref = csv (in => "test.csv", key   => "code",
1887                                         value => [ "product", "price" ]);
1888        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1889                                         value => "price");
1890        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1891                                         value => [ "product", "price" ]);
1892
1893       with test.csv like
1894
1895        code,product,price,color
1896        1,pc,850,gray
1897        2,keyboard,12,white
1898        3,mouse,5,black
1899
1900       the first example will return
1901
1902         { 1 => 850,
1903           2 =>  12,
1904           3 =>   5,
1905           }
1906
1907       the second example will return
1908
1909         { 1   => {
1910               price   => 850,
1911               product => 'pc'
1912               },
1913           2   => {
1914               price   => 12,
1915               product => 'keyboard'
1916               },
1917           3   => {
1918               price   => 5,
1919               product => 'mouse'
1920               }
1921           }
1922
1923       the third example will return
1924
1925         { "1:gray"    => 850,
1926           "2:white"   =>  12,
1927           "3:black"   =>   5,
1928           }
1929
1930       the fourth example will return
1931
1932         { "1:gray"    => {
1933               price   => 850,
1934               product => 'pc'
1935               },
1936           "2:white"   => {
1937               price   => 12,
1938               product => 'keyboard'
1939               },
1940           "3:black"   => {
1941               price   => 5,
1942               product => 'mouse'
1943               }
1944           }
1945
1946       keep_headers
1947
1948       When using hashes,  keep the column names into the arrayref passed,  so
1949       all headers are available after the call in the original order.
1950
1951        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1952
1953       This attribute can be abbreviated to "kh" or passed as
1954       "keep_column_names".
1955
1956       This attribute implies a default of "auto" for the "headers" attribute.
1957
1958       fragment
1959
1960       Only output the fragment as defined in the "fragment" method. This
1961       option is ignored when generating "CSV". See "out".
1962
1963       Combining all of them could give something like
1964
1965        use Text::CSV_XS qw( csv );
1966        my $aoh = csv (
1967            in       => "test.txt",
1968            encoding => "utf-8",
1969            headers  => "auto",
1970            sep_char => "|",
1971            fragment => "row=3;6-9;15-*",
1972            );
1973        say $aoh->[15]{Foo};
1974
1975       sep_set
1976
1977       If "sep_set" is set, the method "header" is invoked on the opened
1978       stream to detect and set "sep_char" with the given set.
1979
1980       "sep_set" can be abbreviated to "seps".
1981
1982       Note that as the  "header" method is invoked,  its default is to also
1983       set the headers.
1984
1985       set_column_names
1986
1987       If  "set_column_names" is passed,  the method "header" is invoked on
1988       the opened stream with all arguments meant for "header".
1989
1990       If "set_column_names" is passed as a false value, the content of the
1991       first row is only preserved if the output is AoA:
1992
1993       With an input-file like
1994
1995        bAr,foo
1996        1,2
1997        3,4,5
1998
1999       This call
2000
2001        my $aoa = csv (in => $file, set_column_names => 0);
2002
2003       will result in
2004
2005        [[ "bar", "foo"     ],
2006         [ "1",   "2"       ],
2007         [ "3",   "4",  "5" ]]
2008
2009       and
2010
2011        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
2012
2013       will result in
2014
2015        [[ "bAr", "foo"     ],
2016         [ "1",   "2"       ],
2017         [ "3",   "4",  "5" ]]
2018
2019   Callbacks
2020       Callbacks enable actions triggered from the inside of Text::CSV_XS.
2021
2022       While most of what this enables  can easily be done in an  unrolled
2023       loop as described in the "SYNOPSIS" callbacks can be used to meet
2024       special demands or enhance the "csv" function.
2025
2026       error
2027          $csv->callbacks (error => sub { $csv->SetDiag (0) });
2028
2029         the "error"  callback is invoked when an error occurs,  but  only
2030         when "auto_diag" is set to a true value. A callback is invoked with
2031         the values returned by "error_diag":
2032
2033          my ($c, $s);
2034
2035          sub ignore3006
2036          {
2037              my ($err, $msg, $pos, $recno, $fldno) = @_;
2038              if ($err == 3006) {
2039                  # ignore this error
2040                  ($c, $s) = (undef, undef);
2041                  Text::CSV_XS->SetDiag (0);
2042                  }
2043              # Any other error
2044              return;
2045              } # ignore3006
2046
2047          $csv->callbacks (error => \&ignore3006);
2048          $csv->bind_columns (\$c, \$s);
2049          while ($csv->getline ($fh)) {
2050              # Error 3006 will not stop the loop
2051              }
2052
2053       after_parse
2054          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
2055          while (my $row = $csv->getline ($fh)) {
2056              $row->[-1] eq "NEW";
2057              }
2058
2059         This callback is invoked after parsing with  "getline"  only if no
2060         error occurred.  The callback is invoked with two arguments:   the
2061         current "CSV" parser object and an array reference to the fields
2062         parsed.
2063
2064         The return code of the callback is ignored  unless it is a reference
2065         to the string "skip", in which case the record will be skipped in
2066         "getline_all".
2067
2068          sub add_from_db
2069          {
2070              my ($csv, $row) = @_;
2071              $sth->execute ($row->[4]);
2072              push @$row, $sth->fetchrow_array;
2073              } # add_from_db
2074
2075          my $aoa = csv (in => "file.csv", callbacks => {
2076              after_parse => \&add_from_db });
2077
2078         This hook can be used for validation:
2079
2080         FAIL
2081           Die if any of the records does not validate a rule:
2082
2083            after_parse => sub {
2084                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
2085                    die "5th field does not have a valid Dutch zipcode";
2086                }
2087
2088         DEFAULT
2089           Replace invalid fields with a default value:
2090
2091            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
2092
2093         SKIP
2094           Skip records that have invalid fields (only applies to
2095           "getline_all"):
2096
2097            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
2098
2099       before_print
2100          my $idx = 1;
2101          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
2102          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
2103
2104         This callback is invoked  before printing with  "print"  only if no
2105         error occurred.  The callback is invoked with two arguments:  the
2106         current  "CSV" parser object and an array reference to the fields
2107         passed.
2108
2109         The return code of the callback is ignored.
2110
2111          sub max_4_fields
2112          {
2113              my ($csv, $row) = @_;
2114              @$row > 4 and splice @$row, 4;
2115              } # max_4_fields
2116
2117          csv (in => csv (in => "file.csv"), out => *STDOUT,
2118              callbacks => { before print => \&max_4_fields });
2119
2120         This callback is not active for "combine".
2121
2122       Callbacks for csv ()
2123
2124       The "csv" allows for some callbacks that do not integrate in XS
2125       internals but only feature the "csv" function.
2126
2127         csv (in        => "file.csv",
2128              callbacks => {
2129                  filter       => { 6 => sub { $_ > 15 } },    # first
2130                  after_parse  => sub { say "AFTER PARSE";  }, # first
2131                  after_in     => sub { say "AFTER IN";     }, # second
2132                  on_in        => sub { say "ON IN";        }, # third
2133                  },
2134              );
2135
2136         csv (in        => $aoh,
2137              out       => "file.csv",
2138              callbacks => {
2139                  on_in        => sub { say "ON IN";        }, # first
2140                  before_out   => sub { say "BEFORE OUT";   }, # second
2141                  before_print => sub { say "BEFORE PRINT"; }, # third
2142                  },
2143              );
2144
2145       filter
2146         This callback can be used to filter records.  It is called just after
2147         a new record has been scanned.  The callback accepts a:
2148
2149         hashref
2150           The keys are the index to the row (the field name or field number,
2151           1-based) and the values are subs to return a true or false value.
2152
2153            csv (in => "file.csv", filter => {
2154                       3 => sub { m/a/ },       # third field should contain an "a"
2155                       5 => sub { length > 4 }, # length of the 5th field minimal 5
2156                       });
2157
2158            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
2159
2160           If the keys to the filter hash contain any character that is not a
2161           digit it will also implicitly set "headers" to "auto"  unless
2162           "headers"  was already passed as argument.  When headers are
2163           active, returning an array of hashes, the filter is not applicable
2164           to the header itself.
2165
2166           All sub results should match, as in AND.
2167
2168           The context of the callback sets  $_ localized to the field
2169           indicated by the filter. The two arguments are as with all other
2170           callbacks, so the other fields in the current row can be seen:
2171
2172            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
2173
2174           If the context is set to return a list of hashes  ("headers" is
2175           defined), the current record will also be available in the
2176           localized %_:
2177
2178            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
2179
2180           If the filter is used to alter the content by changing $_,  make
2181           sure that the sub returns true in order not to have that record
2182           skipped:
2183
2184            filter => { 2 => sub { $_ = uc }}
2185
2186           will upper-case the second field, and then skip it if the resulting
2187           content evaluates to false. To always accept, end with truth:
2188
2189            filter => { 2 => sub { $_ = uc; 1 }}
2190
2191         coderef
2192            csv (in => "file.csv", filter => sub { $n++; 0; });
2193
2194           If the argument to "filter" is a coderef,  it is an alias or
2195           shortcut to a filter on column 0:
2196
2197            csv (filter => sub { $n++; 0 });
2198
2199           is equal to
2200
2201            csv (filter => { 0 => sub { $n++; 0 });
2202
2203         filter-name
2204            csv (in => "file.csv", filter => "not_blank");
2205            csv (in => "file.csv", filter => "not_empty");
2206            csv (in => "file.csv", filter => "filled");
2207
2208           These are predefined filters
2209
2210           Given a file like (line numbers prefixed for doc purpose only):
2211
2212            1:1,2,3
2213            2:
2214            3:,
2215            4:""
2216            5:,,
2217            6:, ,
2218            7:"",
2219            8:" "
2220            9:4,5,6
2221
2222           not_blank
2223             Filter out the blank lines
2224
2225             This filter is a shortcut for
2226
2227              filter => { 0 => sub { @{$_[1]} > 1 or
2228                          defined $_[1][0] && $_[1][0] ne "" } }
2229
2230             Due to the implementation,  it is currently impossible to also
2231             filter lines that consists only of a quoted empty field. These
2232             lines are also considered blank lines.
2233
2234             With the given example, lines 2 and 4 will be skipped.
2235
2236           not_empty
2237             Filter out lines where all the fields are empty.
2238
2239             This filter is a shortcut for
2240
2241              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2242
2243             A space is not regarded being empty, so given the example data,
2244             lines 2, 3, 4, 5, and 7 are skipped.
2245
2246           filled
2247             Filter out lines that have no visible data
2248
2249             This filter is a shortcut for
2250
2251              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2252
2253             This filter rejects all lines that not have at least one field
2254             that does not evaluate to the empty string.
2255
2256             With the given example data, this filter would skip lines 2
2257             through 8.
2258
2259       after_in
2260         This callback is invoked for each record after all records have been
2261         parsed but before returning the reference to the caller.  The hook is
2262         invoked with two arguments:  the current  "CSV"  parser object  and a
2263         reference to the record.   The reference can be a reference to a
2264         HASH  or a reference to an ARRAY as determined by the arguments.
2265
2266         This callback can also be passed as  an attribute without the
2267         "callbacks" wrapper.
2268
2269       before_out
2270         This callback is invoked for each record before the record is
2271         printed.  The hook is invoked with two arguments:  the current "CSV"
2272         parser object and a reference to the record.   The reference can be a
2273         reference to a  HASH or a reference to an ARRAY as determined by the
2274         arguments.
2275
2276         This callback can also be passed as an attribute  without the
2277         "callbacks" wrapper.
2278
2279         This callback makes the row available in %_ if the row is a hashref.
2280         In this case %_ is writable and will change the original row.
2281
2282       on_in
2283         This callback acts exactly as the "after_in" or the "before_out"
2284         hooks.
2285
2286         This callback can also be passed as an attribute  without the
2287         "callbacks" wrapper.
2288
2289         This callback makes the row available in %_ if the row is a hashref.
2290         In this case %_ is writable and will change the original row. So e.g.
2291         with
2292
2293           my $aoh = csv (
2294               in      => \"foo\n1\n2\n",
2295               headers => "auto",
2296               on_in   => sub { $_{bar} = 2; },
2297               );
2298
2299         $aoh will be:
2300
2301           [ { foo => 1,
2302               bar => 2,
2303               }
2304             { foo => 2,
2305               bar => 2,
2306               }
2307             ]
2308
2309       csv
2310         The function  "csv" can also be called as a method or with an
2311         existing Text::CSV_XS object. This could help if the function is to
2312         be invoked a lot of times and the overhead of creating the object
2313         internally over  and  over again would be prevented by passing an
2314         existing instance.
2315
2316          my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2317
2318          my $aoa = $csv->csv (in => $fh);
2319          my $aoa = csv (in => $fh, csv => $csv);
2320
2321         both act the same. Running this 20000 times on a 20 lines CSV file,
2322         showed a 53% speedup.
2323

INTERNALS

2325       Combine (...)
2326       Parse (...)
2327
2328       The arguments to these internal functions are deliberately not
2329       described or documented in order to enable the  module authors make
2330       changes it when they feel the need for it.  Using them is  highly
2331       discouraged  as  the  API may change in future releases.
2332

EXAMPLES

2334   Reading a CSV file line by line:
2335        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2336        open my $fh, "<", "file.csv" or die "file.csv: $!";
2337        while (my $row = $csv->getline ($fh)) {
2338            # do something with @$row
2339            }
2340        close $fh or die "file.csv: $!";
2341
2342       or
2343
2344        my $aoa = csv (in => "file.csv", on_in => sub {
2345            # do something with %_
2346            });
2347
2348       Reading only a single column
2349
2350        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2351        open my $fh, "<", "file.csv" or die "file.csv: $!";
2352        # get only the 4th column
2353        my @column = map { $_->[3] } @{$csv->getline_all ($fh)};
2354        close $fh or die "file.csv: $!";
2355
2356       with "csv", you could do
2357
2358        my @column = map { $_->[0] }
2359            @{csv (in => "file.csv", fragment => "col=4")};
2360
2361   Parsing CSV strings:
2362        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1, binary => 1 });
2363
2364        my $sample_input_string =
2365            qq{"I said, ""Hi!""",Yes,"",2.34,,"1.09","\x{20ac}",};
2366        if ($csv->parse ($sample_input_string)) {
2367            my @field = $csv->fields;
2368            foreach my $col (0 .. $#field) {
2369                my $quo = $csv->is_quoted ($col) ? $csv->{quote_char} : "";
2370                printf "%2d: %s%s%s\n", $col, $quo, $field[$col], $quo;
2371                }
2372            }
2373        else {
2374            print STDERR "parse () failed on argument: ",
2375                $csv->error_input, "\n";
2376            $csv->error_diag ();
2377            }
2378
2379       Parsing CSV from memory
2380
2381       Given a complete CSV data-set in scalar $data,  generate a list of
2382       lists to represent the rows and fields
2383
2384        # The data
2385        my $data = join "\r\n" => map { join "," => 0 .. 5 } 0 .. 5;
2386
2387        # in a loop
2388        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2389        open my $fh, "<", \$data;
2390        my @foo;
2391        while (my $row = $csv->getline ($fh)) {
2392            push @foo, $row;
2393            }
2394        close $fh;
2395
2396        # a single call
2397        my $foo = csv (in => \$data);
2398
2399   Printing CSV data
2400       The fast way: using "print"
2401
2402       An example for creating "CSV" files using the "print" method:
2403
2404        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
2405        open my $fh, ">", "foo.csv" or die "foo.csv: $!";
2406        for (1 .. 10) {
2407            $csv->print ($fh, [ $_, "$_" ]) or $csv->error_diag;
2408            }
2409        close $fh or die "$tbl.csv: $!";
2410
2411       The slow way: using "combine" and "string"
2412
2413       or using the slower "combine" and "string" methods:
2414
2415        my $csv = Text::CSV_XS->new;
2416
2417        open my $csv_fh, ">", "hello.csv" or die "hello.csv: $!";
2418
2419        my @sample_input_fields = (
2420            'You said, "Hello!"',   5.67,
2421            '"Surely"',   '',   '3.14159');
2422        if ($csv->combine (@sample_input_fields)) {
2423            print $csv_fh $csv->string, "\n";
2424            }
2425        else {
2426            print "combine () failed on argument: ",
2427                $csv->error_input, "\n";
2428            }
2429        close $csv_fh or die "hello.csv: $!";
2430
2431       Generating CSV into memory
2432
2433       Format a data-set (@foo) into a scalar value in memory ($data):
2434
2435        # The data
2436        my @foo = map { [ 0 .. 5 ] } 0 .. 3;
2437
2438        # in a loop
2439        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, eol => "\r\n" });
2440        open my $fh, ">", \my $data;
2441        $csv->print ($fh, $_) for @foo;
2442        close $fh;
2443
2444        # a single call
2445        csv (in => \@foo, out => \my $data);
2446
2447   Rewriting CSV
2448       Rewrite "CSV" files with ";" as separator character to well-formed
2449       "CSV":
2450
2451        use Text::CSV_XS qw( csv );
2452        csv (in => csv (in => "bad.csv", sep_char => ";"), out => *STDOUT);
2453
2454       As "STDOUT" is now default in "csv", a one-liner converting a UTF-16
2455       CSV file with BOM and TAB-separation to valid UTF-8 CSV could be:
2456
2457        $ perl -C3 -MText::CSV_XS=csv -we\
2458           'csv(in=>"utf16tab.csv",encoding=>"utf16",sep=>"\t")' >utf8.csv
2459
2460   Dumping database tables to CSV
2461       Dumping a database table can be simple as this (TIMTOWTDI):
2462
2463        my $dbh = DBI->connect (...);
2464        my $sql = "select * from foo";
2465
2466        # using your own loop
2467        open my $fh, ">", "foo.csv" or die "foo.csv: $!\n";
2468        my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n" });
2469        my $sth = $dbh->prepare ($sql); $sth->execute;
2470        $csv->print ($fh, $sth->{NAME_lc});
2471        while (my $row = $sth->fetch) {
2472            $csv->print ($fh, $row);
2473            }
2474
2475        # using the csv function, all in memory
2476        csv (out => "foo.csv", in => $dbh->selectall_arrayref ($sql));
2477
2478        # using the csv function, streaming with callbacks
2479        my $sth = $dbh->prepare ($sql); $sth->execute;
2480        csv (out => "foo.csv", in => sub { $sth->fetch            });
2481        csv (out => "foo.csv", in => sub { $sth->fetchrow_hashref });
2482
2483       Note that this does not discriminate between "empty" values and NULL-
2484       values from the database,  as both will be the same empty field in CSV.
2485       To enable distinction between the two, use "quote_empty".
2486
2487        csv (out => "foo.csv", in => sub { $sth->fetch }, quote_empty => 1);
2488
2489       If the database import utility supports special sequences to insert
2490       "NULL" values into the database,  like MySQL/MariaDB supports "\N",
2491       use a filter or a map
2492
2493        csv (out => "foo.csv", in => sub { $sth->fetch },
2494                            on_in => sub { $_ //= "\\N" for @{$_[1]} });
2495
2496        while (my $row = $sth->fetch) {
2497            $csv->print ($fh, [ map { $_ // "\\N" } @$row ]);
2498            }
2499
2500       note that this will not work as expected when choosing the backslash
2501       ("\") as "escape_char", as that will cause the "\" to need to be
2502       escaped by yet another "\",  which will cause the field to need
2503       quotation and thus ending up as "\\N" instead of "\N". See also
2504       "undef_str".
2505
2506        csv (out => "foo.csv", in => sub { $sth->fetch }, undef_str => "\\N");
2507
2508       these special sequences are not recognized by  Text::CSV_XS  on parsing
2509       the CSV generated like this, but map and filter are your friends again
2510
2511        while (my $row = $csv->getline ($fh)) {
2512            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @$row);
2513            }
2514
2515        csv (in => "foo.csv", filter => { 1 => sub {
2516            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @{$_[1]}); 0; }});
2517
2518   The examples folder
2519       For more extended examples, see the examples/ 1. sub-directory in the
2520       original distribution or the git repository 2.
2521
2522        1. https://github.com/Tux/Text-CSV_XS/tree/master/examples
2523        2. https://github.com/Tux/Text-CSV_XS
2524
2525       The following files can be found there:
2526
2527       parser-xs.pl
2528         This can be used as a boilerplate to parse invalid "CSV"  and parse
2529         beyond (expected) errors alternative to using the "error" callback.
2530
2531          $ perl examples/parser-xs.pl bad.csv >good.csv
2532
2533       csv-check
2534         This is a command-line tool that uses parser-xs.pl  techniques to
2535         check the "CSV" file and report on its content.
2536
2537          $ csv-check files/utf8.csv
2538          Checked files/utf8.csv  with csv-check 1.9
2539          using Text::CSV_XS 1.32 with perl 5.26.0 and Unicode 9.0.0
2540          OK: rows: 1, columns: 2
2541              sep = <,>, quo = <">, bin = <1>, eol = <"\n">
2542
2543       csv2xls
2544         A script to convert "CSV" to Microsoft Excel ("XLS"). This requires
2545         extra modules Date::Calc and Spreadsheet::WriteExcel. The converter
2546         accepts various options and can produce UTF-8 compliant Excel files.
2547
2548       csv2xlsx
2549         A script to convert "CSV" to Microsoft Excel ("XLSX").  This requires
2550         the modules Date::Calc and Spreadsheet::Writer::XLSX.  The converter
2551         does accept various options including merging several "CSV" files
2552         into a single Excel file.
2553
2554       csvdiff
2555         A script that provides colorized diff on sorted CSV files,  assuming
2556         first line is header and first field is the key. Output options
2557         include colorized ANSI escape codes or HTML.
2558
2559          $ csvdiff --html --output=diff.html file1.csv file2.csv
2560
2561       rewrite.pl
2562         A script to rewrite (in)valid CSV into valid CSV files.  Script has
2563         options to generate confusing CSV files or CSV files that conform to
2564         Dutch MS-Excel exports (using ";" as separation).
2565
2566         Script - by default - honors BOM  and auto-detects separation
2567         converting it to default standard CSV with "," as separator.
2568

CAVEATS

2570       Text::CSV_XS  is not designed to detect the characters used to quote
2571       and separate fields.  The parsing is done using predefined  (default)
2572       settings.  In the examples  sub-directory,  you can find scripts  that
2573       demonstrate how you could try to detect these characters yourself.
2574
2575   Microsoft Excel
2576       The import/export from Microsoft Excel is a risky task, according to
2577       the documentation in "Text::CSV::Separator".  Microsoft uses the
2578       system's list separator defined in the regional settings, which happens
2579       to be a semicolon for Dutch, German and Spanish (and probably some
2580       others as well).   For the English locale,  the default is a comma.
2581       In Windows however,  the user is free to choose a  predefined locale,
2582       and then change  every  individual setting in it, so checking the
2583       locale is no solution.
2584
2585       As of version 1.17, a lone first line with just
2586
2587         sep=;
2588
2589       will be recognized and honored when parsing with "getline".
2590

TODO

2592       More Errors & Warnings
2593         New extensions ought to be  clear and concise  in reporting what
2594         error has occurred where and why, and maybe also offer a remedy to
2595         the problem.
2596
2597         "error_diag" is a (very) good start, but there is more work to be
2598         done in this area.
2599
2600         Basic calls  should croak or warn on  illegal parameters.  Errors
2601         should be documented.
2602
2603       setting meta info
2604         Future extensions might include extending the "meta_info",
2605         "is_quoted", and  "is_binary"  to accept setting these  flags for
2606         fields,  so you can specify which fields are quoted in the
2607         "combine"/"string" combination.
2608
2609          $csv->meta_info (0, 1, 1, 3, 0, 0);
2610          $csv->is_quoted (3, 1);
2611
2612         Metadata Vocabulary for Tabular Data
2613         <http://w3c.github.io/csvw/metadata/> (a W3C editor's draft) could be
2614         an example for supporting more metadata.
2615
2616       Parse the whole file at once
2617         Implement new methods or functions  that enable parsing of a
2618         complete file at once, returning a list of hashes. Possible extension
2619         to this could be to enable a column selection on the call:
2620
2621          my @AoH = $csv->parse_file ($filename, { cols => [ 1, 4..8, 12 ]});
2622
2623         Returning something like
2624
2625          [ { fields => [ 1, 2, "foo", 4.5, undef, "", 8 ],
2626              flags  => [ ... ],
2627              },
2628            { fields => [ ... ],
2629              .
2630              },
2631            ]
2632
2633         Note that the "csv" function already supports most of this,  but does
2634         not return flags. "getline_all" returns all rows for an open stream,
2635         but this will not return flags either.  "fragment"  can reduce the
2636         required  rows or columns, but cannot combine them.
2637
2638       Cookbook
2639         Write a document that has recipes for  most known  non-standard  (and
2640         maybe some standard)  "CSV" formats,  including formats that use
2641         "TAB",  ";", "|", or other non-comma separators.
2642
2643         Examples could be taken from W3C's CSV on the Web: Use Cases and
2644         Requirements <http://w3c.github.io/csvw/use-cases-and-
2645         requirements/index.html>
2646
2647       Steal
2648         Steal good new ideas and features from PapaParse
2649         <http://papaparse.com> or csvkit <http://csvkit.readthedocs.org>.
2650
2651       Perl6 support
2652         I'm already working on perl6 support here
2653         <https://github.com/Tux/CSV>. No promises yet on when it is finished
2654         (or fast). Trying to keep the API alike as much as possible.
2655
2656   NOT TODO
2657       combined methods
2658         Requests for adding means (methods) that combine "combine" and
2659         "string" in a single call will not be honored (use "print" instead).
2660         Likewise for "parse" and "fields"  (use "getline" instead), given the
2661         problems with embedded newlines.
2662
2663   Release plan
2664       No guarantees, but this is what I had in mind some time ago:
2665
2666       · DIAGNOSTICS section in pod to *describe* the errors (see below)
2667

EBCDIC

2669       The current hard-coding of characters and character ranges  makes this
2670       code unusable on "EBCDIC" systems. Recent work in perl-5.20 might
2671       change that.
2672
2673       Opening "EBCDIC" encoded files on  "ASCII"+  systems is likely to
2674       succeed using Encode's "cp37", "cp1047", or "posix-bc":
2675
2676        open my $fh, "<:encoding(cp1047)", "ebcdic_file.csv" or die "...";
2677

DIAGNOSTICS

2679       Still under construction ...
2680
2681       If an error occurs,  "$csv->error_diag" can be used to get information
2682       on the cause of the failure. Note that for speed reasons the internal
2683       value is never cleared on success,  so using the value returned by
2684       "error_diag" in normal cases - when no error occurred - may cause
2685       unexpected results.
2686
2687       If the constructor failed, the cause can be found using "error_diag" as
2688       a class method, like "Text::CSV_XS->error_diag".
2689
2690       The "$csv->error_diag" method is automatically invoked upon error when
2691       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2692       autodie is in effect.  When set to 1, this will cause a "warn" with the
2693       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2694       from "auto_diag" reports.
2695
2696       Errors can be (individually) caught using the "error" callback.
2697
2698       The errors as described below are available. I have tried to make the
2699       error itself explanatory enough, but more descriptions will be added.
2700       For most of these errors, the first three capitals describe the error
2701       category:
2702
2703       · INI
2704
2705         Initialization error or option conflict.
2706
2707       · ECR
2708
2709         Carriage-Return related parse error.
2710
2711       · EOF
2712
2713         End-Of-File related parse error.
2714
2715       · EIQ
2716
2717         Parse error inside quotation.
2718
2719       · EIF
2720
2721         Parse error inside field.
2722
2723       · ECB
2724
2725         Combine error.
2726
2727       · EHR
2728
2729         HashRef parse related error.
2730
2731       And below should be the complete list of error codes that can be
2732       returned:
2733
2734       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2735
2736         The  separation character  cannot be equal to  the quotation
2737         character or to the escape character,  as this would invalidate all
2738         parsing rules.
2739
2740       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2741         TAB"
2742
2743         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2744         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2745         allow.
2746
2747       · 1003 "INI - \r or \n in main attr not allowed"
2748
2749         Using default "eol" characters in either "sep_char", "quote_char",
2750         or  "escape_char"  is  not allowed.
2751
2752       · 1004 "INI - callbacks should be undef or a hashref"
2753
2754         The "callbacks"  attribute only allows one to be "undef" or a hash
2755         reference.
2756
2757       · 1005 "INI - EOL too long"
2758
2759         The value passed for EOL is exceeding its maximum length (16).
2760
2761       · 1006 "INI - SEP too long"
2762
2763         The value passed for SEP is exceeding its maximum length (16).
2764
2765       · 1007 "INI - QUOTE too long"
2766
2767         The value passed for QUOTE is exceeding its maximum length (16).
2768
2769       · 1008 "INI - SEP undefined"
2770
2771         The value passed for SEP should be defined and not empty.
2772
2773       · 1010 "INI - the header is empty"
2774
2775         The header line parsed in the "header" is empty.
2776
2777       · 1011 "INI - the header contains more than one valid separator"
2778
2779         The header line parsed in the  "header"  contains more than one
2780         (unique) separator character out of the allowed set of separators.
2781
2782       · 1012 "INI - the header contains an empty field"
2783
2784         The header line parsed in the "header" is contains an empty field.
2785
2786       · 1013 "INI - the header contains nun-unique fields"
2787
2788         The header line parsed in the  "header"  contains at least  two
2789         identical fields.
2790
2791       · 1014 "INI - header called on undefined stream"
2792
2793         The header line cannot be parsed from an undefined sources.
2794
2795       · 1500 "PRM - Invalid/unsupported argument(s)"
2796
2797         Function or method called with invalid argument(s) or parameter(s).
2798
2799       · 1501 "PRM - The key attribute is passed as an unsupported type"
2800
2801         The "key" attribute is of an unsupported type.
2802
2803       · 1502 "PRM - The value attribute is passed without the key attribute"
2804
2805         The "value" attribute is only allowed when a valid key is given.
2806
2807       · 1503 "PRM - The value attribute is passed as an unsupported type"
2808
2809         The "value" attribute is of an unsupported type.
2810
2811       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2812
2813         When  "eol"  has  been  set  to  anything  but the  default,  like
2814         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2815         "quote_char", where the characters following the "\r" do not make up
2816         the "eol" sequence, this is an error.
2817
2818       · 2011 "ECR - Characters after end of quoted field"
2819
2820         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2821         quoted field and after the closing double-quote, there should be
2822         either a new-line sequence or a separation character.
2823
2824       · 2012 "EOF - End of data in parsing input stream"
2825
2826         Self-explaining. End-of-file while inside parsing a stream. Can
2827         happen only when reading from streams with "getline",  as using
2828         "parse" is done on strings that are not required to have a trailing
2829         "eol".
2830
2831       · 2013 "INI - Specification error for fragments RFC7111"
2832
2833         Invalid specification for URI "fragment" specification.
2834
2835       · 2014 "ENF - Inconsistent number of fields"
2836
2837         Inconsistent number of fields under strict parsing.
2838
2839       · 2021 "EIQ - NL char inside quotes, binary off"
2840
2841         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2842         option has been selected with the constructor.
2843
2844       · 2022 "EIQ - CR char inside quotes, binary off"
2845
2846         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2847         option has been selected with the constructor.
2848
2849       · 2023 "EIQ - QUO character not allowed"
2850
2851         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2852         Bar",\n" will cause this error.
2853
2854       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2855
2856         The escape character is not allowed as last character in an input
2857         stream.
2858
2859       · 2025 "EIQ - Loose unescaped escape"
2860
2861         An escape character should escape only characters that need escaping.
2862
2863         Allowing  the escape  for other characters  is possible  with the
2864         attribute "allow_loose_escapes".
2865
2866       · 2026 "EIQ - Binary character inside quoted field, binary off"
2867
2868         Binary characters are not allowed by default.    Exceptions are
2869         fields that contain valid UTF-8,  that will automatically be upgraded
2870         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2871         data.
2872
2873       · 2027 "EIQ - Quoted field not terminated"
2874
2875         When parsing a field that started with a quotation character,  the
2876         field is expected to be closed with a quotation character.   When the
2877         parsed line is exhausted before the quote is found, that field is not
2878         terminated.
2879
2880       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2881
2882       · 2031 "EIF - CR char is first char of field, not part of EOL"
2883
2884       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2885
2886       · 2034 "EIF - Loose unescaped quote"
2887
2888       · 2035 "EIF - Escaped EOF in unquoted field"
2889
2890       · 2036 "EIF - ESC error"
2891
2892       · 2037 "EIF - Binary character in unquoted field, binary off"
2893
2894       · 2110 "ECB - Binary character in Combine, binary off"
2895
2896       · 2200 "EIO - print to IO failed. See errno"
2897
2898       · 3001 "EHR - Unsupported syntax for column_names ()"
2899
2900       · 3002 "EHR - getline_hr () called before column_names ()"
2901
2902       · 3003 "EHR - bind_columns () and column_names () fields count
2903         mismatch"
2904
2905       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2906
2907       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2908         fields"
2909
2910       · 3007 "EHR - bind_columns needs refs to writable scalars"
2911
2912       · 3008 "EHR - unexpected error in bound fields"
2913
2914       · 3009 "EHR - print_hr () called before column_names ()"
2915
2916       · 3010 "EHR - print_hr () called with invalid arguments"
2917

SEE ALSO

2919       IO::File,  IO::Handle,  IO::Wrap,  Text::CSV,  Text::CSV_PP,
2920       Text::CSV::Encoded,     Text::CSV::Separator,    Text::CSV::Slurp,
2921       Spreadsheet::CSV and Spreadsheet::Read, and of course perl.
2922
2923       If you are using perl6,  you can have a look at  "Text::CSV"  in the
2924       perl6 ecosystem, offering the same features.
2925
2926       non-perl
2927
2928       A CSV parser in JavaScript,  also used by W3C <http://www.w3.org>,  is
2929       the multi-threaded in-browser PapaParse <http://papaparse.com/>.
2930
2931       csvkit <http://csvkit.readthedocs.org> is a python CSV parsing toolkit.
2932

AUTHOR

2934       Alan Citterman <alan@mfgrtl.com> wrote the original Perl module.
2935       Please don't send mail concerning Text::CSV_XS to Alan, who is not
2936       involved in the C/XS part that is now the main part of the module.
2937
2938       Jochen Wiedmann <joe@ispsoft.de> rewrote the en- and decoding in C by
2939       implementing a simple finite-state machine.   He added variable quote,
2940       escape and separator characters, the binary mode and the print and
2941       getline methods. See ChangeLog releases 0.10 through 0.23.
2942
2943       H.Merijn Brand <h.m.brand@xs4all.nl> cleaned up the code,  added the
2944       field flags methods,  wrote the major part of the test suite, completed
2945       the documentation,   fixed most RT bugs,  added all the allow flags and
2946       the "csv" function. See ChangeLog releases 0.25 and on.
2947
2949        Copyright (C) 2007-2019 H.Merijn Brand.  All rights reserved.
2950        Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2951        Copyright (C) 1997      Alan Citterman.  All rights reserved.
2952
2953       This library is free software;  you can redistribute and/or modify it
2954       under the same terms as Perl itself.
2955
2956
2957
2958perl v5.30.0                      2019-09-15                         CSV_XS(3)
Impressum