Text::CSV_XS(3pm)

1CSV_XS(3)             User Contributed Perl Documentation            CSV_XS(3)
2
3
4

NAME

6       Text::CSV_XS - comma-separated values manipulation routines
7

SYNOPSIS

9        # Functional interface
10        use Text::CSV_XS qw( csv );
11
12        # Read whole file in memory
13        my $aoa = csv (in => "data.csv");    # as array of array
14        my $aoh = csv (in => "data.csv",
15                       headers => "auto");   # as array of hash
16
17        # Write array of arrays as csv file
18        csv (in => $aoa, out => "file.csv", sep_char=> ";");
19
20        # Only show lines where "code" is odd
21        csv (in => "data.csv", filter => { code => sub { $_ % 2 }});
22
23
24        # Object interface
25        use Text::CSV_XS;
26
27        my @rows;
28        # Read/parse CSV
29        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
30        open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
31        while (my $row = $csv->getline ($fh)) {
32            $row->[2] =~ m/pattern/ or next; # 3rd field should match
33            push @rows, $row;
34            }
35        close $fh;
36
37        # and write as CSV
38        open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
39        $csv->say ($fh, $_) for @rows;
40        close $fh or die "new.csv: $!";
41

DESCRIPTION

43       Text::CSV_XS  provides facilities for the composition  and
44       decomposition of comma-separated values.  An instance of the
45       Text::CSV_XS class will combine fields into a "CSV" string and parse a
46       "CSV" string into fields.
47
48       The module accepts either strings or files as input  and support the
49       use of user-specified characters for delimiters, separators, and
50       escapes.
51
52   Embedded newlines
53       Important Note:  The default behavior is to accept only ASCII
54       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
55       that the fields can not contain newlines. If your data contains
56       newlines embedded in fields, or characters above 0x7E (tilde), or
57       binary data, you must set "binary => 1" in the call to "new". To cover
58       the widest range of parsing options, you will always want to set
59       binary.
60
61       But you still have the problem  that you have to pass a correct line to
62       the "parse" method, which is more complicated from the usual point of
63       usage:
64
65        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
66        while (<>) {           #  WRONG!
67            $csv->parse ($_);
68            my @fields = $csv->fields ();
69            }
70
71       this will break, as the "while" might read broken lines:  it does not
72       care about the quoting. If you need to support embedded newlines,  the
73       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
74       and "\r\n" by default) and then
75
76        my $csv = Text::CSV_XS->new ({ binary => 1 });
77        open my $fh, "<", $file or die "$file: $!";
78        while (my $row = $csv->getline ($fh)) {
79            my @fields = @$row;
80            }
81
82       The old(er) way of using global file handles is still supported
83
84        while (my $row = $csv->getline (*ARGV)) { ... }
85
86   Unicode
87       Unicode is only tested to work with perl-5.8.2 and up.
88
89       See also "BOM".
90
91       The simplest way to ensure the correct encoding is used for  in- and
92       output is by either setting layers on the filehandles, or setting the
93       "encoding" argument for "csv".
94
95        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
96       or
97        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
98
99        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
100       or
101        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
102
103       On parsing (both for  "getline" and  "parse"),  if the source is marked
104       being UTF8, then all fields that are marked binary will also be marked
105       UTF8.
106
107       On combining ("print"  and  "combine"):  if any of the combining fields
108       was marked UTF8, the resulting string will be marked as UTF8.  Note
109       however that all fields  before  the first field marked UTF8 and
110       contained 8-bit characters that were not upgraded to UTF8,  these will
111       be  "bytes"  in the resulting string too, possibly causing unexpected
112       errors.  If you pass data of different encoding,  or you don't know if
113       there is  different  encoding, force it to be upgraded before you pass
114       them on:
115
116        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
117
118       For complete control over encoding, please use Text::CSV::Encoded:
119
120        use Text::CSV::Encoded;
121        my $csv = Text::CSV::Encoded->new ({
122            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
123            encoding_out => "cp1252",     # the encoding comes out of Perl
124            });
125
126        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
127        # combine () and print () accept *literally* utf8 encoded data
128        # parse () and getline () return *literally* utf8 encoded data
129
130        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
131        # combine () and print () accept UTF8 marked data
132        # parse () and getline () return UTF8 marked data
133
134   BOM
135       BOM  (or Byte Order Mark)  handling is available only inside the
136       "header" method.   This method supports the following encodings:
137       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
138       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
139       <https://en.wikipedia.org/wiki/Byte_order_mark>.
140
141       If a file has a BOM, the easiest way to deal with that is
142
143        my $aoh = csv (in => $file, detect_bom => 1);
144
145       All records will be encoded based on the detected BOM.
146
147       This implies a call to the  "header"  method,  which defaults to also
148       set the "column_names". So this is not the same as
149
150        my $aoh = csv (in => $file, headers => "auto");
151
152       which only reads the first record to set  "column_names"  but ignores
153       any meaning of possible present BOM.
154

SPECIFICATION

156       While no formal specification for CSV exists, RFC 4180
157       <http://tools.ietf.org/html/rfc4180> (1) describes the common format
158       and establishes  "text/csv" as the MIME type registered with the IANA.
159       RFC 7111 <http://tools.ietf.org/html/rfc7111> (2) adds fragments to
160       CSV.
161
162       Many informal documents exist that describe the "CSV" format.   "How
163       To: The Comma Separated Value (CSV) File Format"
164       <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm> (3)  provides an
165       overview of the  "CSV"  format in the most widely used applications and
166       explains how it can best be used and supported.
167
168        1) http://tools.ietf.org/html/rfc4180
169        2) http://tools.ietf.org/html/rfc7111
170        3) http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
171
172       The basic rules are as follows:
173
174       CSV  is a delimited data format that has fields/columns separated by
175       the comma character and records/rows separated by newlines. Fields that
176       contain a special character (comma, newline, or double quote),  must be
177       enclosed in double quotes. However, if a line contains a single entry
178       that is the empty string, it may be enclosed in double quotes.  If a
179       field's value contains a double quote character it is escaped by
180       placing another double quote character next to it. The "CSV" file
181       format does not require a specific character encoding, byte order, or
182       line terminator format.
183
184       · Each record is a single line ended by a line feed  (ASCII/"LF"=0x0A)
185         or a carriage return and line feed pair (ASCII/"CRLF"="0x0D 0x0A"),
186         however, line-breaks may be embedded.
187
188       · Fields are separated by commas.
189
190       · Allowable characters within a "CSV" field include 0x09 ("TAB") and
191         the inclusive range of 0x20 (space) through 0x7E (tilde).  In binary
192         mode all characters are accepted, at least in quoted fields.
193
194       · A field within  "CSV"  must be surrounded by  double-quotes to
195         contain  a separator character (comma).
196
197       Though this is the most clear and restrictive definition,  Text::CSV_XS
198       is way more liberal than this, and allows extension:
199
200       · Line termination by a single carriage return is accepted by default
201
202       · The separation-, escape-, and escape- characters can be any ASCII
203         character in the range from  0x20 (space) to  0x7E (tilde).
204         Characters outside this range may or may not work as expected.
205         Multibyte characters, like UTF "U+060C" (ARABIC COMMA),   "U+FF0C"
206         (FULLWIDTH COMMA),  "U+241B" (SYMBOL FOR ESCAPE), "U+2424" (SYMBOL
207         FOR NEWLINE), "U+FF02" (FULLWIDTH QUOTATION MARK), and "U+201C" (LEFT
208         DOUBLE QUOTATION MARK) (to give some examples of what might look
209         promising) work for newer versions of perl for "sep_char", and
210         "quote_char" but not for "escape_char".
211
212         If you use perl-5.8.2 or higher these three attributes are
213         utf8-decoded, to increase the likelihood of success. This way
214         "U+00FE" will be allowed as a quote character.
215
216       · A field in  "CSV"  must be surrounded by double-quotes to make an
217         embedded double-quote, represented by a pair of consecutive double-
218         quotes, valid. In binary mode you may additionally use the sequence
219         ""0" for representation of a NULL byte. Using 0x00 in binary mode is
220         just as valid.
221
222       · Several violations of the above specification may be lifted by
223         passing some options as attributes to the object constructor.
224

METHODS

226   version
227       (Class method) Returns the current module version.
228
229   new
230       (Class method) Returns a new instance of class Text::CSV_XS. The
231       attributes are described by the (optional) hash ref "\%attr".
232
233        my $csv = Text::CSV_XS->new ({ attributes ... });
234
235       The following attributes are available:
236
237       eol
238
239        my $csv = Text::CSV_XS->new ({ eol => $/ });
240                  $csv->eol (undef);
241        my $eol = $csv->eol;
242
243       The end-of-line string to add to rows for "print" or the record
244       separator for "getline".
245
246       When not passed in a parser instance,  the default behavior is to
247       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
248       "eol" at all. Passing "undef" or the empty string behave the same.
249
250       When not passed in a generating instance,  records are not terminated
251       at all, so it is probably wise to pass something you expect. A safe
252       choice for "eol" on output is either $/ or "\r\n".
253
254       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
255       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
256       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
257
258       If both $/ and "eol" equal "\015", parsing lines that end on only a
259       Carriage Return without Line Feed, will be "parse"d correct.
260
261       sep_char
262
263        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
264                $csv->sep_char (";");
265        my $c = $csv->sep_char;
266
267       The char used to separate fields, by default a comma. (",").  Limited
268       to a single-byte character, usually in the range from 0x20 (space) to
269       0x7E (tilde). When longer sequences are required, use "sep".
270
271       The separation character can not be equal to the quote character  or to
272       the escape character.
273
274       See also "CAVEATS"
275
276       sep
277
278        my $csv = Text::CSV_XS->new ({ sep => "\N{FULLWIDTH COMMA}" });
279                  $csv->sep (";");
280        my $sep = $csv->sep;
281
282       The chars used to separate fields, by default undefined. Limited to 8
283       bytes.
284
285       When set, overrules "sep_char".  If its length is one byte it acts as
286       an alias to "sep_char".
287
288       See also "CAVEATS"
289
290       quote_char
291
292        my $csv = Text::CSV_XS->new ({ quote_char => "'" });
293                $csv->quote_char (undef);
294        my $c = $csv->quote_char;
295
296       The character to quote fields containing blanks or binary data,  by
297       default the double quote character (""").  A value of undef suppresses
298       quote chars (for simple cases only). Limited to a single-byte
299       character, usually in the range from  0x20 (space) to  0x7E (tilde).
300       When longer sequences are required, use "quote".
301
302       "quote_char" can not be equal to "sep_char".
303
304       quote
305
306        my $csv = Text::CSV_XS->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
307                    $csv->quote ("'");
308        my $quote = $csv->quote;
309
310       The chars used to quote fields, by default undefined. Limited to 8
311       bytes.
312
313       When set, overrules "quote_char". If its length is one byte it acts as
314       an alias to "quote_char".
315
316       See also "CAVEATS"
317
318       escape_char
319
320        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
321                $csv->escape_char (":");
322        my $c = $csv->escape_char;
323
324       The character to  escape  certain characters inside quoted fields.
325       This is limited to a  single-byte  character,  usually  in the  range
326       from  0x20 (space) to 0x7E (tilde).
327
328       The "escape_char" defaults to being the double-quote mark ("""). In
329       other words the same as the default "quote_char". This means that
330       doubling the quote mark in a field escapes it:
331
332        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
333
334       If  you  change  the   "quote_char"  without  changing  the
335       "escape_char",  the  "escape_char" will still be the double-quote
336       (""").  If instead you want to escape the  "quote_char" by doubling it
337       you will need to also change the  "escape_char"  to be the same as what
338       you have changed the "quote_char" to.
339
340       Setting "escape_char" to <undef> or "" will disable escaping completely
341       and is greatly discouraged. This will also disable "escape_null".
342
343       The escape character can not be equal to the separation character.
344
345       binary
346
347        my $csv = Text::CSV_XS->new ({ binary => 1 });
348                $csv->binary (0);
349        my $f = $csv->binary;
350
351       If this attribute is 1,  you may use binary characters in quoted
352       fields, including line feeds, carriage returns and "NULL" bytes. (The
353       latter could be escaped as ""0".) By default this feature is off.
354
355       If a string is marked UTF8,  "binary" will be turned on automatically
356       when binary characters other than "CR" and "NL" are encountered.   Note
357       that a simple string like "\x{00a0}" might still be binary, but not
358       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
359
360       strict
361
362        my $csv = Text::CSV_XS->new ({ strict => 1 });
363                $csv->strict (0);
364        my $f = $csv->strict;
365
366       If this attribute is set to 1, any row that parses to a different
367       number of fields than the previous row will cause the parser to throw
368       error 2014.
369
370       formula_handling
371
372       formula
373
374        my $csv = Text::CSV_XS->new ({ formula => "none" });
375                $csv->formula ("none");
376        my $f = $csv->formula;
377
378       This defines the behavior of fields containing formulas. As formulas
379       are considered dangerous in spreadsheets, this attribute can define an
380       optional action to be taken if a field starts with an equal sign ("=").
381
382       For purpose of code-readability, this can also be written as
383
384        my $csv = Text::CSV_XS->new ({ formula_handling => "none" });
385                $csv->formula_handling ("none");
386        my $f = $csv->formula_handling;
387
388       Possible values for this attribute are
389
390       none
391         Take no specific action. This is the default.
392
393          $csv->formula ("none");
394
395       die
396         Cause the process to "die" whenever a leading "=" is encountered.
397
398          $csv->formula ("die");
399
400       croak
401         Cause the process to "croak" whenever a leading "=" is encountered.
402         (See Carp)
403
404          $csv->formula ("croak");
405
406       diag
407         Report position and content of the field whenever a leading  "=" is
408         found.  The value of the field is unchanged.
409
410          $csv->formula ("diag");
411
412       empty
413         Replace the content of fields that start with a "=" with the empty
414         string.
415
416          $csv->formula ("empty");
417          $csv->formula ("");
418
419       undef
420         Replace the content of fields that start with a "=" with "undef".
421
422          $csv->formula ("undef");
423          $csv->formula (undef);
424
425       All other values will give a warning and then fallback to "diag".
426
427       decode_utf8
428
429        my $csv = Text::CSV_XS->new ({ decode_utf8 => 1 });
430                $csv->decode_utf8 (0);
431        my $f = $csv->decode_utf8;
432
433       This attributes defaults to TRUE.
434
435       While parsing,  fields that are valid UTF-8, are automatically set to
436       be UTF-8, so that
437
438         $csv->parse ("\xC4\xA8\n");
439
440       results in
441
442         PV("\304\250"\0) [UTF8 "\x{128}"]
443
444       Sometimes it might not be a desired action.  To prevent those upgrades,
445       set this attribute to false, and the result will be
446
447         PV("\304\250"\0)
448
449       auto_diag
450
451        my $csv = Text::CSV_XS->new ({ auto_diag => 1 });
452                $csv->auto_diag (2);
453        my $l = $csv->auto_diag;
454
455       Set this attribute to a number between 1 and 9 causes  "error_diag" to
456       be automatically called in void context upon errors.
457
458       In case of error "2012 - EOF", this call will be void.
459
460       If "auto_diag" is set to a numeric value greater than 1, it will "die"
461       on errors instead of "warn".  If set to anything unrecognized,  it will
462       be silently ignored.
463
464       Future extensions to this feature will include more reliable auto-
465       detection of  "autodie"  being active in the scope of which the error
466       occurred which will increment the value of "auto_diag" with  1 the
467       moment the error is detected.
468
469       diag_verbose
470
471        my $csv = Text::CSV_XS->new ({ diag_verbose => 1 });
472                $csv->diag_verbose (2);
473        my $l = $csv->diag_verbose;
474
475       Set the verbosity of the output triggered by "auto_diag".   Currently
476       only adds the current  input-record-number  (if known)  to the
477       diagnostic output with an indication of the position of the error.
478
479       blank_is_undef
480
481        my $csv = Text::CSV_XS->new ({ blank_is_undef => 1 });
482                $csv->blank_is_undef (0);
483        my $f = $csv->blank_is_undef;
484
485       Under normal circumstances, "CSV" data makes no distinction between
486       quoted- and unquoted empty fields.  These both end up in an empty
487       string field once read, thus
488
489        1,"",," ",2
490
491       is read as
492
493        ("1", "", "", " ", "2")
494
495       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
496       set, the unquoted  empty field is the result of an undefined value.
497       To enable this distinction when  reading "CSV"  data,  the
498       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
499       to "undef", causing the above to be parsed as
500
501        ("1", "", undef, " ", "2")
502
503       note that this is specifically important when loading  "CSV" fields
504       into a database that allows "NULL" values,  as the perl equivalent for
505       "NULL" is "undef" in DBI land.
506
507       empty_is_undef
508
509        my $csv = Text::CSV_XS->new ({ empty_is_undef => 1 });
510                $csv->empty_is_undef (0);
511        my $f = $csv->empty_is_undef;
512
513       Going one  step  further  than  "blank_is_undef",  this attribute
514       converts all empty fields to "undef", so
515
516        1,"",," ",2
517
518       is read as
519
520        (1, undef, undef, " ", 2)
521
522       Note that this effects only fields that are  originally  empty,  not
523       fields that are empty after stripping allowed whitespace. YMMV.
524
525       allow_whitespace
526
527        my $csv = Text::CSV_XS->new ({ allow_whitespace => 1 });
528                $csv->allow_whitespace (0);
529        my $f = $csv->allow_whitespace;
530
531       When this option is set to true,  the whitespace  ("TAB"'s and
532       "SPACE"'s) surrounding  the  separation character  is removed when
533       parsing.  If either "TAB" or "SPACE" is one of the three characters
534       "sep_char", "quote_char", or "escape_char" it will not be considered
535       whitespace.
536
537       Now lines like:
538
539        1 , "foo" , bar , 3 , zapp
540
541       are parsed as valid "CSV", even though it violates the "CSV" specs.
542
543       Note that  all  whitespace is stripped from both  start and  end of
544       each field.  That would make it  more than a feature to enable parsing
545       bad "CSV" lines, as
546
547        1,   2.0,  3,   ape  , monkey
548
549       will now be parsed as
550
551        ("1", "2.0", "3", "ape", "monkey")
552
553       even if the original line was perfectly acceptable "CSV".
554
555       allow_loose_quotes
556
557        my $csv = Text::CSV_XS->new ({ allow_loose_quotes => 1 });
558                $csv->allow_loose_quotes (0);
559        my $f = $csv->allow_loose_quotes;
560
561       By default, parsing unquoted fields containing "quote_char" characters
562       like
563
564        1,foo "bar" baz,42
565
566       would result in parse error 2034.  Though it is still bad practice to
567       allow this format,  we  cannot  help  the  fact  that  some  vendors
568       make  their applications spit out lines styled this way.
569
570       If there is really bad "CSV" data, like
571
572        1,"foo "bar" baz",42
573
574       or
575
576        1,""foo bar baz"",42
577
578       there is a way to get this data-line parsed and leave the quotes inside
579       the quoted field as-is.  This can be achieved by setting
580       "allow_loose_quotes" AND making sure that the "escape_char" is  not
581       equal to "quote_char".
582
583       allow_loose_escapes
584
585        my $csv = Text::CSV_XS->new ({ allow_loose_escapes => 1 });
586                $csv->allow_loose_escapes (0);
587        my $f = $csv->allow_loose_escapes;
588
589       Parsing fields  that  have  "escape_char"  characters that escape
590       characters that do not need to be escaped, like:
591
592        my $csv = Text::CSV_XS->new ({ escape_char => "\\" });
593        $csv->parse (qq{1,"my bar\'s",baz,42});
594
595       would result in parse error 2025.   Though it is bad practice to allow
596       this format,  this attribute enables you to treat all escape character
597       sequences equal.
598
599       allow_unquoted_escape
600
601        my $csv = Text::CSV_XS->new ({ allow_unquoted_escape => 1 });
602                $csv->allow_unquoted_escape (0);
603        my $f = $csv->allow_unquoted_escape;
604
605       A backward compatibility issue where "escape_char" differs from
606       "quote_char"  prevents  "escape_char" to be in the first position of a
607       field.  If "quote_char" is equal to the default """ and "escape_char"
608       is set to "\", this would be illegal:
609
610        1,\0,2
611
612       Setting this attribute to 1  might help to overcome issues with
613       backward compatibility and allow this style.
614
615       always_quote
616
617        my $csv = Text::CSV_XS->new ({ always_quote => 1 });
618                $csv->always_quote (0);
619        my $f = $csv->always_quote;
620
621       By default the generated fields are quoted only if they need to be.
622       For example, if they contain the separator character. If you set this
623       attribute to 1 then all defined fields will be quoted. ("undef" fields
624       are not quoted, see "blank_is_undef"). This makes it quite often easier
625       to handle exported data in external applications.   (Poor creatures who
626       are better to use Text::CSV_XS. :)
627
628       quote_space
629
630        my $csv = Text::CSV_XS->new ({ quote_space => 1 });
631                $csv->quote_space (0);
632        my $f = $csv->quote_space;
633
634       By default,  a space in a field would trigger quotation.  As no rule
635       exists this to be forced in "CSV",  nor any for the opposite, the
636       default is true for safety.   You can exclude the space  from this
637       trigger  by setting this attribute to 0.
638
639       quote_empty
640
641        my $csv = Text::CSV_XS->new ({ quote_empty => 1 });
642                $csv->quote_empty (0);
643        my $f = $csv->quote_empty;
644
645       By default the generated fields are quoted only if they need to be.
646       An empty (defined) field does not need quotation. If you set this
647       attribute to 1 then empty defined fields will be quoted.  ("undef"
648       fields are not quoted, see "blank_is_undef"). See also "always_quote".
649
650       quote_binary
651
652        my $csv = Text::CSV_XS->new ({ quote_binary => 1 });
653                $csv->quote_binary (0);
654        my $f = $csv->quote_binary;
655
656       By default,  all "unsafe" bytes inside a string cause the combined
657       field to be quoted.  By setting this attribute to 0, you can disable
658       that trigger for bytes >= 0x7F.
659
660       escape_null
661
662        my $csv = Text::CSV_XS->new ({ escape_null => 1 });
663                $csv->escape_null (0);
664        my $f = $csv->escape_null;
665
666       By default, a "NULL" byte in a field would be escaped. This option
667       enables you to treat the  "NULL"  byte as a simple binary character in
668       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
669       can prevent "NULL" escapes by setting this attribute to 0.
670
671       When the "escape_char" attribute is set to undefined,  this attribute
672       will be set to false.
673
674       The default setting will encode "=\x00=" as
675
676        "="0="
677
678       With "escape_null" set, this will result in
679
680        "=\x00="
681
682       The default when using the "csv" function is "false".
683
684       For backward compatibility reasons,  the deprecated old name
685       "quote_null" is still recognized.
686
687       keep_meta_info
688
689        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1 });
690                $csv->keep_meta_info (0);
691        my $f = $csv->keep_meta_info;
692
693       By default, the parsing of input records is as simple and fast as
694       possible.  However,  some parsing information - like quotation of the
695       original field - is lost in that process.  Setting this flag to true
696       enables retrieving that information after parsing with  the methods
697       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
698       false for performance.
699
700       If you set this attribute to a value greater than 9,   than you can
701       control output quotation style like it was used in the input of the the
702       last parsed record (unless quotation was added because of other
703       reasons).
704
705        my $csv = Text::CSV_XS->new ({
706           binary         => 1,
707           keep_meta_info => 1,
708           quote_space    => 0,
709           });
710
711        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
712
713        $csv->print (*STDOUT, \@row);
714        # 1,,, , ,f,g,"h""h",help,help
715        $csv->keep_meta_info (11);
716        $csv->print (*STDOUT, \@row);
717        # 1,,"", ," ",f,"g","h""h",help,"help"
718
719       undef_str
720
721        my $csv = Text::CSV_XS->new ({ undef_str => "\\N" });
722                $csv->undef_str (undef);
723        my $s = $csv->undef_str;
724
725       This attribute optionally defines the output of undefined fields. The
726       value passed is not changed at all, so if it needs quotation, the
727       quotation needs to be included in the value of the attribute.  Use with
728       caution, as passing a value like  ",",,,,"""  will for sure mess up
729       your output. The default for this attribute is "undef", meaning no
730       special treatment.
731
732       This attribute is useful when exporting  CSV data  to be imported in
733       custom loaders, like for MySQL, that recognize special sequences for
734       "NULL" data.
735
736       This attribute has no meaning when parsing CSV data.
737
738       verbatim
739
740        my $csv = Text::CSV_XS->new ({ verbatim => 1 });
741                $csv->verbatim (0);
742        my $f = $csv->verbatim;
743
744       This is a quite controversial attribute to set,  but makes some hard
745       things possible.
746
747       The rationale behind this attribute is to tell the parser that the
748       normally special characters newline ("NL") and Carriage Return ("CR")
749       will not be special when this flag is set,  and be dealt with  as being
750       ordinary binary characters. This will ease working with data with
751       embedded newlines.
752
753       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
754       every line.
755
756       Imagine a file format like
757
758        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
759
760       where, the line ending is a very specific "#\r\n", and the sep_char is
761       a "^" (caret).   None of the fields is quoted,   but embedded binary
762       data is likely to be present. With the specific line ending, this
763       should not be too hard to detect.
764
765       By default,  Text::CSV_XS'  parse function is instructed to only know
766       about "\n" and "\r"  to be legal line endings,  and so has to deal with
767       the embedded newline as a real "end-of-line",  so it can scan the next
768       line if binary is true, and the newline is inside a quoted field. With
769       this option, we tell "parse" to parse the line as if "\n" is just
770       nothing more than a binary character.
771
772       For "parse" this means that the parser has no more idea about line
773       ending and "getline" "chomp"s line endings on reading.
774
775       types
776
777       A set of column types; the attribute is immediately passed to the
778       "types" method.
779
780       callbacks
781
782       See the "Callbacks" section below.
783
784       accessors
785
786       To sum it up,
787
788        $csv = Text::CSV_XS->new ();
789
790       is equivalent to
791
792        $csv = Text::CSV_XS->new ({
793            eol                   => undef, # \r, \n, or \r\n
794            sep_char              => ',',
795            sep                   => undef,
796            quote_char            => '"',
797            quote                 => undef,
798            escape_char           => '"',
799            binary                => 0,
800            decode_utf8           => 1,
801            auto_diag             => 0,
802            diag_verbose          => 0,
803            blank_is_undef        => 0,
804            empty_is_undef        => 0,
805            allow_whitespace      => 0,
806            allow_loose_quotes    => 0,
807            allow_loose_escapes   => 0,
808            allow_unquoted_escape => 0,
809            always_quote          => 0,
810            quote_empty           => 0,
811            quote_space           => 1,
812            escape_null           => 1,
813            quote_binary          => 1,
814            keep_meta_info        => 0,
815            strict                => 0,
816            formula               => 0,
817            verbatim              => 0,
818            undef_str             => undef,
819            types                 => undef,
820            callbacks             => undef,
821            });
822
823       For all of the above mentioned flags, an accessor method is available
824       where you can inquire the current value, or change the value
825
826        my $quote = $csv->quote_char;
827        $csv->binary (1);
828
829       It is not wise to change these settings halfway through writing "CSV"
830       data to a stream. If however you want to create a new stream using the
831       available "CSV" object, there is no harm in changing them.
832
833       If the "new" constructor call fails,  it returns "undef",  and makes
834       the fail reason available through the "error_diag" method.
835
836        $csv = Text::CSV_XS->new ({ ecs_char => 1 }) or
837            die "".Text::CSV_XS->error_diag ();
838
839       "error_diag" will return a string like
840
841        "INI - Unknown attribute 'ecs_char'"
842
843   known_attributes
844        @attr = Text::CSV_XS->known_attributes;
845        @attr = Text::CSV_XS::known_attributes;
846        @attr = $csv->known_attributes;
847
848       This method will return an ordered list of all the supported
849       attributes as described above.   This can be useful for knowing what
850       attributes are valid in classes that use or extend Text::CSV_XS.
851
852   print
853        $status = $csv->print ($fh, $colref);
854
855       Similar to  "combine" + "string" + "print",  but much more efficient.
856       It expects an array ref as input  (not an array!)  and the resulting
857       string is not really  created,  but  immediately  written  to the  $fh
858       object, typically an IO handle or any other object that offers a
859       "print" method.
860
861       For performance reasons  "print"  does not create a result string,  so
862       all "string", "status", "fields", and "error_input" methods will return
863       undefined information after executing this method.
864
865       If $colref is "undef"  (explicit,  not through a variable argument) and
866       "bind_columns"  was used to specify fields to be printed,  it is
867       possible to make performance improvements, as otherwise data would have
868       to be copied as arguments to the method call:
869
870        $csv->bind_columns (\($foo, $bar));
871        $status = $csv->print ($fh, undef);
872
873       A short benchmark
874
875        my @data = ("aa" .. "zz");
876        $csv->bind_columns (\(@data));
877
878        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
879        $csv->print ($fh,  \@data  );   # 57600 recs/sec
880        $csv->print ($fh,   undef  );   # 48500 recs/sec
881
882   say
883        $status = $csv->say ($fh, $colref);
884
885       Like "print", but "eol" defaults to "$\".
886
887   print_hr
888        $csv->print_hr ($fh, $ref);
889
890       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
891       provided the column names are set with "column_names".
892
893       It is just a wrapper method with basic parameter checks over
894
895        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
896
897   combine
898        $status = $csv->combine (@fields);
899
900       This method constructs a "CSV" record from  @fields,  returning success
901       or failure.   Failure can result from lack of arguments or an argument
902       that contains an invalid character.   Upon success,  "string" can be
903       called to retrieve the resultant "CSV" string.  Upon failure,  the
904       value returned by "string" is undefined and "error_input" could be
905       called to retrieve the invalid argument.
906
907   string
908        $line = $csv->string ();
909
910       This method returns the input to  "parse"  or the resultant "CSV"
911       string of "combine", whichever was called more recently.
912
913   getline
914        $colref = $csv->getline ($fh);
915
916       This is the counterpart to  "print",  as "parse"  is the counterpart to
917       "combine":  it parses a row from the $fh  handle using the "getline"
918       method associated with $fh  and parses this row into an array ref.
919       This array ref is returned by the function or "undef" for failure.
920       When $fh does not support "getline", you are likely to hit errors.
921
922       When fields are bound with "bind_columns" the return value is a
923       reference to an empty list.
924
925       The "string", "fields", and "status" methods are meaningless again.
926
927   getline_all
928        $arrayref = $csv->getline_all ($fh);
929        $arrayref = $csv->getline_all ($fh, $offset);
930        $arrayref = $csv->getline_all ($fh, $offset, $length);
931
932       This will return a reference to a list of getline ($fh) results.  In
933       this call, "keep_meta_info" is disabled.  If $offset is negative, as
934       with "splice", only the last  "abs ($offset)" records of $fh are taken
935       into consideration.
936
937       Given a CSV file with 10 lines:
938
939        lines call
940        ----- ---------------------------------------------------------
941        0..9  $csv->getline_all ($fh)         # all
942        0..9  $csv->getline_all ($fh,  0)     # all
943        8..9  $csv->getline_all ($fh,  8)     # start at 8
944        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
945        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
946        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
947        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
948        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
949
950   getline_hr
951       The "getline_hr" and "column_names" methods work together  to allow you
952       to have rows returned as hashrefs.  You must call "column_names" first
953       to declare your column names.
954
955        $csv->column_names (qw( code name price description ));
956        $hr = $csv->getline_hr ($fh);
957        print "Price for $hr->{name} is $hr->{price} EUR\n";
958
959       "getline_hr" will croak if called before "column_names".
960
961       Note that  "getline_hr"  creates a hashref for every row and will be
962       much slower than the combined use of "bind_columns"  and "getline" but
963       still offering the same ease of use hashref inside the loop:
964
965        my @cols = @{$csv->getline ($fh)};
966        $csv->column_names (@cols);
967        while (my $row = $csv->getline_hr ($fh)) {
968            print $row->{price};
969            }
970
971       Could easily be rewritten to the much faster:
972
973        my @cols = @{$csv->getline ($fh)};
974        my $row = {};
975        $csv->bind_columns (\@{$row}{@cols});
976        while ($csv->getline ($fh)) {
977            print $row->{price};
978            }
979
980       Your mileage may vary for the size of the data and the number of rows.
981       With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
982
983                   Rate hashrefs getlines
984        hashrefs 1.00/s       --     -76%
985        getlines 4.15/s     313%       --
986
987   getline_hr_all
988        $arrayref = $csv->getline_hr_all ($fh);
989        $arrayref = $csv->getline_hr_all ($fh, $offset);
990        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
991
992       This will return a reference to a list of   getline_hr ($fh) results.
993       In this call, "keep_meta_info" is disabled.
994
995   parse
996        $status = $csv->parse ($line);
997
998       This method decomposes a  "CSV"  string into fields,  returning success
999       or failure.   Failure can result from a lack of argument  or the given
1000       "CSV" string is improperly formatted.   Upon success, "fields" can be
1001       called to retrieve the decomposed fields. Upon failure calling "fields"
1002       will return undefined data and  "error_input"  can be called to
1003       retrieve  the invalid argument.
1004
1005       You may use the "types"  method for setting column types.  See "types"'
1006       description below.
1007
1008       The $line argument is supposed to be a simple scalar. Everything else
1009       is supposed to croak and set error 1500.
1010
1011   fragment
1012       This function tries to implement RFC7111  (URI Fragment Identifiers for
1013       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
1014
1015        my $AoA = $csv->fragment ($fh, $spec);
1016
1017       In specifications,  "*" is used to specify the last item, a dash ("-")
1018       to indicate a range.   All indices are 1-based:  the first row or
1019       column has index 1. Selections can be combined with the semi-colon
1020       (";").
1021
1022       When using this method in combination with  "column_names",  the
1023       returned reference  will point to a  list of hashes  instead of a  list
1024       of lists.  A disjointed  cell-based combined selection  might return
1025       rows with different number of columns making the use of hashes
1026       unpredictable.
1027
1028        $csv->column_names ("Name", "Age");
1029        my $AoH = $csv->fragment ($fh, "col=3;8");
1030
1031       If the "after_parse" callback is active,  it is also called on every
1032       line parsed and skipped before the fragment.
1033
1034       row
1035          row=4
1036          row=5-7
1037          row=6-*
1038          row=1-2;4;6-*
1039
1040       col
1041          col=2
1042          col=1-3
1043          col=4-*
1044          col=1-2;4;7-*
1045
1046       cell
1047         In cell-based selection, the comma (",") is used to pair row and
1048         column
1049
1050          cell=4,1
1051
1052         The range operator ("-") using "cell"s can be used to define top-left
1053         and bottom-right "cell" location
1054
1055          cell=3,1-4,6
1056
1057         The "*" is only allowed in the second part of a pair
1058
1059          cell=3,2-*,2    # row 3 till end, only column 2
1060          cell=3,2-3,*    # column 2 till end, only row 3
1061          cell=3,2-*,*    # strip row 1 and 2, and column 1
1062
1063         Cells and cell ranges may be combined with ";", possibly resulting in
1064         rows with different number of columns
1065
1066          cell=1,1-2,2;3,3-4,4;1,4;4,1
1067
1068         Disjointed selections will only return selected cells.   The cells
1069         that are not  specified  will  not  be  included  in the  returned
1070         set,  not even as "undef".  As an example given a "CSV" like
1071
1072          11,12,13,...19
1073          21,22,...28,29
1074          :            :
1075          91,...97,98,99
1076
1077         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
1078
1079          11,12,14
1080          21,22
1081          33,34
1082          41,43,44
1083
1084         Overlapping cell-specs will return those cells only once, So
1085         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1086
1087          11,12,13
1088          21,22,23,24
1089          31,32,33,34
1090          42,43,44
1091
1092       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1093       types of specs to be combined   (either "row" or "col" or "cell").
1094       Passing an invalid fragment specification will croak and set error
1095       2013.
1096
1097   column_names
1098       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1099       keys (column names) are passed, it will return the current setting as a
1100       list.
1101
1102       "column_names" accepts a list of scalars  (the column names)  or a
1103       single array_ref, so you can pass the return value from "getline" too:
1104
1105        $csv->column_names ($csv->getline ($fh));
1106
1107       "column_names" does no checking on duplicates at all, which might lead
1108       to unexpected results.   Undefined entries will be replaced with the
1109       string "\cAUNDEF\cA", so
1110
1111        $csv->column_names (undef, "", "name", "name");
1112        $hr = $csv->getline_hr ($fh);
1113
1114       Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1115       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1116       field.
1117
1118       "column_names" croaks on invalid arguments.
1119
1120   header
1121       This method does NOT work in perl-5.6.x
1122
1123       Parse the CSV header and set "sep", column_names and encoding.
1124
1125        my @hdr = $csv->header ($fh);
1126        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1127        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1128
1129       The first argument should be a file handle.
1130
1131       This method resets some object properties,  as it is supposed to be
1132       invoked only once per file or stream.  It will leave attributes
1133       "column_names" and "bound_columns" alone of setting column names is
1134       disabled. Reading headers on previously process objects might fail on
1135       perl-5.8.0 and older.
1136
1137       Assuming that the file opened for parsing has a header, and the header
1138       does not contain problematic characters like embedded newlines,   read
1139       the first line from the open handle then auto-detect whether the header
1140       separates the column names with a character from the allowed separator
1141       list.
1142
1143       If any of the allowed separators matches,  and none of the other
1144       allowed separators match,  set  "sep"  to that  separator  for the
1145       current CSV_XS instance and use it to parse the first line, map those
1146       to lowercase, and use that to set the instance "column_names":
1147
1148        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1149        open my $fh, "<", "file.csv";
1150        binmode $fh; # for Windows
1151        $csv->header ($fh);
1152        while (my $row = $csv->getline_hr ($fh)) {
1153            ...
1154            }
1155
1156       If the header is empty,  contains more than one unique separator out of
1157       the allowed set,  contains empty fields,   or contains identical fields
1158       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1159       respectively.
1160
1161       If the header contains embedded newlines or is not valid  CSV  in any
1162       other way, this method will croak and leave the parse error untouched.
1163
1164       A successful call to "header"  will always set the  "sep"  of the $csv
1165       object. This behavior can not be disabled.
1166
1167       return value
1168
1169       On error this method will croak.
1170
1171       In list context,  the headers will be returned whether they are used to
1172       set "column_names" or not.
1173
1174       In scalar context, the instance itself is returned.  Note: the values
1175       as found in the header will effectively be  lost if  "set_column_names"
1176       is false.
1177
1178       Options
1179
1180       sep_set
1181          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1182
1183         The list of legal separators defaults to "[ ";", "," ]" and can be
1184         changed by this option.  As this is probably the most often used
1185         option,  it can be passed on its own as an unnamed argument:
1186
1187          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1188
1189         Multi-byte  sequences are allowed,  both multi-character and
1190         Unicode.  See "sep".
1191
1192       detect_bom
1193          $csv->header ($fh, { detect_bom => 1 });
1194
1195         The default behavior is to detect if the header line starts with a
1196         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1197         This default behavior can be disabled by passing a false value to
1198         "detect_bom".
1199
1200         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1201         UTF-32BE,  and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1202         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1203         supported.
1204
1205         If a supported BOM was detected as start of the stream, it is stored
1206         in the abject attribute "ENCODING".
1207
1208          my $enc = $csv->{ENCODING};
1209
1210         The encoding is used with "binmode" on $fh.
1211
1212         If the handle was opened in a (correct) encoding,  this method will
1213         not alter the encoding, as it checks the leading bytes of the first
1214         line. In case the stream starts with a decode BOM ("U+FEFF"),
1215         "{ENCODING}" will be "" (empty) instead of the default "undef".
1216
1217       munge_column_names
1218         This option offers the means to modify the column names into
1219         something that is most useful to the application.   The default is to
1220         map all column names to lower case.
1221
1222          $csv->header ($fh, { munge_column_names => "lc" });
1223
1224         The following values are available:
1225
1226           lc     - lower case
1227           uc     - upper case
1228           none   - do not change
1229           \%hash - supply a mapping
1230           \&cb   - supply a callback
1231
1232         Literal:
1233
1234          $csv->header ($fh, { munge_column_names => "none" });
1235
1236         Hash:
1237
1238          $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1239
1240         if a value does not exist, the original value is used unchanged
1241
1242         Callback:
1243
1244          $csv->header ($fh, { munge_column_names => sub { fc } });
1245          $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1246          $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1247
1248         As this callback is called in a "map", you can use $_ directly.
1249
1250       set_column_names
1251          $csv->header ($fh, { set_column_names => 1 });
1252
1253         The default is to set the instances column names using
1254         "column_names" if the method is successful,  so subsequent calls to
1255         "getline_hr" can return a hash. Disable setting the header can be
1256         forced by using a false value for this option.
1257
1258         As described in "return value" above, content is lost in scalar
1259         context.
1260
1261       Validation
1262
1263       When receiving CSV files from external sources,  this method can be
1264       used to protect against changes in the layout by restricting to known
1265       headers  (and typos in the header fields).
1266
1267        my %known = (
1268            "record key" => "c_rec",
1269            "rec id"     => "c_rec",
1270            "id_rec"     => "c_rec",
1271            "kode"       => "code",
1272            "code"       => "code",
1273            "vaule"      => "value",
1274            "value"      => "value",
1275            );
1276        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
1277        open my $fh, "<", $source or die "$source: $!";
1278        $csv->header ($fh, { munge_column_names => sub {
1279            s/\s+$//;
1280            s/^\s+//;
1281            $known{lc $_} or die "Unknown column '$_' in $source";
1282            }});
1283        while (my $row = $csv->getline_hr ($fh)) {
1284            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1285            }
1286
1287   bind_columns
1288       Takes a list of scalar references to be used for output with  "print"
1289       or to store in the fields fetched by "getline".  When you do not pass
1290       enough references to store the fetched fields in, "getline" will fail
1291       with error 3006.  If you pass more than there are fields to return,
1292       the content of the remaining references is left untouched.
1293
1294        $csv->bind_columns (\$code, \$name, \$price, \$description);
1295        while ($csv->getline ($fh)) {
1296            print "The price of a $name is \x{20ac} $price\n";
1297            }
1298
1299       To reset or clear all column binding, call "bind_columns" with the
1300       single argument "undef". This will also clear column names.
1301
1302        $csv->bind_columns (undef);
1303
1304       If no arguments are passed at all, "bind_columns" will return the list
1305       of current bindings or "undef" if no binds are active.
1306
1307       Note that in parsing with  "bind_columns",  the fields are set on the
1308       fly.  That implies that if the third field of a row causes an error
1309       (or this row has just two fields where the previous row had more),  the
1310       first two fields already have been assigned the values of the current
1311       row, while the rest of the fields will still hold the values of the
1312       previous row.  If you want the parser to fail in these cases, use the
1313       "strict" attribute.
1314
1315   eof
1316        $eof = $csv->eof ();
1317
1318       If "parse" or  "getline"  was used with an IO stream,  this method will
1319       return true (1) if the last call hit end of file,  otherwise it will
1320       return false ('').  This is useful to see the difference between a
1321       failure and end of file.
1322
1323       Note that if the parsing of the last line caused an error,  "eof" is
1324       still true.  That means that if you are not using "auto_diag", an idiom
1325       like
1326
1327        while (my $row = $csv->getline ($fh)) {
1328            # ...
1329            }
1330        $csv->eof or $csv->error_diag;
1331
1332       will not report the error. You would have to change that to
1333
1334        while (my $row = $csv->getline ($fh)) {
1335            # ...
1336            }
1337        +$csv->error_diag and $csv->error_diag;
1338
1339   types
1340        $csv->types (\@tref);
1341
1342       This method is used to force that  (all)  columns are of a given type.
1343       For example, if you have an integer column,  two  columns  with
1344       doubles  and a string column, then you might do a
1345
1346        $csv->types ([Text::CSV_XS::IV (),
1347                      Text::CSV_XS::NV (),
1348                      Text::CSV_XS::NV (),
1349                      Text::CSV_XS::PV ()]);
1350
1351       Column types are used only for decoding columns while parsing,  in
1352       other words by the "parse" and "getline" methods.
1353
1354       You can unset column types by doing a
1355
1356        $csv->types (undef);
1357
1358       or fetch the current type settings with
1359
1360        $types = $csv->types ();
1361
1362       IV  Set field type to integer.
1363
1364       NV  Set field type to numeric/float.
1365
1366       PV  Set field type to string.
1367
1368   fields
1369        @columns = $csv->fields ();
1370
1371       This method returns the input to   "combine"  or the resultant
1372       decomposed fields of a successful "parse", whichever was called more
1373       recently.
1374
1375       Note that the return value is undefined after using "getline", which
1376       does not fill the data structures returned by "parse".
1377
1378   meta_info
1379        @flags = $csv->meta_info ();
1380
1381       This method returns the "flags" of the input to "combine" or the flags
1382       of the resultant  decomposed fields of  "parse",   whichever was called
1383       more recently.
1384
1385       For each field,  a meta_info field will hold  flags that  inform
1386       something about  the  field  returned  by  the  "fields"  method or
1387       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1388
1389       " "0x0001
1390         The field was quoted.
1391
1392       " "0x0002
1393         The field was binary.
1394
1395       See the "is_***" methods below.
1396
1397   is_quoted
1398        my $quoted = $csv->is_quoted ($column_idx);
1399
1400       Where  $column_idx is the  (zero-based)  index of the column in the
1401       last result of "parse".
1402
1403       This returns a true value  if the data in the indicated column was
1404       enclosed in "quote_char" quotes.  This might be important for fields
1405       where content ",20070108," is to be treated as a numeric value,  and
1406       where ","20070108"," is explicitly marked as character string data.
1407
1408       This method is only valid when "keep_meta_info" is set to a true value.
1409
1410   is_binary
1411        my $binary = $csv->is_binary ($column_idx);
1412
1413       Where  $column_idx is the  (zero-based)  index of the column in the
1414       last result of "parse".
1415
1416       This returns a true value if the data in the indicated column contained
1417       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1418
1419       This method is only valid when "keep_meta_info" is set to a true value.
1420
1421   is_missing
1422        my $missing = $csv->is_missing ($column_idx);
1423
1424       Where  $column_idx is the  (zero-based)  index of the column in the
1425       last result of "getline_hr".
1426
1427        $csv->keep_meta_info (1);
1428        while (my $hr = $csv->getline_hr ($fh)) {
1429            $csv->is_missing (0) and next; # This was an empty line
1430            }
1431
1432       When using  "getline_hr",  it is impossible to tell if the  parsed
1433       fields are "undef" because they where not filled in the "CSV" stream
1434       or because they were not read at all, as all the fields defined by
1435       "column_names" are set in the hash-ref.    If you still need to know if
1436       all fields in each row are provided, you should enable "keep_meta_info"
1437       so you can check the flags.
1438
1439       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1440       "undef", regardless of $column_idx being valid or not. If this
1441       attribute is "true" it will return either 0 (the field is present) or 1
1442       (the field is missing).
1443
1444       A special case is the empty line.  If the line is completely empty -
1445       after dealing with the flags - this is still a valid CSV line:  it is a
1446       record of just one single empty field. However, if "keep_meta_info" is
1447       set, invoking "is_missing" with index 0 will now return true.
1448
1449   status
1450        $status = $csv->status ();
1451
1452       This method returns the status of the last invoked "combine" or "parse"
1453       call. Status is success (true: 1) or failure (false: "undef" or 0).
1454
1455   error_input
1456        $bad_argument = $csv->error_input ();
1457
1458       This method returns the erroneous argument (if it exists) of "combine"
1459       or "parse",  whichever was called more recently.  If the last
1460       invocation was successful, "error_input" will return "undef".
1461
1462   error_diag
1463        Text::CSV_XS->error_diag ();
1464        $csv->error_diag ();
1465        $error_code               = 0  + $csv->error_diag ();
1466        $error_str                = "" . $csv->error_diag ();
1467        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1468
1469       If (and only if) an error occurred,  this function returns  the
1470       diagnostics of that error.
1471
1472       If called in void context,  this will print the internal error code and
1473       the associated error message to STDERR.
1474
1475       If called in list context,  this will return  the error code  and the
1476       error message in that order.  If the last error was from parsing, the
1477       rest of the values returned are a best guess at the location  within
1478       the line  that was being parsed. Their values are 1-based.  The
1479       position currently is index of the byte at which the parsing failed in
1480       the current record. It might change to be the index of the current
1481       character in a later release. The records is the index of the record
1482       parsed by the csv instance. The field number is the index of the field
1483       the parser thinks it is currently  trying to  parse. See
1484       examples/csv-check for how this can be used.
1485
1486       If called in  scalar context,  it will return  the diagnostics  in a
1487       single scalar, a-la $!.  It will contain the error code in numeric
1488       context, and the diagnostics message in string context.
1489
1490       When called as a class method or a  direct function call,  the
1491       diagnostics are that of the last "new" call.
1492
1493   record_number
1494        $recno = $csv->record_number ();
1495
1496       Returns the records parsed by this csv instance.  This value should be
1497       more accurate than $. when embedded newlines come in play. Records
1498       written by this instance are not counted.
1499
1500   SetDiag
1501        $csv->SetDiag (0);
1502
1503       Use to reset the diagnostics if you are dealing with errors.
1504

FUNCTIONS

1506   csv
1507       This function is not exported by default and should be explicitly
1508       requested:
1509
1510        use Text::CSV_XS qw( csv );
1511
1512       This is an high-level function that aims at simple (user) interfaces.
1513       This can be used to read/parse a "CSV" file or stream (the default
1514       behavior) or to produce a file or write to a stream (define the  "out"
1515       attribute).  It returns an array- or hash-reference on parsing (or
1516       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1517       When this function fails you can get to the error using the class call
1518       to "error_diag"
1519
1520        my $aoa = csv (in => "test.csv") or
1521            die Text::CSV_XS->error_diag;
1522
1523       This function takes the arguments as key-value pairs. This can be
1524       passed as a list or as an anonymous hash:
1525
1526        my $aoa = csv (  in => "test.csv", sep_char => ";");
1527        my $aoh = csv ({ in => $fh, headers => "auto" });
1528
1529       The arguments passed consist of two parts:  the arguments to "csv"
1530       itself and the optional attributes to the  "CSV"  object used inside
1531       the function as enumerated and explained in "new".
1532
1533       If not overridden, the default option used for CSV is
1534
1535        auto_diag   => 1
1536        escape_null => 0
1537
1538       The option that is always set and cannot be altered is
1539
1540        binary      => 1
1541
1542       As this function will likely be used in one-liners,  it allows  "quote"
1543       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1544       "esc" or "escape".
1545
1546       Alternative invocations:
1547
1548        my $aoa = Text::CSV_XS::csv (in => "file.csv");
1549
1550        my $csv = Text::CSV_XS->new ();
1551        my $aoa = $csv->csv (in => "file.csv");
1552
1553       In the latter case, the object attributes are used from the existing
1554       object and the attribute arguments in the function call are ignored:
1555
1556        my $csv = Text::CSV_XS->new ({ sep_char => ";" });
1557        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1558
1559       will parse using ";" as "sep_char", not ",".
1560
1561       in
1562
1563       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1564       which will be  opened for reading  and closed when finished,  a file
1565       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1566       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1567       "\q{1,2,"csv"}").
1568
1569       When used with "out", "in" should be a reference to a CSV structure
1570       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1571       reference.  The code-ref will be invoked with no arguments.
1572
1573        my $aoa = csv (in => "file.csv");
1574
1575        open my $fh, "<", "file.csv";
1576        my $aoa = csv (in => $fh);
1577
1578        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1579        my $err = csv (in => $csv, out => "file.csv");
1580
1581       If called in void context without the "out" attribute, the resulting
1582       ref will be used as input to a subsequent call to csv:
1583
1584        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1585
1586       will be a shortcut to
1587
1588        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1589
1590       where, in the absence of the "out" attribute, this is a shortcut to
1591
1592        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1593             out => *STDOUT)
1594
1595       out
1596
1597        csv (in => $aoa, out => "file.csv");
1598        csv (in => $aoa, out => $fh);
1599        csv (in => $aoa, out =>   STDOUT);
1600        csv (in => $aoa, out =>  *STDOUT);
1601        csv (in => $aoa, out => \*STDOUT);
1602        csv (in => $aoa, out => \my $data);
1603        csv (in => $aoa, out =>  undef);
1604        csv (in => $aoa, out => \"skip");
1605
1606       In output mode, the default CSV options when producing CSV are
1607
1608        eol       => "\r\n"
1609
1610       The "fragment" attribute is ignored in output mode.
1611
1612       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1613       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1614       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1615       or a reference to a scalar (e.g. "\my $data").
1616
1617        csv (in => sub { $sth->fetch },            out => "dump.csv");
1618        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1619             headers => $sth->{NAME_lc});
1620
1621       When a code-ref is used for "in", the output is generated  per
1622       invocation, so no buffering is involved. This implies that there is no
1623       size restriction on the number of records. The "csv" function ends when
1624       the coderef returns a false value.
1625
1626       If "out" is set to a reference of the literal string "skip", the output
1627       will be suppressed completely,  which might be useful in combination
1628       with a filter for side effects only.
1629
1630        my %cache;
1631        csv (in    => "dump.csv",
1632             out   => \"skip",
1633             on_in => sub { $cache{$_[1][1]}++ });
1634
1635       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1636       equivalent to "\"skip"".
1637
1638       encoding
1639
1640       If passed,  it should be an encoding accepted by the  ":encoding()"
1641       option to "open". There is no default value. This attribute does not
1642       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1643       use in command line invocations.
1644
1645       If "encoding" is set to the literal value "auto", the method "header"
1646       will be invoked on the opened stream to check if there is a BOM and set
1647       the encoding accordingly.   This is equal to passing a true value in
1648       the option "detect_bom".
1649
1650       detect_bom
1651
1652       If  "detect_bom"  is given, the method  "header"  will be invoked on
1653       the opened stream to check if there is a BOM and set the encoding
1654       accordingly.
1655
1656       "detect_bom" can be abbreviated to "bom".
1657
1658       This is the same as setting "encoding" to "auto".
1659
1660       Note that as the method  "header" is invoked,  its default is to also
1661       set the headers.
1662
1663       headers
1664
1665       If this attribute is not given, the default behavior is to produce an
1666       array of arrays.
1667
1668       If "headers" is supplied,  it should be an anonymous list of column
1669       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1670       "lc", "uc", or "skip".
1671
1672       skip
1673         When "skip" is used, the header will not be included in the output.
1674
1675          my $aoa = csv (in => $fh, headers => "skip");
1676
1677       auto
1678         If "auto" is used, the first line of the "CSV" source will be read as
1679         the list of field headers and used to produce an array of hashes.
1680
1681          my $aoh = csv (in => $fh, headers => "auto");
1682
1683       lc
1684         If "lc" is used,  the first line of the  "CSV" source will be read as
1685         the list of field headers mapped to  lower case and used to produce
1686         an array of hashes. This is a variation of "auto".
1687
1688          my $aoh = csv (in => $fh, headers => "lc");
1689
1690       uc
1691         If "uc" is used,  the first line of the  "CSV" source will be read as
1692         the list of field headers mapped to  upper case and used to produce
1693         an array of hashes. This is a variation of "auto".
1694
1695          my $aoh = csv (in => $fh, headers => "uc");
1696
1697       CODE
1698         If a coderef is used,  the first line of the  "CSV" source will be
1699         read as the list of mangled field headers in which each field is
1700         passed as the only argument to the coderef. This list is used to
1701         produce an array of hashes.
1702
1703          my $aoh = csv (in      => $fh,
1704                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1705
1706         this example is a variation of using "lc" where all occurrences of
1707         "kode" are replaced with "code".
1708
1709       ARRAY
1710         If  "headers"  is an anonymous list,  the entries in the list will be
1711         used as field names. The first line is considered data instead of
1712         headers.
1713
1714          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1715          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1716
1717       HASH
1718         If "headers" is an hash reference, this implies "auto", but header
1719         fields for that exist as key in the hashref will be replaced by the
1720         value for that key. Given a CSV file like
1721
1722          post-kode,city,name,id number,fubble
1723          1234AA,Duckstad,Donald,13,"X313DF"
1724
1725         using
1726
1727          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1728
1729         will return an entry like
1730
1731          { pc     => "1234AA",
1732            city   => "Duckstad",
1733            name   => "Donald",
1734            ID     => "13",
1735            fubble => "X313DF",
1736            }
1737
1738       See also "munge_column_names" and "set_column_names".
1739
1740       munge_column_names
1741
1742       If "munge_column_names" is set,  the method  "header"  is invoked on
1743       the opened stream with all matching arguments to detect and set the
1744       headers.
1745
1746       "munge_column_names" can be abbreviated to "munge".
1747
1748       key
1749
1750       If passed,  will default  "headers"  to "auto" and return a hashref
1751       instead of an array of hashes. Allowed values are simple scalars or
1752       array-references where the first element is the joiner and the rest are
1753       the fields to join to combine the key.
1754
1755        my $ref = csv (in => "test.csv", key => "code");
1756        my $ref = csv (in => "test.csv", key => [ ":" => "code", "color" ]);
1757
1758       with test.csv like
1759
1760        code,product,price,color
1761        1,pc,850,gray
1762        2,keyboard,12,white
1763        3,mouse,5,black
1764
1765       the first example will return
1766
1767         { 1   => {
1768               code    => 1,
1769               color   => 'gray',
1770               price   => 850,
1771               product => 'pc'
1772               },
1773           2   => {
1774               code    => 2,
1775               color   => 'white',
1776               price   => 12,
1777               product => 'keyboard'
1778               },
1779           3   => {
1780               code    => 3,
1781               color   => 'black',
1782               price   => 5,
1783               product => 'mouse'
1784               }
1785           }
1786
1787       the second example will return
1788
1789         { "1:gray"    => {
1790               code    => 1,
1791               color   => 'gray',
1792               price   => 850,
1793               product => 'pc'
1794               },
1795           "2:white"   => {
1796               code    => 2,
1797               color   => 'white',
1798               price   => 12,
1799               product => 'keyboard'
1800               },
1801           "3:black"   => {
1802               code    => 3,
1803               color   => 'black',
1804               price   => 5,
1805               product => 'mouse'
1806               }
1807           }
1808
1809       The "key" attribute can be combined with "headers" for "CSV" date that
1810       has no header line, like
1811
1812        my $ref = csv (
1813            in      => "foo.csv",
1814            headers => [qw( c_foo foo bar description stock )],
1815            key     =>     "c_foo",
1816            );
1817
1818       value
1819
1820       Used to create key-value hashes.
1821
1822       Only allowed when "key" is valid. A "value" can be either a single
1823       column label or an anonymous list of column labels.  In the first case,
1824       the value will be a simple scalar value, in the latter case, it will be
1825       a hashref.
1826
1827        my $ref = csv (in => "test.csv", key   => "code",
1828                                         value => "price");
1829        my $ref = csv (in => "test.csv", key   => "code",
1830                                         value => [ "product", "price" ]);
1831        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1832                                         value => "price");
1833        my $ref = csv (in => "test.csv", key   => [ ":" => "code", "color" ],
1834                                         value => [ "product", "price" ]);
1835
1836       with test.csv like
1837
1838        code,product,price,color
1839        1,pc,850,gray
1840        2,keyboard,12,white
1841        3,mouse,5,black
1842
1843       the first example will return
1844
1845         { 1 => 850,
1846           2 =>  12,
1847           3 =>   5,
1848           }
1849
1850       the second example will return
1851
1852         { 1   => {
1853               price   => 850,
1854               product => 'pc'
1855               },
1856           2   => {
1857               price   => 12,
1858               product => 'keyboard'
1859               },
1860           3   => {
1861               price   => 5,
1862               product => 'mouse'
1863               }
1864           }
1865
1866       the third example will return
1867
1868         { "1:gray"    => 850,
1869           "2:white"   =>  12,
1870           "3:black"   =>   5,
1871           }
1872
1873       the fourth example will return
1874
1875         { "1:gray"    => {
1876               price   => 850,
1877               product => 'pc'
1878               },
1879           "2:white"   => {
1880               price   => 12,
1881               product => 'keyboard'
1882               },
1883           "3:black"   => {
1884               price   => 5,
1885               product => 'mouse'
1886               }
1887           }
1888
1889       keep_headers
1890
1891       When using hashes,  keep the column names into the arrayref passed,  so
1892       all headers are available after the call in the original order.
1893
1894        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1895
1896       This attribute can be abbreviated to "kh" or passed as
1897       "keep_column_names".
1898
1899       This attribute implies a default of "auto" for the "headers" attribute.
1900
1901       fragment
1902
1903       Only output the fragment as defined in the "fragment" method. This
1904       option is ignored when generating "CSV". See "out".
1905
1906       Combining all of them could give something like
1907
1908        use Text::CSV_XS qw( csv );
1909        my $aoh = csv (
1910            in       => "test.txt",
1911            encoding => "utf-8",
1912            headers  => "auto",
1913            sep_char => "|",
1914            fragment => "row=3;6-9;15-*",
1915            );
1916        say $aoh->[15]{Foo};
1917
1918       sep_set
1919
1920       If "sep_set" is set, the method "header" is invoked on the opened
1921       stream to detect and set "sep_char" with the given set.
1922
1923       "sep_set" can be abbreviated to "seps".
1924
1925       Note that as the  "header" method is invoked,  its default is to also
1926       set the headers.
1927
1928       set_column_names
1929
1930       If  "set_column_names" is passed,  the method "header" is invoked on
1931       the opened stream with all arguments meant for "header".
1932
1933       If "set_column_names" is passed as a false value, the content of the
1934       first row is only preserved if the output is AoA:
1935
1936       With an input-file like
1937
1938        bAr,foo
1939        1,2
1940        3,4,5
1941
1942       This call
1943
1944        my $aoa = csv (in => $file, set_column_names => 0);
1945
1946       will result in
1947
1948        [[ "bar", "foo"     ],
1949         [ "1",   "2"       ],
1950         [ "3",   "4",  "5" ]]
1951
1952       and
1953
1954        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
1955
1956       will result in
1957
1958        [[ "bAr", "foo"     ],
1959         [ "1",   "2"       ],
1960         [ "3",   "4",  "5" ]]
1961
1962   Callbacks
1963       Callbacks enable actions triggered from the inside of Text::CSV_XS.
1964
1965       While most of what this enables  can easily be done in an  unrolled
1966       loop as described in the "SYNOPSIS" callbacks can be used to meet
1967       special demands or enhance the "csv" function.
1968
1969       error
1970          $csv->callbacks (error => sub { $csv->SetDiag (0) });
1971
1972         the "error"  callback is invoked when an error occurs,  but  only
1973         when "auto_diag" is set to a true value. A callback is invoked with
1974         the values returned by "error_diag":
1975
1976          my ($c, $s);
1977
1978          sub ignore3006
1979          {
1980              my ($err, $msg, $pos, $recno, $fldno) = @_;
1981              if ($err == 3006) {
1982                  # ignore this error
1983                  ($c, $s) = (undef, undef);
1984                  Text::CSV_XS->SetDiag (0);
1985                  }
1986              # Any other error
1987              return;
1988              } # ignore3006
1989
1990          $csv->callbacks (error => \&ignore3006);
1991          $csv->bind_columns (\$c, \$s);
1992          while ($csv->getline ($fh)) {
1993              # Error 3006 will not stop the loop
1994              }
1995
1996       after_parse
1997          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
1998          while (my $row = $csv->getline ($fh)) {
1999              $row->[-1] eq "NEW";
2000              }
2001
2002         This callback is invoked after parsing with  "getline"  only if no
2003         error occurred.  The callback is invoked with two arguments:   the
2004         current "CSV" parser object and an array reference to the fields
2005         parsed.
2006
2007         The return code of the callback is ignored  unless it is a reference
2008         to the string "skip", in which case the record will be skipped in
2009         "getline_all".
2010
2011          sub add_from_db
2012          {
2013              my ($csv, $row) = @_;
2014              $sth->execute ($row->[4]);
2015              push @$row, $sth->fetchrow_array;
2016              } # add_from_db
2017
2018          my $aoa = csv (in => "file.csv", callbacks => {
2019              after_parse => \&add_from_db });
2020
2021         This hook can be used for validation:
2022
2023         FAIL
2024           Die if any of the records does not validate a rule:
2025
2026            after_parse => sub {
2027                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
2028                    die "5th field does not have a valid Dutch zipcode";
2029                }
2030
2031         DEFAULT
2032           Replace invalid fields with a default value:
2033
2034            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
2035
2036         SKIP
2037           Skip records that have invalid fields (only applies to
2038           "getline_all"):
2039
2040            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
2041
2042       before_print
2043          my $idx = 1;
2044          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
2045          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
2046
2047         This callback is invoked  before printing with  "print"  only if no
2048         error occurred.  The callback is invoked with two arguments:  the
2049         current  "CSV" parser object and an array reference to the fields
2050         passed.
2051
2052         The return code of the callback is ignored.
2053
2054          sub max_4_fields
2055          {
2056              my ($csv, $row) = @_;
2057              @$row > 4 and splice @$row, 4;
2058              } # max_4_fields
2059
2060          csv (in => csv (in => "file.csv"), out => *STDOUT,
2061              callbacks => { before print => \&max_4_fields });
2062
2063         This callback is not active for "combine".
2064
2065       Callbacks for csv ()
2066
2067       The "csv" allows for some callbacks that do not integrate in XS
2068       internals but only feature the "csv" function.
2069
2070         csv (in        => "file.csv",
2071              callbacks => {
2072                  filter       => { 6 => sub { $_ > 15 } },    # first
2073                  after_parse  => sub { say "AFTER PARSE";  }, # first
2074                  after_in     => sub { say "AFTER IN";     }, # second
2075                  on_in        => sub { say "ON IN";        }, # third
2076                  },
2077              );
2078
2079         csv (in        => $aoh,
2080              out       => "file.csv",
2081              callbacks => {
2082                  on_in        => sub { say "ON IN";        }, # first
2083                  before_out   => sub { say "BEFORE OUT";   }, # second
2084                  before_print => sub { say "BEFORE PRINT"; }, # third
2085                  },
2086              );
2087
2088       filter
2089         This callback can be used to filter records.  It is called just after
2090         a new record has been scanned.  The callback accepts a:
2091
2092         hashref
2093           The keys are the index to the row (the field name or field number,
2094           1-based) and the values are subs to return a true or false value.
2095
2096            csv (in => "file.csv", filter => {
2097                       3 => sub { m/a/ },       # third field should contain an "a"
2098                       5 => sub { length > 4 }, # length of the 5th field minimal 5
2099                       });
2100
2101            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
2102
2103           If the keys to the filter hash contain any character that is not a
2104           digit it will also implicitly set "headers" to "auto"  unless
2105           "headers"  was already passed as argument.  When headers are
2106           active, returning an array of hashes, the filter is not applicable
2107           to the header itself.
2108
2109           All sub results should match, as in AND.
2110
2111           The context of the callback sets  $_ localized to the field
2112           indicated by the filter. The two arguments are as with all other
2113           callbacks, so the other fields in the current row can be seen:
2114
2115            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
2116
2117           If the context is set to return a list of hashes  ("headers" is
2118           defined), the current record will also be available in the
2119           localized %_:
2120
2121            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
2122
2123           If the filter is used to alter the content by changing $_,  make
2124           sure that the sub returns true in order not to have that record
2125           skipped:
2126
2127            filter => { 2 => sub { $_ = uc }}
2128
2129           will upper-case the second field, and then skip it if the resulting
2130           content evaluates to false. To always accept, end with truth:
2131
2132            filter => { 2 => sub { $_ = uc; 1 }}
2133
2134         coderef
2135            csv (in => "file.csv", filter => sub { $n++; 0; });
2136
2137           If the argument to "filter" is a coderef,  it is an alias or
2138           shortcut to a filter on column 0:
2139
2140            csv (filter => sub { $n++; 0 });
2141
2142           is equal to
2143
2144            csv (filter => { 0 => sub { $n++; 0 });
2145
2146         filter-name
2147            csv (in => "file.csv", filter => "not_blank");
2148            csv (in => "file.csv", filter => "not_empty");
2149            csv (in => "file.csv", filter => "filled");
2150
2151           These are predefined filters
2152
2153           Given a file like (line numbers prefixed for doc purpose only):
2154
2155            1:1,2,3
2156            2:
2157            3:,
2158            4:""
2159            5:,,
2160            6:, ,
2161            7:"",
2162            8:" "
2163            9:4,5,6
2164
2165           not_blank
2166             Filter out the blank lines
2167
2168             This filter is a shortcut for
2169
2170              filter => { 0 => sub { @{$_[1]} > 1 or
2171                          defined $_[1][0] && $_[1][0] ne "" } }
2172
2173             Due to the implementation,  it is currently impossible to also
2174             filter lines that consists only of a quoted empty field. These
2175             lines are also considered blank lines.
2176
2177             With the given example, lines 2 and 4 will be skipped.
2178
2179           not_empty
2180             Filter out lines where all the fields are empty.
2181
2182             This filter is a shortcut for
2183
2184              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2185
2186             A space is not regarded being empty, so given the example data,
2187             lines 2, 3, 4, 5, and 7 are skipped.
2188
2189           filled
2190             Filter out lines that have no visible data
2191
2192             This filter is a shortcut for
2193
2194              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2195
2196             This filter rejects all lines that not have at least one field
2197             that does not evaluate to the empty string.
2198
2199             With the given example data, this filter would skip lines 2
2200             through 8.
2201
2202       after_in
2203         This callback is invoked for each record after all records have been
2204         parsed but before returning the reference to the caller.  The hook is
2205         invoked with two arguments:  the current  "CSV"  parser object  and a
2206         reference to the record.   The reference can be a reference to a
2207         HASH  or a reference to an ARRAY as determined by the arguments.
2208
2209         This callback can also be passed as  an attribute without the
2210         "callbacks" wrapper.
2211
2212       before_out
2213         This callback is invoked for each record before the record is
2214         printed.  The hook is invoked with two arguments:  the current "CSV"
2215         parser object and a reference to the record.   The reference can be a
2216         reference to a  HASH or a reference to an ARRAY as determined by the
2217         arguments.
2218
2219         This callback can also be passed as an attribute  without the
2220         "callbacks" wrapper.
2221
2222         This callback makes the row available in %_ if the row is a hashref.
2223         In this case %_ is writable and will change the original row.
2224
2225       on_in
2226         This callback acts exactly as the "after_in" or the "before_out"
2227         hooks.
2228
2229         This callback can also be passed as an attribute  without the
2230         "callbacks" wrapper.
2231
2232         This callback makes the row available in %_ if the row is a hashref.
2233         In this case %_ is writable and will change the original row. So e.g.
2234         with
2235
2236           my $aoh = csv (
2237               in      => \"foo\n1\n2\n",
2238               headers => "auto",
2239               on_in   => sub { $_{bar} = 2; },
2240               );
2241
2242         $aoh will be:
2243
2244           [ { foo => 1,
2245               bar => 2,
2246               }
2247             { foo => 2,
2248               bar => 2,
2249               }
2250             ]
2251
2252       csv
2253         The function  "csv" can also be called as a method or with an
2254         existing Text::CSV_XS object. This could help if the function is to
2255         be invoked a lot of times and the overhead of creating the object
2256         internally over  and  over again would be prevented by passing an
2257         existing instance.
2258
2259          my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2260
2261          my $aoa = $csv->csv (in => $fh);
2262          my $aoa = csv (in => $fh, csv => $csv);
2263
2264         both act the same. Running this 20000 times on a 20 lines CSV file,
2265         showed a 53% speedup.
2266

INTERNALS

2268       Combine (...)
2269       Parse (...)
2270
2271       The arguments to these internal functions are deliberately not
2272       described or documented in order to enable the  module authors make
2273       changes it when they feel the need for it.  Using them is  highly
2274       discouraged  as  the  API may change in future releases.
2275

EXAMPLES

2277   Reading a CSV file line by line:
2278        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2279        open my $fh, "<", "file.csv" or die "file.csv: $!";
2280        while (my $row = $csv->getline ($fh)) {
2281            # do something with @$row
2282            }
2283        close $fh or die "file.csv: $!";
2284
2285       or
2286
2287        my $aoa = csv (in => "file.csv", on_in => sub {
2288            # do something with %_
2289            });
2290
2291       Reading only a single column
2292
2293        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2294        open my $fh, "<", "file.csv" or die "file.csv: $!";
2295        # get only the 4th column
2296        my @column = map { $_->[3] } @{$csv->getline_all ($fh)};
2297        close $fh or die "file.csv: $!";
2298
2299       with "csv", you could do
2300
2301        my @column = map { $_->[0] }
2302            @{csv (in => "file.csv", fragment => "col=4")};
2303
2304   Parsing CSV strings:
2305        my $csv = Text::CSV_XS->new ({ keep_meta_info => 1, binary => 1 });
2306
2307        my $sample_input_string =
2308            qq{"I said, ""Hi!""",Yes,"",2.34,,"1.09","\x{20ac}",};
2309        if ($csv->parse ($sample_input_string)) {
2310            my @field = $csv->fields;
2311            foreach my $col (0 .. $#field) {
2312                my $quo = $csv->is_quoted ($col) ? $csv->{quote_char} : "";
2313                printf "%2d: %s%s%s\n", $col, $quo, $field[$col], $quo;
2314                }
2315            }
2316        else {
2317            print STDERR "parse () failed on argument: ",
2318                $csv->error_input, "\n";
2319            $csv->error_diag ();
2320            }
2321
2322       Parsing CSV from memory
2323
2324       Given a complete CSV data-set in scalar $data,  generate a list of
2325       lists to represent the rows and fields
2326
2327        # The data
2328        my $data = join "\r\n" => map { join "," => 0 .. 5 } 0 .. 5;
2329
2330        # in a loop
2331        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
2332        open my $fh, "<", \$data;
2333        my @foo;
2334        while (my $row = $csv->getline ($fh)) {
2335            push @foo, $row;
2336            }
2337        close $fh;
2338
2339        # a single call
2340        my $foo = csv (in => \$data);
2341
2342   Printing CSV data
2343       The fast way: using "print"
2344
2345       An example for creating "CSV" files using the "print" method:
2346
2347        my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
2348        open my $fh, ">", "foo.csv" or die "foo.csv: $!";
2349        for (1 .. 10) {
2350            $csv->print ($fh, [ $_, "$_" ]) or $csv->error_diag;
2351            }
2352        close $fh or die "$tbl.csv: $!";
2353
2354       The slow way: using "combine" and "string"
2355
2356       or using the slower "combine" and "string" methods:
2357
2358        my $csv = Text::CSV_XS->new;
2359
2360        open my $csv_fh, ">", "hello.csv" or die "hello.csv: $!";
2361
2362        my @sample_input_fields = (
2363            'You said, "Hello!"',   5.67,
2364            '"Surely"',   '',   '3.14159');
2365        if ($csv->combine (@sample_input_fields)) {
2366            print $csv_fh $csv->string, "\n";
2367            }
2368        else {
2369            print "combine () failed on argument: ",
2370                $csv->error_input, "\n";
2371            }
2372        close $csv_fh or die "hello.csv: $!";
2373
2374       Generating CSV into memory
2375
2376       Format a data-set (@foo) into a scalar value in memory ($data):
2377
2378        # The data
2379        my @foo = map { [ 0 .. 5 ] } 0 .. 3;
2380
2381        # in a loop
2382        my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, eol => "\r\n" });
2383        open my $fh, ">", \my $data;
2384        $csv->print ($fh, $_) for @foo;
2385        close $fh;
2386
2387        # a single call
2388        csv (in => \@foo, out => \my $data);
2389
2390   Rewriting CSV
2391       Rewrite "CSV" files with ";" as separator character to well-formed
2392       "CSV":
2393
2394        use Text::CSV_XS qw( csv );
2395        csv (in => csv (in => "bad.csv", sep_char => ";"), out => *STDOUT);
2396
2397       As "STDOUT" is now default in "csv", a one-liner converting a UTF-16
2398       CSV file with BOM and TAB-separation to valid UTF-8 CSV could be:
2399
2400        $ perl -C3 -MText::CSV_XS=csv -we\
2401           'csv(in=>"utf16tab.csv",encoding=>"utf16",sep=>"\t")' >utf8.csv
2402
2403   Dumping database tables to CSV
2404       Dumping a database table can be simple as this (TIMTOWTDI):
2405
2406        my $dbh = DBI->connect (...);
2407        my $sql = "select * from foo";
2408
2409        # using your own loop
2410        open my $fh, ">", "foo.csv" or die "foo.csv: $!\n";
2411        my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n" });
2412        my $sth = $dbh->prepare ($sql); $sth->execute;
2413        $csv->print ($fh, $sth->{NAME_lc});
2414        while (my $row = $sth->fetch) {
2415            $csv->print ($fh, $row);
2416            }
2417
2418        # using the csv function, all in memory
2419        csv (out => "foo.csv", in => $dbh->selectall_arrayref ($sql));
2420
2421        # using the csv function, streaming with callbacks
2422        my $sth = $dbh->prepare ($sql); $sth->execute;
2423        csv (out => "foo.csv", in => sub { $sth->fetch            });
2424        csv (out => "foo.csv", in => sub { $sth->fetchrow_hashref });
2425
2426       Note that this does not discriminate between "empty" values and NULL-
2427       values from the database,  as both will be the same empty field in CSV.
2428       To enable distinction between the two, use "quote_empty".
2429
2430        csv (out => "foo.csv", in => sub { $sth->fetch }, quote_empty => 1);
2431
2432       If the database import utility supports special sequences to insert
2433       "NULL" values into the database,  like MySQL/MariaDB supports "\N",
2434       use a filter or a map
2435
2436        csv (out => "foo.csv", in => sub { $sth->fetch },
2437                            on_in => sub { $_ //= "\\N" for @{$_[1]} });
2438
2439        while (my $row = $sth->fetch) {
2440            $csv->print ($fh, [ map { $_ // "\\N" } @$row ]);
2441            }
2442
2443       note that this will not work as expected when choosing the backslash
2444       ("\") as "escape_char", as that will cause the "\" to need to be
2445       escaped by yet another "\",  which will cause the field to need
2446       quotation and thus ending up as "\\N" instead of "\N". See also
2447       "undef_str".
2448
2449        csv (out => "foo.csv", in => sub { $sth->fetch }, undef_str => "\\N");
2450
2451       these special sequences are not recognized by  Text::CSV_XS  on parsing
2452       the CSV generated like this, but map and filter are your friends again
2453
2454        while (my $row = $csv->getline ($fh)) {
2455            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @$row);
2456            }
2457
2458        csv (in => "foo.csv", filter => { 1 => sub {
2459            $sth->execute (map { $_ eq "\\N" ? undef : $_ } @{$_[1]}); 0; }});
2460
2461   The examples folder
2462       For more extended examples, see the examples/ 1. sub-directory in the
2463       original distribution or the git repository 2.
2464
2465        1. https://github.com/Tux/Text-CSV_XS/tree/master/examples
2466        2. https://github.com/Tux/Text-CSV_XS
2467
2468       The following files can be found there:
2469
2470       parser-xs.pl
2471         This can be used as a boilerplate to parse invalid "CSV"  and parse
2472         beyond (expected) errors alternative to using the "error" callback.
2473
2474          $ perl examples/parser-xs.pl bad.csv >good.csv
2475
2476       csv-check
2477         This is a command-line tool that uses parser-xs.pl  techniques to
2478         check the "CSV" file and report on its content.
2479
2480          $ csv-check files/utf8.csv
2481          Checked files/utf8.csv  with csv-check 1.9
2482          using Text::CSV_XS 1.32 with perl 5.26.0 and Unicode 9.0.0
2483          OK: rows: 1, columns: 2
2484              sep = <,>, quo = <">, bin = <1>, eol = <"\n">
2485
2486       csv2xls
2487         A script to convert "CSV" to Microsoft Excel ("XLS"). This requires
2488         extra modules Date::Calc and Spreadsheet::WriteExcel. The converter
2489         accepts various options and can produce UTF-8 compliant Excel files.
2490
2491       csv2xlsx
2492         A script to convert "CSV" to Microsoft Excel ("XLSX").  This requires
2493         the modules Date::Calc and Spreadsheet::Writer::XLSX.  The converter
2494         does accept various options including merging several "CSV" files
2495         into a single Excel file.
2496
2497       csvdiff
2498         A script that provides colorized diff on sorted CSV files,  assuming
2499         first line is header and first field is the key. Output options
2500         include colorized ANSI escape codes or HTML.
2501
2502          $ csvdiff --html --output=diff.html file1.csv file2.csv
2503
2504       rewrite.pl
2505         A script to rewrite (in)valid CSV into valid CSV files.  Script has
2506         options to generate confusing CSV files or CSV files that conform to
2507         Dutch MS-Excel exports (using ";" as separation).
2508
2509         Script - by default - honors BOM  and auto-detects separation
2510         converting it to default standard CSV with "," as separator.
2511

CAVEATS

2513       Text::CSV_XS  is not designed to detect the characters used to quote
2514       and separate fields.  The parsing is done using predefined  (default)
2515       settings.  In the examples  sub-directory,  you can find scripts  that
2516       demonstrate how you could try to detect these characters yourself.
2517
2518   Microsoft Excel
2519       The import/export from Microsoft Excel is a risky task, according to
2520       the documentation in "Text::CSV::Separator".  Microsoft uses the
2521       system's list separator defined in the regional settings, which happens
2522       to be a semicolon for Dutch, German and Spanish (and probably some
2523       others as well).   For the English locale,  the default is a comma.
2524       In Windows however,  the user is free to choose a  predefined locale,
2525       and then change  every  individual setting in it, so checking the
2526       locale is no solution.
2527
2528       As of version 1.17, a lone first line with just
2529
2530         sep=;
2531
2532       will be recognized and honored when parsing with "getline".
2533

TODO

2535       More Errors & Warnings
2536         New extensions ought to be  clear and concise  in reporting what
2537         error has occurred where and why, and maybe also offer a remedy to
2538         the problem.
2539
2540         "error_diag" is a (very) good start, but there is more work to be
2541         done in this area.
2542
2543         Basic calls  should croak or warn on  illegal parameters.  Errors
2544         should be documented.
2545
2546       setting meta info
2547         Future extensions might include extending the "meta_info",
2548         "is_quoted", and  "is_binary"  to accept setting these  flags for
2549         fields,  so you can specify which fields are quoted in the
2550         "combine"/"string" combination.
2551
2552          $csv->meta_info (0, 1, 1, 3, 0, 0);
2553          $csv->is_quoted (3, 1);
2554
2555         Metadata Vocabulary for Tabular Data
2556         <http://w3c.github.io/csvw/metadata/> (a W3C editor's draft) could be
2557         an example for supporting more metadata.
2558
2559       Parse the whole file at once
2560         Implement new methods or functions  that enable parsing of a
2561         complete file at once, returning a list of hashes. Possible extension
2562         to this could be to enable a column selection on the call:
2563
2564          my @AoH = $csv->parse_file ($filename, { cols => [ 1, 4..8, 12 ]});
2565
2566         Returning something like
2567
2568          [ { fields => [ 1, 2, "foo", 4.5, undef, "", 8 ],
2569              flags  => [ ... ],
2570              },
2571            { fields => [ ... ],
2572              .
2573              },
2574            ]
2575
2576         Note that the "csv" function already supports most of this,  but does
2577         not return flags. "getline_all" returns all rows for an open stream,
2578         but this will not return flags either.  "fragment"  can reduce the
2579         required  rows or columns, but cannot combine them.
2580
2581       Cookbook
2582         Write a document that has recipes for  most known  non-standard  (and
2583         maybe some standard)  "CSV" formats,  including formats that use
2584         "TAB",  ";", "|", or other non-comma separators.
2585
2586         Examples could be taken from W3C's CSV on the Web: Use Cases and
2587         Requirements <http://w3c.github.io/csvw/use-cases-and-
2588         requirements/index.html>
2589
2590       Steal
2591         Steal good new ideas and features from PapaParse
2592         <http://papaparse.com> or csvkit <http://csvkit.readthedocs.org>.
2593
2594       Perl6 support
2595         I'm already working on perl6 support here
2596         <https://github.com/Tux/CSV>. No promises yet on when it is finished
2597         (or fast). Trying to keep the API alike as much as possible.
2598
2599   NOT TODO
2600       combined methods
2601         Requests for adding means (methods) that combine "combine" and
2602         "string" in a single call will not be honored (use "print" instead).
2603         Likewise for "parse" and "fields"  (use "getline" instead), given the
2604         problems with embedded newlines.
2605
2606   Release plan
2607       No guarantees, but this is what I had in mind some time ago:
2608
2609       · DIAGNOSTICS section in pod to *describe* the errors (see below)
2610

EBCDIC

2612       The current hard-coding of characters and character ranges  makes this
2613       code unusable on "EBCDIC" systems. Recent work in perl-5.20 might
2614       change that.
2615
2616       Opening "EBCDIC" encoded files on  "ASCII"+  systems is likely to
2617       succeed using Encode's "cp37", "cp1047", or "posix-bc":
2618
2619        open my $fh, "<:encoding(cp1047)", "ebcdic_file.csv" or die "...";
2620

DIAGNOSTICS

2622       Still under construction ...
2623
2624       If an error occurs,  "$csv->error_diag" can be used to get information
2625       on the cause of the failure. Note that for speed reasons the internal
2626       value is never cleared on success,  so using the value returned by
2627       "error_diag" in normal cases - when no error occurred - may cause
2628       unexpected results.
2629
2630       If the constructor failed, the cause can be found using "error_diag" as
2631       a class method, like "Text::CSV_XS->error_diag".
2632
2633       The "$csv->error_diag" method is automatically invoked upon error when
2634       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2635       autodie is in effect.  When set to 1, this will cause a "warn" with the
2636       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2637       from "auto_diag" reports.
2638
2639       Errors can be (individually) caught using the "error" callback.
2640
2641       The errors as described below are available. I have tried to make the
2642       error itself explanatory enough, but more descriptions will be added.
2643       For most of these errors, the first three capitals describe the error
2644       category:
2645
2646       · INI
2647
2648         Initialization error or option conflict.
2649
2650       · ECR
2651
2652         Carriage-Return related parse error.
2653
2654       · EOF
2655
2656         End-Of-File related parse error.
2657
2658       · EIQ
2659
2660         Parse error inside quotation.
2661
2662       · EIF
2663
2664         Parse error inside field.
2665
2666       · ECB
2667
2668         Combine error.
2669
2670       · EHR
2671
2672         HashRef parse related error.
2673
2674       And below should be the complete list of error codes that can be
2675       returned:
2676
2677       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2678
2679         The  separation character  cannot be equal to  the quotation
2680         character or to the escape character,  as this would invalidate all
2681         parsing rules.
2682
2683       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2684         TAB"
2685
2686         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2687         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2688         allow.
2689
2690       · 1003 "INI - \r or \n in main attr not allowed"
2691
2692         Using default "eol" characters in either "sep_char", "quote_char",
2693         or  "escape_char"  is  not allowed.
2694
2695       · 1004 "INI - callbacks should be undef or a hashref"
2696
2697         The "callbacks"  attribute only allows one to be "undef" or a hash
2698         reference.
2699
2700       · 1005 "INI - EOL too long"
2701
2702         The value passed for EOL is exceeding its maximum length (16).
2703
2704       · 1006 "INI - SEP too long"
2705
2706         The value passed for SEP is exceeding its maximum length (16).
2707
2708       · 1007 "INI - QUOTE too long"
2709
2710         The value passed for QUOTE is exceeding its maximum length (16).
2711
2712       · 1008 "INI - SEP undefined"
2713
2714         The value passed for SEP should be defined and not empty.
2715
2716       · 1010 "INI - the header is empty"
2717
2718         The header line parsed in the "header" is empty.
2719
2720       · 1011 "INI - the header contains more than one valid separator"
2721
2722         The header line parsed in the  "header"  contains more than one
2723         (unique) separator character out of the allowed set of separators.
2724
2725       · 1012 "INI - the header contains an empty field"
2726
2727         The header line parsed in the "header" is contains an empty field.
2728
2729       · 1013 "INI - the header contains nun-unique fields"
2730
2731         The header line parsed in the  "header"  contains at least  two
2732         identical fields.
2733
2734       · 1014 "INI - header called on undefined stream"
2735
2736         The header line cannot be parsed from an undefined sources.
2737
2738       · 1500 "PRM - Invalid/unsupported argument(s)"
2739
2740         Function or method called with invalid argument(s) or parameter(s).
2741
2742       · 1501 "PRM - The key attribute is passed as an unsupported type"
2743
2744         The "key" attribute is of an unsupported type.
2745
2746       · 1502 "PRM - The value attribute is passed without the key attribute"
2747
2748         The "value" attribute is only allowed when a valid key is given.
2749
2750       · 1503 "PRM - The value attribute is passed as an unsupported type"
2751
2752         The "value" attribute is of an unsupported type.
2753
2754       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2755
2756         When  "eol"  has  been  set  to  anything  but the  default,  like
2757         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2758         "quote_char", where the characters following the "\r" do not make up
2759         the "eol" sequence, this is an error.
2760
2761       · 2011 "ECR - Characters after end of quoted field"
2762
2763         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2764         quoted field and after the closing double-quote, there should be
2765         either a new-line sequence or a separation character.
2766
2767       · 2012 "EOF - End of data in parsing input stream"
2768
2769         Self-explaining. End-of-file while inside parsing a stream. Can
2770         happen only when reading from streams with "getline",  as using
2771         "parse" is done on strings that are not required to have a trailing
2772         "eol".
2773
2774       · 2013 "INI - Specification error for fragments RFC7111"
2775
2776         Invalid specification for URI "fragment" specification.
2777
2778       · 2014 "ENF - Inconsistent number of fields"
2779
2780         Inconsistent number of fields under strict parsing.
2781
2782       · 2021 "EIQ - NL char inside quotes, binary off"
2783
2784         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2785         option has been selected with the constructor.
2786
2787       · 2022 "EIQ - CR char inside quotes, binary off"
2788
2789         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2790         option has been selected with the constructor.
2791
2792       · 2023 "EIQ - QUO character not allowed"
2793
2794         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2795         Bar",\n" will cause this error.
2796
2797       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2798
2799         The escape character is not allowed as last character in an input
2800         stream.
2801
2802       · 2025 "EIQ - Loose unescaped escape"
2803
2804         An escape character should escape only characters that need escaping.
2805
2806         Allowing  the escape  for other characters  is possible  with the
2807         attribute "allow_loose_escapes".
2808
2809       · 2026 "EIQ - Binary character inside quoted field, binary off"
2810
2811         Binary characters are not allowed by default.    Exceptions are
2812         fields that contain valid UTF-8,  that will automatically be upgraded
2813         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2814         data.
2815
2816       · 2027 "EIQ - Quoted field not terminated"
2817
2818         When parsing a field that started with a quotation character,  the
2819         field is expected to be closed with a quotation character.   When the
2820         parsed line is exhausted before the quote is found, that field is not
2821         terminated.
2822
2823       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2824
2825       · 2031 "EIF - CR char is first char of field, not part of EOL"
2826
2827       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2828
2829       · 2034 "EIF - Loose unescaped quote"
2830
2831       · 2035 "EIF - Escaped EOF in unquoted field"
2832
2833       · 2036 "EIF - ESC error"
2834
2835       · 2037 "EIF - Binary character in unquoted field, binary off"
2836
2837       · 2110 "ECB - Binary character in Combine, binary off"
2838
2839       · 2200 "EIO - print to IO failed. See errno"
2840
2841       · 3001 "EHR - Unsupported syntax for column_names ()"
2842
2843       · 3002 "EHR - getline_hr () called before column_names ()"
2844
2845       · 3003 "EHR - bind_columns () and column_names () fields count
2846         mismatch"
2847
2848       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2849
2850       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2851         fields"
2852
2853       · 3007 "EHR - bind_columns needs refs to writable scalars"
2854
2855       · 3008 "EHR - unexpected error in bound fields"
2856
2857       · 3009 "EHR - print_hr () called before column_names ()"
2858
2859       · 3010 "EHR - print_hr () called with invalid arguments"
2860

AUTHOR

2877       Alan Citterman <alan@mfgrtl.com> wrote the original Perl module.
2878       Please don't send mail concerning Text::CSV_XS to Alan, who is not
2879       involved in the C/XS part that is now the main part of the module.
2880
2881       Jochen Wiedmann <joe@ispsoft.de> rewrote the en- and decoding in C by
2882       implementing a simple finite-state machine.   He added variable quote,
2883       escape and separator characters, the binary mode and the print and
2884       getline methods. See ChangeLog releases 0.10 through 0.23.
2885
2886       H.Merijn Brand <h.m.brand@xs4all.nl> cleaned up the code,  added the
2887       field flags methods,  wrote the major part of the test suite, completed
2888       the documentation,   fixed most RT bugs,  added all the allow flags and
2889       the "csv" function. See ChangeLog releases 0.25 and on.
2890

COPYRIGHT AND LICENSE

2892        Copyright (C) 2007-2019 H.Merijn Brand.  All rights reserved.
2893        Copyright (C) 1998-2001 Jochen Wiedmann. All rights reserved.
2894        Copyright (C) 1997      Alan Citterman.  All rights reserved.
2895
2896       This library is free software;  you can redistribute and/or modify it
2897       under the same terms as Perl itself.
2898
2899
2900
2901perl v5.28.1                      2019-02-27                         CSV_XS(3)