Text::CSV(3pm)

1Text::CSV(3)          User Contributed Perl Documentation         Text::CSV(3)
2
3
4

NAME

6       Text::CSV - comma-separated values manipulator (using XS or PurePerl)
7

SYNOPSIS

9        use Text::CSV;
10
11        my @rows;
12        my $csv = Text::CSV->new ( { binary => 1 } )  # should set binary attribute.
13                        or die "Cannot use CSV: ".Text::CSV->error_diag ();
14
15        open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
16        while ( my $row = $csv->getline( $fh ) ) {
17            $row->[2] =~ m/pattern/ or next; # 3rd field should match
18            push @rows, $row;
19        }
20        $csv->eof or $csv->error_diag();
21        close $fh;
22
23        $csv->eol ("\r\n");
24
25        open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
26        $csv->print ($fh, $_) for @rows;
27        close $fh or die "new.csv: $!";
28
29        #
30        # parse and combine style
31        #
32
33        $status = $csv->combine(@columns);    # combine columns into a string
34        $line   = $csv->string();             # get the combined string
35
36        $status  = $csv->parse($line);        # parse a CSV string into fields
37        @columns = $csv->fields();            # get the parsed fields
38
39        $status       = $csv->status ();      # get the most recent status
40        $bad_argument = $csv->error_input (); # get the most recent bad argument
41        $diag         = $csv->error_diag ();  # if an error occurred, explains WHY
42
43        $status = $csv->print ($io, $colref); # Write an array of fields
44                                              # immediately to a file $io
45        $colref = $csv->getline ($io);        # Read a line from file $io,
46                                              # parse it and return an array
47                                              # ref of fields
48        $csv->column_names (@names);          # Set column names for getline_hr ()
49        $ref = $csv->getline_hr ($io);        # getline (), but returns a hashref
50        $eof = $csv->eof ();                  # Indicate if last parse or
51                                              # getline () hit End Of File
52
53        $csv->types(\@t_array);               # Set column types
54

DESCRIPTION

56       Text::CSV is a thin wrapper for Text::CSV_XS-compatible modules now.
57       All the backend modules provide facilities for the composition and
58       decomposition of comma-separated values. Text::CSV uses Text::CSV_XS by
59       default, and when Text::CSV_XS is not available, falls back on
60       Text::CSV_PP, which is bundled in the same distribution as this module.
61

CHOOSING BACKEND

63       This module respects an environmental variable called "PERL_TEXT_CSV"
64       when it decides a backend module to use. If this environmental variable
65       is not set, it tries to load Text::CSV_XS, and if Text::CSV_XS is not
66       available, falls back on Text::CSV_PP;
67
68       If you always don't want it to fall back on Text::CSV_PP, set the
69       variable like this ("export" may be "setenv", "set" and the likes,
70       depending on your environment):
71
72         > export PERL_TEXT_CSV=Text::CSV_XS
73
74       If you prefer Text::CSV_XS to Text::CSV_PP (default), then:
75
76         > export PERL_TEXT_CSV=Text::CSV_XS,Text::CSV_PP
77
78       You may also want to set this variable at the top of your test files,
79       in order not to be bothered with incompatibilities between backends
80       (you need to wrap this in "BEGIN", and set before actually "use"-ing
81       Text::CSV module, as it decides its backend as soon as it's loaded):
82
83         BEGIN { $ENV{PERL_TEXT_CSV}='Text::CSV_PP'; }
84         use Text::CSV;
85

NOTES

87       This section is taken from Text::CSV_XS.
88
89   Embedded newlines
90       Important Note:  The default behavior is to accept only ASCII
91       characters in the range from 0x20 (space) to 0x7E (tilde).   This means
92       that the fields can not contain newlines. If your data contains
93       newlines embedded in fields, or characters above 0x7E (tilde), or
94       binary data, you must set "binary => 1" in the call to "new". To cover
95       the widest range of parsing options, you will always want to set
96       binary.
97
98       But you still have the problem  that you have to pass a correct line to
99       the "parse" method, which is more complicated from the usual point of
100       usage:
101
102        my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
103        while (<>) {           #  WRONG!
104            $csv->parse ($_);
105            my @fields = $csv->fields ();
106            }
107
108       this will break, as the "while" might read broken lines:  it does not
109       care about the quoting. If you need to support embedded newlines,  the
110       way to go is to  not  pass "eol" in the parser  (it accepts "\n", "\r",
111       and "\r\n" by default) and then
112
113        my $csv = Text::CSV->new ({ binary => 1 });
114        open my $fh, "<", $file or die "$file: $!";
115        while (my $row = $csv->getline ($fh)) {
116            my @fields = @$row;
117            }
118
119       The old(er) way of using global file handles is still supported
120
121        while (my $row = $csv->getline (*ARGV)) { ... }
122
123   Unicode
124       Unicode is only tested to work with perl-5.8.2 and up.
125
126       See also "BOM".
127
128       The simplest way to ensure the correct encoding is used for  in- and
129       output is by either setting layers on the filehandles, or setting the
130       "encoding" argument for "csv".
131
132        open my $fh, "<:encoding(UTF-8)", "in.csv"  or die "in.csv: $!";
133       or
134        my $aoa = csv (in => "in.csv",     encoding => "UTF-8");
135
136        open my $fh, ">:encoding(UTF-8)", "out.csv" or die "out.csv: $!";
137       or
138        csv (in => $aoa, out => "out.csv", encoding => "UTF-8");
139
140       On parsing (both for  "getline" and  "parse"),  if the source is marked
141       being UTF8, then all fields that are marked binary will also be marked
142       UTF8.
143
144       On combining ("print"  and  "combine"):  if any of the combining fields
145       was marked UTF8, the resulting string will be marked as UTF8.  Note
146       however that all fields  before  the first field marked UTF8 and
147       contained 8-bit characters that were not upgraded to UTF8,  these will
148       be  "bytes"  in the resulting string too, possibly causing unexpected
149       errors.  If you pass data of different encoding,  or you don't know if
150       there is  different  encoding, force it to be upgraded before you pass
151       them on:
152
153        $csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
154
155       For complete control over encoding, please use Text::CSV::Encoded:
156
157        use Text::CSV::Encoded;
158        my $csv = Text::CSV::Encoded->new ({
159            encoding_in  => "iso-8859-1", # the encoding comes into   Perl
160            encoding_out => "cp1252",     # the encoding comes out of Perl
161            });
162
163        $csv = Text::CSV::Encoded->new ({ encoding  => "utf8" });
164        # combine () and print () accept *literally* utf8 encoded data
165        # parse () and getline () return *literally* utf8 encoded data
166
167        $csv = Text::CSV::Encoded->new ({ encoding  => undef }); # default
168        # combine () and print () accept UTF8 marked data
169        # parse () and getline () return UTF8 marked data
170
171   BOM
172       BOM  (or Byte Order Mark)  handling is available only inside the
173       "header" method.   This method supports the following encodings:
174       "utf-8", "utf-1", "utf-32be", "utf-32le", "utf-16be", "utf-16le",
175       "utf-ebcdic", "scsu", "bocu-1", and "gb-18030". See Wikipedia
176       <https://en.wikipedia.org/wiki/Byte_order_mark>.
177
178       If a file has a BOM, the easiest way to deal with that is
179
180        my $aoh = csv (in => $file, detect_bom => 1);
181
182       All records will be encoded based on the detected BOM.
183
184       This implies a call to the  "header"  method,  which defaults to also
185       set the "column_names". So this is not the same as
186
187        my $aoh = csv (in => $file, headers => "auto");
188
189       which only reads the first record to set  "column_names"  but ignores
190       any meaning of possible present BOM.
191

METHODS

193       This section is also taken from Text::CSV_XS.
194
195   version
196       (Class method) Returns the current module version.
197
198   new
199       (Class method) Returns a new instance of class Text::CSV. The
200       attributes are described by the (optional) hash ref "\%attr".
201
202        my $csv = Text::CSV->new ({ attributes ... });
203
204       The following attributes are available:
205
206       eol
207
208        my $csv = Text::CSV->new ({ eol => $/ });
209                  $csv->eol (undef);
210        my $eol = $csv->eol;
211
212       The end-of-line string to add to rows for "print" or the record
213       separator for "getline".
214
215       When not passed in a parser instance,  the default behavior is to
216       accept "\n", "\r", and "\r\n", so it is probably safer to not specify
217       "eol" at all. Passing "undef" or the empty string behave the same.
218
219       When not passed in a generating instance,  records are not terminated
220       at all, so it is probably wise to pass something you expect. A safe
221       choice for "eol" on output is either $/ or "\r\n".
222
223       Common values for "eol" are "\012" ("\n" or Line Feed),  "\015\012"
224       ("\r\n" or Carriage Return, Line Feed),  and "\015"  ("\r" or Carriage
225       Return). The "eol" attribute cannot exceed 7 (ASCII) characters.
226
227       If both $/ and "eol" equal "\015", parsing lines that end on only a
228       Carriage Return without Line Feed, will be "parse"d correct.
229
230       sep_char
231
232        my $csv = Text::CSV->new ({ sep_char => ";" });
233                $csv->sep_char (";");
234        my $c = $csv->sep_char;
235
236       The char used to separate fields, by default a comma. (",").  Limited
237       to a single-byte character, usually in the range from 0x20 (space) to
238       0x7E (tilde). When longer sequences are required, use "sep".
239
240       The separation character can not be equal to the quote character  or to
241       the escape character.
242
243       sep
244
245        my $csv = Text::CSV->new ({ sep => "\N{FULLWIDTH COMMA}" });
246                  $csv->sep (";");
247        my $sep = $csv->sep;
248
249       The chars used to separate fields, by default undefined. Limited to 8
250       bytes.
251
252       When set, overrules "sep_char".  If its length is one byte it acts as
253       an alias to "sep_char".
254
255       quote_char
256
257        my $csv = Text::CSV->new ({ quote_char => "'" });
258                $csv->quote_char (undef);
259        my $c = $csv->quote_char;
260
261       The character to quote fields containing blanks or binary data,  by
262       default the double quote character (""").  A value of undef suppresses
263       quote chars (for simple cases only). Limited to a single-byte
264       character, usually in the range from  0x20 (space) to  0x7E (tilde).
265       When longer sequences are required, use "quote".
266
267       "quote_char" can not be equal to "sep_char".
268
269       quote
270
271        my $csv = Text::CSV->new ({ quote => "\N{FULLWIDTH QUOTATION MARK}" });
272                    $csv->quote ("'");
273        my $quote = $csv->quote;
274
275       The chars used to quote fields, by default undefined. Limited to 8
276       bytes.
277
278       When set, overrules "quote_char". If its length is one byte it acts as
279       an alias to "quote_char".
280
281       escape_char
282
283        my $csv = Text::CSV->new ({ escape_char => "\\" });
284                $csv->escape_char (":");
285        my $c = $csv->escape_char;
286
287       The character to  escape  certain characters inside quoted fields.
288       This is limited to a  single-byte  character,  usually  in the  range
289       from  0x20 (space) to 0x7E (tilde).
290
291       The "escape_char" defaults to being the double-quote mark ("""). In
292       other words the same as the default "quote_char". This means that
293       doubling the quote mark in a field escapes it:
294
295        "foo","bar","Escape ""quote mark"" with two ""quote marks""","baz"
296
297       If  you  change  the   "quote_char"  without  changing  the
298       "escape_char",  the  "escape_char" will still be the double-quote
299       (""").  If instead you want to escape the  "quote_char" by doubling it
300       you will need to also change the  "escape_char"  to be the same as what
301       you have changed the "quote_char" to.
302
303       Setting "escape_char" to <undef> or "" will disable escaping completely
304       and is greatly discouraged. This will also disable "escape_null".
305
306       The escape character can not be equal to the separation character.
307
308       binary
309
310        my $csv = Text::CSV->new ({ binary => 1 });
311                $csv->binary (0);
312        my $f = $csv->binary;
313
314       If this attribute is 1,  you may use binary characters in quoted
315       fields, including line feeds, carriage returns and "NULL" bytes. (The
316       latter could be escaped as ""0".) By default this feature is off.
317
318       If a string is marked UTF8,  "binary" will be turned on automatically
319       when binary characters other than "CR" and "NL" are encountered.   Note
320       that a simple string like "\x{00a0}" might still be binary, but not
321       marked UTF8, so setting "{ binary => 1 }" is still a wise option.
322
323       strict
324
325        my $csv = Text::CSV->new ({ strict => 1 });
326                $csv->strict (0);
327        my $f = $csv->strict;
328
329       If this attribute is set to 1, any row that parses to a different
330       number of fields than the previous row will cause the parser to throw
331       error 2014.
332
333       formula_handling
334
335       formula
336
337        my $csv = Text::CSV->new ({ formula => "none" });
338                $csv->formula ("none");
339        my $f = $csv->formula;
340
341       This defines the behavior of fields containing formulas. As formulas
342       are considered dangerous in spreadsheets, this attribute can define an
343       optional action to be taken if a field starts with an equal sign ("=").
344
345       For purpose of code-readability, this can also be written as
346
347        my $csv = Text::CSV->new ({ formula_handling => "none" });
348                $csv->formula_handling ("none");
349        my $f = $csv->formula_handling;
350
351       Possible values for this attribute are
352
353       none
354         Take no specific action. This is the default.
355
356          $csv->formula ("none");
357
358       die
359         Cause the process to "die" whenever a leading "=" is encountered.
360
361          $csv->formula ("die");
362
363       croak
364         Cause the process to "croak" whenever a leading "=" is encountered.
365         (See Carp)
366
367          $csv->formula ("croak");
368
369       diag
370         Report position and content of the field whenever a leading  "=" is
371         found.  The value of the field is unchanged.
372
373          $csv->formula ("diag");
374
375       empty
376         Replace the content of fields that start with a "=" with the empty
377         string.
378
379          $csv->formula ("empty");
380          $csv->formula ("");
381
382       undef
383         Replace the content of fields that start with a "=" with "undef".
384
385          $csv->formula ("undef");
386          $csv->formula (undef);
387
388       All other values will give a warning and then fallback to "diag".
389
390       decode_utf8
391
392        my $csv = Text::CSV->new ({ decode_utf8 => 1 });
393                $csv->decode_utf8 (0);
394        my $f = $csv->decode_utf8;
395
396       This attributes defaults to TRUE.
397
398       While parsing,  fields that are valid UTF-8, are automatically set to
399       be UTF-8, so that
400
401         $csv->parse ("\xC4\xA8\n");
402
403       results in
404
405         PV("\304\250"\0) [UTF8 "\x{128}"]
406
407       Sometimes it might not be a desired action.  To prevent those upgrades,
408       set this attribute to false, and the result will be
409
410         PV("\304\250"\0)
411
412       auto_diag
413
414        my $csv = Text::CSV->new ({ auto_diag => 1 });
415                $csv->auto_diag (2);
416        my $l = $csv->auto_diag;
417
418       Set this attribute to a number between 1 and 9 causes  "error_diag" to
419       be automatically called in void context upon errors.
420
421       In case of error "2012 - EOF", this call will be void.
422
423       If "auto_diag" is set to a numeric value greater than 1, it will "die"
424       on errors instead of "warn".  If set to anything unrecognized,  it will
425       be silently ignored.
426
427       Future extensions to this feature will include more reliable auto-
428       detection of  "autodie"  being active in the scope of which the error
429       occurred which will increment the value of "auto_diag" with  1 the
430       moment the error is detected.
431
432       diag_verbose
433
434        my $csv = Text::CSV->new ({ diag_verbose => 1 });
435                $csv->diag_verbose (2);
436        my $l = $csv->diag_verbose;
437
438       Set the verbosity of the output triggered by "auto_diag".   Currently
439       only adds the current  input-record-number  (if known)  to the
440       diagnostic output with an indication of the position of the error.
441
442       blank_is_undef
443
444        my $csv = Text::CSV->new ({ blank_is_undef => 1 });
445                $csv->blank_is_undef (0);
446        my $f = $csv->blank_is_undef;
447
448       Under normal circumstances, "CSV" data makes no distinction between
449       quoted- and unquoted empty fields.  These both end up in an empty
450       string field once read, thus
451
452        1,"",," ",2
453
454       is read as
455
456        ("1", "", "", " ", "2")
457
458       When writing  "CSV" files with either  "always_quote" or  "quote_empty"
459       set, the unquoted  empty field is the result of an undefined value.
460       To enable this distinction when  reading "CSV"  data,  the
461       "blank_is_undef"  attribute will cause  unquoted empty fields to be set
462       to "undef", causing the above to be parsed as
463
464        ("1", "", undef, " ", "2")
465
466       note that this is specifically important when loading  "CSV" fields
467       into a database that allows "NULL" values,  as the perl equivalent for
468       "NULL" is "undef" in DBI land.
469
470       empty_is_undef
471
472        my $csv = Text::CSV->new ({ empty_is_undef => 1 });
473                $csv->empty_is_undef (0);
474        my $f = $csv->empty_is_undef;
475
476       Going one  step  further  than  "blank_is_undef",  this attribute
477       converts all empty fields to "undef", so
478
479        1,"",," ",2
480
481       is read as
482
483        (1, undef, undef, " ", 2)
484
485       Note that this effects only fields that are  originally  empty,  not
486       fields that are empty after stripping allowed whitespace. YMMV.
487
488       allow_whitespace
489
490        my $csv = Text::CSV->new ({ allow_whitespace => 1 });
491                $csv->allow_whitespace (0);
492        my $f = $csv->allow_whitespace;
493
494       When this option is set to true,  the whitespace  ("TAB"'s and
495       "SPACE"'s) surrounding  the  separation character  is removed when
496       parsing.  If either "TAB" or "SPACE" is one of the three characters
497       "sep_char", "quote_char", or "escape_char" it will not be considered
498       whitespace.
499
500       Now lines like:
501
502        1 , "foo" , bar , 3 , zapp
503
504       are parsed as valid "CSV", even though it violates the "CSV" specs.
505
506       Note that  all  whitespace is stripped from both  start and  end of
507       each field.  That would make it  more than a feature to enable parsing
508       bad "CSV" lines, as
509
510        1,   2.0,  3,   ape  , monkey
511
512       will now be parsed as
513
514        ("1", "2.0", "3", "ape", "monkey")
515
516       even if the original line was perfectly acceptable "CSV".
517
518       allow_loose_quotes
519
520        my $csv = Text::CSV->new ({ allow_loose_quotes => 1 });
521                $csv->allow_loose_quotes (0);
522        my $f = $csv->allow_loose_quotes;
523
524       By default, parsing unquoted fields containing "quote_char" characters
525       like
526
527        1,foo "bar" baz,42
528
529       would result in parse error 2034.  Though it is still bad practice to
530       allow this format,  we  cannot  help  the  fact  that  some  vendors
531       make  their applications spit out lines styled this way.
532
533       If there is really bad "CSV" data, like
534
535        1,"foo "bar" baz",42
536
537       or
538
539        1,""foo bar baz"",42
540
541       there is a way to get this data-line parsed and leave the quotes inside
542       the quoted field as-is.  This can be achieved by setting
543       "allow_loose_quotes" AND making sure that the "escape_char" is  not
544       equal to "quote_char".
545
546       allow_loose_escapes
547
548        my $csv = Text::CSV->new ({ allow_loose_escapes => 1 });
549                $csv->allow_loose_escapes (0);
550        my $f = $csv->allow_loose_escapes;
551
552       Parsing fields  that  have  "escape_char"  characters that escape
553       characters that do not need to be escaped, like:
554
555        my $csv = Text::CSV->new ({ escape_char => "\\" });
556        $csv->parse (qq{1,"my bar\'s",baz,42});
557
558       would result in parse error 2025.   Though it is bad practice to allow
559       this format,  this attribute enables you to treat all escape character
560       sequences equal.
561
562       allow_unquoted_escape
563
564        my $csv = Text::CSV->new ({ allow_unquoted_escape => 1 });
565                $csv->allow_unquoted_escape (0);
566        my $f = $csv->allow_unquoted_escape;
567
568       A backward compatibility issue where "escape_char" differs from
569       "quote_char"  prevents  "escape_char" to be in the first position of a
570       field.  If "quote_char" is equal to the default """ and "escape_char"
571       is set to "\", this would be illegal:
572
573        1,\0,2
574
575       Setting this attribute to 1  might help to overcome issues with
576       backward compatibility and allow this style.
577
578       always_quote
579
580        my $csv = Text::CSV->new ({ always_quote => 1 });
581                $csv->always_quote (0);
582        my $f = $csv->always_quote;
583
584       By default the generated fields are quoted only if they need to be.
585       For example, if they contain the separator character. If you set this
586       attribute to 1 then all defined fields will be quoted. ("undef" fields
587       are not quoted, see "blank_is_undef"). This makes it quite often easier
588       to handle exported data in external applications.
589
590       quote_space
591
592        my $csv = Text::CSV->new ({ quote_space => 1 });
593                $csv->quote_space (0);
594        my $f = $csv->quote_space;
595
596       By default,  a space in a field would trigger quotation.  As no rule
597       exists this to be forced in "CSV",  nor any for the opposite, the
598       default is true for safety.   You can exclude the space  from this
599       trigger  by setting this attribute to 0.
600
601       quote_empty
602
603        my $csv = Text::CSV->new ({ quote_empty => 1 });
604                $csv->quote_empty (0);
605        my $f = $csv->quote_empty;
606
607       By default the generated fields are quoted only if they need to be.
608       An empty (defined) field does not need quotation. If you set this
609       attribute to 1 then empty defined fields will be quoted.  ("undef"
610       fields are not quoted, see "blank_is_undef"). See also "always_quote".
611
612       quote_binary
613
614        my $csv = Text::CSV->new ({ quote_binary => 1 });
615                $csv->quote_binary (0);
616        my $f = $csv->quote_binary;
617
618       By default,  all "unsafe" bytes inside a string cause the combined
619       field to be quoted.  By setting this attribute to 0, you can disable
620       that trigger for bytes >= 0x7F.
621
622       escape_null
623
624        my $csv = Text::CSV->new ({ escape_null => 1 });
625                $csv->escape_null (0);
626        my $f = $csv->escape_null;
627
628       By default, a "NULL" byte in a field would be escaped. This option
629       enables you to treat the  "NULL"  byte as a simple binary character in
630       binary mode (the "{ binary => 1 }" is set).  The default is true.  You
631       can prevent "NULL" escapes by setting this attribute to 0.
632
633       When the "escape_char" attribute is set to undefined,  this attribute
634       will be set to false.
635
636       The default setting will encode "=\x00=" as
637
638        "="0="
639
640       With "escape_null" set, this will result in
641
642        "=\x00="
643
644       The default when using the "csv" function is "false".
645
646       For backward compatibility reasons,  the deprecated old name
647       "quote_null" is still recognized.
648
649       keep_meta_info
650
651        my $csv = Text::CSV->new ({ keep_meta_info => 1 });
652                $csv->keep_meta_info (0);
653        my $f = $csv->keep_meta_info;
654
655       By default, the parsing of input records is as simple and fast as
656       possible.  However,  some parsing information - like quotation of the
657       original field - is lost in that process.  Setting this flag to true
658       enables retrieving that information after parsing with  the methods
659       "meta_info",  "is_quoted", and "is_binary" described below.  Default is
660       false for performance.
661
662       If you set this attribute to a value greater than 9,   than you can
663       control output quotation style like it was used in the input of the the
664       last parsed record (unless quotation was added because of other
665       reasons).
666
667        my $csv = Text::CSV->new ({
668           binary         => 1,
669           keep_meta_info => 1,
670           quote_space    => 0,
671           });
672
673        my $row = $csv->parse (q{1,,"", ," ",f,"g","h""h",help,"help"});
674
675        $csv->print (*STDOUT, \@row);
676        # 1,,, , ,f,g,"h""h",help,help
677        $csv->keep_meta_info (11);
678        $csv->print (*STDOUT, \@row);
679        # 1,,"", ," ",f,"g","h""h",help,"help"
680
681       undef_str
682
683        my $csv = Text::CSV->new ({ undef_str => "\\N" });
684                $csv->undef_str (undef);
685        my $s = $csv->undef_str;
686
687       This attribute optionally defines the output of undefined fields. The
688       value passed is not changed at all, so if it needs quotation, the
689       quotation needs to be included in the value of the attribute.  Use with
690       caution, as passing a value like  ",",,,,"""  will for sure mess up
691       your output. The default for this attribute is "undef", meaning no
692       special treatment.
693
694       This attribute is useful when exporting  CSV data  to be imported in
695       custom loaders, like for MySQL, that recognize special sequences for
696       "NULL" data.
697
698       verbatim
699
700        my $csv = Text::CSV->new ({ verbatim => 1 });
701                $csv->verbatim (0);
702        my $f = $csv->verbatim;
703
704       This is a quite controversial attribute to set,  but makes some hard
705       things possible.
706
707       The rationale behind this attribute is to tell the parser that the
708       normally special characters newline ("NL") and Carriage Return ("CR")
709       will not be special when this flag is set,  and be dealt with  as being
710       ordinary binary characters. This will ease working with data with
711       embedded newlines.
712
713       When  "verbatim"  is used with  "getline",  "getline"  auto-"chomp"'s
714       every line.
715
716       Imagine a file format like
717
718        M^^Hans^Janssen^Klas 2\n2A^Ja^11-06-2007#\r\n
719
720       where, the line ending is a very specific "#\r\n", and the sep_char is
721       a "^" (caret).   None of the fields is quoted,   but embedded binary
722       data is likely to be present. With the specific line ending, this
723       should not be too hard to detect.
724
725       By default,  Text::CSV'  parse function is instructed to only know
726       about "\n" and "\r"  to be legal line endings,  and so has to deal with
727       the embedded newline as a real "end-of-line",  so it can scan the next
728       line if binary is true, and the newline is inside a quoted field. With
729       this option, we tell "parse" to parse the line as if "\n" is just
730       nothing more than a binary character.
731
732       For "parse" this means that the parser has no more idea about line
733       ending and "getline" "chomp"s line endings on reading.
734
735       types
736
737       A set of column types; the attribute is immediately passed to the
738       "types" method.
739
740       callbacks
741
742       See the "Callbacks" section below.
743
744       accessors
745
746       To sum it up,
747
748        $csv = Text::CSV->new ();
749
750       is equivalent to
751
752        $csv = Text::CSV->new ({
753            eol                   => undef, # \r, \n, or \r\n
754            sep_char              => ',',
755            sep                   => undef,
756            quote_char            => '"',
757            quote                 => undef,
758            escape_char           => '"',
759            binary                => 0,
760            decode_utf8           => 1,
761            auto_diag             => 0,
762            diag_verbose          => 0,
763            blank_is_undef        => 0,
764            empty_is_undef        => 0,
765            allow_whitespace      => 0,
766            allow_loose_quotes    => 0,
767            allow_loose_escapes   => 0,
768            allow_unquoted_escape => 0,
769            always_quote          => 0,
770            quote_empty           => 0,
771            quote_space           => 1,
772            escape_null           => 1,
773            quote_binary          => 1,
774            keep_meta_info        => 0,
775            verbatim              => 0,
776            undef_str             => undef,
777            types                 => undef,
778            callbacks             => undef,
779            });
780
781       For all of the above mentioned flags, an accessor method is available
782       where you can inquire the current value, or change the value
783
784        my $quote = $csv->quote_char;
785        $csv->binary (1);
786
787       It is not wise to change these settings halfway through writing "CSV"
788       data to a stream. If however you want to create a new stream using the
789       available "CSV" object, there is no harm in changing them.
790
791       If the "new" constructor call fails,  it returns "undef",  and makes
792       the fail reason available through the "error_diag" method.
793
794        $csv = Text::CSV->new ({ ecs_char => 1 }) or
795            die "".Text::CSV->error_diag ();
796
797       "error_diag" will return a string like
798
799        "INI - Unknown attribute 'ecs_char'"
800
801   known_attributes
802        @attr = Text::CSV->known_attributes;
803        @attr = Text::CSV::known_attributes;
804        @attr = $csv->known_attributes;
805
806       This method will return an ordered list of all the supported
807       attributes as described above.   This can be useful for knowing what
808       attributes are valid in classes that use or extend Text::CSV.
809
810   print
811        $status = $csv->print ($fh, $colref);
812
813       Similar to  "combine" + "string" + "print",  but much more efficient.
814       It expects an array ref as input  (not an array!)  and the resulting
815       string is not really  created,  but  immediately  written  to the  $fh
816       object, typically an IO handle or any other object that offers a
817       "print" method.
818
819       For performance reasons  "print"  does not create a result string,  so
820       all "string", "status", "fields", and "error_input" methods will return
821       undefined information after executing this method.
822
823       If $colref is "undef"  (explicit,  not through a variable argument) and
824       "bind_columns"  was used to specify fields to be printed,  it is
825       possible to make performance improvements, as otherwise data would have
826       to be copied as arguments to the method call:
827
828        $csv->bind_columns (\($foo, $bar));
829        $status = $csv->print ($fh, undef);
830
831       A short benchmark
832
833        my @data = ("aa" .. "zz");
834        $csv->bind_columns (\(@data));
835
836        $csv->print ($fh, [ @data ]);   # 11800 recs/sec
837        $csv->print ($fh,  \@data  );   # 57600 recs/sec
838        $csv->print ($fh,   undef  );   # 48500 recs/sec
839
840   say
841        $status = $csv->say ($fh, $colref);
842
843       Like "print", but "eol" defaults to "$\".
844
845   print_hr
846        $csv->print_hr ($fh, $ref);
847
848       Provides an easy way  to print a  $ref  (as fetched with "getline_hr")
849       provided the column names are set with "column_names".
850
851       It is just a wrapper method with basic parameter checks over
852
853        $csv->print ($fh, [ map { $ref->{$_} } $csv->column_names ]);
854
855   combine
856        $status = $csv->combine (@fields);
857
858       This method constructs a "CSV" record from  @fields,  returning success
859       or failure.   Failure can result from lack of arguments or an argument
860       that contains an invalid character.   Upon success,  "string" can be
861       called to retrieve the resultant "CSV" string.  Upon failure,  the
862       value returned by "string" is undefined and "error_input" could be
863       called to retrieve the invalid argument.
864
865   string
866        $line = $csv->string ();
867
868       This method returns the input to  "parse"  or the resultant "CSV"
869       string of "combine", whichever was called more recently.
870
871   getline
872        $colref = $csv->getline ($fh);
873
874       This is the counterpart to  "print",  as "parse"  is the counterpart to
875       "combine":  it parses a row from the $fh  handle using the "getline"
876       method associated with $fh  and parses this row into an array ref.
877       This array ref is returned by the function or "undef" for failure.
878       When $fh does not support "getline", you are likely to hit errors.
879
880       When fields are bound with "bind_columns" the return value is a
881       reference to an empty list.
882
883       The "string", "fields", and "status" methods are meaningless again.
884
885   getline_all
886        $arrayref = $csv->getline_all ($fh);
887        $arrayref = $csv->getline_all ($fh, $offset);
888        $arrayref = $csv->getline_all ($fh, $offset, $length);
889
890       This will return a reference to a list of getline ($fh) results.  In
891       this call, "keep_meta_info" is disabled.  If $offset is negative, as
892       with "splice", only the last  "abs ($offset)" records of $fh are taken
893       into consideration.
894
895       Given a CSV file with 10 lines:
896
897        lines call
898        ----- ---------------------------------------------------------
899        0..9  $csv->getline_all ($fh)         # all
900        0..9  $csv->getline_all ($fh,  0)     # all
901        8..9  $csv->getline_all ($fh,  8)     # start at 8
902        -     $csv->getline_all ($fh,  0,  0) # start at 0 first 0 rows
903        0..4  $csv->getline_all ($fh,  0,  5) # start at 0 first 5 rows
904        4..5  $csv->getline_all ($fh,  4,  2) # start at 4 first 2 rows
905        8..9  $csv->getline_all ($fh, -2)     # last 2 rows
906        6..7  $csv->getline_all ($fh, -4,  2) # first 2 of last  4 rows
907
908   getline_hr
909       The "getline_hr" and "column_names" methods work together  to allow you
910       to have rows returned as hashrefs.  You must call "column_names" first
911       to declare your column names.
912
913        $csv->column_names (qw( code name price description ));
914        $hr = $csv->getline_hr ($fh);
915        print "Price for $hr->{name} is $hr->{price} EUR\n";
916
917       "getline_hr" will croak if called before "column_names".
918
919       Note that  "getline_hr"  creates a hashref for every row and will be
920       much slower than the combined use of "bind_columns"  and "getline" but
921       still offering the same ease of use hashref inside the loop:
922
923        my @cols = @{$csv->getline ($fh)};
924        $csv->column_names (@cols);
925        while (my $row = $csv->getline_hr ($fh)) {
926            print $row->{price};
927            }
928
929       Could easily be rewritten to the much faster:
930
931        my @cols = @{$csv->getline ($fh)};
932        my $row = {};
933        $csv->bind_columns (\@{$row}{@cols});
934        while ($csv->getline ($fh)) {
935            print $row->{price};
936            }
937
938       Your mileage may vary for the size of the data and the number of rows.
939       With perl-5.14.2 the comparison for a 100_000 line file with 14 rows:
940
941                   Rate hashrefs getlines
942        hashrefs 1.00/s       --     -76%
943        getlines 4.15/s     313%       --
944
945   getline_hr_all
946        $arrayref = $csv->getline_hr_all ($fh);
947        $arrayref = $csv->getline_hr_all ($fh, $offset);
948        $arrayref = $csv->getline_hr_all ($fh, $offset, $length);
949
950       This will return a reference to a list of   getline_hr ($fh) results.
951       In this call, "keep_meta_info" is disabled.
952
953   parse
954        $status = $csv->parse ($line);
955
956       This method decomposes a  "CSV"  string into fields,  returning success
957       or failure.   Failure can result from a lack of argument  or the given
958       "CSV" string is improperly formatted.   Upon success, "fields" can be
959       called to retrieve the decomposed fields. Upon failure calling "fields"
960       will return undefined data and  "error_input"  can be called to
961       retrieve  the invalid argument.
962
963       You may use the "types"  method for setting column types.  See "types"'
964       description below.
965
966       The $line argument is supposed to be a simple scalar. Everything else
967       is supposed to croak and set error 1500.
968
969   fragment
970       This function tries to implement RFC7111  (URI Fragment Identifiers for
971       the text/csv Media Type) - http://tools.ietf.org/html/rfc7111
972
973        my $AoA = $csv->fragment ($fh, $spec);
974
975       In specifications,  "*" is used to specify the last item, a dash ("-")
976       to indicate a range.   All indices are 1-based:  the first row or
977       column has index 1. Selections can be combined with the semi-colon
978       (";").
979
980       When using this method in combination with  "column_names",  the
981       returned reference  will point to a  list of hashes  instead of a  list
982       of lists.  A disjointed  cell-based combined selection  might return
983       rows with different number of columns making the use of hashes
984       unpredictable.
985
986        $csv->column_names ("Name", "Age");
987        my $AoH = $csv->fragment ($fh, "col=3;8");
988
989       If the "after_parse" callback is active,  it is also called on every
990       line parsed and skipped before the fragment.
991
992       row
993          row=4
994          row=5-7
995          row=6-*
996          row=1-2;4;6-*
997
998       col
999          col=2
1000          col=1-3
1001          col=4-*
1002          col=1-2;4;7-*
1003
1004       cell
1005         In cell-based selection, the comma (",") is used to pair row and
1006         column
1007
1008          cell=4,1
1009
1010         The range operator ("-") using "cell"s can be used to define top-left
1011         and bottom-right "cell" location
1012
1013          cell=3,1-4,6
1014
1015         The "*" is only allowed in the second part of a pair
1016
1017          cell=3,2-*,2    # row 3 till end, only column 2
1018          cell=3,2-3,*    # column 2 till end, only row 3
1019          cell=3,2-*,*    # strip row 1 and 2, and column 1
1020
1021         Cells and cell ranges may be combined with ";", possibly resulting in
1022         rows with different number of columns
1023
1024          cell=1,1-2,2;3,3-4,4;1,4;4,1
1025
1026         Disjointed selections will only return selected cells.   The cells
1027         that are not  specified  will  not  be  included  in the  returned
1028         set,  not even as "undef".  As an example given a "CSV" like
1029
1030          11,12,13,...19
1031          21,22,...28,29
1032          :            :
1033          91,...97,98,99
1034
1035         with "cell=1,1-2,2;3,3-4,4;1,4;4,1" will return:
1036
1037          11,12,14
1038          21,22
1039          33,34
1040          41,43,44
1041
1042         Overlapping cell-specs will return those cells only once, So
1043         "cell=1,1-3,3;2,2-4,4;2,3;4,2" will return:
1044
1045          11,12,13
1046          21,22,23,24
1047          31,32,33,34
1048          42,43,44
1049
1050       RFC7111 <http://tools.ietf.org/html/rfc7111> does  not  allow different
1051       types of specs to be combined   (either "row" or "col" or "cell").
1052       Passing an invalid fragment specification will croak and set error
1053       2013.
1054
1055   column_names
1056       Set the "keys" that will be used in the  "getline_hr"  calls.  If no
1057       keys (column names) are passed, it will return the current setting as a
1058       list.
1059
1060       "column_names" accepts a list of scalars  (the column names)  or a
1061       single array_ref, so you can pass the return value from "getline" too:
1062
1063        $csv->column_names ($csv->getline ($fh));
1064
1065       "column_names" does no checking on duplicates at all, which might lead
1066       to unexpected results.   Undefined entries will be replaced with the
1067       string "\cAUNDEF\cA", so
1068
1069        $csv->column_names (undef, "", "name", "name");
1070        $hr = $csv->getline_hr ($fh);
1071
1072       Will set "$hr->{"\cAUNDEF\cA"}" to the 1st field,  "$hr->{""}" to the
1073       2nd field, and "$hr->{name}" to the 4th field,  discarding the 3rd
1074       field.
1075
1076       "column_names" croaks on invalid arguments.
1077
1078   header
1079       This method does NOT work in perl-5.6.x
1080
1081       Parse the CSV header and set "sep", column_names and encoding.
1082
1083        my @hdr = $csv->header ($fh);
1084        $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1085        $csv->header ($fh, { detect_bom => 1, munge_column_names => "lc" });
1086
1087       The first argument should be a file handle.
1088
1089       This method resets some object properties,  as it is supposed to be
1090       invoked only once per file or stream.  It will leave attributes
1091       "column_names" and "bound_columns" alone of setting column names is
1092       disabled. Reading headers on previously process objects might fail on
1093       perl-5.8.0 and older.
1094
1095       Assuming that the file opened for parsing has a header, and the header
1096       does not contain problematic characters like embedded newlines,   read
1097       the first line from the open handle then auto-detect whether the header
1098       separates the column names with a character from the allowed separator
1099       list.
1100
1101       If any of the allowed separators matches,  and none of the other
1102       allowed separators match,  set  "sep"  to that  separator  for the
1103       current CSV instance and use it to parse the first line, map those to
1104       lowercase, and use that to set the instance "column_names":
1105
1106        my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
1107        open my $fh, "<", "file.csv";
1108        binmode $fh; # for Windows
1109        $csv->header ($fh);
1110        while (my $row = $csv->getline_hr ($fh)) {
1111            ...
1112            }
1113
1114       If the header is empty,  contains more than one unique separator out of
1115       the allowed set,  contains empty fields,   or contains identical fields
1116       (after folding), it will croak with error 1010, 1011, 1012, or 1013
1117       respectively.
1118
1119       If the header contains embedded newlines or is not valid  CSV  in any
1120       other way, this method will croak and leave the parse error untouched.
1121
1122       A successful call to "header"  will always set the  "sep"  of the $csv
1123       object. This behavior can not be disabled.
1124
1125       return value
1126
1127       On error this method will croak.
1128
1129       In list context,  the headers will be returned whether they are used to
1130       set "column_names" or not.
1131
1132       In scalar context, the instance itself is returned.  Note: the values
1133       as found in the header will effectively be  lost if  "set_column_names"
1134       is false.
1135
1136       Options
1137
1138       sep_set
1139          $csv->header ($fh, { sep_set => [ ";", ",", "|", "\t" ] });
1140
1141         The list of legal separators defaults to "[ ";", "," ]" and can be
1142         changed by this option.  As this is probably the most often used
1143         option,  it can be passed on its own as an unnamed argument:
1144
1145          $csv->header ($fh, [ ";", ",", "|", "\t", "::", "\x{2063}" ]);
1146
1147         Multi-byte  sequences are allowed,  both multi-character and
1148         Unicode.  See "sep".
1149
1150       detect_bom
1151          $csv->header ($fh, { detect_bom => 1 });
1152
1153         The default behavior is to detect if the header line starts with a
1154         BOM.  If the header has a BOM, use that to set the encoding of $fh.
1155         This default behavior can be disabled by passing a false value to
1156         "detect_bom".
1157
1158         Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE,
1159         UTF-32BE,  and UTF-32LE. BOM's also support UTF-1, UTF-EBCDIC, SCSU,
1160         BOCU-1,  and GB-18030 but Encode does not (yet). UTF-7 is not
1161         supported.
1162
1163         If a supported BOM was detected as start of the stream, it is stored
1164         in the abject attribute "ENCODING".
1165
1166          my $enc = $csv->{ENCODING};
1167
1168         The encoding is used with "binmode" on $fh.
1169
1170         If the handle was opened in a (correct) encoding,  this method will
1171         not alter the encoding, as it checks the leading bytes of the first
1172         line. In case the stream starts with a decode BOM ("U+FEFF"),
1173         "{ENCODING}" will be "" (empty) instead of the default "undef".
1174
1175       munge_column_names
1176         This option offers the means to modify the column names into
1177         something that is most useful to the application.   The default is to
1178         map all column names to lower case.
1179
1180          $csv->header ($fh, { munge_column_names => "lc" });
1181
1182         The following values are available:
1183
1184           lc     - lower case
1185           uc     - upper case
1186           none   - do not change
1187           \%hash - supply a mapping
1188           \&cb   - supply a callback
1189
1190         Literal:
1191
1192          $csv->header ($fh, { munge_column_names => "none" });
1193
1194         Hash:
1195
1196          $csv->header ($fh, { munge_column_names => { foo => "sombrero" });
1197
1198         if a value does not exist, the original value is used unchanged
1199
1200         Callback:
1201
1202          $csv->header ($fh, { munge_column_names => sub { fc } });
1203          $csv->header ($fh, { munge_column_names => sub { "column_".$col++ } });
1204          $csv->header ($fh, { munge_column_names => sub { lc (s/\W+/_/gr) } });
1205
1206         As this callback is called in a "map", you can use $_ directly.
1207
1208       set_column_names
1209          $csv->header ($fh, { set_column_names => 1 });
1210
1211         The default is to set the instances column names using
1212         "column_names" if the method is successful,  so subsequent calls to
1213         "getline_hr" can return a hash. Disable setting the header can be
1214         forced by using a false value for this option.
1215
1216         As described in "return value" above, content is lost in scalar
1217         context.
1218
1219       Validation
1220
1221       When receiving CSV files from external sources,  this method can be
1222       used to protect against changes in the layout by restricting to known
1223       headers  (and typos in the header fields).
1224
1225        my %known = (
1226            "record key" => "c_rec",
1227            "rec id"     => "c_rec",
1228            "id_rec"     => "c_rec",
1229            "kode"       => "code",
1230            "code"       => "code",
1231            "vaule"      => "value",
1232            "value"      => "value",
1233            );
1234        my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
1235        open my $fh, "<", $source or die "$source: $!";
1236        $csv->header ($fh, { munge_column_names => sub {
1237            s/\s+$//;
1238            s/^\s+//;
1239            $known{lc $_} or die "Unknown column '$_' in $source";
1240            }});
1241        while (my $row = $csv->getline_hr ($fh)) {
1242            say join "\t", $row->{c_rec}, $row->{code}, $row->{value};
1243            }
1244
1245   bind_columns
1246       Takes a list of scalar references to be used for output with  "print"
1247       or to store in the fields fetched by "getline".  When you do not pass
1248       enough references to store the fetched fields in, "getline" will fail
1249       with error 3006.  If you pass more than there are fields to return,
1250       the content of the remaining references is left untouched.
1251
1252        $csv->bind_columns (\$code, \$name, \$price, \$description);
1253        while ($csv->getline ($fh)) {
1254            print "The price of a $name is \x{20ac} $price\n";
1255            }
1256
1257       To reset or clear all column binding, call "bind_columns" with the
1258       single argument "undef". This will also clear column names.
1259
1260        $csv->bind_columns (undef);
1261
1262       If no arguments are passed at all, "bind_columns" will return the list
1263       of current bindings or "undef" if no binds are active.
1264
1265       Note that in parsing with  "bind_columns",  the fields are set on the
1266       fly.  That implies that if the third field of a row causes an error
1267       (or this row has just two fields where the previous row had more),  the
1268       first two fields already have been assigned the values of the current
1269       row, while the rest of the fields will still hold the values of the
1270       previous row.  If you want the parser to fail in these cases, use the
1271       "strict" attribute.
1272
1273   eof
1274        $eof = $csv->eof ();
1275
1276       If "parse" or  "getline"  was used with an IO stream,  this method will
1277       return true (1) if the last call hit end of file,  otherwise it will
1278       return false ('').  This is useful to see the difference between a
1279       failure and end of file.
1280
1281       Note that if the parsing of the last line caused an error,  "eof" is
1282       still true.  That means that if you are not using "auto_diag", an idiom
1283       like
1284
1285        while (my $row = $csv->getline ($fh)) {
1286            # ...
1287            }
1288        $csv->eof or $csv->error_diag;
1289
1290       will not report the error. You would have to change that to
1291
1292        while (my $row = $csv->getline ($fh)) {
1293            # ...
1294            }
1295        +$csv->error_diag and $csv->error_diag;
1296
1297   types
1298        $csv->types (\@tref);
1299
1300       This method is used to force that  (all)  columns are of a given type.
1301       For example, if you have an integer column,  two  columns  with
1302       doubles  and a string column, then you might do a
1303
1304        $csv->types ([Text::CSV::IV (),
1305                      Text::CSV::NV (),
1306                      Text::CSV::NV (),
1307                      Text::CSV::PV ()]);
1308
1309       Column types are used only for decoding columns while parsing,  in
1310       other words by the "parse" and "getline" methods.
1311
1312       You can unset column types by doing a
1313
1314        $csv->types (undef);
1315
1316       or fetch the current type settings with
1317
1318        $types = $csv->types ();
1319
1320       IV  Set field type to integer.
1321
1322       NV  Set field type to numeric/float.
1323
1324       PV  Set field type to string.
1325
1326   fields
1327        @columns = $csv->fields ();
1328
1329       This method returns the input to   "combine"  or the resultant
1330       decomposed fields of a successful "parse", whichever was called more
1331       recently.
1332
1333       Note that the return value is undefined after using "getline", which
1334       does not fill the data structures returned by "parse".
1335
1336   meta_info
1337        @flags = $csv->meta_info ();
1338
1339       This method returns the "flags" of the input to "combine" or the flags
1340       of the resultant  decomposed fields of  "parse",   whichever was called
1341       more recently.
1342
1343       For each field,  a meta_info field will hold  flags that  inform
1344       something about  the  field  returned  by  the  "fields"  method or
1345       passed to  the "combine" method. The flags are bit-wise-"or"'d like:
1346
1347       " "0x0001
1348         The field was quoted.
1349
1350       " "0x0002
1351         The field was binary.
1352
1353       See the "is_***" methods below.
1354
1355   is_quoted
1356        my $quoted = $csv->is_quoted ($column_idx);
1357
1358       Where  $column_idx is the  (zero-based)  index of the column in the
1359       last result of "parse".
1360
1361       This returns a true value  if the data in the indicated column was
1362       enclosed in "quote_char" quotes.  This might be important for fields
1363       where content ",20070108," is to be treated as a numeric value,  and
1364       where ","20070108"," is explicitly marked as character string data.
1365
1366       This method is only valid when "keep_meta_info" is set to a true value.
1367
1368   is_binary
1369        my $binary = $csv->is_binary ($column_idx);
1370
1371       Where  $column_idx is the  (zero-based)  index of the column in the
1372       last result of "parse".
1373
1374       This returns a true value if the data in the indicated column contained
1375       any byte in the range "[\x00-\x08,\x10-\x1F,\x7F-\xFF]".
1376
1377       This method is only valid when "keep_meta_info" is set to a true value.
1378
1379   is_missing
1380        my $missing = $csv->is_missing ($column_idx);
1381
1382       Where  $column_idx is the  (zero-based)  index of the column in the
1383       last result of "getline_hr".
1384
1385        $csv->keep_meta_info (1);
1386        while (my $hr = $csv->getline_hr ($fh)) {
1387            $csv->is_missing (0) and next; # This was an empty line
1388            }
1389
1390       When using  "getline_hr",  it is impossible to tell if the  parsed
1391       fields are "undef" because they where not filled in the "CSV" stream
1392       or because they were not read at all, as all the fields defined by
1393       "column_names" are set in the hash-ref.    If you still need to know if
1394       all fields in each row are provided, you should enable "keep_meta_info"
1395       so you can check the flags.
1396
1397       If  "keep_meta_info"  is "false",  "is_missing"  will always return
1398       "undef", regardless of $column_idx being valid or not. If this
1399       attribute is "true" it will return either 0 (the field is present) or 1
1400       (the field is missing).
1401
1402       A special case is the empty line.  If the line is completely empty -
1403       after dealing with the flags - this is still a valid CSV line:  it is a
1404       record of just one single empty field. However, if "keep_meta_info" is
1405       set, invoking "is_missing" with index 0 will now return true.
1406
1407   status
1408        $status = $csv->status ();
1409
1410       This method returns the status of the last invoked "combine" or "parse"
1411       call. Status is success (true: 1) or failure (false: "undef" or 0).
1412
1413   error_input
1414        $bad_argument = $csv->error_input ();
1415
1416       This method returns the erroneous argument (if it exists) of "combine"
1417       or "parse",  whichever was called more recently.  If the last
1418       invocation was successful, "error_input" will return "undef".
1419
1420   error_diag
1421        Text::CSV->error_diag ();
1422        $csv->error_diag ();
1423        $error_code               = 0  + $csv->error_diag ();
1424        $error_str                = "" . $csv->error_diag ();
1425        ($cde, $str, $pos, $rec, $fld) = $csv->error_diag ();
1426
1427       If (and only if) an error occurred,  this function returns  the
1428       diagnostics of that error.
1429
1430       If called in void context,  this will print the internal error code and
1431       the associated error message to STDERR.
1432
1433       If called in list context,  this will return  the error code  and the
1434       error message in that order.  If the last error was from parsing, the
1435       rest of the values returned are a best guess at the location  within
1436       the line  that was being parsed. Their values are 1-based.  The
1437       position currently is index of the byte at which the parsing failed in
1438       the current record. It might change to be the index of the current
1439       character in a later release. The records is the index of the record
1440       parsed by the csv instance. The field number is the index of the field
1441       the parser thinks it is currently  trying to  parse. See
1442       examples/csv-check for how this can be used.
1443
1444       If called in  scalar context,  it will return  the diagnostics  in a
1445       single scalar, a-la $!.  It will contain the error code in numeric
1446       context, and the diagnostics message in string context.
1447
1448       When called as a class method or a  direct function call,  the
1449       diagnostics are that of the last "new" call.
1450
1451   record_number
1452        $recno = $csv->record_number ();
1453
1454       Returns the records parsed by this csv instance.  This value should be
1455       more accurate than $. when embedded newlines come in play. Records
1456       written by this instance are not counted.
1457
1458   SetDiag
1459        $csv->SetDiag (0);
1460
1461       Use to reset the diagnostics if you are dealing with errors.
1462

ADDITIONAL METHODS

1464       backend
1465           Returns the backend module name called by Text::CSV.  "module" is
1466           an alias.
1467
1468       is_xs
1469           Returns true value if Text::CSV uses an XS backend.
1470
1471       is_pp
1472           Returns true value if Text::CSV uses a pure-Perl backend.
1473

FUNCTIONS

1475       This section is also taken from Text::CSV_XS.
1476
1477   csv
1478       This function is not exported by default and should be explicitly
1479       requested:
1480
1481        use Text::CSV qw( csv );
1482
1483       This is an high-level function that aims at simple (user) interfaces.
1484       This can be used to read/parse a "CSV" file or stream (the default
1485       behavior) or to produce a file or write to a stream (define the  "out"
1486       attribute).  It returns an array- or hash-reference on parsing (or
1487       "undef" on fail) or the numeric value of  "error_diag"  on writing.
1488       When this function fails you can get to the error using the class call
1489       to "error_diag"
1490
1491        my $aoa = csv (in => "test.csv") or
1492            die Text::CSV->error_diag;
1493
1494       This function takes the arguments as key-value pairs. This can be
1495       passed as a list or as an anonymous hash:
1496
1497        my $aoa = csv (  in => "test.csv", sep_char => ";");
1498        my $aoh = csv ({ in => $fh, headers => "auto" });
1499
1500       The arguments passed consist of two parts:  the arguments to "csv"
1501       itself and the optional attributes to the  "CSV"  object used inside
1502       the function as enumerated and explained in "new".
1503
1504       If not overridden, the default option used for CSV is
1505
1506        auto_diag   => 1
1507        escape_null => 0
1508
1509       The option that is always set and cannot be altered is
1510
1511        binary      => 1
1512
1513       As this function will likely be used in one-liners,  it allows  "quote"
1514       to be abbreviated as "quo",  and  "escape_char" to be abbreviated as
1515       "esc" or "escape".
1516
1517       Alternative invocations:
1518
1519        my $aoa = Text::CSV::csv (in => "file.csv");
1520
1521        my $csv = Text::CSV->new ();
1522        my $aoa = $csv->csv (in => "file.csv");
1523
1524       In the latter case, the object attributes are used from the existing
1525       object and the attribute arguments in the function call are ignored:
1526
1527        my $csv = Text::CSV->new ({ sep_char => ";" });
1528        my $aoh = $csv->csv (in => "file.csv", bom => 1);
1529
1530       will parse using ";" as "sep_char", not ",".
1531
1532       in
1533
1534       Used to specify the source.  "in" can be a file name (e.g. "file.csv"),
1535       which will be  opened for reading  and closed when finished,  a file
1536       handle (e.g.  $fh or "FH"),  a reference to a glob (e.g. "\*ARGV"),
1537       the glob itself (e.g. *STDIN), or a reference to a scalar (e.g.
1538       "\q{1,2,"csv"}").
1539
1540       When used with "out", "in" should be a reference to a CSV structure
1541       (AoA or AoH)  or a CODE-ref that returns an array-reference or a hash-
1542       reference.  The code-ref will be invoked with no arguments.
1543
1544        my $aoa = csv (in => "file.csv");
1545
1546        open my $fh, "<", "file.csv";
1547        my $aoa = csv (in => $fh);
1548
1549        my $csv = [ [qw( Foo Bar )], [ 1, 2 ], [ 2, 3 ]];
1550        my $err = csv (in => $csv, out => "file.csv");
1551
1552       If called in void context without the "out" attribute, the resulting
1553       ref will be used as input to a subsequent call to csv:
1554
1555        csv (in => "file.csv", filter => { 2 => sub { length > 2 }})
1556
1557       will be a shortcut to
1558
1559        csv (in => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}))
1560
1561       where, in the absence of the "out" attribute, this is a shortcut to
1562
1563        csv (in  => csv (in => "file.csv", filter => { 2 => sub { length > 2 }}),
1564             out => *STDOUT)
1565
1566       out
1567
1568        csv (in => $aoa, out => "file.csv");
1569        csv (in => $aoa, out => $fh);
1570        csv (in => $aoa, out =>   STDOUT);
1571        csv (in => $aoa, out =>  *STDOUT);
1572        csv (in => $aoa, out => \*STDOUT);
1573        csv (in => $aoa, out => \my $data);
1574        csv (in => $aoa, out =>  undef);
1575        csv (in => $aoa, out => \"skip");
1576
1577       In output mode, the default CSV options when producing CSV are
1578
1579        eol       => "\r\n"
1580
1581       The "fragment" attribute is ignored in output mode.
1582
1583       "out" can be a file name  (e.g.  "file.csv"),  which will be opened for
1584       writing and closed when finished,  a file handle (e.g. $fh or "FH"),  a
1585       reference to a glob (e.g. "\*STDOUT"),  the glob itself (e.g. *STDOUT),
1586       or a reference to a scalar (e.g. "\my $data").
1587
1588        csv (in => sub { $sth->fetch },            out => "dump.csv");
1589        csv (in => sub { $sth->fetchrow_hashref }, out => "dump.csv",
1590             headers => $sth->{NAME_lc});
1591
1592       When a code-ref is used for "in", the output is generated  per
1593       invocation, so no buffering is involved. This implies that there is no
1594       size restriction on the number of records. The "csv" function ends when
1595       the coderef returns a false value.
1596
1597       If "out" is set to a reference of the literal string "skip", the output
1598       will be suppressed completely,  which might be useful in combination
1599       with a filter for side effects only.
1600
1601        my %cache;
1602        csv (in    => "dump.csv",
1603             out   => \"skip",
1604             on_in => sub { $cache{$_[1][1]}++ });
1605
1606       Currently,  setting "out" to any false value  ("undef", "", 0) will be
1607       equivalent to "\"skip"".
1608
1609       encoding
1610
1611       If passed,  it should be an encoding accepted by the  ":encoding()"
1612       option to "open". There is no default value. This attribute does not
1613       work in perl 5.6.x.  "encoding" can be abbreviated to "enc" for ease of
1614       use in command line invocations.
1615
1616       If "encoding" is set to the literal value "auto", the method "header"
1617       will be invoked on the opened stream to check if there is a BOM and set
1618       the encoding accordingly.   This is equal to passing a true value in
1619       the option "detect_bom".
1620
1621       detect_bom
1622
1623       If  "detect_bom"  is given, the method  "header"  will be invoked on
1624       the opened stream to check if there is a BOM and set the encoding
1625       accordingly.
1626
1627       "detect_bom" can be abbreviated to "bom".
1628
1629       This is the same as setting "encoding" to "auto".
1630
1631       Note that as the method  "header" is invoked,  its default is to also
1632       set the headers.
1633
1634       headers
1635
1636       If this attribute is not given, the default behavior is to produce an
1637       array of arrays.
1638
1639       If "headers" is supplied,  it should be an anonymous list of column
1640       names, an anonymous hashref, a coderef, or a literal flag:  "auto",
1641       "lc", "uc", or "skip".
1642
1643       skip
1644         When "skip" is used, the header will not be included in the output.
1645
1646          my $aoa = csv (in => $fh, headers => "skip");
1647
1648       auto
1649         If "auto" is used, the first line of the "CSV" source will be read as
1650         the list of field headers and used to produce an array of hashes.
1651
1652          my $aoh = csv (in => $fh, headers => "auto");
1653
1654       lc
1655         If "lc" is used,  the first line of the  "CSV" source will be read as
1656         the list of field headers mapped to  lower case and used to produce
1657         an array of hashes. This is a variation of "auto".
1658
1659          my $aoh = csv (in => $fh, headers => "lc");
1660
1661       uc
1662         If "uc" is used,  the first line of the  "CSV" source will be read as
1663         the list of field headers mapped to  upper case and used to produce
1664         an array of hashes. This is a variation of "auto".
1665
1666          my $aoh = csv (in => $fh, headers => "uc");
1667
1668       CODE
1669         If a coderef is used,  the first line of the  "CSV" source will be
1670         read as the list of mangled field headers in which each field is
1671         passed as the only argument to the coderef. This list is used to
1672         produce an array of hashes.
1673
1674          my $aoh = csv (in      => $fh,
1675                         headers => sub { lc ($_[0]) =~ s/kode/code/gr });
1676
1677         this example is a variation of using "lc" where all occurrences of
1678         "kode" are replaced with "code".
1679
1680       ARRAY
1681         If  "headers"  is an anonymous list,  the entries in the list will be
1682         used as field names. The first line is considered data instead of
1683         headers.
1684
1685          my $aoh = csv (in => $fh, headers => [qw( Foo Bar )]);
1686          csv (in => $aoa, out => $fh, headers => [qw( code description price )]);
1687
1688       HASH
1689         If "headers" is an hash reference, this implies "auto", but header
1690         fields for that exist as key in the hashref will be replaced by the
1691         value for that key. Given a CSV file like
1692
1693          post-kode,city,name,id number,fubble
1694          1234AA,Duckstad,Donald,13,"X313DF"
1695
1696         using
1697
1698          csv (headers => { "post-kode" => "pc", "id number" => "ID" }, ...
1699
1700         will return an entry like
1701
1702          { pc     => "1234AA",
1703            city   => "Duckstad",
1704            name   => "Donald",
1705            ID     => "13",
1706            fubble => "X313DF",
1707            }
1708
1709       See also "munge_column_names" and "set_column_names".
1710
1711       munge_column_names
1712
1713       If "munge_column_names" is set,  the method  "header"  is invoked on
1714       the opened stream with all matching arguments to detect and set the
1715       headers.
1716
1717       "munge_column_names" can be abbreviated to "munge".
1718
1719       key
1720
1721       If passed,  will default  "headers"  to "auto" and return a hashref
1722       instead of an array of hashes.
1723
1724        my $ref = csv (in => "test.csv", key => "code");
1725
1726       with test.csv like
1727
1728        code,product,price,color
1729        1,pc,850,gray
1730        2,keyboard,12,white
1731        3,mouse,5,black
1732
1733       will return
1734
1735         { 1   => {
1736               code    => 1,
1737               color   => 'gray',
1738               price   => 850,
1739               product => 'pc'
1740               },
1741           2   => {
1742               code    => 2,
1743               color   => 'white',
1744               price   => 12,
1745               product => 'keyboard'
1746               },
1747           3   => {
1748               code    => 3,
1749               color   => 'black',
1750               price   => 5,
1751               product => 'mouse'
1752               }
1753           }
1754
1755       The "key" attribute can be combined with "headers" for "CSV" date that
1756       has no header line, like
1757
1758        my $ref = csv (
1759            in      => "foo.csv",
1760            headers => [qw( c_foo foo bar description stock )],
1761            key     =>     "c_foo",
1762            );
1763
1764       keep_headers
1765
1766       When using hashes,  keep the column names into the arrayref passed,  so
1767       all headers are available after the call in the original order.
1768
1769        my $aoh = csv (in => "file.csv", keep_headers => \my @hdr);
1770
1771       This attribute can be abbreviated to "kh" or passed as
1772       "keep_column_names".
1773
1774       This attribute implies a default of "auto" for the "headers" attribute.
1775
1776       fragment
1777
1778       Only output the fragment as defined in the "fragment" method. This
1779       option is ignored when generating "CSV". See "out".
1780
1781       Combining all of them could give something like
1782
1783        use Text::CSV qw( csv );
1784        my $aoh = csv (
1785            in       => "test.txt",
1786            encoding => "utf-8",
1787            headers  => "auto",
1788            sep_char => "|",
1789            fragment => "row=3;6-9;15-*",
1790            );
1791        say $aoh->[15]{Foo};
1792
1793       sep_set
1794
1795       If "sep_set" is set, the method "header" is invoked on the opened
1796       stream to detect and set "sep_char" with the given set.
1797
1798       "sep_set" can be abbreviated to "seps".
1799
1800       Note that as the  "header" method is invoked,  its default is to also
1801       set the headers.
1802
1803       set_column_names
1804
1805       If  "set_column_names" is passed,  the method "header" is invoked on
1806       the opened stream with all arguments meant for "header".
1807
1808       If "set_column_names" is passed as a false value, the content of the
1809       first row is only preserved if the output is AoA:
1810
1811       With an input-file like
1812
1813        bAr,foo
1814        1,2
1815        3,4,5
1816
1817       This call
1818
1819        my $aoa = csv (in => $file, set_column_names => 0);
1820
1821       will result in
1822
1823        [[ "bar", "foo"     ],
1824         [ "1",   "2"       ],
1825         [ "3",   "4",  "5" ]]
1826
1827       and
1828
1829        my $aoa = csv (in => $file, set_column_names => 0, munge => "none");
1830
1831       will result in
1832
1833        [[ "bAr", "foo"     ],
1834         [ "1",   "2"       ],
1835         [ "3",   "4",  "5" ]]
1836
1837   Callbacks
1838       Callbacks enable actions triggered from the inside of Text::CSV.
1839
1840       While most of what this enables  can easily be done in an  unrolled
1841       loop as described in the "SYNOPSIS" callbacks can be used to meet
1842       special demands or enhance the "csv" function.
1843
1844       error
1845          $csv->callbacks (error => sub { $csv->SetDiag (0) });
1846
1847         the "error"  callback is invoked when an error occurs,  but  only
1848         when "auto_diag" is set to a true value. A callback is invoked with
1849         the values returned by "error_diag":
1850
1851          my ($c, $s);
1852
1853          sub ignore3006
1854          {
1855              my ($err, $msg, $pos, $recno, $fldno) = @_;
1856              if ($err == 3006) {
1857                  # ignore this error
1858                  ($c, $s) = (undef, undef);
1859                  Text::CSV->SetDiag (0);
1860                  }
1861              # Any other error
1862              return;
1863              } # ignore3006
1864
1865          $csv->callbacks (error => \&ignore3006);
1866          $csv->bind_columns (\$c, \$s);
1867          while ($csv->getline ($fh)) {
1868              # Error 3006 will not stop the loop
1869              }
1870
1871       after_parse
1872          $csv->callbacks (after_parse => sub { push @{$_[1]}, "NEW" });
1873          while (my $row = $csv->getline ($fh)) {
1874              $row->[-1] eq "NEW";
1875              }
1876
1877         This callback is invoked after parsing with  "getline"  only if no
1878         error occurred.  The callback is invoked with two arguments:   the
1879         current "CSV" parser object and an array reference to the fields
1880         parsed.
1881
1882         The return code of the callback is ignored  unless it is a reference
1883         to the string "skip", in which case the record will be skipped in
1884         "getline_all".
1885
1886          sub add_from_db
1887          {
1888              my ($csv, $row) = @_;
1889              $sth->execute ($row->[4]);
1890              push @$row, $sth->fetchrow_array;
1891              } # add_from_db
1892
1893          my $aoa = csv (in => "file.csv", callbacks => {
1894              after_parse => \&add_from_db });
1895
1896         This hook can be used for validation:
1897
1898         FAIL
1899           Die if any of the records does not validate a rule:
1900
1901            after_parse => sub {
1902                $_[1][4] =~ m/^[0-9]{4}\s?[A-Z]{2}$/ or
1903                    die "5th field does not have a valid Dutch zipcode";
1904                }
1905
1906         DEFAULT
1907           Replace invalid fields with a default value:
1908
1909            after_parse => sub { $_[1][2] =~ m/^\d+$/ or $_[1][2] = 0 }
1910
1911         SKIP
1912           Skip records that have invalid fields (only applies to
1913           "getline_all"):
1914
1915            after_parse => sub { $_[1][0] =~ m/^\d+$/ or return \"skip"; }
1916
1917       before_print
1918          my $idx = 1;
1919          $csv->callbacks (before_print => sub { $_[1][0] = $idx++ });
1920          $csv->print (*STDOUT, [ 0, $_ ]) for @members;
1921
1922         This callback is invoked  before printing with  "print"  only if no
1923         error occurred.  The callback is invoked with two arguments:  the
1924         current  "CSV" parser object and an array reference to the fields
1925         passed.
1926
1927         The return code of the callback is ignored.
1928
1929          sub max_4_fields
1930          {
1931              my ($csv, $row) = @_;
1932              @$row > 4 and splice @$row, 4;
1933              } # max_4_fields
1934
1935          csv (in => csv (in => "file.csv"), out => *STDOUT,
1936              callbacks => { before print => \&max_4_fields });
1937
1938         This callback is not active for "combine".
1939
1940       Callbacks for csv ()
1941
1942       The "csv" allows for some callbacks that do not integrate in XS
1943       internals but only feature the "csv" function.
1944
1945         csv (in        => "file.csv",
1946              callbacks => {
1947                  filter       => { 6 => sub { $_ > 15 } },    # first
1948                  after_parse  => sub { say "AFTER PARSE";  }, # first
1949                  after_in     => sub { say "AFTER IN";     }, # second
1950                  on_in        => sub { say "ON IN";        }, # third
1951                  },
1952              );
1953
1954         csv (in        => $aoh,
1955              out       => "file.csv",
1956              callbacks => {
1957                  on_in        => sub { say "ON IN";        }, # first
1958                  before_out   => sub { say "BEFORE OUT";   }, # second
1959                  before_print => sub { say "BEFORE PRINT"; }, # third
1960                  },
1961              );
1962
1963       filter
1964         This callback can be used to filter records.  It is called just after
1965         a new record has been scanned.  The callback accepts a:
1966
1967         hashref
1968           The keys are the index to the row (the field name or field number,
1969           1-based) and the values are subs to return a true or false value.
1970
1971            csv (in => "file.csv", filter => {
1972                       3 => sub { m/a/ },       # third field should contain an "a"
1973                       5 => sub { length > 4 }, # length of the 5th field minimal 5
1974                       });
1975
1976            csv (in => "file.csv", filter => { foo => sub { $_ > 4 }});
1977
1978           If the keys to the filter hash contain any character that is not a
1979           digit it will also implicitly set "headers" to "auto"  unless
1980           "headers"  was already passed as argument.  When headers are
1981           active, returning an array of hashes, the filter is not applicable
1982           to the header itself.
1983
1984           All sub results should match, as in AND.
1985
1986           The context of the callback sets  $_ localized to the field
1987           indicated by the filter. The two arguments are as with all other
1988           callbacks, so the other fields in the current row can be seen:
1989
1990            filter => { 3 => sub { $_ > 100 ? $_[1][1] =~ m/A/ : $_[1][6] =~ m/B/ }}
1991
1992           If the context is set to return a list of hashes  ("headers" is
1993           defined), the current record will also be available in the
1994           localized %_:
1995
1996            filter => { 3 => sub { $_ > 100 && $_{foo} =~ m/A/ && $_{bar} < 1000  }}
1997
1998           If the filter is used to alter the content by changing $_,  make
1999           sure that the sub returns true in order not to have that record
2000           skipped:
2001
2002            filter => { 2 => sub { $_ = uc }}
2003
2004           will upper-case the second field, and then skip it if the resulting
2005           content evaluates to false. To always accept, end with truth:
2006
2007            filter => { 2 => sub { $_ = uc; 1 }}
2008
2009         coderef
2010            csv (in => "file.csv", filter => sub { $n++; 0; });
2011
2012           If the argument to "filter" is a coderef,  it is an alias or
2013           shortcut to a filter on column 0:
2014
2015            csv (filter => sub { $n++; 0 });
2016
2017           is equal to
2018
2019            csv (filter => { 0 => sub { $n++; 0 });
2020
2021         filter-name
2022            csv (in => "file.csv", filter => "not_blank");
2023            csv (in => "file.csv", filter => "not_empty");
2024            csv (in => "file.csv", filter => "filled");
2025
2026           These are predefined filters
2027
2028           Given a file like (line numbers prefixed for doc purpose only):
2029
2030            1:1,2,3
2031            2:
2032            3:,
2033            4:""
2034            5:,,
2035            6:, ,
2036            7:"",
2037            8:" "
2038            9:4,5,6
2039
2040           not_blank
2041             Filter out the blank lines
2042
2043             This filter is a shortcut for
2044
2045              filter => { 0 => sub { @{$_[1]} > 1 or
2046                          defined $_[1][0] && $_[1][0] ne "" } }
2047
2048             Due to the implementation,  it is currently impossible to also
2049             filter lines that consists only of a quoted empty field. These
2050             lines are also considered blank lines.
2051
2052             With the given example, lines 2 and 4 will be skipped.
2053
2054           not_empty
2055             Filter out lines where all the fields are empty.
2056
2057             This filter is a shortcut for
2058
2059              filter => { 0 => sub { grep { defined && $_ ne "" } @{$_[1]} } }
2060
2061             A space is not regarded being empty, so given the example data,
2062             lines 2, 3, 4, 5, and 7 are skipped.
2063
2064           filled
2065             Filter out lines that have no visible data
2066
2067             This filter is a shortcut for
2068
2069              filter => { 0 => sub { grep { defined && m/\S/ } @{$_[1]} } }
2070
2071             This filter rejects all lines that not have at least one field
2072             that does not evaluate to the empty string.
2073
2074             With the given example data, this filter would skip lines 2
2075             through 8.
2076
2077       after_in
2078         This callback is invoked for each record after all records have been
2079         parsed but before returning the reference to the caller.  The hook is
2080         invoked with two arguments:  the current  "CSV"  parser object  and a
2081         reference to the record.   The reference can be a reference to a
2082         HASH  or a reference to an ARRAY as determined by the arguments.
2083
2084         This callback can also be passed as  an attribute without the
2085         "callbacks" wrapper.
2086
2087       before_out
2088         This callback is invoked for each record before the record is
2089         printed.  The hook is invoked with two arguments:  the current "CSV"
2090         parser object and a reference to the record.   The reference can be a
2091         reference to a  HASH or a reference to an ARRAY as determined by the
2092         arguments.
2093
2094         This callback can also be passed as an attribute  without the
2095         "callbacks" wrapper.
2096
2097         This callback makes the row available in %_ if the row is a hashref.
2098         In this case %_ is writable and will change the original row.
2099
2100       on_in
2101         This callback acts exactly as the "after_in" or the "before_out"
2102         hooks.
2103
2104         This callback can also be passed as an attribute  without the
2105         "callbacks" wrapper.
2106
2107         This callback makes the row available in %_ if the row is a hashref.
2108         In this case %_ is writable and will change the original row. So e.g.
2109         with
2110
2111           my $aoh = csv (
2112               in      => \"foo\n1\n2\n",
2113               headers => "auto",
2114               on_in   => sub { $_{bar} = 2; },
2115               );
2116
2117         $aoh will be:
2118
2119           [ { foo => 1,
2120               bar => 2,
2121               }
2122             { foo => 2,
2123               bar => 2,
2124               }
2125             ]
2126
2127       csv
2128         The function  "csv" can also be called as a method or with an
2129         existing Text::CSV object. This could help if the function is to be
2130         invoked a lot of times and the overhead of creating the object
2131         internally over  and  over again would be prevented by passing an
2132         existing instance.
2133
2134          my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
2135
2136          my $aoa = $csv->csv (in => $fh);
2137          my $aoa = csv (in => $fh, csv => $csv);
2138
2139         both act the same. Running this 20000 times on a 20 lines CSV file,
2140         showed a 53% speedup.
2141

DIAGNOSTICS

2143       This section is also taken from Text::CSV_XS.
2144
2145       Still under construction ...
2146
2147       If an error occurs,  "$csv->error_diag" can be used to get information
2148       on the cause of the failure. Note that for speed reasons the internal
2149       value is never cleared on success,  so using the value returned by
2150       "error_diag" in normal cases - when no error occurred - may cause
2151       unexpected results.
2152
2153       If the constructor failed, the cause can be found using "error_diag" as
2154       a class method, like "Text::CSV->error_diag".
2155
2156       The "$csv->error_diag" method is automatically invoked upon error when
2157       the contractor was called with  "auto_diag"  set to  1 or 2, or when
2158       autodie is in effect.  When set to 1, this will cause a "warn" with the
2159       error message,  when set to 2, it will "die". "2012 - EOF" is excluded
2160       from "auto_diag" reports.
2161
2162       Errors can be (individually) caught using the "error" callback.
2163
2164       The errors as described below are available. I have tried to make the
2165       error itself explanatory enough, but more descriptions will be added.
2166       For most of these errors, the first three capitals describe the error
2167       category:
2168
2169       · INI
2170
2171         Initialization error or option conflict.
2172
2173       · ECR
2174
2175         Carriage-Return related parse error.
2176
2177       · EOF
2178
2179         End-Of-File related parse error.
2180
2181       · EIQ
2182
2183         Parse error inside quotation.
2184
2185       · EIF
2186
2187         Parse error inside field.
2188
2189       · ECB
2190
2191         Combine error.
2192
2193       · EHR
2194
2195         HashRef parse related error.
2196
2197       And below should be the complete list of error codes that can be
2198       returned:
2199
2200       · 1001 "INI - sep_char is equal to quote_char or escape_char"
2201
2202         The  separation character  cannot be equal to  the quotation
2203         character or to the escape character,  as this would invalidate all
2204         parsing rules.
2205
2206       · 1002 "INI - allow_whitespace with escape_char or quote_char SP or
2207         TAB"
2208
2209         Using the  "allow_whitespace"  attribute  when either "quote_char" or
2210         "escape_char"  is equal to "SPACE" or "TAB" is too ambiguous to
2211         allow.
2212
2213       · 1003 "INI - \r or \n in main attr not allowed"
2214
2215         Using default "eol" characters in either "sep_char", "quote_char",
2216         or  "escape_char"  is  not allowed.
2217
2218       · 1004 "INI - callbacks should be undef or a hashref"
2219
2220         The "callbacks"  attribute only allows one to be "undef" or a hash
2221         reference.
2222
2223       · 1005 "INI - EOL too long"
2224
2225         The value passed for EOL is exceeding its maximum length (16).
2226
2227       · 1006 "INI - SEP too long"
2228
2229         The value passed for SEP is exceeding its maximum length (16).
2230
2231       · 1007 "INI - QUOTE too long"
2232
2233         The value passed for QUOTE is exceeding its maximum length (16).
2234
2235       · 1008 "INI - SEP undefined"
2236
2237         The value passed for SEP should be defined and not empty.
2238
2239       · 1010 "INI - the header is empty"
2240
2241         The header line parsed in the "header" is empty.
2242
2243       · 1011 "INI - the header contains more than one valid separator"
2244
2245         The header line parsed in the  "header"  contains more than one
2246         (unique) separator character out of the allowed set of separators.
2247
2248       · 1012 "INI - the header contains an empty field"
2249
2250         The header line parsed in the "header" is contains an empty field.
2251
2252       · 1013 "INI - the header contains nun-unique fields"
2253
2254         The header line parsed in the  "header"  contains at least  two
2255         identical fields.
2256
2257       · 1014 "INI - header called on undefined stream"
2258
2259         The header line cannot be parsed from an undefined sources.
2260
2261       · 1500 "PRM - Invalid/unsupported argument(s)"
2262
2263         Function or method called with invalid argument(s) or parameter(s).
2264
2265       · 1501 "PRM - The key attribute is passed as an unsupported type"
2266
2267         The "key" attribute is of an unsupported type.
2268
2269       · 2010 "ECR - QUO char inside quotes followed by CR not part of EOL"
2270
2271         When  "eol"  has  been  set  to  anything  but the  default,  like
2272         "\r\t\n",  and  the  "\r"  is  following  the   second   (closing)
2273         "quote_char", where the characters following the "\r" do not make up
2274         the "eol" sequence, this is an error.
2275
2276       · 2011 "ECR - Characters after end of quoted field"
2277
2278         Sequences like "1,foo,"bar"baz,22,1" are not allowed. "bar" is a
2279         quoted field and after the closing double-quote, there should be
2280         either a new-line sequence or a separation character.
2281
2282       · 2012 "EOF - End of data in parsing input stream"
2283
2284         Self-explaining. End-of-file while inside parsing a stream. Can
2285         happen only when reading from streams with "getline",  as using
2286         "parse" is done on strings that are not required to have a trailing
2287         "eol".
2288
2289       · 2013 "INI - Specification error for fragments RFC7111"
2290
2291         Invalid specification for URI "fragment" specification.
2292
2293       · 2014 "ENF - Inconsistent number of fields"
2294
2295         Inconsistent number of fields under strict parsing.
2296
2297       · 2021 "EIQ - NL char inside quotes, binary off"
2298
2299         Sequences like "1,"foo\nbar",22,1" are allowed only when the binary
2300         option has been selected with the constructor.
2301
2302       · 2022 "EIQ - CR char inside quotes, binary off"
2303
2304         Sequences like "1,"foo\rbar",22,1" are allowed only when the binary
2305         option has been selected with the constructor.
2306
2307       · 2023 "EIQ - QUO character not allowed"
2308
2309         Sequences like ""foo "bar" baz",qu" and "2023,",2008-04-05,"Foo,
2310         Bar",\n" will cause this error.
2311
2312       · 2024 "EIQ - EOF cannot be escaped, not even inside quotes"
2313
2314         The escape character is not allowed as last character in an input
2315         stream.
2316
2317       · 2025 "EIQ - Loose unescaped escape"
2318
2319         An escape character should escape only characters that need escaping.
2320
2321         Allowing  the escape  for other characters  is possible  with the
2322         attribute "allow_loose_escape".
2323
2324       · 2026 "EIQ - Binary character inside quoted field, binary off"
2325
2326         Binary characters are not allowed by default.    Exceptions are
2327         fields that contain valid UTF-8,  that will automatically be upgraded
2328         if the content is valid UTF-8. Set "binary" to 1 to accept binary
2329         data.
2330
2331       · 2027 "EIQ - Quoted field not terminated"
2332
2333         When parsing a field that started with a quotation character,  the
2334         field is expected to be closed with a quotation character.   When the
2335         parsed line is exhausted before the quote is found, that field is not
2336         terminated.
2337
2338       · 2030 "EIF - NL char inside unquoted verbatim, binary off"
2339
2340       · 2031 "EIF - CR char is first char of field, not part of EOL"
2341
2342       · 2032 "EIF - CR char inside unquoted, not part of EOL"
2343
2344       · 2034 "EIF - Loose unescaped quote"
2345
2346       · 2035 "EIF - Escaped EOF in unquoted field"
2347
2348       · 2036 "EIF - ESC error"
2349
2350       · 2037 "EIF - Binary character in unquoted field, binary off"
2351
2352       · 2110 "ECB - Binary character in Combine, binary off"
2353
2354       · 2200 "EIO - print to IO failed. See errno"
2355
2356       · 3001 "EHR - Unsupported syntax for column_names ()"
2357
2358       · 3002 "EHR - getline_hr () called before column_names ()"
2359
2360       · 3003 "EHR - bind_columns () and column_names () fields count
2361         mismatch"
2362
2363       · 3004 "EHR - bind_columns () only accepts refs to scalars"
2364
2365       · 3006 "EHR - bind_columns () did not pass enough refs for parsed
2366         fields"
2367
2368       · 3007 "EHR - bind_columns needs refs to writable scalars"
2369
2370       · 3008 "EHR - unexpected error in bound fields"
2371
2372       · 3009 "EHR - print_hr () called before column_names ()"
2373
2374       · 3010 "EHR - print_hr () called with invalid arguments"
2375

AUTHORS and MAINTAINERS

2380       Alan Citterman <alan[at]mfgrtl.com> wrote the original Perl module.
2381       Please don't send mail concerning Text::CSV to Alan, as he's not a
2382       present maintainer.
2383
2384       Jochen Wiedmann <joe[at]ispsoft.de> rewrote the encoding and decoding
2385       in C by implementing a simple finite-state machine and added the
2386       variable quote, escape and separator characters, the binary mode and
2387       the print and getline methods. See ChangeLog releases 0.10 through
2388       0.23.
2389
2390       H.Merijn Brand <h.m.brand[at]xs4all.nl> cleaned up the code, added the
2391       field flags methods, wrote the major part of the test suite, completed
2392       the documentation, fixed some RT bugs. See ChangeLog releases 0.25 and
2393       on.
2394
2395       Makamaka Hannyaharamitu, <makamaka[at]cpan.org> wrote Text::CSV_PP
2396       which is the pure-Perl version of Text::CSV_XS.
2397
2398       New Text::CSV (since 0.99) is maintained by Makamaka, and Kenichi
2399       Ishigaki since 1.91.
2400

COPYRIGHT AND LICENSE

2402       Text::CSV
2403
2404       Copyright (C) 1997 Alan Citterman. All rights reserved.  Copyright (C)
2405       2007-2015 Makamaka Hannyaharamitu.  Copyright (C) 2017- Kenichi
2406       Ishigaki A large portion of the doc is taken from Text::CSV_XS. See
2407       below.
2408
2409       Text::CSV_PP:
2410
2411       Copyright (C) 2005-2015 Makamaka Hannyaharamitu.  Copyright (C) 2017-
2412       Kenichi Ishigaki A large portion of the code/doc are also taken from
2413       Text::CSV_XS. See below.
2414
2415       Text:CSV_XS:
2416
2417       Copyright (C) 2007-2016 H.Merijn Brand for PROCURA B.V.  Copyright (C)
2418       1998-2001 Jochen Wiedmann. All rights reserved.  Portions Copyright (C)
2419       1997 Alan Citterman. All rights reserved.
2420
2421       This library is free software; you can redistribute it and/or modify it
2422       under the same terms as Perl itself.
2423
2424
2425
2426perl v5.28.0                      2018-08-17                      Text::CSV(3)