DBD::CSV(3pm)

1DBD::CSV(3)           User Contributed Perl Documentation          DBD::CSV(3)
2
3
4

NAME

6       DBD::CSV - DBI driver for CSV files
7

SYNOPSIS

9           use DBI;
10           $dbh = DBI->connect("DBI:CSV:f_dir=/home/joe/csvdb")
11               or die "Cannot connect: " . $DBI::errstr;
12           $sth = $dbh->prepare("CREATE TABLE a (id INTEGER, name CHAR(10))")
13               or die "Cannot prepare: " . $dbh->errstr();
14           $sth->execute() or die "Cannot execute: " . $sth->errstr();
15           $sth->finish();
16           $dbh->disconnect();
17
18           # Read a CSV file with ";" as the separator, as exported by
19           # MS Excel. Note we need to escape the ";", otherwise it
20           # would be treated as an attribute separator.
21           $dbh = DBI->connect(qq{DBI:CSV:csv_sep_char=\\;});
22           $sth = $dbh->prepare("SELECT * FROM info");
23
24           # Same example, this time reading "info.csv" as a table:
25           $dbh = DBI->connect(qq{DBI:CSV:csv_sep_char=\\;});
26           $dbh->{'csv_tables'}->{'info'} = { 'file' => 'info.csv'};
27           $sth = $dbh->prepare("SELECT * FROM info");
28

DESCRIPTION

30       The DBD::CSV module is yet another driver for the DBI (Database inde‐
31       pendent interface for Perl). This one is based on the SQL "engine"
32       SQL::Statement and the abstract DBI driver DBD::File and implements
33       access to so-called CSV files (Comma separated values). Such files are
34       mostly used for exporting MS Access and MS Excel data.
35
36       See DBI(3) for details on DBI, SQL::Statement(3) for details on
37       SQL::Statement and DBD::File(3) for details on the base class
38       DBD::File.
39
40       Prerequisites
41
42       The only system dependent feature that DBD::File uses, is the "flock()"
43       function. Thus the module should run (in theory) on any system with a
44       working "flock()", in particular on all Unix machines and on Windows
45       NT. Under Windows 95 and MacOS the use of "flock()" is disabled, thus
46       the module should still be usable,
47
48       Unlike other DBI drivers, you don't need an external SQL engine or a
49       running server. All you need are the following Perl modules, available
50       from any CPAN mirror, for example
51
52         ftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-module
53
54       DBI the DBI (Database independent interface for Perl), version 1.00 or
55           a later release
56
57       SQL::Statement
58           a simple SQL engine
59
60       Text::CSV_XS
61           this module is used for writing rows to or reading rows from CSV
62           files.
63
64       Installation
65
66       Installing this module (and the prerequisites from above) is quite sim‐
67       ple.  You just fetch the archive, extract it with
68
69           gzip -cd DBD-CSV-0.1000.tar.gz ⎪ tar xf -
70
71       (this is for Unix users, Windows users would prefer WinZip or something
72       similar) and then enter the following:
73
74           cd DBD-CSV-0.1000
75           perl Makefile.PL
76           make
77           make test
78
79       If any tests fail, let me know. Otherwise go on with
80
81           make install
82
83       Note that you almost definitely need root or administrator permissions.
84       If you don't have them, read the ExtUtils::MakeMaker man page for
85       details on installing in your own directories. ExtUtils::MakeMaker.
86
87
88
89         The level of SQL support available depends on the version of
90         SQL::Statement installed.  Any version will support *basic*
91         CREATE, INSERT, DELETE, UPDATE, and SELECT statements.  Only
92         versions of SQL::Statement 1.0 and above support additional
93         features such as table joins, string functions, etc.  See the
94         documentation of the latest version of SQL::Statement for details.
95
96       Creating a database handle
97
98       Creating a database handle usually implies connecting to a database
99       server.  Thus this command reads
100
101           use DBI;
102           my $dbh = DBI->connect("DBI:CSV:f_dir=$dir");
103
104       The directory tells the driver where it should create or open tables
105       (a.k.a. files). It defaults to the current directory, thus the follow‐
106       ing are equivalent:
107
108           $dbh = DBI->connect("DBI:CSV:");
109           $dbh = DBI->connect("DBI:CSV:f_dir=.");
110
111       (I was told, that VMS requires
112
113           $dbh = DBI->connect("DBI:CSV:f_dir=");
114
115       for whatever reasons.)
116
117       You may set other attributes in the DSN string, separated by semi‐
118       colons.
119
120       Creating and dropping tables
121
122       You can create and drop tables with commands like the following:
123
124           $dbh->do("CREATE TABLE $table (id INTEGER, name CHAR(64))");
125           $dbh->do("DROP TABLE $table");
126
127       Note that currently only the column names will be stored and no other
128       data.  Thus all other information including column type (INTEGER or
129       CHAR(x), for example), column attributes (NOT NULL, PRIMARY KEY, ...)
130       will silently be discarded. This may change in a later release.
131
132       A drop just removes the file without any warning.
133
134       See DBI(3) for more details.
135
136       Table names cannot be arbitrary, due to restrictions of the SQL syntax.
137       I recommend that table names are valid SQL identifiers: The first char‐
138       acter is alphabetic, followed by an arbitrary number of alphanumeric
139       characters. If you want to use other files, the file names must start
140       with '/', './' or '../' and they must not contain white space.
141
142       Inserting, fetching and modifying data
143
144       The following examples insert some data in a table and fetch it back:
145       First all data in the string:
146
147           $dbh->do("INSERT INTO $table VALUES (1, "
148                    . $dbh->quote("foobar") . ")");
149
150       Note the use of the quote method for escaping the word 'foobar'. Any
151       string must be escaped, even if it doesn't contain binary data.
152
153       Next an example using parameters:
154
155           $dbh->do("INSERT INTO $table VALUES (?, ?)", undef,
156                    2, "It's a string!");
157
158       Note that you don't need to use the quote method here, this is done
159       automatically for you. This version is particularly well designed for
160       loops. Whenever performance is an issue, I recommend using this method.
161
162       You might wonder about the "undef". Don't wonder, just take it as it
163       is. :-) It's an attribute argument that I have never ever used and will
164       be parsed to the prepare method as a second argument.
165
166       To retrieve data, you can use the following:
167
168           my($query) = "SELECT * FROM $table WHERE id > 1 ORDER BY id";
169           my($sth) = $dbh->prepare($query);
170           $sth->execute();
171           while (my $row = $sth->fetchrow_hashref) {
172               print("Found result row: id = ", $row->{'id'},
173                     ", name = ", $row->{'name'});
174           }
175           $sth->finish();
176
177       Again, column binding works: The same example again.
178
179           my($query) = "SELECT * FROM $table WHERE id > 1 ORDER BY id";
180           my($sth) = $dbh->prepare($query);
181           $sth->execute();
182           my($id, $name);
183           $sth->bind_columns(undef, \$id, \$name);
184           while ($sth->fetch) {
185               print("Found result row: id = $id, name = $name\n");
186           }
187           $sth->finish();
188
189       Of course you can even use input parameters. Here's the same example
190       for the third time:
191
192           my($query) = "SELECT * FROM $table WHERE id = ?";
193           my($sth) = $dbh->prepare($query);
194           $sth->bind_columns(undef, \$id, \$name);
195           for (my($i) = 1;  $i <= 2;   $i++) {
196               $sth->execute($id);
197               if ($sth->fetch) {
198                   print("Found result row: id = $id, name = $name\n");
199               }
200               $sth->finish();
201           }
202
203       See DBI(3) for details on these methods. See SQL::Statement(3) for
204       details on the WHERE clause.
205
206       Data rows are modified with the UPDATE statement:
207
208           $dbh->do("UPDATE $table SET id = 3 WHERE id = 1");
209
210       Likewise you use the DELETE statement for removing rows:
211
212           $dbh->do("DELETE FROM $table WHERE id > 1");
213
214       Error handling
215
216       In the above examples we have never cared about return codes. Of
217       course, this cannot be recommended. Instead we should have written (for
218       example):
219
220           my($query) = "SELECT * FROM $table WHERE id = ?";
221           my($sth) = $dbh->prepare($query)
222               or die "prepare: " . $dbh->errstr();
223           $sth->bind_columns(undef, \$id, \$name)
224               or die "bind_columns: " . $dbh->errstr();
225           for (my($i) = 1;  $i <= 2;   $i++) {
226               $sth->execute($id)
227                   or die "execute: " . $dbh->errstr();
228               if ($sth->fetch) {
229                   print("Found result row: id = $id, name = $name\n");
230               }
231           }
232           $sth->finish($id)
233               or die "finish: " . $dbh->errstr();
234
235       Obviously this is tedious. Fortunately we have DBI's RaiseError
236       attribute:
237
238           $dbh->{'RaiseError'} = 1;
239           $@ = '';
240           eval {
241               my($query) = "SELECT * FROM $table WHERE id = ?";
242               my($sth) = $dbh->prepare($query);
243               $sth->bind_columns(undef, \$id, \$name);
244               for (my($i) = 1;  $i <= 2;   $i++) {
245                   $sth->execute($id);
246                   if ($sth->fetch) {
247                       print("Found result row: id = $id, name = $name\n");
248                   }
249               }
250               $sth->finish($id);
251           };
252           if ($@) { die "SQL database error: $@"; }
253
254       This is not only shorter, it even works when using DBI methods within
255       subroutines.
256
257       Metadata
258
259       The following attributes are handled by DBI itself and not by
260       DBD::File, thus they all work as expected:
261
262           Active
263           ActiveKids
264           CachedKids
265           CompatMode             (Not used)
266           InactiveDestroy
267           Kids
268           PrintError
269           RaiseError
270           Warn                   (Not used)
271
272       The following DBI attributes are handled by DBD::File:
273
274       AutoCommit
275           Always on
276
277       ChopBlanks
278           Works
279
280       NUM_OF_FIELDS
281           Valid after "$sth->execute"
282
283       NUM_OF_PARAMS
284           Valid after "$sth->prepare"
285
286       NAME
287           Valid after "$sth->execute"; undef for Non-Select statements.
288
289       NULLABLE
290           Not really working. Always returns an array ref of one's, as
291           DBD::CSV doesn't verify input data. Valid after "$sth->execute";
292           undef for non-Select statements.
293
294       These attributes and methods are not supported:
295
296           bind_param_inout
297           CursorName
298           LongReadLen
299           LongTruncOk
300
301       In addition to the DBI attributes, you can use the following dbh
302       attributes:
303
304       f_dir   This attribute is used for setting the directory where CSV
305               files are opened. Usually you set it in the dbh, it defaults to
306               the current directory ("."). However, it is overwritable in the
307               statement handles.
308
309       csv_eol
310       csv_sep_char
311       csv_quote_char
312       csv_escape_char
313       csv_class
314       csv_csv The attributes csv_eol, csv_sep_char, csv_quote_char and
315               csv_escape_char are corresponding to the respective attributes
316               of the Text::CSV_XS object. You want to set these attributes if
317               you have unusual CSV files like /etc/passwd or MS Excel gener‐
318               ated CSV files with a semicolon as separator. Defaults are
319               "\015\012", ';', '"' and '"', respectively.
320
321               The attributes are used to create an instance of the class
322               csv_class, by default Text::CSV_XS. Alternatively you may pass
323               an instance as csv_csv, the latter takes precedence. Note that
324               the binary attribute must be set to a true value in that case.
325
326               Additionally you may overwrite these attributes on a per-table
327               base in the csv_tables attribute.
328
329       csv_tables
330               This hash ref is used for storing table dependent metadata. For
331               any table it contains an element with the table name as key and
332               another hash ref with the following attributes:
333
334               file        The tables file name; defaults to
335
336                               "$dbh->{f_dir}/$table"
337
338               eol
339               sep_char
340               quote_char
341               escape_char
342               class
343               csv         These correspond to the attributes csv_eol,
344                           csv_sep_char, csv_quote_char, csv_escape_char,
345                           csv_class and csv_csv.  The difference is that they
346                           work on a per-table base.
347
348               col_names
349               skip_first_row
350                           By default DBD::CSV assumes that column names are
351                           stored in the first row of the CSV file. If this is
352                           not the case, you can supply an array ref of table
353                           names with the col_names attribute. In that case
354                           the attribute skip_first_row will be set to FALSE.
355
356                           If you supply an empty array ref, the driver will
357                           read the first row for you, count the number of
358                           columns and create column names like "col0",
359                           "col1", ...
360
361       Example: Suggest you want to use /etc/passwd as a CSV file. :-) There
362       simplest way is:
363
364           require DBI;
365           my $dbh = DBI->connect("DBI:CSV:f_dir=/etc;csv_eol=\n;"
366                                  . "csv_sep_char=:;csv_quote_char=;"
367                                  . "csv_escape_char=");
368           $dbh->{'csv_tables'}->{'passwd'} = {
369               'col_names' => ["login", "password", "uid", "gid", "realname",
370                               "directory", "shell"]
371           };
372           $sth = $dbh->prepare("SELECT * FROM passwd");
373
374       Another possibility where you leave all the defaults as they are and
375       overwrite them on a per table base:
376
377           require DBI;
378           my $dbh = DBI->connect("DBI:CSV:");
379           $dbh->{'csv_tables'}->{'passwd'} = {
380               'eol' => "\n",
381               'sep_char' => ":",
382               'quote_char' => undef,
383               'escape_char' => undef,
384               'file' => '/etc/passwd',
385               'col_names' => ["login", "password", "uid", "gid", "realname",
386                               "directory", "shell"]
387           };
388           $sth = $dbh->prepare("SELECT * FROM passwd");
389
390       Driver private methods
391
392       These methods are inherited from DBD::File:
393
394       data_sources
395           The "data_sources" method returns a list of subdirectories of the
396           current directory in the form "DBI:CSV:directory=$dirname".
397
398           If you want to read the subdirectories of another directory, use
399
400               my($drh) = DBI->install_driver("CSV");
401               my(@list) = $drh->data_sources('f_dir' => '/usr/local/csv_data' );
402
403       list_tables
404           This method returns a list of file names inside $dbh->{'direc‐
405           tory'}.  Example:
406
407               my($dbh) = DBI->connect("DBI:CSV:directory=/usr/local/csv_data");
408               my(@list) = $dbh->func('list_tables');
409
410           Note that the list includes all files contained in the directory,
411           even those that have non-valid table names, from the view of SQL.
412           See "Creating and dropping tables" above.
413
414       Data restrictions
415
416       When inserting and fetching data, you will sometimes be surprised:
417       DBD::CSV doesn't correctly handle data types, in particular NULLs. If
418       you insert integers, it might happen, that fetch returns a string. Of
419       course, a string containing the integer, so that's perhaps not a real
420       problem. But the following will never work:
421
422           $dbh->do("INSERT INTO $table (id, name) VALUES (?, ?)",
423                    undef, "foo bar");
424           $sth = $dbh->prepare("SELECT * FROM $table WHERE id IS NULL");
425           $sth->execute();
426           my($id, $name);
427           $sth->bind_columns(undef, \$id, \$name);
428           while ($sth->fetch) {
429               printf("Found result row: id = %s, name = %s\n",
430                     defined($id) ? $id : "NULL",
431                     defined($name) ? $name : "NULL");
432           }
433           $sth->finish();
434
435       The row we have just inserted, will never be returned! The reason is
436       obvious, if you examine the CSV file: The corresponding row looks like
437
438           "","foo bar"
439
440       In other words, not a NULL is stored, but an empty string. CSV files
441       don't have a concept of NULL values. Surprisingly the above example
442       works, if you insert a NULL value for the name! Again, you find the
443       explanation by examining the CSV file:
444
445           ""
446
447       In other words, DBD::CSV has "emulated" a NULL value by writing a row
448       with less columns. Of course this works only if the rightmost column is
449       NULL, the two rightmost columns are NULL, ..., but the leftmost column
450       will never be NULL!
451
452       See "Creating and dropping tables" above for table name restrictions.
453

TODO

455       Extensions of DBD::CSV:
456
457       CSV file scanner
458           Write a simple CSV file scanner that reads a CSV file and attempts
459           to guess sep_char, quote_char, escape_char and eol automatically.
460
461       These are merely restrictions of the DBD::File or SQL::Statement mod‐
462       ules:
463
464       Table name mapping
465           Currently it is not possible to use files with names like
466           "names.csv".  Instead you have to use soft links or rename files.
467           As an alternative one might use, for example a dbh attribute 'ta‐
468           ble_map'. It might be a hash ref, the keys being the table names
469           and the values being the file names.
470
471       Column name mapping
472           Currently the module assumes that column names are stored in the
473           first row. While this is fine in most cases, there should be a pos‐
474           sibility of setting column names and column number from the pro‐
475           grammer: For example MS Access doesn't export column names by
476           default.
477

KNOWN BUGS

479       ·       The module is using flock() internally. However, this function
480               is not available on platforms. Using flock() is disabled on
481               MacOS and Windows 95: There's no locking at all (perhaps not so
482               important on these operating systems, as they are for single
483               users anyways).
484

AUTHOR AND COPYRIGHT

486       This module is currently maintained by
487
488             Jeff Zucker
489             <jeff@vpservices.com>
490
491       The original author is Jochen Wiedmann.
492
493       Copyright (C) 1998 by Jochen Wiedmann
494
495       All rights reserved.
496
497       You may distribute this module under the terms of either the GNU Gen‐
498       eral Public License or the Artistic License, as specified in the Perl
499       README file.
500

NAME

SYNOPSIS

DESCRIPTION

TODO

KNOWN BUGS

AUTHOR AND COPYRIGHT

SEE ALSO