csvsql(1)

1CSVSQL(1)                           csvkit                           CSVSQL(1)
2
3
4

NAME

6       csvsql - csvsql Documentation
7

DESCRIPTION

9       Generate  SQL statements for a CSV file or execute those statements di‐
10       rectly on a database. In the latter case supports both creating  tables
11       and inserting data:
12
13          usage: csvsql [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
14                        [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
15                        [-S] [--blanks] [--date-format DATE_FORMAT]
16                        [--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v]
17                        [-l] [--zero] [-V]
18                        [-i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}]
19                        [--db CONNECTION_STRING] [--query QUERY] [--insert]
20                        [--prefix PREFIX] [--tables TABLE_NAMES] [--no-constraints]
21                        [--unique-constraint UNIQUE_CONSTRAINT] [--no-create]
22                        [--create-if-not-exists] [--overwrite] [--db-schema DB_SCHEMA]
23                        [-y SNIFF_LIMIT] [-I] [--chunk-size NUM]
24                        [FILE [FILE ...]]
25
26          Generate SQL statements for one or more CSV files, or execute those statements
27          directly on a database, and execute one or more SQL queries.
28
29          positional arguments:
30            FILE                  The CSV file(s) to operate on. If omitted, will accept
31                                  input as piped data via STDIN.
32
33          optional arguments:
34            -h, --help            show this help message and exit
35            -i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}, --dialect {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}
36                                  Dialect of SQL to generate. Cannot be used with --db.
37            --db CONNECTION_STRING
38                                  If present, a SQLAlchemy connection string to use to
39                                  directly execute generated SQL on a database.
40            --query QUERY         Execute one or more SQL queries delimited by ";" and
41                                  output the result of the last query as CSV. QUERY may
42                                  be a filename.
43            --insert              Insert the data into the table. Requires --db.
44            --prefix PREFIX       Add an expression following the INSERT keyword, like
45                                  OR IGNORE or OR REPLACE.
46            --before-insert BEFORE_INSERT
47                                  Execute SQL before the INSERT command. Requires
48                                  --insert.
49            --after-insert AFTER_INSERT
50                                  Execute SQL after the INSERT command. Requires
51                                  --insert.
52            --tables TABLE_NAMES  A comma-separated list of names of tables to be
53                                  created. By default, the tables will be named after
54                                  the filenames without extensions or "stdin".
55            --no-constraints      Generate a schema without length limits or null
56                                  checks. Useful when sampling big tables.
57            --unique-constraint UNIQUE_CONSTRAINT
58                                  A column-separated list of names of columns to include
59                                  in a UNIQUE constraint.
60            --no-create           Skip creating the table. Requires --insert.
61            --create-if-not-exists
62                                  Create the table if it does not exist, otherwise keep
63                                  going. Requires --insert.
64            --overwrite           Drop the table if it already exists. Requires
65                                  --insert. Cannot be used with --no-create.
66            --db-schema DB_SCHEMA
67                                  Optional name of database schema to create table(s)
68                                  in.
69            -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
70                                  Limit CSV dialect sniffing to the specified number of
71                                  bytes. Specify "0" to disable sniffing.
72            -I, --no-inference    Disable type inference when parsing the input.
73            --chunk-size CHUNK_SIZE
74                                  Chunk size for batch insert into the table. Requires
75                                  --insert.
76
77       See also: Arguments common to all tools.
78
79       For  information  on connection strings and supported dialects refer to
80       the SQLAlchemy documentation.
81
82       If you prefer not to enter your  password  in  the  connection  string,
83       store  the password securely in a PostgreSQL Password File, a MySQL Op‐
84       tions File or similar files for other systems.
85
86       NOTE:
87          Using the --query option may cause rounding (in Python 2) or  intro‐
88          duce Python floating point issues (in Python 3).
89
90       NOTE:
91          If  the CSV file was created from a JSON file using in2csv, remember
92          to quote SQL columns properly. For example:
93
94              echo '{"a":{"b":"c"},"d":"e"}' | in2csv -f ndjson | csvsql --query 'SELECT "a/b" FROM stdin'
95
96       NOTE:
97          Alternatives to csvsql are q and textql.
98

EXAMPLES

100       Generate a statement in the PostgreSQL dialect:
101
102          csvsql -i postgresql examples/realdata/FY09_EDU_Recipients_by_State.csv
103
104       Create a table and import data from the CSV directly into PostgreSQL:
105
106          createdb test
107          csvsql --db postgresql:///test --tables fy09 --insert examples/realdata/FY09_EDU_Recipients_by_State.csv
108
109       For large tables it may not be practical to process the  entire  table.
110       One  solution to this is to analyze a sample of the table. In this case
111       it can be useful to turn off length limits and  null  checks  with  the
112       --no-constraints option:
113
114          head -n 20 examples/realdata/FY09_EDU_Recipients_by_State.csv | csvsql --no-constraints --tables fy09
115
116       Create  tables  for an entire folder of CSVs and import data from those
117       files directly into PostgreSQL:
118
119          createdb test
120          csvsql --db postgresql:///test --insert examples/*_converted.csv
121
122       If those CSVs have identical headers, you can import them into the same
123       table by using csvstack first:
124
125          createdb test
126          csvstack examples/dummy?.csv | csvsql --db postgresql:///test --insert
127
128       Group rows by one column:
129
130          csvsql --query "select * from 'dummy3' group by a" examples/dummy3.csv
131
132       You  can  also  use  CSVSQL  to “directly” query one or more CSV files.
133       Please note that this will create an  in-memory  SQL  database,  so  it
134       won’t be very fast:
135
136          csvsql --query  "select m.usda_id, avg(i.sepal_length) as mean_sepal_length from iris as i join irismeta as m on (i.species = m.species) group by m.species" examples/iris.csv examples/irismeta.csv
137
138       Concatenate two columns:
139
140          csvsql --query "select a || b from 'dummy3'" examples/dummy3.csv
141
142       If a column contains null values, you must COALESCE the column:
143
144          csvsql --query "select a || COALESCE(b, '') from 'sort_ints_nulls'" --no-inference examples/sort_ints_nulls.csv
145

AUTHOR

147       Christopher Groskopf
148

COPYRIGHT

150       2023, Christopher Groskopf
151
152
153
154
1551.1.1                            Jul 21, 2023                        CSVSQL(1)