1CSVSQL(1) csvkit CSVSQL(1)
2
3
4
6 csvsql - csvsql Documentation
7
9 Generate SQL statements for a CSV file or execute those statements di‐
10 rectly on a database. In the latter case supports both creating tables
11 and inserting data:
12
13 usage: csvsql [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
14 [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
15 [-S] [--blanks] [--date-format DATE_FORMAT]
16 [--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v]
17 [-l] [--zero] [-V]
18 [-i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}]
19 [--db CONNECTION_STRING] [--query QUERY] [--insert]
20 [--prefix PREFIX] [--tables TABLE_NAMES] [--no-constraints]
21 [--unique-constraint UNIQUE_CONSTRAINT] [--no-create]
22 [--create-if-not-exists] [--overwrite] [--db-schema DB_SCHEMA]
23 [-y SNIFF_LIMIT] [-I] [--chunk-size NUM]
24 [FILE [FILE ...]]
25
26 Generate SQL statements for one or more CSV files, or execute those statements
27 directly on a database, and execute one or more SQL queries.
28
29 positional arguments:
30 FILE The CSV file(s) to operate on. If omitted, will accept
31 input as piped data via STDIN.
32
33 optional arguments:
34 -h, --help show this help message and exit
35 -i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}, --dialect {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}
36 Dialect of SQL to generate. Cannot be used with --db.
37 --db CONNECTION_STRING
38 If present, a SQLAlchemy connection string to use to
39 directly execute generated SQL on a database.
40 --query QUERY Execute one or more SQL queries delimited by ";" and
41 output the result of the last query as CSV. QUERY may
42 be a filename.
43 --insert Insert the data into the table. Requires --db.
44 --prefix PREFIX Add an expression following the INSERT keyword, like
45 OR IGNORE or OR REPLACE.
46 --before-insert BEFORE_INSERT
47 Execute SQL before the INSERT command. Requires
48 --insert.
49 --after-insert AFTER_INSERT
50 Execute SQL after the INSERT command. Requires
51 --insert.
52 --tables TABLE_NAMES A comma-separated list of names of tables to be
53 created. By default, the tables will be named after
54 the filenames without extensions or "stdin".
55 --no-constraints Generate a schema without length limits or null
56 checks. Useful when sampling big tables.
57 --unique-constraint UNIQUE_CONSTRAINT
58 A column-separated list of names of columns to include
59 in a UNIQUE constraint.
60 --no-create Skip creating the table. Requires --insert.
61 --create-if-not-exists
62 Create the table if it does not exist, otherwise keep
63 going. Requires --insert.
64 --overwrite Drop the table if it already exists. Requires
65 --insert. Cannot be used with --no-create.
66 --db-schema DB_SCHEMA
67 Optional name of database schema to create table(s)
68 in.
69 -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
70 Limit CSV dialect sniffing to the specified number of
71 bytes. Specify "0" to disable sniffing.
72 -I, --no-inference Disable type inference when parsing the input.
73 --chunk-size CHUNK_SIZE
74 Chunk size for batch insert into the table. Requires
75 --insert.
76
77 See also: Arguments common to all tools.
78
79 For information on connection strings and supported dialects refer to
80 the SQLAlchemy documentation.
81
82 If you prefer not to enter your password in the connection string,
83 store the password securely in a PostgreSQL Password File, a MySQL Op‐
84 tions File or similar files for other systems.
85
86 NOTE:
87 Using the --query option may cause rounding (in Python 2) or intro‐
88 duce Python floating point issues (in Python 3).
89
90 NOTE:
91 If the CSV file was created from a JSON file using in2csv, remember
92 to quote SQL columns properly. For example:
93
94 echo '{"a":{"b":"c"},"d":"e"}' | in2csv -f ndjson | csvsql --query 'SELECT "a/b" FROM stdin'
95
96 NOTE:
97 Alternatives to csvsql are q and textql.
98
100 Generate a statement in the PostgreSQL dialect:
101
102 csvsql -i postgresql examples/realdata/FY09_EDU_Recipients_by_State.csv
103
104 Create a table and import data from the CSV directly into PostgreSQL:
105
106 createdb test
107 csvsql --db postgresql:///test --tables fy09 --insert examples/realdata/FY09_EDU_Recipients_by_State.csv
108
109 For large tables it may not be practical to process the entire table.
110 One solution to this is to analyze a sample of the table. In this case
111 it can be useful to turn off length limits and null checks with the
112 --no-constraints option:
113
114 head -n 20 examples/realdata/FY09_EDU_Recipients_by_State.csv | csvsql --no-constraints --tables fy09
115
116 Create tables for an entire folder of CSVs and import data from those
117 files directly into PostgreSQL:
118
119 createdb test
120 csvsql --db postgresql:///test --insert examples/*_converted.csv
121
122 If those CSVs have identical headers, you can import them into the same
123 table by using csvstack first:
124
125 createdb test
126 csvstack examples/dummy?.csv | csvsql --db postgresql:///test --insert
127
128 Group rows by one column:
129
130 csvsql --query "select * from 'dummy3' group by a" examples/dummy3.csv
131
132 You can also use CSVSQL to “directly” query one or more CSV files.
133 Please note that this will create an in-memory SQL database, so it
134 won’t be very fast:
135
136 csvsql --query "select m.usda_id, avg(i.sepal_length) as mean_sepal_length from iris as i join irismeta as m on (i.species = m.species) group by m.species" examples/iris.csv examples/irismeta.csv
137
138 Concatenate two columns:
139
140 csvsql --query "select a || b from 'dummy3'" examples/dummy3.csv
141
142 If a column contains null values, you must COALESCE the column:
143
144 csvsql --query "select a || COALESCE(b, '') from 'sort_ints_nulls'" --no-inference examples/sort_ints_nulls.csv
145
147 Christopher Groskopf
148
150 2022, Christopher Groskopf
151
152
153
154
1551.0.7 Jul 22, 2022 CSVSQL(1)