1CSVSQL(1) csvkit CSVSQL(1)
2
3
4
6 csvsql - csvsql Documentation
7
9 Generate SQL statements for a CSV file or execute those statements
10 directly on a database. In the latter case supports both creating
11 tables and inserting data:
12
13 usage: csvsql [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
14 [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
15 [-S] [--blanks] [--date-format DATE_FORMAT]
16 [--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v]
17 [-l] [--zero] [-V]
18 [-i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}]
19 [--db CONNECTION_STRING] [--query QUERY] [--insert]
20 [--prefix PREFIX] [--tables TABLE_NAMES] [--no-constraints]
21 [--unique-constraint UNIQUE_CONSTRAINT] [--no-create]
22 [--create-if-not-exists] [--overwrite] [--db-schema DB_SCHEMA]
23 [-y SNIFF_LIMIT] [-I] [--chunk-size NUM]
24 [FILE [FILE ...]]
25
26 Generate SQL statements for one or more CSV files, or execute those statements
27 directly on a database, and execute one or more SQL queries.
28
29 positional arguments:
30 FILE The CSV file(s) to operate on. If omitted, will accept
31 input on STDIN.
32
33 optional arguments:
34 -h, --help show this help message and exit
35 -i {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}, --dialect {firebird,mssql,mysql,oracle,postgresql,sqlite,sybase,crate}
36 Dialect of SQL to generate. Only valid when --db is
37 not specified.
38 --db CONNECTION_STRING
39 If present, a sqlalchemy connection string to use to
40 directly execute generated SQL on a database.
41 --query QUERY Execute one or more SQL queries delimited by ";" and
42 output the result of the last query as CSV. QUERY may
43 be a filename.
44 --insert In addition to creating the table, also insert the
45 data into the table. Only valid when --db is
46 specified.
47 --prefix PREFIX Add an expression following the INSERT keyword, like
48 OR IGNORE or OR REPLACE.
49 --tables TABLE_NAMES A comma-separated list of names of tables to be
50 created. By default, the tables will be named after
51 the filenames without extensions or "stdin".
52 --no-constraints Generate a schema without length limits or null
53 checks. Useful when sampling big tables.
54 --unique-constraint UNIQUE_CONSTRAINT
55 A column-separated list of names of columns to include
56 in a UNIQUE constraint.
57 --no-create Skip creating a table. Only valid when --insert is
58 specified.
59 --create-if-not-exists
60 Create table if it does not exist, otherwise keep
61 going. Only valid when --insert is specified.
62 --overwrite Drop the table before creating. Only valid when
63 --insert is specified and --no-create is not
64 specified.
65 --db-schema DB_SCHEMA
66 Optional name of database schema to create table(s)
67 in.
68 -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
69 Limit CSV dialect sniffing to the specified number of
70 bytes. Specify "0" to disable sniffing entirely.
71 -I, --no-inference Disable type inference when parsing the input.
72 --chunk-size NUM
73 Chunk size for batch insert into the table.
74 Only valid when --insert is specified.
75
76 See also: ../common_arguments.
77
78 For information on connection strings and supported dialects refer to
79 the SQLAlchemy documentation.
80
81 If you prefer not to enter your password in the connection string,
82 store the password securely in a PostgreSQL Password File, a MySQL
83 Options File or similar files for other systems.
84
85 NOTE:
86 Using the --query option may cause rounding (in Python 2) or intro‐
87 duce [Python floating point issues](‐
88 https://docs.python.org/3.4/tutorial/floatingpoint.html) (in Python
89 3).
90
92 Generate a statement in the PostgreSQL dialect:
93
94 csvsql -i postgresql examples/realdata/FY09_EDU_Recipients_by_State.csv
95
96 Create a table and import data from the CSV directly into PostgreSQL:
97
98 createdb test
99 csvsql --db postgresql:///test --tables fy09 --insert examples/realdata/FY09_EDU_Recipients_by_State.csv
100
101 For large tables it may not be practical to process the entire table.
102 One solution to this is to analyze a sample of the table. In this case
103 it can be useful to turn off length limits and null checks with the
104 --no-constraints option:
105
106 head -n 20 examples/realdata/FY09_EDU_Recipients_by_State.csv | csvsql --no-constraints --tables fy09
107
108 Create tables for an entire folder of CSVs and import data from those
109 files directly into PostgreSQL:
110
111 createdb test
112 csvsql --db postgresql:///test --insert examples/*_converted.csv
113
114 If those CSVs have identical headers, you can import them into the same
115 table by using csvstack first:
116
117 createdb test
118 csvstack examples/dummy?.csv | csvsql --db postgresql:///test --insert
119
120 Group rows by one column:
121
122 csvsql --query "select * from 'dummy3' group by a" examples/dummy3.csv
123
124 You can also use CSVSQL to “directly” query one or more CSV files.
125 Please note that this will create an in-memory SQL database, so it
126 won’t be very fast:
127
128 csvsql --query "select m.usda_id, avg(i.sepal_length) as mean_sepal_length from iris as i join irismeta as m on (i.species = m.species) group by m.species" examples/iris.csv examples/irismeta.csv
129
130 Concatenate two columns:
131
132 csvsql --query "select a || b from 'dummy3'" examples/dummy3.csv
133
134 If a column contains null values, you must COALESCE the column:
135
136 csvsql --query "select a || COALESCE(b, '') from 'sort_ints_nulls'" --no-inference examples/sort_ints_nulls.csv
137
139 Christopher Groskopf
140
142 2020, Christopher Groskopf
143
144
145
146
1471.0.4 Jan 30, 2020 CSVSQL(1)