1CSVSTAT(1) csvkit CSVSTAT(1)
2
3
4
6 csvstat - csvstat Documentation
7
9 Prints descriptive statistics for all columns in a CSV file. Will in‐
10 telligently determine the type of each column and then print analysis
11 relevant to that type (ranges for dates, mean and median for integers,
12 etc.):
13
14 usage: csvstat [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
15 [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-S] [-H]
16 [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [--csv] [-n]
17 [-c COLUMNS] [--type] [--nulls] [--unique] [--min] [--max]
18 [--sum] [--mean] [--median] [--stdev] [--len] [--freq]
19 [--freq-count FREQ_COUNT] [--count] [--decimal-format DECIMAL_FORMAT]
20 [-G] [-y SNIFF_LIMIT]
21 [FILE]
22
23 Print descriptive statistics for each column in a CSV file.
24
25 positional arguments:
26 FILE The CSV file to operate on. If omitted, will accept
27 input as piped data via STDIN.
28
29 optional arguments:
30 -h, --help show this help message and exit
31 --csv Output results as a CSV, rather than text.
32 -n, --names Display column names and indices from the input CSV
33 and exit.
34 -c COLUMNS, --columns COLUMNS
35 A comma-separated list of column indices, names or
36 ranges to be examined, e.g. "1,id,3-5". Defaults to
37 all columns.
38 --type Only output data type.
39 --nulls Only output whether columns contains nulls.
40 --unique Only output counts of unique values.
41 --min Only output smallest values.
42 --max Only output largest values.
43 --sum Only output sums.
44 --mean Only output means.
45 --median Only output medians.
46 --stdev Only output standard deviations.
47 --len Only output the length of the longest values.
48 --freq Only output lists of frequent values.
49 --freq-count FREQ_COUNT
50 The maximum number of frequent values to display.
51 --count Only output total row count.
52 --decimal-format DECIMAL_FORMAT
53 %-format specification for printing decimal numbers.
54 Defaults to locale-specific formatting with "%.3f".
55 -G, --no-grouping-separator
56 Do not use grouping separators in decimal numbers.
57 -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
58 Limit CSV dialect sniffing to the specified number of
59 bytes. Specify "0" to disable sniffing.
60
61 See also: Arguments common to all tools.
62
64 Basic use:
65
66 csvstat examples/realdata/FY09_EDU_Recipients_by_State.csv
67
68 When an statistic name is passed, only that stat will be printed:
69
70 csvstat --min examples/realdata/FY09_EDU_Recipients_by_State.csv
71
72 1. State Name: None
73 2. State Abbreviate: None
74 3. Code: 1
75 4. Montgomery GI Bill-Active Duty: 435
76 5. Montgomery GI Bill- Selective Reserve: 48
77 6. Dependents' Educational Assistance: 118
78 7. Reserve Educational Assistance Program: 60
79 8. Post-Vietnam Era Veteran's Educational Assistance Program: 1
80 9. TOTAL: 768
81 10. j: None
82
83 If a single stat and a single column are requested, only a value will
84 be returned:
85
86 csvstat -c 4 --mean examples/realdata/FY09_EDU_Recipients_by_State.csv
87
88 6,263.904
89
91 Christopher Groskopf
92
94 2023, Christopher Groskopf
95
96
97
98
991.1.1 Jul 21, 2023 CSVSTAT(1)