1ANALYZE(7) PostgreSQL 16.1 Documentation ANALYZE(7)
2
3
4
6 ANALYZE - collect statistics about a database
7
9 ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
10 ANALYZE [ VERBOSE ] [ table_and_columns [, ...] ]
11
12 where option can be one of:
13
14 VERBOSE [ boolean ]
15 SKIP_LOCKED [ boolean ]
16 BUFFER_USAGE_LIMIT size
17
18 and table_and_columns is:
19
20 table_name [ ( column_name [, ...] ) ]
21
23 ANALYZE collects statistics about the contents of tables in the
24 database, and stores the results in the pg_statistic system catalog.
25 Subsequently, the query planner uses these statistics to help determine
26 the most efficient execution plans for queries.
27
28 Without a table_and_columns list, ANALYZE processes every table and
29 materialized view in the current database that the current user has
30 permission to analyze. With a list, ANALYZE processes only those
31 table(s). It is further possible to give a list of column names for a
32 table, in which case only the statistics for those columns are
33 collected.
34
35 When the option list is surrounded by parentheses, the options can be
36 written in any order. The parenthesized syntax was added in PostgreSQL
37 11; the unparenthesized syntax is deprecated.
38
40 VERBOSE
41 Enables display of progress messages.
42
43 SKIP_LOCKED
44 Specifies that ANALYZE should not wait for any conflicting locks to
45 be released when beginning work on a relation: if a relation cannot
46 be locked immediately without waiting, the relation is skipped.
47 Note that even with this option, ANALYZE may still block when
48 opening the relation's indexes or when acquiring sample rows from
49 partitions, table inheritance children, and some types of foreign
50 tables. Also, while ANALYZE ordinarily processes all partitions of
51 specified partitioned tables, this option will cause ANALYZE to
52 skip all partitions if there is a conflicting lock on the
53 partitioned table.
54
55 BUFFER_USAGE_LIMIT
56 Specifies the Buffer Access Strategy ring buffer size for ANALYZE.
57 This size is used to calculate the number of shared buffers which
58 will be reused as part of this strategy. 0 disables use of a
59 Buffer Access Strategy. When this option is not specified, ANALYZE
60 uses the value from vacuum_buffer_usage_limit. Higher settings can
61 allow ANALYZE to run more quickly, but having too large a setting
62 may cause too many other useful pages to be evicted from shared
63 buffers. The minimum value is 128 kB and the maximum value is 16
64 GB.
65
66 boolean
67 Specifies whether the selected option should be turned on or off.
68 You can write TRUE, ON, or 1 to enable the option, and FALSE, OFF,
69 or 0 to disable it. The boolean value can also be omitted, in which
70 case TRUE is assumed.
71
72 size
73 Specifies an amount of memory in kilobytes. Sizes may also be
74 specified as a string containing the numerical size followed by any
75 one of the following memory units: B (bytes), kB (kilobytes), MB
76 (megabytes), GB (gigabytes), or TB (terabytes).
77
78 table_name
79 The name (possibly schema-qualified) of a specific table to
80 analyze. If omitted, all regular tables, partitioned tables, and
81 materialized views in the current database are analyzed (but not
82 foreign tables). If the specified table is a partitioned table,
83 both the inheritance statistics of the partitioned table as a whole
84 and statistics of the individual partitions are updated.
85
86 column_name
87 The name of a specific column to analyze. Defaults to all columns.
88
90 When VERBOSE is specified, ANALYZE emits progress messages to indicate
91 which table is currently being processed. Various statistics about the
92 tables are printed as well.
93
95 To analyze a table, one must ordinarily be the table's owner or a
96 superuser. However, database owners are allowed to analyze all tables
97 in their databases, except shared catalogs. (The restriction for shared
98 catalogs means that a true database-wide ANALYZE can only be performed
99 by a superuser.) ANALYZE will skip over any tables that the calling
100 user does not have permission to analyze.
101
102 Foreign tables are analyzed only when explicitly selected. Not all
103 foreign data wrappers support ANALYZE. If the table's wrapper does not
104 support ANALYZE, the command prints a warning and does nothing.
105
106 In the default PostgreSQL configuration, the autovacuum daemon (see
107 Section 25.1.6) takes care of automatic analyzing of tables when they
108 are first loaded with data, and as they change throughout regular
109 operation. When autovacuum is disabled, it is a good idea to run
110 ANALYZE periodically, or just after making major changes in the
111 contents of a table. Accurate statistics will help the planner to
112 choose the most appropriate query plan, and thereby improve the speed
113 of query processing. A common strategy for read-mostly databases is to
114 run VACUUM and ANALYZE once a day during a low-usage time of day. (This
115 will not be sufficient if there is heavy update activity.)
116
117 ANALYZE requires only a read lock on the target table, so it can run in
118 parallel with other activity on the table.
119
120 The statistics collected by ANALYZE usually include a list of some of
121 the most common values in each column and a histogram showing the
122 approximate data distribution in each column. One or both of these can
123 be omitted if ANALYZE deems them uninteresting (for example, in a
124 unique-key column, there are no common values) or if the column data
125 type does not support the appropriate operators. There is more
126 information about the statistics in Chapter 25.
127
128 For large tables, ANALYZE takes a random sample of the table contents,
129 rather than examining every row. This allows even very large tables to
130 be analyzed in a small amount of time. Note, however, that the
131 statistics are only approximate, and will change slightly each time
132 ANALYZE is run, even if the actual table contents did not change. This
133 might result in small changes in the planner's estimated costs shown by
134 EXPLAIN. In rare situations, this non-determinism will cause the
135 planner's choices of query plans to change after ANALYZE is run. To
136 avoid this, raise the amount of statistics collected by ANALYZE, as
137 described below.
138
139 The extent of analysis can be controlled by adjusting the
140 default_statistics_target configuration variable, or on a
141 column-by-column basis by setting the per-column statistics target with
142 ALTER TABLE ... ALTER COLUMN ... SET STATISTICS. The target value sets
143 the maximum number of entries in the most-common-value list and the
144 maximum number of bins in the histogram. The default target value is
145 100, but this can be adjusted up or down to trade off accuracy of
146 planner estimates against the time taken for ANALYZE and the amount of
147 space occupied in pg_statistic. In particular, setting the statistics
148 target to zero disables collection of statistics for that column. It
149 might be useful to do that for columns that are never used as part of
150 the WHERE, GROUP BY, or ORDER BY clauses of queries, since the planner
151 will have no use for statistics on such columns.
152
153 The largest statistics target among the columns being analyzed
154 determines the number of table rows sampled to prepare the statistics.
155 Increasing the target causes a proportional increase in the time and
156 space needed to do ANALYZE.
157
158 One of the values estimated by ANALYZE is the number of distinct values
159 that appear in each column. Because only a subset of the rows are
160 examined, this estimate can sometimes be quite inaccurate, even with
161 the largest possible statistics target. If this inaccuracy leads to bad
162 query plans, a more accurate value can be determined manually and then
163 installed with ALTER TABLE ... ALTER COLUMN ... SET (n_distinct = ...).
164
165 If the table being analyzed has inheritance children, ANALYZE gathers
166 two sets of statistics: one on the rows of the parent table only, and a
167 second including rows of both the parent table and all of its children.
168 This second set of statistics is needed when planning queries that
169 process the inheritance tree as a whole. The child tables themselves
170 are not individually analyzed in this case. The autovacuum daemon,
171 however, will only consider inserts or updates on the parent table
172 itself when deciding whether to trigger an automatic analyze for that
173 table. If that table is rarely inserted into or updated, the
174 inheritance statistics will not be up to date unless you run ANALYZE
175 manually.
176
177 For partitioned tables, ANALYZE gathers statistics by sampling rows
178 from all partitions; in addition, it will recurse into each partition
179 and update its statistics. Each leaf partition is analyzed only once,
180 even with multi-level partitioning. No statistics are collected for
181 only the parent table (without data from its partitions), because with
182 partitioning it's guaranteed to be empty.
183
184 The autovacuum daemon does not process partitioned tables, nor does it
185 process inheritance parents if only the children are ever modified. It
186 is usually necessary to periodically run a manual ANALYZE to keep the
187 statistics of the table hierarchy up to date.
188
189 If any child tables or partitions are foreign tables whose foreign data
190 wrappers do not support ANALYZE, those tables are ignored while
191 gathering inheritance statistics.
192
193 If the table being analyzed is completely empty, ANALYZE will not
194 record new statistics for that table. Any existing statistics will be
195 retained.
196
197 Each backend running ANALYZE will report its progress in the
198 pg_stat_progress_analyze view. See Section 28.4.1 for details.
199
201 There is no ANALYZE statement in the SQL standard.
202
204 VACUUM(7), vacuumdb(1), Section 20.4.4, Section 25.1.6, Section 28.4.1
205
206
207
208PostgreSQL 16.1 2023 ANALYZE(7)