1MK-INDEX-USAGE(1) User Contributed Perl Documentation MK-INDEX-USAGE(1)
2
3
4
6 mk-index-usage - Read queries from a log and analyze how they use
7 indexes.
8
10 Usage: mk-index-usage [OPTION...] [FILE...]
11
12 mk-index-usage reads queries from logs and analyzes how they use
13 indexes.
14
15 Analyze queries in slow.log and print reports:
16
17 mk-index-usage /path/to/slow.log --host localhost
18
19 Disable reports and save results to mk database for later analysis:
20
21 mk-index-usage slow.log --no-report --save-results-database mk
22
24 The following section is included to inform users about the potential
25 risks, whether known or unknown, of using this tool. The two main
26 categories of risks are those created by the nature of the tool (e.g.
27 read-only tools vs. read-write tools) and those created by bugs.
28
29 This tool is read-only unless you use "--save-results-database". It
30 reads a log of queries and EXPLAIN them. It also gathers information
31 about all tables in all databases. It should be very low-risk.
32
33 At the time of this release, we know of no bugs that could cause
34 serious harm to users.
35
36 The authoritative source for updated information is always the online
37 issue tracking system. Issues that affect this tool will be marked as
38 such. You can see a list of such issues at the following URL:
39 <http://www.maatkit.org/bugs/mk-index-usage>.
40
41 See also "BUGS" for more information on filing bugs and getting help.
42
44 This tool connects to a MySQL database server, reads through a query
45 log, and uses EXPLAIN to ask MySQL how it will use each query. When it
46 is finished, it prints out a report on indexes that the queries didn't
47 use.
48
49 The query log needs to be in MySQL's slow query log format. If you
50 need to input a different format, you can use mk-query-digest to
51 translate the formats. If you don't specify a filename, the tool reads
52 from STDIN.
53
54 The tool runs two stages. In the first stage, the tool takes inventory
55 of all the tables and indexes in your database, so it can compare the
56 existing indexes to those that were actually used by the queries in the
57 log. In the second stage, it runs EXPLAIN on each query in the query
58 log. It uses separate database connections to inventory the tables and
59 run EXPLAIN, so it opens two connections to the database.
60
61 If a query is not a SELECT, it tries to transform it to a roughly
62 equivalent SELECT query so it can be EXPLAINed. This is not a perfect
63 process, but it is good enough to be useful.
64
65 The tool skips the EXPLAIN step for queries that are exact duplicates
66 of those seen before. It assumes that the same query will generate the
67 same EXPLAIN plan as it did previously (usually a safe assumption, and
68 generally good for performance), and simply increments the count of
69 times that the indexes were used. However, queries that have the same
70 fingerprint but different checksums will be re-EXPLAINed. Queries that
71 have different literal constants can have different execution plans,
72 and this is important to measure.
73
74 After EXPLAIN-ing the query, it is necessary to try to map aliases in
75 the query back to the original table names. For example, consider the
76 EXPLAIN plan for the following query:
77
78 SELECT * FROM tbl1 AS foo;
79
80 The EXPLAIN output will show access to table "foo", and that must be
81 translated back to "tbl1". This process involves complex parsing. It
82 is generally very accurate, but there is some chance that it might not
83 work right. If you find cases where it fails, submit a bug report and
84 a reproducible test case.
85
86 Queries that cannot be EXPLAINed will cause all subsequent queries with
87 the same fingerprint to be blacklisted. This is to reduce the work
88 they cause, and prevent them from continuing to print error messages.
89 However, at least in this stage of the tool's development, it is my
90 opinion that it's not a good idea to preemptively silence these, or
91 prevent them from being EXPLAINed at all. I am looking for lots of
92 feedback on how to improve things like the query parsing. So please
93 submit your test cases based on the errors the tool prints!
94
96 After it reads all the events in the log, the tool prints out DROP
97 statements for every index that was not used. It skips indexes for
98 tables that were never accessed by any queries in the log, to avoid
99 false-positive results.
100
101 If you don't specify "--quiet", the tool also outputs warnings about
102 statements that cannot be EXPLAINed and similar. These go to standard
103 error.
104
105 Progress reports are enabled by default (see "--progress"). These also
106 go to standard error.
107
109 This tool accepts additional command-line arguments. Refer to the
110 "SYNOPSIS" and usage information for details.
111
112 --ask-pass
113 Prompt for a password when connecting to MySQL.
114
115 --charset
116 short form: -A; type: string
117
118 Default character set. If the value is utf8, sets Perl's binmode
119 on STDOUT to utf8, passes the mysql_enable_utf8 option to
120 DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any
121 other value sets binmode on STDOUT without the utf8 layer, and runs
122 SET NAMES after connecting to MySQL.
123
124 --config
125 type: Array
126
127 Read this comma-separated list of config files; if specified, this
128 must be the first option on the command line.
129
130 --create-save-results-database
131 Create the "--save-results-database" if it does not exist.
132
133 If the "--save-results-database" already exists and this option is
134 specified, the database is used and the necessary tables are
135 created if they do not already exist.
136
137 --[no]create-views
138 Create views for "--save-results-database" example queries.
139
140 Several example queries are given for querying the tables in the
141 "--save-results-database". These example queries are, by default,
142 created as views. Specifying "--no-create-views" prevents these
143 views from being created.
144
145 --database
146 short form: -D; type: string
147
148 The database to use for the connection.
149
150 --databases
151 short form: -d; type: hash
152
153 Only get tables and indexes from this comma-separated list of
154 databases.
155
156 --databases-regex
157 type: string
158
159 Only get tables and indexes from database whose names match this
160 Perl regex.
161
162 --defaults-file
163 short form: -F; type: string
164
165 Only read mysql options from the given file. You must give an
166 absolute pathname.
167
168 --drop
169 type: Hash; default: non-unique
170
171 Suggest dropping only these types of unused indexes.
172
173 By default mk-index-usage will only suggest to drop unused
174 secondary indexes, not primary or unique indexes. You can specify
175 which types of unused indexes the tool suggests to drop: primary,
176 unique, non-unique, all.
177
178 A separate "ALTER TABLE" statement for each type is printed. So if
179 you specify "--drop all" and there is a primary key and a non-
180 unique index, the "ALTER TABLE ... DROP" for each will be printed
181 on separate lines.
182
183 --empty-save-results-tables
184 Drop and re-create all pre-existing tables in the
185 "--save-results-database". This allows information from previous
186 runs to be removed before the current run.
187
188 --help
189 Show help and exit.
190
191 --host
192 short form: -h; type: string
193
194 Connect to host.
195
196 --ignore-databases
197 type: Hash
198
199 Ignore this comma-separated list of databases.
200
201 --ignore-databases-regex
202 type: string
203
204 Ignore databases whose names match this Perl regex.
205
206 --ignore-tables
207 type: Hash
208
209 Ignore this comma-separated list of table names.
210
211 Table names may be qualified with the database name.
212
213 --ignore-tables-regex
214 type: string
215
216 Ignore tables whose names match the Perl regex.
217
218 --password
219 short form: -p; type: string
220
221 Password to use when connecting.
222
223 --port
224 short form: -P; type: int
225
226 Port number to use for connection.
227
228 --progress
229 type: array; default: time,30
230
231 Print progress reports to STDERR. The value is a comma-separated
232 list with two parts. The first part can be percentage, time, or
233 iterations; the second part specifies how often an update should be
234 printed, in percentage, seconds, or number of iterations.
235
236 --quiet
237 short form: -q
238
239 Do not print any warnings. Also disables "--progress".
240
241 --[no]report
242 default: yes
243
244 Print the reports for "--report-format".
245
246 You may want to disable the reports by specifying "--no-report" if,
247 for example, you also specify "--save-results-database" and you
248 only want to query the results tables later.
249
250 --report-format
251 type: Array; default: drop_unused_indexes
252
253 Right now there is only one report: drop_unused_indexes. This
254 report prints SQL statements for dropping any unused indexes. See
255 also "--drop".
256
257 See also "--[no]report".
258
259 --save-results-database
260 type: DSN
261
262 Save results to tables in this database. Information about
263 indexes, queries, tables and their usage is stored in several
264 tables in the specified database. The tables are auto-created if
265 they do not exist. If the database doesn't exist, it can be auto-
266 created with "--create-save-results-database". In this case the
267 connection is initially created with no default database, then
268 after the database is created, it is USE'ed.
269
270 mk-index-usage executes INSERT statements to save the results.
271 Therefore, you should be careful if you use this feature on a
272 production server. It might increase load, or cause trouble if you
273 don't want the server to be written to, or so on.
274
275 This is a new feature. It may change in future releases.
276
277 After a run, you can query the usage tables to answer various
278 questions about index usage. The tables have the following CREATE
279 TABLE definitions:
280
281 MAGIC_create_indexes:
282
283 CREATE TABLE IF NOT EXISTS indexes (
284 db VARCHAR(64) NOT NULL,
285 tbl VARCHAR(64) NOT NULL,
286 idx VARCHAR(64) NOT NULL,
287 cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,
288 PRIMARY KEY (db, tbl, idx)
289 )
290
291 MAGIC_create_queries:
292
293 CREATE TABLE IF NOT EXISTS queries (
294 query_id BIGINT UNSIGNED NOT NULL,
295 fingerprint TEXT NOT NULL,
296 sample TEXT NOT NULL,
297 PRIMARY KEY (query_id)
298 )
299
300 MAGIC_create_tables:
301
302 CREATE TABLE IF NOT EXISTS tables (
303 db VARCHAR(64) NOT NULL,
304 tbl VARCHAR(64) NOT NULL,
305 cnt BIGINT UNSIGNED NOT NULL DEFAULT 0,
306 PRIMARY KEY (db, tbl)
307 )
308
309 MAGIC_create_index_usage:
310
311 CREATE TABLE IF NOT EXISTS index_usage (
312 query_id BIGINT UNSIGNED NOT NULL,
313 db VARCHAR(64) NOT NULL,
314 tbl VARCHAR(64) NOT NULL,
315 idx VARCHAR(64) NOT NULL,
316 cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,
317 UNIQUE INDEX (query_id, db, tbl, idx)
318 )
319
320 MAGIC_create_index_alternatives:
321
322 CREATE TABLE IF NOT EXISTS index_alternatives (
323 query_id BIGINT UNSIGNED NOT NULL, -- This query used
324 db VARCHAR(64) NOT NULL, -- this index, but...
325 tbl VARCHAR(64) NOT NULL, --
326 idx VARCHAR(64) NOT NULL, --
327 alt_idx VARCHAR(64) NOT NULL, -- was an alternative
328 cnt BIGINT UNSIGNED NOT NULL DEFAULT 1,
329 UNIQUE INDEX (query_id, db, tbl, idx, alt_idx),
330 INDEX (db, tbl, idx),
331 INDEX (db, tbl, alt_idx)
332 )
333
334 The following are some queries you can run against these tables to
335 answer common questions you might have. Each query is also created
336 as a view (with MySQL v5.0 and newer) if "--[no]create-views" is
337 true (it is by default). The view names are the strings after the
338 "MAGIC_view_" prefix.
339
340 Question: which queries sometimes use different indexes, and what
341 fraction of the time is each index chosen?
342 MAGIC_view_query_uses_several_indexes:
343
344 SELECT iu.query_id, CONCAT_WS('.', iu.db, iu.tbl, iu.idx) AS idx,
345 variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct
346 FROM index_usage AS iu
347 INNER JOIN (
348 SELECT query_id, db, tbl, SUM(cnt) AS total_cnt,
349 COUNT(*) AS variations
350 FROM index_usage
351 GROUP BY query_id, db, tbl
352 HAVING COUNT(*) > 1
353 ) AS qv USING(query_id, db, tbl);
354
355 Question: which indexes have lots of alternatives, i.e. are chosen
356 instead of other indexes, and for what queries?
357 MAGIC_view_index_has_alternates:
358
359 SELECT CONCAT_WS('.', db, tbl, idx) AS idx_chosen,
360 GROUP_CONCAT(DISTINCT alt_idx) AS alternatives,
361 GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
362 FROM index_alternatives
363 GROUP BY db, tbl, idx
364 HAVING COUNT(*) > 1;
365
366 Question: which indexes are considered as alternates for other
367 indexes, and for what queries? MAGIC_view_index_alternates:
368
369 SELECT CONCAT_WS('.', db, tbl, alt_idx) AS idx_considered,
370 GROUP_CONCAT(DISTINCT idx) AS alternative_to,
371 GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
372 FROM index_alternatives
373 GROUP BY db, tbl, alt_idx
374 HAVING COUNT(*) > 1;
375
376 Question: which of those are never chosen by any queries, and are
377 therefore superfluous? MAGIC_view_unused_index_alternates:
378
379 SELECT CONCAT_WS('.', i.db, i.tbl, i.idx) AS idx,
380 alt.alternative_to, alt.queries, alt.cnt
381 FROM indexes AS i
382 INNER JOIN (
383 SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to,
384 GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
385 FROM index_alternatives
386 GROUP BY db, tbl, alt_idx
387 HAVING COUNT(*) > 1
388 ) AS alt ON i.db = alt.db AND i.tbl = alt.tbl
389 AND i.idx = alt.alt_idx
390 WHERE i.cnt = 0;
391
392 Question: given a table, which indexes were used, by how many
393 queries, with how many distinct fingerprints? Were there
394 alternatives? Which indexes were not used? You can edit the
395 following query's SELECT list to also see the query IDs in
396 question. MAGIC_view_index_usage:
397
398 SELECT i.idx, iu.usage_cnt, iu.usage_total,
399 ia.alt_cnt, ia.alt_total
400 FROM indexes AS i
401 LEFT OUTER JOIN (
402 SELECT db, tbl, idx, COUNT(*) AS usage_cnt,
403 SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by
404 FROM index_usage
405 GROUP BY db, tbl, idx
406 ) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx
407 LEFT OUTER JOIN (
408 SELECT db, tbl, idx, COUNT(*) AS alt_cnt,
409 SUM(cnt) AS alt_total,
410 GROUP_CONCAT(query_id) AS alt_queries
411 FROM index_alternatives
412 GROUP BY db, tbl, idx
413 ) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx;
414
415 Question: which indexes on a given table are vital for at least one
416 query (there is no alternative)? MAGIC_view_required_indexes:
417
418 SELECT i.db, i.tbl, i.idx, no_alt.queries
419 FROM indexes AS i
420 INNER JOIN (
421 SELECT iu.db, iu.tbl, iu.idx,
422 GROUP_CONCAT(iu.query_id) AS queries
423 FROM index_usage AS iu
424 LEFT OUTER JOIN index_alternatives AS ia
425 USING(db, tbl, idx)
426 WHERE ia.db IS NULL
427 GROUP BY iu.db, iu.tbl, iu.idx
428 ) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl
429 AND no_alt.idx = i.idx
430 ORDER BY i.db, i.tbl, i.idx, no_alt.queries;
431
432 --set-vars
433 type: string; default: wait_timeout=10000
434
435 Set these MySQL variables. Immediately after connecting to MySQL,
436 this string will be appended to SET and executed.
437
438 --socket
439 short form: -S; type: string
440
441 Socket file to use for connection.
442
443 --tables
444 short form: -t; type: hash
445
446 Only get indexes from this comma-separated list of tables.
447
448 --tables-regex
449 type: string
450
451 Only get indexes from tables whose names match this Perl regex.
452
453 --user
454 short form: -u; type: string
455
456 User for login if not current user.
457
458 --version
459 Show version and exit.
460
462 These DSN options are used to create a DSN. Each option is given like
463 "option=value". The options are case-sensitive, so P and p are not the
464 same option. There cannot be whitespace before or after the "=" and if
465 the value contains whitespace it must be quoted. DSN options are
466 comma-separated. See the maatkit manpage for full details.
467
468 · A
469
470 dsn: charset; copy: yes
471
472 Default character set.
473
474 · D
475
476 dsn: database; copy: yes
477
478 Database to connect to.
479
480 · F
481
482 dsn: mysql_read_default_file; copy: yes
483
484 Only read default options from the given file
485
486 · h
487
488 dsn: host; copy: yes
489
490 Connect to host.
491
492 · p
493
494 dsn: password; copy: yes
495
496 Password to use when connecting.
497
498 · P
499
500 dsn: port; copy: yes
501
502 Port number to use for connection.
503
504 · S
505
506 dsn: mysql_socket; copy: yes
507
508 Socket file to use for connection.
509
510 · u
511
512 dsn: user; copy: yes
513
514 User for login if not current user.
515
517 You can download Maatkit from Google Code at
518 <http://code.google.com/p/maatkit/>, or you can get any of the tools
519 easily with a command like the following:
520
521 wget http://www.maatkit.org/get/toolname
522 or
523 wget http://www.maatkit.org/trunk/toolname
524
525 Where "toolname" can be replaced with the name (or fragment of a name)
526 of any of the Maatkit tools. Once downloaded, they're ready to run; no
527 installation is needed. The first URL gets the latest released version
528 of the tool, and the second gets the latest trunk code from Subversion.
529
531 The environment variable "MKDEBUG" enables verbose debugging output in
532 all of the Maatkit tools:
533
534 MKDEBUG=1 mk-....
535
537 You need Perl and some core packages that ought to be installed in any
538 reasonably new version of Perl.
539
541 For a list of known bugs see
542 <http://www.maatkit.org/bugs/mk-index-usage>.
543
544 Please use Google Code Issues and Groups to report bugs or request
545 support: <http://code.google.com/p/maatkit/>. You can also join
546 #maatkit on Freenode to discuss Maatkit.
547
548 Please include the complete command-line used to reproduce the problem
549 you are seeing, the version of all MySQL servers involved, the complete
550 output of the tool when run with "--version", and if possible,
551 debugging output produced by running with the "MKDEBUG=1" environment
552 variable.
553
555 This program is copyright 2010-2011 Baron Schwartz. Feedback and
556 improvements are welcome.
557
558 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
559 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
560 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
561
562 This program is free software; you can redistribute it and/or modify it
563 under the terms of the GNU General Public License as published by the
564 Free Software Foundation, version 2; OR the Perl Artistic License. On
565 UNIX and similar systems, you can issue `man perlgpl' or `man
566 perlartistic' to read these licenses.
567
568 You should have received a copy of the GNU General Public License along
569 with this program; if not, write to the Free Software Foundation, Inc.,
570 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
571
573 Baron Schwartz, Daniel Nichter
574
576 This tool is part of Maatkit, a toolkit for power users of MySQL.
577 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
578 primary code contributors. Both are employed by Percona. Financial
579 support for Maatkit development is primarily provided by Percona and
580 its clients.
581
583 This manual page documents Ver 0.9.4 Distrib 7540 $Revision: 7477 $.
584
585
586
587perl v5.28.0 2011-06-08 MK-INDEX-USAGE(1)