1MK-INDEX-USAGE(1)     User Contributed Perl Documentation    MK-INDEX-USAGE(1)
2
3
4

NAME

6       mk-index-usage - Read queries from a log and analyze how they use
7       indexes.
8

SYNOPSIS

10       Usage: mk-index-usage [OPTION...] [FILE...]
11
12       mk-index-usage reads queries from logs and analyzes how they use
13       indexes.
14
15       Analyze queries in slow.log and print reports:
16
17         mk-index-usage /path/to/slow.log --host localhost
18
19       Disable reports and save results to mk database for later analysis:
20
21         mk-index-usage slow.log --no-report --save-results-database mk
22

RISKS

24       The following section is included to inform users about the potential
25       risks, whether known or unknown, of using this tool.  The two main
26       categories of risks are those created by the nature of the tool (e.g.
27       read-only tools vs. read-write tools) and those created by bugs.
28
29       This tool is read-only unless you use "--save-results-database".  It
30       reads a log of queries and EXPLAIN them.  It also gathers information
31       about all tables in all databases.  It should be very low-risk.
32
33       At the time of this release, we know of no bugs that could cause
34       serious harm to users.
35
36       The authoritative source for updated information is always the online
37       issue tracking system.  Issues that affect this tool will be marked as
38       such.  You can see a list of such issues at the following URL:
39       <http://www.maatkit.org/bugs/mk-index-usage>.
40
41       See also "BUGS" for more information on filing bugs and getting help.
42

DESCRIPTION

44       This tool connects to a MySQL database server, reads through a query
45       log, and uses EXPLAIN to ask MySQL how it will use each query.  When it
46       is finished, it prints out a report on indexes that the queries didn't
47       use.
48
49       The query log needs to be in MySQL's slow query log format.  If you
50       need to input a different format, you can use mk-query-digest to
51       translate the formats.  If you don't specify a filename, the tool reads
52       from STDIN.
53
54       The tool runs two stages.  In the first stage, the tool takes inventory
55       of all the tables and indexes in your database, so it can compare the
56       existing indexes to those that were actually used by the queries in the
57       log.  In the second stage, it runs EXPLAIN on each query in the query
58       log.  It uses separate database connections to inventory the tables and
59       run EXPLAIN, so it opens two connections to the database.
60
61       If a query is not a SELECT, it tries to transform it to a roughly
62       equivalent SELECT query so it can be EXPLAINed.  This is not a perfect
63       process, but it is good enough to be useful.
64
65       The tool skips the EXPLAIN step for queries that are exact duplicates
66       of those seen before.  It assumes that the same query will generate the
67       same EXPLAIN plan as it did previously (usually a safe assumption, and
68       generally good for performance), and simply increments the count of
69       times that the indexes were used.  However, queries that have the same
70       fingerprint but different checksums will be re-EXPLAINed.  Queries that
71       have different literal constants can have different execution plans,
72       and this is important to measure.
73
74       After EXPLAIN-ing the query, it is necessary to try to map aliases in
75       the query back to the original table names.  For example, consider the
76       EXPLAIN plan for the following query:
77
78         SELECT * FROM tbl1 AS foo;
79
80       The EXPLAIN output will show access to table "foo", and that must be
81       translated back to "tbl1".  This process involves complex parsing.  It
82       is generally very accurate, but there is some chance that it might not
83       work right.  If you find cases where it fails, submit a bug report and
84       a reproducible test case.
85
86       Queries that cannot be EXPLAINed will cause all subsequent queries with
87       the same fingerprint to be blacklisted.  This is to reduce the work
88       they cause, and prevent them from continuing to print error messages.
89       However, at least in this stage of the tool's development, it is my
90       opinion that it's not a good idea to preemptively silence these, or
91       prevent them from being EXPLAINed at all.  I am looking for lots of
92       feedback on how to improve things like the query parsing.  So please
93       submit your test cases based on the errors the tool prints!
94

OUTPUT

96       After it reads all the events in the log, the tool prints out DROP
97       statements for every index that was not used.  It skips indexes for
98       tables that were never accessed by any queries in the log, to avoid
99       false-positive results.
100
101       If you don't specify "--quiet", the tool also outputs warnings about
102       statements that cannot be EXPLAINed and similar.  These go to standard
103       error.
104
105       Progress reports are enabled by default (see "--progress").  These also
106       go to standard error.
107

OPTIONS

109       This tool accepts additional command-line arguments.  Refer to the
110       "SYNOPSIS" and usage information for details.
111
112       --ask-pass
113           Prompt for a password when connecting to MySQL.
114
115       --charset
116           short form: -A; type: string
117
118           Default character set.  If the value is utf8, sets Perl's binmode
119           on STDOUT to utf8, passes the mysql_enable_utf8 option to
120           DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL.  Any
121           other value sets binmode on STDOUT without the utf8 layer, and runs
122           SET NAMES after connecting to MySQL.
123
124       --config
125           type: Array
126
127           Read this comma-separated list of config files; if specified, this
128           must be the first option on the command line.
129
130       --create-save-results-database
131           Create the "--save-results-database" if it does not exist.
132
133           If the "--save-results-database" already exists and this option is
134           specified, the database is used and the necessary tables are
135           created if they do not already exist.
136
137       --[no]create-views
138           Create views for "--save-results-database" example queries.
139
140           Several example queries are given for querying the tables in the
141           "--save-results-database".  These example queries are, by default,
142           created as views.  Specifying "--no-create-views" prevents these
143           views from being created.
144
145       --database
146           short form: -D; type: string
147
148           The database to use for the connection.
149
150       --databases
151           short form: -d; type: hash
152
153           Only get tables and indexes from this comma-separated list of
154           databases.
155
156       --databases-regex
157           type: string
158
159           Only get tables and indexes from database whose names match this
160           Perl regex.
161
162       --defaults-file
163           short form: -F; type: string
164
165           Only read mysql options from the given file.  You must give an
166           absolute pathname.
167
168       --drop
169           type: Hash; default: non-unique
170
171           Suggest dropping only these types of unused indexes.
172
173           By default mk-index-usage will only suggest to drop unused
174           secondary indexes, not primary or unique indexes.  You can specify
175           which types of unused indexes the tool suggests to drop: primary,
176           unique, non-unique, all.
177
178           A separate "ALTER TABLE" statement for each type is printed.  So if
179           you specify "--drop all" and there is a primary key and a non-
180           unique index, the "ALTER TABLE ... DROP" for each will be printed
181           on separate lines.
182
183       --empty-save-results-tables
184           Drop and re-create all pre-existing tables in the
185           "--save-results-database".  This allows information from previous
186           runs to be removed before the current run.
187
188       --help
189           Show help and exit.
190
191       --host
192           short form: -h; type: string
193
194           Connect to host.
195
196       --ignore-databases
197           type: Hash
198
199           Ignore this comma-separated list of databases.
200
201       --ignore-databases-regex
202           type: string
203
204           Ignore databases whose names match this Perl regex.
205
206       --ignore-tables
207           type: Hash
208
209           Ignore this comma-separated list of table names.
210
211           Table names may be qualified with the database name.
212
213       --ignore-tables-regex
214           type: string
215
216           Ignore tables whose names match the Perl regex.
217
218       --password
219           short form: -p; type: string
220
221           Password to use when connecting.
222
223       --port
224           short form: -P; type: int
225
226           Port number to use for connection.
227
228       --progress
229           type: array; default: time,30
230
231           Print progress reports to STDERR.  The value is a comma-separated
232           list with two parts.  The first part can be percentage, time, or
233           iterations; the second part specifies how often an update should be
234           printed, in percentage, seconds, or number of iterations.
235
236       --quiet
237           short form: -q
238
239           Do not print any warnings.  Also disables "--progress".
240
241       --[no]report
242           default: yes
243
244           Print the reports for "--report-format".
245
246           You may want to disable the reports by specifying "--no-report" if,
247           for example, you also specify "--save-results-database" and you
248           only want to query the results tables later.
249
250       --report-format
251           type: Array; default: drop_unused_indexes
252
253           Right now there is only one report: drop_unused_indexes.  This
254           report prints SQL statements for dropping any unused indexes.  See
255           also "--drop".
256
257           See also "--[no]report".
258
259       --save-results-database
260           type: DSN
261
262           Save results to tables in this database.  Information about
263           indexes, queries, tables and their usage is stored in several
264           tables in the specified database.  The tables are auto-created if
265           they do not exist.  If the database doesn't exist, it can be auto-
266           created with "--create-save-results-database".  In this case the
267           connection is initially created with no default database, then
268           after the database is created, it is USE'ed.
269
270           mk-index-usage executes INSERT statements to save the results.
271           Therefore, you should be careful if you use this feature on a
272           production server.  It might increase load, or cause trouble if you
273           don't want the server to be written to, or so on.
274
275           This is a new feature.  It may change in future releases.
276
277           After a run, you can query the usage tables to answer various
278           questions about index usage.  The tables have the following CREATE
279           TABLE definitions:
280
281           MAGIC_create_indexes:
282
283             CREATE TABLE IF NOT EXISTS indexes (
284               db           VARCHAR(64) NOT NULL,
285               tbl          VARCHAR(64) NOT NULL,
286               idx          VARCHAR(64) NOT NULL,
287               cnt          BIGINT UNSIGNED NOT NULL DEFAULT 0,
288               PRIMARY KEY  (db, tbl, idx)
289             )
290
291           MAGIC_create_queries:
292
293             CREATE TABLE IF NOT EXISTS queries (
294               query_id     BIGINT UNSIGNED NOT NULL,
295               fingerprint  TEXT NOT NULL,
296               sample       TEXT NOT NULL,
297               PRIMARY KEY  (query_id)
298             )
299
300           MAGIC_create_tables:
301
302             CREATE TABLE IF NOT EXISTS tables (
303               db           VARCHAR(64) NOT NULL,
304               tbl          VARCHAR(64) NOT NULL,
305               cnt          BIGINT UNSIGNED NOT NULL DEFAULT 0,
306               PRIMARY KEY  (db, tbl)
307             )
308
309           MAGIC_create_index_usage:
310
311             CREATE TABLE IF NOT EXISTS index_usage (
312               query_id      BIGINT UNSIGNED NOT NULL,
313               db            VARCHAR(64) NOT NULL,
314               tbl           VARCHAR(64) NOT NULL,
315               idx           VARCHAR(64) NOT NULL,
316               cnt           BIGINT UNSIGNED NOT NULL DEFAULT 1,
317               UNIQUE INDEX  (query_id, db, tbl, idx)
318             )
319
320           MAGIC_create_index_alternatives:
321
322             CREATE TABLE IF NOT EXISTS index_alternatives (
323               query_id      BIGINT UNSIGNED NOT NULL, -- This query used
324               db            VARCHAR(64) NOT NULL,     -- this index, but...
325               tbl           VARCHAR(64) NOT NULL,     --
326               idx           VARCHAR(64) NOT NULL,     --
327               alt_idx       VARCHAR(64) NOT NULL,     -- was an alternative
328               cnt           BIGINT UNSIGNED NOT NULL DEFAULT 1,
329               UNIQUE INDEX  (query_id, db, tbl, idx, alt_idx),
330               INDEX         (db, tbl, idx),
331               INDEX         (db, tbl, alt_idx)
332             )
333
334           The following are some queries you can run against these tables to
335           answer common questions you might have.  Each query is also created
336           as a view (with MySQL v5.0 and newer) if "--[no]create-views" is
337           true (it is by default).  The view names are the strings after the
338           "MAGIC_view_" prefix.
339
340           Question: which queries sometimes use different indexes, and what
341           fraction of the time is each index chosen?
342           MAGIC_view_query_uses_several_indexes:
343
344            SELECT iu.query_id, CONCAT_WS('.', iu.db, iu.tbl, iu.idx) AS idx,
345               variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct
346            FROM index_usage AS iu
347               INNER JOIN (
348                  SELECT query_id, db, tbl, SUM(cnt) AS total_cnt,
349                    COUNT(*) AS variations
350                  FROM index_usage
351                  GROUP BY query_id, db, tbl
352                  HAVING COUNT(*) > 1
353               ) AS qv USING(query_id, db, tbl);
354
355           Question: which indexes have lots of alternatives, i.e. are chosen
356           instead of other indexes, and for what queries?
357           MAGIC_view_index_has_alternates:
358
359            SELECT CONCAT_WS('.', db, tbl, idx) AS idx_chosen,
360               GROUP_CONCAT(DISTINCT alt_idx) AS alternatives,
361               GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
362            FROM index_alternatives
363            GROUP BY db, tbl, idx
364            HAVING COUNT(*) > 1;
365
366           Question: which indexes are considered as alternates for other
367           indexes, and for what queries?  MAGIC_view_index_alternates:
368
369            SELECT CONCAT_WS('.', db, tbl, alt_idx) AS idx_considered,
370               GROUP_CONCAT(DISTINCT idx) AS alternative_to,
371               GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
372            FROM index_alternatives
373            GROUP BY db, tbl, alt_idx
374            HAVING COUNT(*) > 1;
375
376           Question: which of those are never chosen by any queries, and are
377           therefore superfluous?  MAGIC_view_unused_index_alternates:
378
379            SELECT CONCAT_WS('.', i.db, i.tbl, i.idx) AS idx,
380               alt.alternative_to, alt.queries, alt.cnt
381            FROM indexes AS i
382               INNER JOIN (
383                  SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to,
384                     GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
385                  FROM index_alternatives
386                  GROUP BY db, tbl, alt_idx
387                  HAVING COUNT(*) > 1
388               ) AS alt ON i.db = alt.db AND i.tbl = alt.tbl
389                 AND i.idx = alt.alt_idx
390            WHERE i.cnt = 0;
391
392           Question: given a table, which indexes were used, by how many
393           queries, with how many distinct fingerprints?  Were there
394           alternatives?  Which indexes were not used?  You can edit the
395           following query's SELECT list to also see the query IDs in
396           question.  MAGIC_view_index_usage:
397
398            SELECT i.idx, iu.usage_cnt, iu.usage_total,
399               ia.alt_cnt, ia.alt_total
400            FROM indexes AS i
401               LEFT OUTER JOIN (
402                  SELECT db, tbl, idx, COUNT(*) AS usage_cnt,
403                     SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by
404                  FROM index_usage
405                  GROUP BY db, tbl, idx
406               ) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx
407               LEFT OUTER JOIN (
408                  SELECT db, tbl, idx, COUNT(*) AS alt_cnt,
409                     SUM(cnt) AS alt_total,
410                     GROUP_CONCAT(query_id) AS alt_queries
411                  FROM index_alternatives
412                  GROUP BY db, tbl, idx
413               ) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx;
414
415           Question: which indexes on a given table are vital for at least one
416           query (there is no alternative)?  MAGIC_view_required_indexes:
417
418              SELECT i.db, i.tbl, i.idx, no_alt.queries
419              FROM indexes AS i
420                 INNER JOIN (
421                    SELECT iu.db, iu.tbl, iu.idx,
422                       GROUP_CONCAT(iu.query_id) AS queries
423                    FROM index_usage AS iu
424                       LEFT OUTER JOIN index_alternatives AS ia
425                          USING(db, tbl, idx)
426                    WHERE ia.db IS NULL
427                    GROUP BY iu.db, iu.tbl, iu.idx
428                 ) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl
429                    AND no_alt.idx = i.idx
430              ORDER BY i.db, i.tbl, i.idx, no_alt.queries;
431
432       --set-vars
433           type: string; default: wait_timeout=10000
434
435           Set these MySQL variables.  Immediately after connecting to MySQL,
436           this string will be appended to SET and executed.
437
438       --socket
439           short form: -S; type: string
440
441           Socket file to use for connection.
442
443       --tables
444           short form: -t; type: hash
445
446           Only get indexes from this comma-separated list of tables.
447
448       --tables-regex
449           type: string
450
451           Only get indexes from tables whose names match this Perl regex.
452
453       --user
454           short form: -u; type: string
455
456           User for login if not current user.
457
458       --version
459           Show version and exit.
460

DSN OPTIONS

462       These DSN options are used to create a DSN.  Each option is given like
463       "option=value".  The options are case-sensitive, so P and p are not the
464       same option.  There cannot be whitespace before or after the "=" and if
465       the value contains whitespace it must be quoted.  DSN options are
466       comma-separated.  See the maatkit manpage for full details.
467
468       ·   A
469
470           dsn: charset; copy: yes
471
472           Default character set.
473
474       ·   D
475
476           dsn: database; copy: yes
477
478           Database to connect to.
479
480       ·   F
481
482           dsn: mysql_read_default_file; copy: yes
483
484           Only read default options from the given file
485
486       ·   h
487
488           dsn: host; copy: yes
489
490           Connect to host.
491
492       ·   p
493
494           dsn: password; copy: yes
495
496           Password to use when connecting.
497
498       ·   P
499
500           dsn: port; copy: yes
501
502           Port number to use for connection.
503
504       ·   S
505
506           dsn: mysql_socket; copy: yes
507
508           Socket file to use for connection.
509
510       ·   u
511
512           dsn: user; copy: yes
513
514           User for login if not current user.
515

DOWNLOADING

517       You can download Maatkit from Google Code at
518       <http://code.google.com/p/maatkit/>, or you can get any of the tools
519       easily with a command like the following:
520
521          wget http://www.maatkit.org/get/toolname
522          or
523          wget http://www.maatkit.org/trunk/toolname
524
525       Where "toolname" can be replaced with the name (or fragment of a name)
526       of any of the Maatkit tools.  Once downloaded, they're ready to run; no
527       installation is needed.  The first URL gets the latest released version
528       of the tool, and the second gets the latest trunk code from Subversion.
529

ENVIRONMENT

531       The environment variable "MKDEBUG" enables verbose debugging output in
532       all of the Maatkit tools:
533
534          MKDEBUG=1 mk-....
535

SYSTEM REQUIREMENTS

537       You need Perl and some core packages that ought to be installed in any
538       reasonably new version of Perl.
539

BUGS

541       For a list of known bugs see
542       <http://www.maatkit.org/bugs/mk-index-usage>.
543
544       Please use Google Code Issues and Groups to report bugs or request
545       support: <http://code.google.com/p/maatkit/>.  You can also join
546       #maatkit on Freenode to discuss Maatkit.
547
548       Please include the complete command-line used to reproduce the problem
549       you are seeing, the version of all MySQL servers involved, the complete
550       output of the tool when run with "--version", and if possible,
551       debugging output produced by running with the "MKDEBUG=1" environment
552       variable.
553

COPYRIGHT, LICENSE AND WARRANTY

555       This program is copyright 2010-2011 Baron Schwartz.  Feedback and
556       improvements are welcome.
557
558       THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
559       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
560       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
561
562       This program is free software; you can redistribute it and/or modify it
563       under the terms of the GNU General Public License as published by the
564       Free Software Foundation, version 2; OR the Perl Artistic License.  On
565       UNIX and similar systems, you can issue `man perlgpl' or `man
566       perlartistic' to read these licenses.
567
568       You should have received a copy of the GNU General Public License along
569       with this program; if not, write to the Free Software Foundation, Inc.,
570       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.
571

AUTHOR

573       Baron Schwartz, Daniel Nichter
574

ABOUT MAATKIT

576       This tool is part of Maatkit, a toolkit for power users of MySQL.
577       Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
578       primary code contributors.  Both are employed by Percona.  Financial
579       support for Maatkit development is primarily provided by Percona and
580       its clients.
581

VERSION

583       This manual page documents Ver 0.9.4 Distrib 7540 $Revision: 7477 $.
584
585
586
587perl v5.28.0                      2011-06-08                 MK-INDEX-USAGE(1)
Impressum