1MK-VISUAL-EXPLAIN(1) User Contributed Perl Documentation MK-VISUAL-EXPLAIN(1)
2
3
4
6 mk-visual-explain - Format EXPLAIN output as a tree.
7
9 mk-visual-explain <file_containing_explain_output>
10 mk-visual-explain -c <file_containing_query>
11 mysql -e "explain select * from mysql.user" | mk-visual-explain
12
14 The following section is included to inform users about the potential
15 risks, whether known or unknown, of using this tool. The two main
16 categories of risks are those created by the nature of the tool (e.g.
17 read-only tools vs. read-write tools) and those created by bugs.
18
19 mk-visual-explain is read-only and very low-risk.
20
21 At the time of this release, we know of no bugs that could cause
22 serious harm to users.
23
24 The authoritative source for updated information is always the online
25 issue tracking system. Issues that affect this tool will be marked as
26 such. You can see a list of such issues at the following URL:
27 http://www.maatkit.org/bugs/mk-visual-explain
28 <http://www.maatkit.org/bugs/mk-visual-explain>.
29
30 See also "BUGS" for more information on filing bugs and getting help.
31
33 mk-visual-explain reverse-engineers MySQL's EXPLAIN output into a query
34 execution plan, which it then formats as a left-deep tree -- the same
35 way the plan is represented inside MySQL. It is possible to do this by
36 hand, or to read EXPLAIN's output directly, but it requires patience
37 and expertise. Many people find a tree representation more
38 understandable.
39
40 You can pipe input into mk-visual-explain or specify a filename at the
41 command line, including the magical '-' filename, which will read from
42 standard input. It can do two things with the input: parse it for
43 something that looks like EXPLAIN output, or connect to a MySQL
44 instance and run EXPLAIN on the input.
45
46 When parsing its input, mk-visual-explain understands three formats:
47 tabular like that shown in the mysql command-line client, vertical like
48 that created by using the \G line terminator in the mysql command-line
49 client, and tab separated. It ignores any lines it doesn't know how to
50 parse.
51
52 When executing the input, mk-visual-explain replaces everything in the
53 input up to the first SELECT keyword with 'EXPLAIN SELECT,' and then
54 executes the result. You must specify "--connect" to execute the input
55 as a query.
56
57 Either way, it builds a tree from the result set and prints it to
58 standard output. For the following query,
59
60 select * from sakila.film_actor join sakila.film using(film_id);
61
62 mk-visual-explain generates this query plan:
63
64 JOIN
65 +- Bookmark lookup
66 | +- Table
67 | | table film_actor
68 | | possible_keys idx_fk_film_id
69 | +- Index lookup
70 | key film_actor->idx_fk_film_id
71 | possible_keys idx_fk_film_id
72 | key_len 2
73 | ref sakila.film.film_id
74 | rows 2
75 +- Table scan
76 rows 952
77 +- Table
78 table film
79 possible_keys PRIMARY
80
81 The query plan is left-deep, depth-first search, and the tree's root is
82 the output node -- the last step in the execution plan. In other
83 words, read it like this:
84
85 1. Table scan the 'film' table, which accesses an estimated 952 rows.
86
87 2. For each row, find matching rows by doing an index lookup into the
88 film_actor->idx_fk_film_id index with the value from
89 sakila.film.film_id, then a bookmark lookup into the film_actor
90 table.
91
92 For more information on how to read EXPLAIN output, please see
93 <http://dev.mysql.com/doc/en/explain.html>, and this talk titled "Query
94 Optimizer Internals and What's New in the MySQL 5.2 Optimizer," from
95 Timour Katchaounov, one of the MySQL developers:
96 <http://maatkit.org/presentations/katchaounov_timour.pdf>.
97
99 This program is actually a runnable module, not just an ordinary Perl
100 script. In fact, there are two modules embedded in it. This makes
101 unit testing easy, but it also makes it easy for you to use the parsing
102 and tree-building functionality if you want.
103
104 The ExplainParser package accepts a string and parses whatever it
105 thinks looks like EXPLAIN output from it. The synopsis is as follows:
106
107 require "mk-visual-explain";
108 my $p = ExplainParser->new();
109 my $rows = $p->parse("some text");
110 # $rows is an arrayref of hashrefs.
111
112 The ExplainTree package accepts a set of rows and turns it into a tree.
113 For convenience, you can also have it delegate to ExplainParser and
114 parse text for you. Here's the synopsis:
115
116 require "mk-visual-explain";
117 my $e = ExplainTree->new();
118 my $tree = $e->parse("some text", \%options);
119 my $output = $e->pretty_print($tree);
120 print $tree;
121
123 This section explains the algorithm that converts EXPLAIN into a tree.
124 You may be interested in reading this if you want to understand EXPLAIN
125 more fully, or trying to figure out how this works, but otherwise this
126 section will probably not make your life richer.
127
128 The tree can be built by examining the id, select_type, and table
129 columns of each row. Here's what I know about them:
130
131 The id column is the sequential number of the select. This does not
132 indicate nesting; it just comes from counting SELECT from the left of
133 the SQL statement. It's like capturing parentheses in a regular
134 expression. A UNION RESULT row doesn't have an id, because it isn't a
135 SELECT. The source code actually refers to UNIONs as a fake_lex, as I
136 recall.
137
138 If two adjacent rows have the same id value, they are joined with the
139 standard single-sweep multi-join method.
140
141 The select_type column tells a) that a new sub-scope has opened b) what
142 kind of relationship the row has to the previous row c) what kind of
143 operation the row represents.
144
145 · SIMPLE means there are no subqueries or unions in the whole query.
146
147 · PRIMARY means there are, but this is the outermost SELECT.
148
149 · [DEPENDENT] UNION means this result is UNIONed with the previous
150 result (not row; a result might encompass more than one row).
151
152 · UNION RESULT terminates a set of UNIONed results.
153
154 · [DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening.
155 This is the kind of subquery that happens in a WHERE clause, SELECT
156 list or whatnot; it does not return a so-called "derived table."
157
158 · DERIVED is a subquery in the FROM clause.
159
160 Tables that are JOINed all have the same select_type. For example, if
161 you JOIN three tables inside a dependent subquery, they'll all say the
162 same thing: DEPENDENT SUBQUERY.
163
164 The table column usually specifies the table name or alias, but may
165 also say <derivedN> or <unionN,N...N>. If it says <derivedN>, the row
166 represents an access to the temporary table that holds the result of
167 the subquery whose id is N. If it says <unionN,..N> it's the same
168 thing, but it refers to the results it UNIONs together.
169
170 Finally, order matters. If a row's id is less than the one before it,
171 I think that means it is dependent on something other than the one
172 before it. For example,
173
174 explain select
175 (select 1 from sakila.film),
176 (select 2 from sakila.film_actor),
177 (select 3 from sakila.actor);
178
179 | id | select_type | table |
180 +----+-------------+------------+
181 | 1 | PRIMARY | NULL |
182 | 4 | SUBQUERY | actor |
183 | 3 | SUBQUERY | film_actor |
184 | 2 | SUBQUERY | film |
185
186 If the results were in order 2-3-4, I think that would mean 3 is a
187 subquery of 2, 4 is a subquery of 3. As it is, this means 4 is a
188 subquery of the neares previous recent row with a smaller id, which is
189 1. Likewise for 3 and 2.
190
191 This structure is hard to programatically build into a tree for the
192 same reason it's hard to understand by inspection: there are both
193 forward and backward references. <derivedN> is a forward reference to
194 selectN, while <unionM,N> is a backward reference to selectM and
195 selectN. That makes recursion and other tree-building algorithms hard
196 to get right (NOTE: after implementation, I now see how it would be
197 possible to deal with both forward and backward references, but I have
198 no motivation to change something that works). Consider the following:
199
200 select * from (
201 select 1 from sakila.actor as actor_1
202 union
203 select 1 from sakila.actor as actor_2
204 ) as der_1
205 union
206 select * from (
207 select 1 from sakila.actor as actor_3
208 union all
209 select 1 from sakila.actor as actor_4
210 ) as der_2;
211
212 | id | select_type | table |
213 +------+--------------+------------+
214 | 1 | PRIMARY | <derived2> |
215 | 2 | DERIVED | actor_1 |
216 | 3 | UNION | actor_2 |
217 | NULL | UNION RESULT | <union2,3> |
218 | 4 | UNION | <derived5> |
219 | 5 | DERIVED | actor_3 |
220 | 6 | UNION | actor_4 |
221 | NULL | UNION RESULT | <union5,6> |
222 | NULL | UNION RESULT | <union1,4> |
223
224 This would be a lot easier to work with if it looked like this (I've
225 bracketed the id on rows I moved):
226
227 | id | select_type | table |
228 +------+--------------+------------+
229 | [1] | UNION RESULT | <union1,4> |
230 | 1 | PRIMARY | <derived2> |
231 | [2] | UNION RESULT | <union2,3> |
232 | 2 | DERIVED | actor_1 |
233 | 3 | UNION | actor_2 |
234 | 4 | UNION | <derived5> |
235 | [5] | UNION RESULT | <union5,6> |
236 | 5 | DERIVED | actor_3 |
237 | 6 | UNION | actor_4 |
238
239 In fact, why not re-number all the ids, so the PRIMARY row becomes 2,
240 and so on? That would make it even easier to read. Unfortunately that
241 would also have the effect of destroying the meaning of the id column,
242 which I think is important to preserve in the final tree. Also, though
243 it makes it easier to read, it doesn't make it easier to manipulate
244 programmatically; so it's fine to leave them numbered as they are.
245
246 The goal of re-ordering is to make it easier to figure out which rows
247 are children of which rows in the execution plan. Given the reordered
248 list and some row whose table is <union...> or <derived>, it is easy to
249 find the beginning of the slice of rows that should be child nodes in
250 the tree: you just look for the first row whose ID is the same as the
251 first number in the table.
252
253 The next question is how to find the last row that should be a child
254 node of a UNION or DERIVED. I'll start with DERIVED, because the
255 solution makes UNION easy.
256
257 Consider how MySQL numbers the SELECTs sequentially according to their
258 position in the SQL, left-to-right. Since a DERIVED table encloses
259 everything within it in a scope, which becomes a temporary table, there
260 are only two things to think about: its child subqueries and unions (if
261 any), and its next siblings in the scope that encloses it. Its
262 children will all have an id greater than it does, by definition, so
263 any later rows with a smaller id terminate the scope.
264
265 Here's an example. The middle derived table here has a subquery and a
266 UNION to make it a little more complex for the example.
267
268 explain select 1
269 from (
270 select film_id from sakila.film limit 1
271 ) as der_1
272 join (
273 select film_id, actor_id, (select count(*) from sakila.rental) as r
274 from sakila.film_actor limit 1
275 union all
276 select 1, 1, 1 from sakila.film_actor as dummy
277 ) as der_2 using (film_id)
278 join (
279 select actor_id from sakila.actor limit 1
280 ) as der_3 using (actor_id);
281
282 Here's the output of EXPLAIN:
283
284 | id | select_type | table |
285 | 1 | PRIMARY | <derived2> |
286 | 1 | PRIMARY | <derived6> |
287 | 1 | PRIMARY | <derived3> |
288 | 6 | DERIVED | actor |
289 | 3 | DERIVED | film_actor |
290 | 4 | SUBQUERY | rental |
291 | 5 | UNION | dummy |
292 | NULL | UNION RESULT | <union3,5> |
293 | 2 | DERIVED | film |
294
295 The siblings all have id 1, and the middle one I care about is
296 derived3. (Notice MySQL doesn't execute them in the order I defined
297 them, which is fine). Now notice that MySQL prints out the rows in the
298 opposite order I defined the subqueries: 6, 3, 2. It always seems to
299 do this, and there might be other methods of finding the scope
300 boundaries including looking for the lower boundary of the next largest
301 sibling, but this is a good enough heuristic. I am forced to rely on
302 it for non-DERIVED subqueries, so I rely on it here too. Therefore, I
303 decide that everything greater than or equal to 3 belongs to the
304 DERIVED scope.
305
306 The rule for UNION is simple: they consume the entire enclosing scope,
307 and to find the component parts of each one, you find each part's
308 beginning as referred to in the <unionN,...> definition, and its end is
309 either just before the next one, or if it's the last part, the end is
310 the end of the scope.
311
312 This is only simple because UNION consumes the entire scope, which is
313 either the entire statement, or the scope of a DERIVED table. This is
314 because a UNION cannot be a sibling of another UNION or a table,
315 DERIVED or not. (Try writing such a statement if you don't see it
316 intuitively). Therefore, you can just find the enclosing scope's
317 boundaries, and the rest is easy. Notice in the example above, the
318 UNION is over <union3,5>, which includes the row with id 4 -- it
319 includes every row between 3 and 5.
320
321 Finally, there are non-derived subqueries to deal with as well. In
322 this case I can't look at siblings to find the end of the scope as I
323 did for DERIVED. I have to trust that MySQL executes depth-first.
324 Here's an example:
325
326 explain
327 select actor_id,
328 (
329 select count(film_id)
330 + (select count(*) from sakila.film)
331 from sakila.film join sakila.film_actor using(film_id)
332 where exists(
333 select * from sakila.actor
334 where sakila.actor.actor_id = sakila.film_actor.actor_id
335 )
336 )
337 from sakila.actor;
338
339 | id | select_type | table |
340 | 1 | PRIMARY | actor |
341 | 2 | SUBQUERY | film |
342 | 2 | SUBQUERY | film_actor |
343 | 4 | DEPENDENT SUBQUERY | actor |
344 | 3 | SUBQUERY | film |
345
346 In order, the tree should be built like this:
347
348 · See row 1.
349
350 · See row 2. It's a higher id than 1, so it's a subquery, along with
351 every other row whose id is greater than 2.
352
353 · Inside this scope, see 2 and 2 and JOIN them. See 4. It's a
354 higher id than 2, so it's again a subquery; recurse. After that,
355 see 3, which is also higher; recurse.
356
357 But the only reason the nested subquery didn't include select 3 is
358 because select 4 came first. In other words, if EXPLAIN looked like
359 this,
360
361 | id | select_type | table |
362 | 1 | PRIMARY | actor |
363 | 2 | SUBQUERY | film |
364 | 2 | SUBQUERY | film_actor |
365 | 3 | SUBQUERY | film |
366 | 4 | DEPENDENT SUBQUERY | actor |
367
368 I would be forced to assume upon seeing select 3 that select 4 is a
369 subquery of it, rather than just being the next sibling in the
370 enclosing scope. If this is ever wrong, then the algorithm is wrong,
371 and I don't see what could be done about it.
372
373 UNION is a little more complicated than just "the entire scope is a
374 UNION," because the UNION might itself be inside an enclosing scope
375 that's only indicated by the first item inside the UNION. There are
376 only three kinds of enclosing scopes: UNION, DERIVED, and SUBQUERY. A
377 UNION can't enclose a UNION, and a DERIVED has its own "scope markers,"
378 but a SUBQUERY can wholly enclose a UNION, like this strange example on
379 the empty table t1:
380
381 explain select * from t1 where not exists(
382 (select t11.i from t1 t11) union (select t12.i from t1 t12));
383
384 | id | select_type | table | Extra |
385 +------+--------------+------------+--------------------------------+
386 | 1 | PRIMARY | t1 | const row not found |
387 | 2 | SUBQUERY | NULL | No tables used |
388 | 3 | SUBQUERY | NULL | no matching row in const table |
389 | 4 | UNION | t12 | const row not found |
390 | NULL | UNION RESULT | <union2,4> | |
391
392 The UNION's backward references might make it look like the UNION
393 encloses the subquery, but studying the query makes it clear this isn't
394 the case. So when a UNION's first row says SUBQUERY, it is this
395 special case.
396
397 By the way, I don't fully understand this query plan; there are 4
398 numbered SELECT in the plan, but only 3 in the query. The parens
399 around the UNIONs are meaningful. Removing them will make the EXPLAIN
400 different. Please tell me how and why this works if you know.
401
402 Armed with this knowledge, it's possible to use recursion to turn the
403 parent-child relationship between all the rows into a tree representing
404 the execution plan.
405
406 MySQL prints the rows in execution order, even the forward and backward
407 references. At any given scope, the rows are processed as a left-deep
408 tree. MySQL does not do "bushy" execution plans. It begins with a
409 table, finds a matching row in the next table, and continues till the
410 last table, when it emits a row. When it runs out, it backtracks till
411 it can find the next row and repeats. There are subtleties of course,
412 but this is the basic plan. This is why MySQL transforms all RIGHT
413 OUTER JOINs into LEFT OUTER JOINs and cannot do FULL OUTER JOIN.
414
415 This means in any given scope, say
416
417 | id | select_type | table |
418 | 1 | SIMPLE | tbl1 |
419 | 1 | SIMPLE | tbl2 |
420 | 1 | SIMPLE | tbl3 |
421
422 The execution plan looks like a depth-first traversal of this tree:
423
424 JOIN
425 / \
426 JOIN tbl3
427 / \
428 tbl1 tbl2
429
430 The JOIN might not be a JOIN. It might be a subquery, for example.
431 This comes from the type column of EXPLAIN. The documentation says
432 this is a "join type," but I think "access type" is more accurate,
433 because it's "how MySQL accesses rows."
434
435 mk-visual-explain decorates the tree significantly more than just
436 turning rows into nodes. Each node may get a series of transformations
437 that turn it into a subtree of more than one node. For example, an
438 index scan not marked with 'Using index' must do a bookmark lookup into
439 the table rows; that is a three-node subtree. However, after the above
440 node-ordering and scoping stuff, the rest of the process is pretty
441 simple.
442
444 --ask-pass
445 Prompt for a password when connecting to MySQL.
446
447 --charset
448 short form: -A; type: string
449
450 Default character set. If the value is utf8, sets Perl's binmode
451 on STDOUT to utf8, passes the mysql_enable_utf8 option to
452 DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any
453 other value sets binmode on STDOUT without the utf8 layer, and runs
454 SET NAMES after connecting to MySQL.
455
456 --clustered-pk
457 Assume that PRIMARY KEY index accesses don't need to do a bookmark
458 lookup to retrieve rows. This is the case for InnoDB.
459
460 --config
461 type: Array
462
463 Read this comma-separated list of config files; if specified, this
464 must be the first option on the command line.
465
466 --connect
467 Treat input as a query, and obtain EXPLAIN output by connecting to
468 a MySQL instance and running EXPLAIN on the query. When this
469 option is given, mk-visual-explain uses the other connection-
470 specific options such as "--user" to connect to the MySQL instance.
471 If you have a .my.cnf file, it will read it, so you may not need to
472 specify any connection-specific options.
473
474 --database
475 short form: -D; type: string
476
477 Connect to this database.
478
479 --defaults-file
480 short form: -F; type: string
481
482 Only read mysql options from the given file. You must give an
483 absolute pathname.
484
485 --format
486 type: string; default: tree
487
488 Set output format.
489
490 The default is a terse pretty-printed tree. The valid values are:
491
492 value meaning
493 ===== =======
494 tree Pretty-printed terse tree.
495 dump Data::Dumper output (see L<Data::Dumper> for more).
496
497 --help
498 Show help and exit.
499
500 --host
501 short form: -h; type: string
502
503 Connect to host.
504
505 --password
506 short form: -p; type: string
507
508 Password to use when connecting.
509
510 --pid
511 type: string
512
513 Create the given PID file. The file contains the process ID of the
514 script. The PID file is removed when the script exits. Before
515 starting, the script checks if the PID file already exists. If it
516 does not, then the script creates and writes its own PID to it. If
517 it does, then the script checks the following: if the file contains
518 a PID and a process is running with that PID, then the script dies;
519 or, if there is no process running with that PID, then the script
520 overwrites the file with its own PID and starts; else, if the file
521 contains no PID, then the script dies.
522
523 --port
524 short form: -P; type: int
525
526 Port number to use for connection.
527
528 --set-vars
529 type: string; default: wait_timeout=10000
530
531 Set these MySQL variables. Immediately after connecting to MySQL,
532 this string will be appended to SET and executed.
533
534 --socket
535 short form: -S; type: string
536
537 Socket file to use for connection.
538
539 --user
540 short form: -u; type: string
541
542 User for login if not current user.
543
544 --version
545 Show version and exit.
546
548 These DSN options are used to create a DSN. Each option is given like
549 "option=value". The options are case-sensitive, so P and p are not the
550 same option. There cannot be whitespace before or after the "=" and if
551 the value contains whitespace it must be quoted. DSN options are
552 comma-separated. See the maatkit manpage for full details.
553
554 · A
555
556 dsn: charset; copy: yes
557
558 Default character set.
559
560 · D
561
562 dsn: database; copy: yes
563
564 Default database.
565
566 · F
567
568 dsn: mysql_read_default_file; copy: yes
569
570 Only read default options from the given file
571
572 · h
573
574 dsn: host; copy: yes
575
576 Connect to host.
577
578 · p
579
580 dsn: password; copy: yes
581
582 Password to use when connecting.
583
584 · P
585
586 dsn: port; copy: yes
587
588 Port number to use for connection.
589
590 · S
591
592 dsn: mysql_socket; copy: yes
593
594 Socket file to use for connection.
595
596 · u
597
598 dsn: user; copy: yes
599
600 User for login if not current user.
601
603 You can download Maatkit from Google Code at
604 <http://code.google.com/p/maatkit/>, or you can get any of the tools
605 easily with a command like the following:
606
607 wget http://www.maatkit.org/get/toolname
608 or
609 wget http://www.maatkit.org/trunk/toolname
610
611 Where "toolname" can be replaced with the name (or fragment of a name)
612 of any of the Maatkit tools. Once downloaded, they're ready to run; no
613 installation is needed. The first URL gets the latest released version
614 of the tool, and the second gets the latest trunk code from Subversion.
615
617 The environment variable "MKDEBUG" enables verbose debugging output in
618 all of the Maatkit tools:
619
620 MKDEBUG=1 mk-....
621
623 You need Perl, DBI, DBD::mysql, and some core packages that ought to be
624 installed in any reasonably new version of Perl.
625
627 For list of known bugs see
628 http://www.maatkit.org/bugs/mk-visual-explain
629 <http://www.maatkit.org/bugs/mk-visual-explain>.
630
631 Please use Google Code Issues and Groups to report bugs or request
632 support: <http://code.google.com/p/maatkit/>. You can also join
633 #maatkit on Freenode to discuss Maatkit.
634
635 Please include the complete command-line used to reproduce the problem
636 you are seeing, the version of all MySQL servers involved, the complete
637 output of the tool when run with "--version", and if possible,
638 debugging output produced by running with the "MKDEBUG=1" environment
639 variable.
640
642 This program is copyright 2007-2010 Baron Schwartz. Feedback and
643 improvements are welcome.
644
645 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
646 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
647 MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
648
649 This program is free software; you can redistribute it and/or modify it
650 under the terms of the GNU General Public License as published by the
651 Free Software Foundation, version 2; OR the Perl Artistic License. On
652 UNIX and similar systems, you can issue `man perlgpl' or `man
653 perlartistic' to read these licenses.
654
655 You should have received a copy of the GNU General Public License along
656 with this program; if not, write to the Free Software Foundation, Inc.,
657 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
658
660 See also mk-query-profiler.
661
663 Baron "Xaprb" Schwartz
664
666 This tool is part of Maatkit, a toolkit for power users of MySQL.
667 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
668 primary code contributors. Both are employed by Percona. Financial
669 support for Maatkit development is primarily provided by Percona and
670 its clients.
671
673 This manual page documents Ver 1.0.22 Distrib 6839 $Revision: 6831 $.
674
675
676
677perl v5.12.1 2010-08-01 MK-VISUAL-EXPLAIN(1)