1MK-SLAVE-PREFETCH(1) User Contributed Perl Documentation MK-SLAVE-PREFETCH(1)
2
3
4
6 mk-slave-prefetch - Pipeline relay logs on a MySQL slave to pre-warm
7 caches.
8
10 mk-slave-prefetch
11 mk-slave-prefetch --statistics > /path/to/saved/statistics
12 mk-slave-prefetch /path/to/saved/statistics
13
15 The following section is included to inform users about the potential
16 risks, whether known or unknown, of using this tool. The two main
17 categories of risks are those created by the nature of the tool (e.g.
18 read-only tools vs. read-write tools) and those created by bugs.
19
20 mk-slave-prefetch is read-only by default, and is generally low-risk.
21 It does execute SQL statements, but these should be SELECT only.
22 Despite this, it might be a good idea to make it connect to MySQL with
23 a user account that has minimal privileges. Here is an example of how
24 to grant the necessary privileges:
25
26 GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.*
27 TO 'prefetch'@'%' IDENTIFIED BY 'sp33dmeup!';
28
29 At the time of this release, we know of no bugs that could cause
30 serious harm to users.
31
32 The authoritative source for updated information is always the online
33 issue tracking system. Issues that affect this tool will be marked as
34 such. You can see a list of such issues at the following URL:
35 http://www.maatkit.org/bugs/mk-slave-prefetch
36 <http://www.maatkit.org/bugs/mk-slave-prefetch>.
37
38 See also "BUGS" for more information on filing bugs and getting help.
39
41 mk-slave-prefetch reads the slave's relay log slightly ahead of where
42 the slave's SQL thread is reading, converts statements into "SELECT",
43 and executes them. In theory, this should help alleviate the effects
44 of the slave's single-threaded SQL execution. It will help take
45 advantage of multiple CPUs and disks by pre-reading the data from disk,
46 so the data is already in the cache when the slave SQL thread executes
47 the un-modified version of the statement.
48
49 "mk-slave-prefetch" learns how long it takes statements to execute, and
50 doesn't try to execute those that take a very long time. You can ask
51 it to print what it has learned after it executes. You can also
52 specify a filename on the command line. The file should contain the
53 statistics printed by a previous run. These will be used to pre-
54 populate the statistics so it doesn't have to re-learn.
55
56 This program is based on concepts I heard Paul Tuckfield explain at the
57 November 2006 MySQL Camp un-conference. However, the code is my own
58 work. I have not seen any other implementation of Paul's idea.
59
61 Does it work? Does it actually speed up the slave?
62
63 That depends on your workload, hardware, and other factors. It might
64 work when the following are true:
65
66 · The slave's data is much larger than memory, and the workload is
67 mostly randomly scattered small (single-row is ideal) changes.
68
69 · There are lots of high-concurrency "UPDATE" and "DELETE" statements
70 on the master.
71
72 · The slave SQL thread is I/O-bound, but the slave overall has plenty
73 of spare I/O capacity (definitely more than one disk spindle).
74
75 · The slave uses InnoDB or another storage engine with row-level
76 locking.
77
78 It does not speed up replication on my slaves, which mostly have large
79 queries like "INSERT .. SELECT .. GROUP BY". In my benchmarks it
80 seemed to make no difference at all, positive or negative.
81
82 On the wrong workload or slave configuration, this technique might
83 actually make the slaves slower. Your mileage will vary.
84
85 User-contributed benchmarks are welcome.
86
88 Specify at least one of "--print", "--execute" or "--stop".
89
90 --ask-pass
91 Prompt for a password when connecting to MySQL.
92
93 --charset
94 short form: -A; type: string
95
96 Default character set. If the value is utf8, sets Perl's binmode
97 on STDOUT to utf8, passes the mysql_enable_utf8 option to
98 DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any
99 other value sets binmode on STDOUT without the utf8 layer, and runs
100 SET NAMES after connecting to MySQL.
101
102 --check-interval
103 type: Array; default: 16,1,1024
104
105 How often to check the slave: init,min,max. This many relay log
106 events should pass before checking the output of "SHOW SLAVE
107 STATUS". The syntax is a three-number range: initial, minimum, and
108 maximum. You should be able to leave this at the defaults.
109
110 "mk-slave-prefetch" varies the check interval in powers of two,
111 depending on whether it decides the check was necessary.
112
113 --config
114 type: Array
115
116 Read this comma-separated list of config files; if specified, this
117 must be the first option on the command line.
118
119 --[no]continue-on-error
120 default: yes
121
122 Continue parsing even if there is an error.
123
124 --daemonize
125 Fork to the background and detach from the shell. POSIX operating
126 systems only.
127
128 --database
129 short form: -D; type: string
130
131 The database to use for the connection. The initial connection
132 will be to this database, but mk-slave-prefetch will issue "USE"
133 statements as required by the binary log events.
134
135 This database is also used as the default database for
136 "--secondary-indexes" if the database cannot automatically be
137 determined from the query.
138
139 --defaults-file
140 short form: -F; type: string
141
142 Only read mysql options from the given file. You must give an
143 absolute pathname.
144
145 --dry-run
146 Ignore replication checks and just read and rewrite the relay log
147 events.
148
149 This option make mk-slave-prefetch ignore all relay log related
150 checks for position, slave lag, etc. and simply causes the tool to
151 read and rewrite all the events in the relay log. A connection to
152 the slave server is still required.
153
154 --errors
155 cumulative: yes
156
157 Print queries that caused errors. If specified once, at exit; if
158 twice, in realtime.
159
160 If you specify this option once, you will see a report at the end
161 of the script execution, showing the normalized queries and the
162 number of times they were seen. If you specify this option twice,
163 you will see the errors printed out as they occur, but no
164 normalized report at the end of execution.
165
166 --execute
167 Execute the transformed queries to warm the caches.
168
169 --help
170 Show help and exit.
171
172 --host
173 short form: -h; type: string
174
175 Connect to host.
176
177 --[no]inject-columns
178 default: yes
179
180 Inject "(columns)" into INSERT/REPLACE that don't specify them.
181
182 Normally this query cannot be rewritten because mk-slave-prefetch
183 doesn't know which columns the values refer to: "INSERT INTO tbl
184 VALUES (1,2)". This option causes mk-slave-prefetch to "SHOW
185 CREATE TABLE" the table from the query, get its columns and inject
186 these columns into the query, like: "INSERT INTO tbl
187 (`col1`,`col2`) VALUES (1,2)". This allows the query to be written
188 as a SELECT and prefetched.
189
190 Columns for each unique database.table are cached, so this
191 operation may fail if an "ALTER TABLE" statement changes the order
192 or name of any columns.
193
194 --io-lag
195 type: size; default: 1k
196
197 How many bytes to lag the slave I/O thread. This helps avoid
198 "mysqlbinlog" reading right off the end of the relay log file.
199
200 --log
201 type: string
202
203 Print all output to this file when daemonized.
204
205 --max-query-time
206 type: float; default: 1
207
208 Do not run queries longer than this many seconds; fractions
209 allowed. If "mk-slave-prefetch" predicts the query will take
210 longer to execute, it will skip the query. This is based on the
211 theory that pre-warming the cache is most beneficial for short
212 queries.
213
214 "mk-slave-prefetch" learns how long queries require to execute. It
215 keeps an average over the last "--query-sample-size" samples of
216 each query. The averages are based on an abstracted version of the
217 query, with specific parameters replaced by placeholders. The
218 result is a sort of "fingerprint" for the query, not executable
219 SQL. You can see the learned statistics with the "--statistics"
220 option.
221
222 You can pre-load query fingerprints, and average execution times,
223 from a file. This way you don't have to wait for
224 "mk-slave-prefetch" to learn all over every time you start it.
225 Just specify the file on the command line. The format should be
226 the same as the output from "--statistics".
227
228 You might also want to filter out some statements completely, or
229 let only some statements through. See the "--reject-regexp" and
230 "--permit-regexp" options.
231
232 If "mk-slave-prefetch" hasn't seen a query's fingerprint before,
233 and thus doesn't know how long it will take to execute, it wraps it
234 in a subuery, like this:
235
236 SELECT 1 FROM ( <query> ) AS X LIMIT 1;
237
238 This helps avoid fetching a lot of data back to the client when a
239 query is very large. It requires a version of MySQL that supports
240 subqueries (version 4.1 and newer). If yours doesn't, the subquery
241 trick can't be used, so the query might fetch a lot of data back to
242 the client.
243
244 Once a query's fingerprint has been seen, so it's known that the
245 query isn't enormously slow, "mk-slave-prefetch" just rewrites the
246 "SELECT" list for efficiency. (Avoiding the subquery reduces the
247 query's overhead for short queries). The rewritten query will then
248 look like the following;
249
250 SELECT ISNULL(COALESCE(<columns>)) FROM ...
251
252 --num-prefix
253 Abstract away numeric table name prefixes. This causes the
254 following two queries to "fingeprint" to the same thing:
255
256 select from 1_2_users;
257 select from 2_3_users;
258
259 --offset
260 type: size; default: 128
261
262 How many bytes "mk-slave-prefetch" will try to stay in front of the
263 slave SQL thread. It will not execute log events it doesn't think
264 are at least this far ahead of the SQL thread. See also
265 "--window".
266
267 --password
268 short form: -p; type: string
269
270 Password to use when connecting.
271
272 --permit-regexp
273 type: string
274
275 Permit queries matching this Perl regexp. This is a filter for log
276 events. The regular expression is matched against the raw log
277 event, before any transformations are applied. If specified, this
278 option will permit only log events matching the regular expression.
279
280 --pid
281 type: string
282
283 Create the given PID file when daemonized. The file contains the
284 process ID of the daemonized instance. The PID file is removed
285 when the daemonized instance exits. The program checks for the
286 existence of the PID file when starting; if it exists and the
287 process with the matching PID exists, the program exits.
288
289 --port
290 short form: -P; type: int
291
292 Port number to use for connection.
293
294 --print
295 Print the transformed relay log events to standard output.
296
297 --print-nonrewritten
298 Print queries that could not be transformed into "SELECT".
299
300 --progress
301 type: int
302
303 Print progress information every X events. The information is the
304 current log file and position, plus a summary of the statistics
305 gathered.
306
307 --query-sample-size
308 type: int; default: 4
309
310 Average query exec time over this many queries. The last "N"
311 queries with a given fingerprint are averaged together to get the
312 average query execution time (see "--max-query-time").
313
314 --reject-regexp
315 type: string
316
317 Reject queries matching this Perl regexp. Similar to
318 "--permit-regexp", but has the opposite effect: log events must not
319 match the regular expression.
320
321 --relay-log
322 type: string
323
324 Read only the specified relay log file; - to read from STDIN.
325
326 By default mk-slave-prefetch reads the "Relay_Log_File" reported by
327 "SHOW SLAVE STATUS". This option allows you to specify a relay log
328 file that has already be converted to text by "mysqlbinlog". The
329 tool will exit after reading and parsing this file.
330
331 This option is useful with "--dry-run" and "--print" if you want to
332 see how the tool would rewrite the relay log's events without
333 executing them or having to wait for a lagged slave.
334
335 --relay-log-dir
336 type: string
337
338 Open the slave's "Relay_Log_File" relative to this directory.
339
340 Unless this option is specified, mk-slave-prefetch automatically
341 determines the directory that relay logs are in by first looking at
342 the "relay_log" system variable to see if it specifies a path. If
343 it does, this path is used; if it does not, then the "datadir"
344 variable value is used.
345
346 This option is ignored if an explicit "--relay-log" is specified.
347
348 --run-time
349 type: time
350
351 How long "mk-slave-prefetch" should run before exiting. The
352 default is to run forever.
353
354 --secondary-indexes
355 Prefetch secondary indexes for pipelined queries.
356
357 --sentinel
358 type: string; default: /tmp/mk-slave-prefetch-sentinel
359
360 Exit if this file exists.
361
362 --set-vars
363 type: string; default: wait_timeout=10000
364
365 Set these MySQL variables. Immediately after connecting to MySQL,
366 this string will be appended to SET and executed.
367
368 --sleep
369 type: time; default: 1s
370
371 Sleep time before checking for new events.
372
373 When mk-slave-prefetch is done reading all the events in a relay
374 log, it sleeps this amount of time before checking for new events.
375
376 This option is automatically set to zero if both "--relay-log" and
377 "--dry-run" are specified.
378
379 --socket
380 short form: -S; type: string
381
382 Socket file to use for connection.
383
384 --statistics
385 Print execution statistics after exiting. The statistics are in
386 two sections: counters, and queries. The counters simply count the
387 number of times events occur. You may see the following counters:
388
389 NAME MEANING
390 ====================== =======================================
391 mysqlbinlog Executed mysqlbinlog to read log events.
392 events The total number of relay log events.
393 not_far_enough_ahead An event was not at least L<"--offset">
394 bytes ahead of the SQL thread.
395 too_far_ahead An event was more than L<"--offset">
396 + L<"--window"> bytes ahead of the SQL thread.
397 too_close_to_io_thread An event was less than L<"--io-lag"> bytes
398 away from the I/O thread's position.
399 event_not_allowed An event wasn't a SET, USE, INSERT,
400 UPDATE, DELETE or REPLACE query.
401 event_filtered_out An event was filtered out because of
402 L<"--permit-regexp"> or L<"--reject-regexp">.
403 same_timestamp A SET TIMESTAMP event was ignored because
404 it had the same timestamp as the last one.
405 do_query A transformed event was executed
406 or printed.
407 query_error An executed query had an error.
408 query_too_long An event was not executed because its
409 average query length exceeded
410 L<"--max-query-time">.
411 query_not_rewritten A query could not be rewritten to a
412 SELECT.
413 master_pos_wait The tool waited for the SQL thread to
414 catch up.
415 show_slave_status The tool queried SHOW SLAVE STATUS.
416 load_data_infile The tool found a LOAD DATA INFILE query
417 and unlinked (deleted) the temp file.
418 could_not_unlink The tool failed to unlink a temp file.
419 sleep The tool slept for a second because the
420 slave's SQL thread was not running, or
421 because it read past the end of the log.
422
423 After the counters, "mk-slave-prefetch" prints information about
424 each query fingerprint it has seen, two lines per fingerprint. The
425 first line contains the query's fingerprint. The second line
426 contains the number of times the fingerprint was seen, number of
427 times executed, the sum of the execution times, and the average
428 execution time over the last "--query-sample-size" samples.
429
430 --stop
431 Stop running instances by creating the "--sentinel" file.
432
433 --threads
434 type: int; default: 2
435
436 Number of concurrent threads to use for pipelining queries.
437
438 --tmpdir
439 type: string; default: /dev/null
440
441 Where to create temp files for "LOAD DATA INFILE" queries. The
442 default will cause "mysqlbinlog" to skip the file and the
443 associated "LOAD DATA INFILE" command entirely.
444
445 If "mk-slave-prefetch" sees a "LOAD DATA INFILE" command (which it
446 won't if this is left at the default), it will try to remove the
447 temporary file, then skip the event.
448
449 --user
450 short form: -u; type: string
451
452 User for login if not current user.
453
454 --version
455 Show version and exit.
456
457 --window
458 type: size; default: 4k
459
460 The max bytes ahead of the slave "mk-slave-prefetch" should get.
461 Defines the window within which "mk-slave-prefetch" considers a
462 query OK to execute. The window begins at the slave SQL thread's
463 last known position plus "--offset" bytes, and extends for the
464 specified number of bytes.
465
466 If "mk-slave-prefetch" sees a log event that is too far in the
467 future, it will increment the "too_far_ahead" counter and wait for
468 the slave SQL thread to catch up (which increments the
469 "master_pos_wait" counter). If an event isn't far enough ahead of
470 the SQL thread, it will be discarded and the "not_far_enough_ahead"
471 counter increments.
472
473 Watching the mentioned statistics can help you understand how to
474 tune the window. You want "mk-slave-prefetch" to run just ahead of
475 the SQL thread, not throwing out a lot of events for being too far
476 ahead or not far enough ahead.
477
479 These DSN options are used to create a DSN. Each option is given like
480 "option=value". The options are case-sensitive, so P and p are not the
481 same option. There cannot be whitespace before or after the "=" and if
482 the value contains whitespace it must be quoted. DSN options are
483 comma-separated. See the maatkit manpage for full details.
484
485 · A
486
487 dsn: charset; copy: yes
488
489 Default character set.
490
491 · D
492
493 dsn: database; copy: yes
494
495 Default database.
496
497 · F
498
499 dsn: mysql_read_default_file; copy: yes
500
501 Only read default options from the given file
502
503 · h
504
505 dsn: host; copy: yes
506
507 Connect to host.
508
509 · p
510
511 dsn: password; copy: yes
512
513 Password to use when connecting.
514
515 · P
516
517 dsn: port; copy: yes
518
519 Port number to use for connection.
520
521 · S
522
523 dsn: mysql_socket; copy: yes
524
525 Socket file to use for connection.
526
527 · u
528
529 dsn: user; copy: yes
530
531 User for login if not current user.
532
534 You can download Maatkit from Google Code at
535 <http://code.google.com/p/maatkit/>, or you can get any of the tools
536 easily with a command like the following:
537
538 wget http://www.maatkit.org/get/toolname
539 or
540 wget http://www.maatkit.org/trunk/toolname
541
542 Where "toolname" can be replaced with the name (or fragment of a name)
543 of any of the Maatkit tools. Once downloaded, they're ready to run; no
544 installation is needed. The first URL gets the latest released version
545 of the tool, and the second gets the latest trunk code from Subversion.
546
548 The environment variable "MKDEBUG" enables verbose debugging output in
549 all of the Maatkit tools:
550
551 MKDEBUG=1 mk-....
552
554 You need Perl, DBI, DBD::mysql, and some core packages that ought to be
555 installed in any reasonably new version of Perl.
556
558 For list of known bugs see
559 http://www.maatkit.org/bugs/mk-slave-prefetch
560 <http://www.maatkit.org/bugs/mk-slave-prefetch>.
561
562 Please use Google Code Issues and Groups to report bugs or request
563 support: <http://code.google.com/p/maatkit/>. You can also join
564 #maatkit on Freenode to discuss Maatkit.
565
566 Please include the complete command-line used to reproduce the problem
567 you are seeing, the version of all MySQL servers involved, the complete
568 output of the tool when run with "--version", and if possible,
569 debugging output produced by running with the "MKDEBUG=1" environment
570 variable.
571
573 This program is copyright 2007-2010 Baron Schwartz. Feedback and
574 improvements are welcome.
575
576 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
577 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
578 MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
579
580 This program is free software; you can redistribute it and/or modify it
581 under the terms of the GNU General Public License as published by the
582 Free Software Foundation, version 2; OR the Perl Artistic License. On
583 UNIX and similar systems, you can issue `man perlgpl' or `man
584 perlartistic' to read these licenses.
585
586 You should have received a copy of the GNU General Public License along
587 with this program; if not, write to the Free Software Foundation, Inc.,
588 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
589
591 Baron Schwartz, Daniel Nichter
592
594 This tool is part of Maatkit, a toolkit for power users of MySQL.
595 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
596 primary code contributors. Both are employed by Percona. Financial
597 support for Maatkit development is primarily provided by Percona and
598 its clients.
599
601 This manual page documents Ver 1.0.20 Distrib 6839 $Revision: 6831 $.
602
603
604
605perl v5.12.1 2010-08-01 MK-SLAVE-PREFETCH(1)