1MK-SLAVE-PREFETCH(1) User Contributed Perl Documentation MK-SLAVE-PREFETCH(1)
2
3
4
6 mk-slave-prefetch - Pipeline relay logs on a MySQL slave to pre-warm
7 caches.
8
10 Usage: mk-slave-prefetch [OPTION...] [FILE]
11
12 mk-slave-prefetch pipelines relay logs to pre-warm the slave's caches.
13
14 Example:
15
16 mk-slave-prefetch
17
18 mk-slave-prefetch --statistics > /path/to/saved/statistics
19
20 mk-slave-prefetch /path/to/saved/statistics
21
23 The following section is included to inform users about the potential
24 risks, whether known or unknown, of using this tool. The two main
25 categories of risks are those created by the nature of the tool (e.g.
26 read-only tools vs. read-write tools) and those created by bugs.
27
28 mk-slave-prefetch is read-only by default, and is generally low-risk.
29 It does execute SQL statements, but these should be SELECT only.
30 Despite this, it might be a good idea to make it connect to MySQL with
31 a user account that has minimal privileges. Here is an example of how
32 to grant the necessary privileges:
33
34 GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.*
35 TO 'prefetch'@'%' IDENTIFIED BY 'sp33dmeup!';
36
37 At the time of this release, we know of no bugs that could cause
38 serious harm to users.
39
40 The authoritative source for updated information is always the online
41 issue tracking system. Issues that affect this tool will be marked as
42 such. You can see a list of such issues at the following URL:
43 <http://www.maatkit.org/bugs/mk-slave-prefetch>.
44
45 See also "BUGS" for more information on filing bugs and getting help.
46
48 mk-slave-prefetch reads the slave's relay log slightly ahead of where
49 the slave's SQL thread is reading, converts statements into "SELECT",
50 and executes them. In theory, this should help alleviate the effects
51 of the slave's single-threaded SQL execution. It will help take
52 advantage of multiple CPUs and disks by pre-reading the data from disk,
53 so the data is already in the cache when the slave SQL thread executes
54 the un-modified version of the statement.
55
56 "mk-slave-prefetch" learns how long it takes statements to execute, and
57 doesn't try to execute those that take a very long time. You can ask
58 it to print what it has learned after it executes. You can also
59 specify a filename on the command line. The file should contain the
60 statistics printed by a previous run. These will be used to pre-
61 populate the statistics so it doesn't have to re-learn.
62
63 This program is based on concepts I heard Paul Tuckfield explain at the
64 November 2006 MySQL Camp un-conference. However, the code is my own
65 work. I have not seen any other implementation of Paul's idea.
66
68 Does it work? Does it actually speed up the slave?
69
70 That depends on your workload, hardware, and other factors. It might
71 work when the following are true:
72
73 · The slave's data is much larger than memory, and the workload is
74 mostly randomly scattered small (single-row is ideal) changes.
75
76 · There are lots of high-concurrency "UPDATE" and "DELETE" statements
77 on the master.
78
79 · The slave SQL thread is I/O-bound, but the slave overall has plenty
80 of spare I/O capacity (definitely more than one disk spindle).
81
82 · The slave uses InnoDB or another storage engine with row-level
83 locking.
84
85 It does not speed up replication on my slaves, which mostly have large
86 queries like "INSERT .. SELECT .. GROUP BY". In my benchmarks it
87 seemed to make no difference at all, positive or negative.
88
89 On the wrong workload or slave configuration, this technique might
90 actually make the slaves slower. Your mileage will vary.
91
92 User-contributed benchmarks are welcome.
93
95 Specify at least one of "--print", "--execute" or "--stop".
96
97 This tool accepts additional command-line arguments. Refer to the
98 "SYNOPSIS" and usage information for details.
99
100 --ask-pass
101 Prompt for a password when connecting to MySQL.
102
103 --charset
104 short form: -A; type: string
105
106 Default character set. If the value is utf8, sets Perl's binmode
107 on STDOUT to utf8, passes the mysql_enable_utf8 option to
108 DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any
109 other value sets binmode on STDOUT without the utf8 layer, and runs
110 SET NAMES after connecting to MySQL.
111
112 --check-interval
113 type: Array; default: 16,1,1024
114
115 How often to check the slave: init,min,max. This many relay log
116 events should pass before checking the output of "SHOW SLAVE
117 STATUS". The syntax is a three-number range: initial, minimum, and
118 maximum. You should be able to leave this at the defaults.
119
120 "mk-slave-prefetch" varies the check interval in powers of two,
121 depending on whether it decides the check was necessary.
122
123 --config
124 type: Array
125
126 Read this comma-separated list of config files; if specified, this
127 must be the first option on the command line.
128
129 --[no]continue-on-error
130 default: yes
131
132 Continue parsing even if there is an error.
133
134 --daemonize
135 Fork to the background and detach from the shell. POSIX operating
136 systems only.
137
138 --database
139 short form: -D; type: string
140
141 The database to use for the connection. The initial connection
142 will be to this database, but mk-slave-prefetch will issue "USE"
143 statements as required by the binary log events.
144
145 This database is also used as the default database for
146 "--secondary-indexes" if the database cannot automatically be
147 determined from the query.
148
149 --defaults-file
150 short form: -F; type: string
151
152 Only read mysql options from the given file. You must give an
153 absolute pathname.
154
155 --dry-run
156 Ignore replication checks and just read and rewrite the relay log
157 events.
158
159 This option make mk-slave-prefetch ignore all relay log related
160 checks for position, slave lag, etc. and simply causes the tool to
161 read and rewrite all the events in the relay log. A connection to
162 the slave server is still required.
163
164 --errors
165 cumulative: yes
166
167 Print queries that caused errors. If specified once, at exit; if
168 twice, in realtime.
169
170 If you specify this option once, you will see a report at the end
171 of the script execution, showing the normalized queries and the
172 number of times they were seen. If you specify this option twice,
173 you will see the errors printed out as they occur, but no
174 normalized report at the end of execution.
175
176 --execute
177 Execute the transformed queries to warm the caches.
178
179 --help
180 Show help and exit.
181
182 --host
183 short form: -h; type: string
184
185 Connect to host.
186
187 --[no]inject-columns
188 default: yes
189
190 Inject "(columns)" into INSERT/REPLACE that don't specify them.
191
192 Normally this query cannot be rewritten because mk-slave-prefetch
193 doesn't know which columns the values refer to: "INSERT INTO tbl
194 VALUES (1,2)". This option causes mk-slave-prefetch to "SHOW
195 CREATE TABLE" the table from the query, get its columns and inject
196 these columns into the query, like: "INSERT INTO tbl
197 (`col1`,`col2`) VALUES (1,2)". This allows the query to be written
198 as a SELECT and prefetched.
199
200 Columns for each unique database.table are cached, so this
201 operation may fail if an "ALTER TABLE" statement changes the order
202 or name of any columns.
203
204 --io-lag
205 type: size; default: 1k
206
207 How many bytes to lag the slave I/O thread. This helps avoid
208 "mysqlbinlog" reading right off the end of the relay log file.
209
210 --log
211 type: string
212
213 Print all output to this file when daemonized.
214
215 --max-query-time
216 type: float; default: 1
217
218 Do not run queries longer than this many seconds; fractions
219 allowed. If "mk-slave-prefetch" predicts the query will take
220 longer to execute, it will skip the query. This is based on the
221 theory that pre-warming the cache is most beneficial for short
222 queries.
223
224 "mk-slave-prefetch" learns how long queries require to execute. It
225 keeps an average over the last "--query-sample-size" samples of
226 each query. The averages are based on an abstracted version of the
227 query, with specific parameters replaced by placeholders. The
228 result is a sort of "fingerprint" for the query, not executable
229 SQL. You can see the learned statistics with the "--statistics"
230 option.
231
232 You can pre-load query fingerprints, and average execution times,
233 from a file. This way you don't have to wait for
234 "mk-slave-prefetch" to learn all over every time you start it.
235 Just specify the file on the command line. The format should be
236 the same as the output from "--statistics".
237
238 You might also want to filter out some statements completely, or
239 let only some statements through. See the "--reject-regexp" and
240 "--permit-regexp" options.
241
242 If "mk-slave-prefetch" hasn't seen a query's fingerprint before,
243 and thus doesn't know how long it will take to execute, it wraps it
244 in a subquery, like this:
245
246 SELECT 1 FROM ( <query> ) AS X LIMIT 1;
247
248 This helps avoid fetching a lot of data back to the client when a
249 query is very large. It requires a version of MySQL that supports
250 subqueries (version 4.1 and newer). If yours doesn't, the subquery
251 trick can't be used, so the query might fetch a lot of data back to
252 the client.
253
254 Once a query's fingerprint has been seen, so it's known that the
255 query isn't enormously slow, "mk-slave-prefetch" just rewrites the
256 "SELECT" list for efficiency. (Avoiding the subquery reduces the
257 query's overhead for short queries). The rewritten query will then
258 look like the following;
259
260 SELECT ISNULL(COALESCE(<columns>)) FROM ...
261
262 --num-prefix
263 Abstract away numeric table name prefixes. This causes the
264 following two queries to "fingerprint" to the same thing:
265
266 select from 1_2_users;
267 select from 2_3_users;
268
269 --offset
270 type: size; default: 128
271
272 How many bytes "mk-slave-prefetch" will try to stay in front of the
273 slave SQL thread. It will not execute log events it doesn't think
274 are at least this far ahead of the SQL thread. See also
275 "--window".
276
277 --password
278 short form: -p; type: string
279
280 Password to use when connecting.
281
282 --permit-regexp
283 type: string
284
285 Permit queries matching this Perl regexp. This is a filter for log
286 events. The regular expression is matched against the raw log
287 event, before any transformations are applied. If specified, this
288 option will permit only log events matching the regular expression.
289
290 --pid
291 type: string
292
293 Create the given PID file when daemonized. The file contains the
294 process ID of the daemonized instance. The PID file is removed
295 when the daemonized instance exits. The program checks for the
296 existence of the PID file when starting; if it exists and the
297 process with the matching PID exists, the program exits.
298
299 --port
300 short form: -P; type: int
301
302 Port number to use for connection.
303
304 --print
305 Print the transformed relay log events to standard output.
306
307 --print-nonrewritten
308 Print queries that could not be transformed into "SELECT".
309
310 --progress
311 type: int
312
313 Print progress information every X events. The information is the
314 current log file and position, plus a summary of the statistics
315 gathered.
316
317 --query-sample-size
318 type: int; default: 4
319
320 Average query exec time over this many queries. The last "N"
321 queries with a given fingerprint are averaged together to get the
322 average query execution time (see "--max-query-time").
323
324 --reject-regexp
325 type: string
326
327 Reject queries matching this Perl regexp. Similar to
328 "--permit-regexp", but has the opposite effect: log events must not
329 match the regular expression.
330
331 --relay-log
332 type: string
333
334 Read only the specified relay log file; - to read from STDIN.
335
336 By default mk-slave-prefetch reads the "Relay_Log_File" reported by
337 "SHOW SLAVE STATUS". This option allows you to specify a relay log
338 file that has already be converted to text by "mysqlbinlog". The
339 tool will exit after reading and parsing this file.
340
341 This option is useful with "--dry-run" and "--print" if you want to
342 see how the tool would rewrite the relay log's events without
343 executing them or having to wait for a lagged slave.
344
345 --relay-log-dir
346 type: string
347
348 Open the slave's "Relay_Log_File" relative to this directory.
349
350 Unless this option is specified, mk-slave-prefetch automatically
351 determines the directory that relay logs are in by first looking at
352 the "relay_log" system variable to see if it specifies a path. If
353 it does, this path is used; if it does not, then the "datadir"
354 variable value is used.
355
356 This option is ignored if an explicit "--relay-log" is specified.
357
358 --run-time
359 type: time
360
361 How long "mk-slave-prefetch" should run before exiting. The
362 default is to run forever.
363
364 --secondary-indexes
365 Prefetch secondary indexes for pipelined queries.
366
367 --sentinel
368 type: string; default: /tmp/mk-slave-prefetch-sentinel
369
370 Exit if this file exists.
371
372 --set-vars
373 type: string; default: wait_timeout=10000
374
375 Set these MySQL variables. Immediately after connecting to MySQL,
376 this string will be appended to SET and executed.
377
378 --sleep
379 type: time; default: 1s
380
381 Sleep time before checking for new events.
382
383 When mk-slave-prefetch is done reading all the events in a relay
384 log, it sleeps this amount of time before checking for new events.
385
386 This option is automatically set to zero if both "--relay-log" and
387 "--dry-run" are specified.
388
389 --socket
390 short form: -S; type: string
391
392 Socket file to use for connection.
393
394 --statistics
395 Print execution statistics after exiting. The statistics are in
396 two sections: counters, and queries. The counters simply count the
397 number of times events occur. You may see the following counters:
398
399 NAME MEANING
400 ====================== =======================================
401 mysqlbinlog Executed mysqlbinlog to read log events.
402 events The total number of relay log events.
403 not_far_enough_ahead An event was not at least L<"--offset">
404 bytes ahead of the SQL thread.
405 too_far_ahead An event was more than L<"--offset">
406 + L<"--window"> bytes ahead of the SQL thread.
407 too_close_to_io_thread An event was less than L<"--io-lag"> bytes
408 away from the I/O thread's position.
409 event_not_allowed An event wasn't a SET, USE, INSERT,
410 UPDATE, DELETE or REPLACE query.
411 event_filtered_out An event was filtered out because of
412 L<"--permit-regexp"> or L<"--reject-regexp">.
413 same_timestamp A SET TIMESTAMP event was ignored because
414 it had the same timestamp as the last one.
415 do_query A transformed event was executed
416 or printed.
417 query_error An executed query had an error.
418 query_too_long An event was not executed because its
419 average query length exceeded
420 L<"--max-query-time">.
421 query_not_rewritten A query could not be rewritten to a
422 SELECT.
423 master_pos_wait The tool waited for the SQL thread to
424 catch up.
425 show_slave_status The tool queried SHOW SLAVE STATUS.
426 load_data_infile The tool found a LOAD DATA INFILE query
427 and unlinked (deleted) the temp file.
428 could_not_unlink The tool failed to unlink a temp file.
429 sleep The tool slept for a second because the
430 slave's SQL thread was not running, or
431 because it read past the end of the log.
432
433 After the counters, "mk-slave-prefetch" prints information about
434 each query fingerprint it has seen, two lines per fingerprint. The
435 first line contains the query's fingerprint. The second line
436 contains the number of times the fingerprint was seen, number of
437 times executed, the sum of the execution times, and the average
438 execution time over the last "--query-sample-size" samples.
439
440 --stop
441 Stop running instances by creating the "--sentinel" file.
442
443 --threads
444 type: int; default: 2
445
446 Number of concurrent threads to use for pipelining queries.
447
448 --tmpdir
449 type: string; default: /dev/null
450
451 Where to create temp files for "LOAD DATA INFILE" queries. The
452 default will cause "mysqlbinlog" to skip the file and the
453 associated "LOAD DATA INFILE" command entirely.
454
455 If "mk-slave-prefetch" sees a "LOAD DATA INFILE" command (which it
456 won't if this is left at the default), it will try to remove the
457 temporary file, then skip the event.
458
459 --user
460 short form: -u; type: string
461
462 User for login if not current user.
463
464 --version
465 Show version and exit.
466
467 --window
468 type: size; default: 4k
469
470 The max bytes ahead of the slave "mk-slave-prefetch" should get.
471 Defines the window within which "mk-slave-prefetch" considers a
472 query OK to execute. The window begins at the slave SQL thread's
473 last known position plus "--offset" bytes, and extends for the
474 specified number of bytes.
475
476 If "mk-slave-prefetch" sees a log event that is too far in the
477 future, it will increment the "too_far_ahead" counter and wait for
478 the slave SQL thread to catch up (which increments the
479 "master_pos_wait" counter). If an event isn't far enough ahead of
480 the SQL thread, it will be discarded and the "not_far_enough_ahead"
481 counter increments.
482
483 Watching the mentioned statistics can help you understand how to
484 tune the window. You want "mk-slave-prefetch" to run just ahead of
485 the SQL thread, not throwing out a lot of events for being too far
486 ahead or not far enough ahead.
487
489 These DSN options are used to create a DSN. Each option is given like
490 "option=value". The options are case-sensitive, so P and p are not the
491 same option. There cannot be whitespace before or after the "=" and if
492 the value contains whitespace it must be quoted. DSN options are
493 comma-separated. See the maatkit manpage for full details.
494
495 · A
496
497 dsn: charset; copy: yes
498
499 Default character set.
500
501 · D
502
503 dsn: database; copy: yes
504
505 Default database.
506
507 · F
508
509 dsn: mysql_read_default_file; copy: yes
510
511 Only read default options from the given file
512
513 · h
514
515 dsn: host; copy: yes
516
517 Connect to host.
518
519 · p
520
521 dsn: password; copy: yes
522
523 Password to use when connecting.
524
525 · P
526
527 dsn: port; copy: yes
528
529 Port number to use for connection.
530
531 · S
532
533 dsn: mysql_socket; copy: yes
534
535 Socket file to use for connection.
536
537 · u
538
539 dsn: user; copy: yes
540
541 User for login if not current user.
542
544 You can download Maatkit from Google Code at
545 <http://code.google.com/p/maatkit/>, or you can get any of the tools
546 easily with a command like the following:
547
548 wget http://www.maatkit.org/get/toolname
549 or
550 wget http://www.maatkit.org/trunk/toolname
551
552 Where "toolname" can be replaced with the name (or fragment of a name)
553 of any of the Maatkit tools. Once downloaded, they're ready to run; no
554 installation is needed. The first URL gets the latest released version
555 of the tool, and the second gets the latest trunk code from Subversion.
556
558 The environment variable "MKDEBUG" enables verbose debugging output in
559 all of the Maatkit tools:
560
561 MKDEBUG=1 mk-....
562
564 You need Perl, DBI, DBD::mysql, and some core packages that ought to be
565 installed in any reasonably new version of Perl.
566
568 For a list of known bugs see
569 <http://www.maatkit.org/bugs/mk-slave-prefetch>.
570
571 Please use Google Code Issues and Groups to report bugs or request
572 support: <http://code.google.com/p/maatkit/>. You can also join
573 #maatkit on Freenode to discuss Maatkit.
574
575 Please include the complete command-line used to reproduce the problem
576 you are seeing, the version of all MySQL servers involved, the complete
577 output of the tool when run with "--version", and if possible,
578 debugging output produced by running with the "MKDEBUG=1" environment
579 variable.
580
582 This program is copyright 2007-2011 Baron Schwartz. Feedback and
583 improvements are welcome.
584
585 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
586 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
587 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
588
589 This program is free software; you can redistribute it and/or modify it
590 under the terms of the GNU General Public License as published by the
591 Free Software Foundation, version 2; OR the Perl Artistic License. On
592 UNIX and similar systems, you can issue `man perlgpl' or `man
593 perlartistic' to read these licenses.
594
595 You should have received a copy of the GNU General Public License along
596 with this program; if not, write to the Free Software Foundation, Inc.,
597 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
598
600 Baron Schwartz, Daniel Nichter
601
603 This tool is part of Maatkit, a toolkit for power users of MySQL.
604 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
605 primary code contributors. Both are employed by Percona. Financial
606 support for Maatkit development is primarily provided by Percona and
607 its clients.
608
610 This manual page documents Ver 1.0.21 Distrib 7540 $Revision: 7531 $.
611
612
613
614perl v5.28.0 2011-06-08 MK-SLAVE-PREFETCH(1)