1MK-SLAVE-PREFETCH(1)  User Contributed Perl Documentation MK-SLAVE-PREFETCH(1)
2
3
4

NAME

6       mk-slave-prefetch - Pipeline relay logs on a MySQL slave to pre-warm
7       caches.
8

SYNOPSIS

10        mk-slave-prefetch
11        mk-slave-prefetch --statistics > /path/to/saved/statistics
12        mk-slave-prefetch /path/to/saved/statistics
13

RISKS

15       The following section is included to inform users about the potential
16       risks, whether known or unknown, of using this tool.  The two main
17       categories of risks are those created by the nature of the tool (e.g.
18       read-only tools vs. read-write tools) and those created by bugs.
19
20       mk-slave-prefetch is read-only by default, and is generally low-risk.
21       It does execute SQL statements, but these should be SELECT only.
22       Despite this, it might be a good idea to make it connect to MySQL with
23       a user account that has minimal privileges.  Here is an example of how
24       to grant the necessary privileges:
25
26          GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.*
27             TO 'prefetch'@'%' IDENTIFIED BY 'sp33dmeup!';
28
29       At the time of this release, we know of no bugs that could cause
30       serious harm to users.
31
32       The authoritative source for updated information is always the online
33       issue tracking system.  Issues that affect this tool will be marked as
34       such.  You can see a list of such issues at the following URL:
35       http://www.maatkit.org/bugs/mk-slave-prefetch
36       <http://www.maatkit.org/bugs/mk-slave-prefetch>.
37
38       See also "BUGS" for more information on filing bugs and getting help.
39

DESCRIPTION

41       mk-slave-prefetch reads the slave's relay log slightly ahead of where
42       the slave's SQL thread is reading, converts statements into "SELECT",
43       and executes them.  In theory, this should help alleviate the effects
44       of the slave's single-threaded SQL execution.  It will help take
45       advantage of multiple CPUs and disks by pre-reading the data from disk,
46       so the data is already in the cache when the slave SQL thread executes
47       the un-modified version of the statement.
48
49       "mk-slave-prefetch" learns how long it takes statements to execute, and
50       doesn't try to execute those that take a very long time.  You can ask
51       it to print what it has learned after it executes.  You can also
52       specify a filename on the command line.  The file should contain the
53       statistics printed by a previous run.  These will be used to pre-
54       populate the statistics so it doesn't have to re-learn.
55
56       This program is based on concepts I heard Paul Tuckfield explain at the
57       November 2006 MySQL Camp un-conference.  However, the code is my own
58       work.  I have not seen any other implementation of Paul's idea.
59

DOES IT WORK?

61       Does it work?  Does it actually speed up the slave?
62
63       That depends on your workload, hardware, and other factors.  It might
64       work when the following are true:
65
66       ·   The slave's data is much larger than memory, and the workload is
67           mostly randomly scattered small (single-row is ideal) changes.
68
69       ·   There are lots of high-concurrency "UPDATE" and "DELETE" statements
70           on the master.
71
72       ·   The slave SQL thread is I/O-bound, but the slave overall has plenty
73           of spare I/O capacity (definitely more than one disk spindle).
74
75       ·   The slave uses InnoDB or another storage engine with row-level
76           locking.
77
78       It does not speed up replication on my slaves, which mostly have large
79       queries like "INSERT .. SELECT .. GROUP BY".  In my benchmarks it
80       seemed to make no difference at all, positive or negative.
81
82       On the wrong workload or slave configuration, this technique might
83       actually make the slaves slower.  Your mileage will vary.
84
85       User-contributed benchmarks are welcome.
86

OPTIONS

88       Specify at least one of "--print", "--execute" or "--stop".
89
90       --ask-pass
91           Prompt for a password when connecting to MySQL.
92
93       --charset
94           short form: -A; type: string
95
96           Default character set.  If the value is utf8, sets Perl's binmode
97           on STDOUT to utf8, passes the mysql_enable_utf8 option to
98           DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL.  Any
99           other value sets binmode on STDOUT without the utf8 layer, and runs
100           SET NAMES after connecting to MySQL.
101
102       --check-interval
103           type: Array; default: 16,1,1024
104
105           How often to check the slave: init,min,max.  This many relay log
106           events should pass before checking the output of "SHOW SLAVE
107           STATUS".  The syntax is a three-number range: initial, minimum, and
108           maximum.  You should be able to leave this at the defaults.
109
110           "mk-slave-prefetch" varies the check interval in powers of two,
111           depending on whether it decides the check was necessary.
112
113       --config
114           type: Array
115
116           Read this comma-separated list of config files; if specified, this
117           must be the first option on the command line.
118
119       --[no]continue-on-error
120           default: yes
121
122           Continue parsing even if there is an error.
123
124       --daemonize
125           Fork to the background and detach from the shell.  POSIX operating
126           systems only.
127
128       --database
129           short form: -D; type: string
130
131           The database to use for the connection.  The initial connection
132           will be to this database, but mk-slave-prefetch will issue "USE"
133           statements as required by the binary log events.
134
135           This database is also used as the default database for
136           "--secondary-indexes" if the database cannot automatically be
137           determined from the query.
138
139       --defaults-file
140           short form: -F; type: string
141
142           Only read mysql options from the given file.  You must give an
143           absolute pathname.
144
145       --dry-run
146           Ignore replication checks and just read and rewrite the relay log
147           events.
148
149           This option make mk-slave-prefetch ignore all relay log related
150           checks for position, slave lag, etc. and simply causes the tool to
151           read and rewrite all the events in the relay log.  A connection to
152           the slave server is still required.
153
154       --errors
155           cumulative: yes
156
157           Print queries that caused errors.  If specified once, at exit; if
158           twice, in realtime.
159
160           If you specify this option once, you will see a report at the end
161           of the script execution, showing the normalized queries and the
162           number of times they were seen.  If you specify this option twice,
163           you will see the errors printed out as they occur, but no
164           normalized report at the end of execution.
165
166       --execute
167           Execute the transformed queries to warm the caches.
168
169       --help
170           Show help and exit.
171
172       --host
173           short form: -h; type: string
174
175           Connect to host.
176
177       --[no]inject-columns
178           default: yes
179
180           Inject "(columns)" into INSERT/REPLACE that don't specify them.
181
182           Normally this query cannot be rewritten because mk-slave-prefetch
183           doesn't know which columns the values refer to: "INSERT INTO tbl
184           VALUES (1,2)".  This option causes mk-slave-prefetch to "SHOW
185           CREATE TABLE" the table from the query, get its columns and inject
186           these columns into the query, like: "INSERT INTO tbl
187           (`col1`,`col2`) VALUES (1,2)".  This allows the query to be written
188           as a SELECT and prefetched.
189
190           Columns for each unique database.table are cached, so this
191           operation may fail if an "ALTER TABLE" statement changes the order
192           or name of any columns.
193
194       --io-lag
195           type: size; default: 1k
196
197           How many bytes to lag the slave I/O thread.  This helps avoid
198           "mysqlbinlog" reading right off the end of the relay log file.
199
200       --log
201           type: string
202
203           Print all output to this file when daemonized.
204
205       --max-query-time
206           type: float; default: 1
207
208           Do not run queries longer than this many seconds; fractions
209           allowed.  If "mk-slave-prefetch" predicts the query will take
210           longer to execute, it will skip the query.  This is based on the
211           theory that pre-warming the cache is most beneficial for short
212           queries.
213
214           "mk-slave-prefetch" learns how long queries require to execute.  It
215           keeps an average over the last "--query-sample-size" samples of
216           each query.  The averages are based on an abstracted version of the
217           query, with specific parameters replaced by placeholders.  The
218           result is a sort of "fingerprint" for the query, not executable
219           SQL.  You can see the learned statistics with the "--statistics"
220           option.
221
222           You can pre-load query fingerprints, and average execution times,
223           from a file.  This way you don't have to wait for
224           "mk-slave-prefetch" to learn all over every time you start it.
225           Just specify the file on the command line.  The format should be
226           the same as the output from "--statistics".
227
228           You might also want to filter out some statements completely, or
229           let only some statements through.  See the "--reject-regexp" and
230           "--permit-regexp" options.
231
232           If "mk-slave-prefetch" hasn't seen a query's fingerprint before,
233           and thus doesn't know how long it will take to execute, it wraps it
234           in a subuery, like this:
235
236              SELECT 1 FROM ( <query> ) AS X LIMIT 1;
237
238           This helps avoid fetching a lot of data back to the client when a
239           query is very large.  It requires a version of MySQL that supports
240           subqueries (version 4.1 and newer).  If yours doesn't, the subquery
241           trick can't be used, so the query might fetch a lot of data back to
242           the client.
243
244           Once a query's fingerprint has been seen, so it's known that the
245           query isn't enormously slow, "mk-slave-prefetch" just rewrites the
246           "SELECT" list for efficiency.  (Avoiding the subquery reduces the
247           query's overhead for short queries).  The rewritten query will then
248           look like the following;
249
250              SELECT ISNULL(COALESCE(<columns>)) FROM ...
251
252       --num-prefix
253           Abstract away numeric table name prefixes.  This causes the
254           following two queries to "fingeprint" to the same thing:
255
256             select from 1_2_users;
257             select from 2_3_users;
258
259       --offset
260           type: size; default: 128
261
262           How many bytes "mk-slave-prefetch" will try to stay in front of the
263           slave SQL thread.  It will not execute log events it doesn't think
264           are at least this far ahead of the SQL thread.  See also
265           "--window".
266
267       --password
268           short form: -p; type: string
269
270           Password to use when connecting.
271
272       --permit-regexp
273           type: string
274
275           Permit queries matching this Perl regexp.  This is a filter for log
276           events.  The regular expression is matched against the raw log
277           event, before any transformations are applied.  If specified, this
278           option will permit only log events matching the regular expression.
279
280       --pid
281           type: string
282
283           Create the given PID file when daemonized.  The file contains the
284           process ID of the daemonized instance.  The PID file is removed
285           when the daemonized instance exits.  The program checks for the
286           existence of the PID file when starting; if it exists and the
287           process with the matching PID exists, the program exits.
288
289       --port
290           short form: -P; type: int
291
292           Port number to use for connection.
293
294       --print
295           Print the transformed relay log events to standard output.
296
297       --print-nonrewritten
298           Print queries that could not be transformed into "SELECT".
299
300       --progress
301           type: int
302
303           Print progress information every X events.  The information is the
304           current log file and position, plus a summary of the statistics
305           gathered.
306
307       --query-sample-size
308           type: int; default: 4
309
310           Average query exec time over this many queries.  The last "N"
311           queries with a given fingerprint are averaged together to get the
312           average query execution time (see "--max-query-time").
313
314       --reject-regexp
315           type: string
316
317           Reject queries matching this Perl regexp.  Similar to
318           "--permit-regexp", but has the opposite effect: log events must not
319           match the regular expression.
320
321       --relay-log
322           type: string
323
324           Read only the specified relay log file; - to read from STDIN.
325
326           By default mk-slave-prefetch reads the "Relay_Log_File" reported by
327           "SHOW SLAVE STATUS".  This option allows you to specify a relay log
328           file that has already be converted to text by "mysqlbinlog".  The
329           tool will exit after reading and parsing this file.
330
331           This option is useful with "--dry-run" and "--print" if you want to
332           see how the tool would rewrite the relay log's events without
333           executing them or having to wait for a lagged slave.
334
335       --relay-log-dir
336           type: string
337
338           Open the slave's "Relay_Log_File" relative to this directory.
339
340           Unless this option is specified, mk-slave-prefetch automatically
341           determines the directory that relay logs are in by first looking at
342           the "relay_log" system variable to see if it specifies a path.  If
343           it does, this path is used; if it does not, then the "datadir"
344           variable value is used.
345
346           This option is ignored if an explicit "--relay-log" is specified.
347
348       --run-time
349           type: time
350
351           How long "mk-slave-prefetch" should run before exiting.  The
352           default is to run forever.
353
354       --secondary-indexes
355           Prefetch secondary indexes for pipelined queries.
356
357       --sentinel
358           type: string; default: /tmp/mk-slave-prefetch-sentinel
359
360           Exit if this file exists.
361
362       --set-vars
363           type: string; default: wait_timeout=10000
364
365           Set these MySQL variables.  Immediately after connecting to MySQL,
366           this string will be appended to SET and executed.
367
368       --sleep
369           type: time; default: 1s
370
371           Sleep time before checking for new events.
372
373           When mk-slave-prefetch is done reading all the events in a relay
374           log, it sleeps this amount of time before checking for new events.
375
376           This option is automatically set to zero if both "--relay-log" and
377           "--dry-run" are specified.
378
379       --socket
380           short form: -S; type: string
381
382           Socket file to use for connection.
383
384       --statistics
385           Print execution statistics after exiting.  The statistics are in
386           two sections: counters, and queries.  The counters simply count the
387           number of times events occur.  You may see the following counters:
388
389              NAME                    MEANING
390              ======================  =======================================
391              mysqlbinlog             Executed mysqlbinlog to read log events.
392              events                  The total number of relay log events.
393              not_far_enough_ahead    An event was not at least L<"--offset">
394                                      bytes ahead of the SQL thread.
395              too_far_ahead           An event was more than L<"--offset">
396                                      + L<"--window"> bytes ahead of the SQL thread.
397              too_close_to_io_thread  An event was less than L<"--io-lag"> bytes
398                                      away from the I/O thread's position.
399              event_not_allowed       An event wasn't a SET, USE, INSERT,
400                                      UPDATE, DELETE or REPLACE query.
401              event_filtered_out      An event was filtered out because of
402                                      L<"--permit-regexp"> or L<"--reject-regexp">.
403              same_timestamp          A SET TIMESTAMP event was ignored because
404                                      it had the same timestamp as the last one.
405              do_query                A transformed event was executed
406                                      or printed.
407              query_error             An executed query had an error.
408              query_too_long          An event was not executed because its
409                                      average query length exceeded
410                                      L<"--max-query-time">.
411              query_not_rewritten     A query could not be rewritten to a
412                                      SELECT.
413              master_pos_wait         The tool waited for the SQL thread to
414                                      catch up.
415              show_slave_status       The tool queried SHOW SLAVE STATUS.
416              load_data_infile        The tool found a LOAD DATA INFILE query
417                                      and unlinked (deleted) the temp file.
418              could_not_unlink        The tool failed to unlink a temp file.
419              sleep                   The tool slept for a second because the
420                                      slave's SQL thread was not running, or
421                                      because it read past the end of the log.
422
423           After the counters, "mk-slave-prefetch" prints information about
424           each query fingerprint it has seen, two lines per fingerprint.  The
425           first line contains the query's fingerprint.  The second line
426           contains the number of times the fingerprint was seen, number of
427           times executed, the sum of the execution times, and the average
428           execution time over the last "--query-sample-size" samples.
429
430       --stop
431           Stop running instances by creating the "--sentinel" file.
432
433       --threads
434           type: int; default: 2
435
436           Number of concurrent threads to use for pipelining queries.
437
438       --tmpdir
439           type: string; default: /dev/null
440
441           Where to create temp files for "LOAD DATA INFILE" queries.  The
442           default will cause "mysqlbinlog" to skip the file and the
443           associated "LOAD DATA INFILE" command entirely.
444
445           If "mk-slave-prefetch" sees a "LOAD DATA INFILE" command (which it
446           won't if this is left at the default), it will try to remove the
447           temporary file, then skip the event.
448
449       --user
450           short form: -u; type: string
451
452           User for login if not current user.
453
454       --version
455           Show version and exit.
456
457       --window
458           type: size; default: 4k
459
460           The max bytes ahead of the slave "mk-slave-prefetch" should get.
461           Defines the window within which "mk-slave-prefetch" considers a
462           query OK to execute.  The window begins at the slave SQL thread's
463           last known position plus "--offset" bytes, and extends for the
464           specified number of bytes.
465
466           If "mk-slave-prefetch" sees a log event that is too far in the
467           future, it will increment the "too_far_ahead" counter and wait for
468           the slave SQL thread to catch up (which increments the
469           "master_pos_wait" counter).  If an event isn't far enough ahead of
470           the SQL thread, it will be discarded and the "not_far_enough_ahead"
471           counter increments.
472
473           Watching the mentioned statistics can help you understand how to
474           tune the window.  You want "mk-slave-prefetch" to run just ahead of
475           the SQL thread, not throwing out a lot of events for being too far
476           ahead or not far enough ahead.
477

DSN OPTIONS

479       These DSN options are used to create a DSN.  Each option is given like
480       "option=value".  The options are case-sensitive, so P and p are not the
481       same option.  There cannot be whitespace before or after the "=" and if
482       the value contains whitespace it must be quoted.  DSN options are
483       comma-separated.  See the maatkit manpage for full details.
484
485       ·   A
486
487           dsn: charset; copy: yes
488
489           Default character set.
490
491       ·   D
492
493           dsn: database; copy: yes
494
495           Default database.
496
497       ·   F
498
499           dsn: mysql_read_default_file; copy: yes
500
501           Only read default options from the given file
502
503       ·   h
504
505           dsn: host; copy: yes
506
507           Connect to host.
508
509       ·   p
510
511           dsn: password; copy: yes
512
513           Password to use when connecting.
514
515       ·   P
516
517           dsn: port; copy: yes
518
519           Port number to use for connection.
520
521       ·   S
522
523           dsn: mysql_socket; copy: yes
524
525           Socket file to use for connection.
526
527       ·   u
528
529           dsn: user; copy: yes
530
531           User for login if not current user.
532

DOWNLOADING

534       You can download Maatkit from Google Code at
535       <http://code.google.com/p/maatkit/>, or you can get any of the tools
536       easily with a command like the following:
537
538          wget http://www.maatkit.org/get/toolname
539          or
540          wget http://www.maatkit.org/trunk/toolname
541
542       Where "toolname" can be replaced with the name (or fragment of a name)
543       of any of the Maatkit tools.  Once downloaded, they're ready to run; no
544       installation is needed.  The first URL gets the latest released version
545       of the tool, and the second gets the latest trunk code from Subversion.
546

ENVIRONMENT

548       The environment variable "MKDEBUG" enables verbose debugging output in
549       all of the Maatkit tools:
550
551          MKDEBUG=1 mk-....
552

SYSTEM REQUIREMENTS

554       You need Perl, DBI, DBD::mysql, and some core packages that ought to be
555       installed in any reasonably new version of Perl.
556

BUGS

558       For list of known bugs see
559       http://www.maatkit.org/bugs/mk-slave-prefetch
560       <http://www.maatkit.org/bugs/mk-slave-prefetch>.
561
562       Please use Google Code Issues and Groups to report bugs or request
563       support: <http://code.google.com/p/maatkit/>.  You can also join
564       #maatkit on Freenode to discuss Maatkit.
565
566       Please include the complete command-line used to reproduce the problem
567       you are seeing, the version of all MySQL servers involved, the complete
568       output of the tool when run with "--version", and if possible,
569       debugging output produced by running with the "MKDEBUG=1" environment
570       variable.
571

COPYRIGHT, LICENSE AND WARRANTY

573       This program is copyright 2007-2010 Baron Schwartz.  Feedback and
574       improvements are welcome.
575
576       THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
577       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
578       MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
579
580       This program is free software; you can redistribute it and/or modify it
581       under the terms of the GNU General Public License as published by the
582       Free Software Foundation, version 2; OR the Perl Artistic License.  On
583       UNIX and similar systems, you can issue `man perlgpl' or `man
584       perlartistic' to read these licenses.
585
586       You should have received a copy of the GNU General Public License along
587       with this program; if not, write to the Free Software Foundation, Inc.,
588       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.
589

AUTHOR

591       Baron Schwartz, Daniel Nichter
592

ABOUT MAATKIT

594       This tool is part of Maatkit, a toolkit for power users of MySQL.
595       Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
596       primary code contributors.  Both are employed by Percona.  Financial
597       support for Maatkit development is primarily provided by Percona and
598       its clients.
599

VERSION

601       This manual page documents Ver 1.0.20 Distrib 6839 $Revision: 6831 $.
602
603
604
605perl v5.12.1                      2010-08-01              MK-SLAVE-PREFETCH(1)
Impressum