1MK-TABLE-CHECKSUM(1)  User Contributed Perl Documentation MK-TABLE-CHECKSUM(1)
2
3
4

NAME

6       mk-table-checksum - Perform an online replication consistency check, or
7       checksum MySQL tables efficiently on one or many servers.
8

SYNOPSIS

10       Usage: mk-table-checksum [OPTION...] DSN [DSN...]
11
12       mk-table-checksum checksums MySQL tables efficiently on one or more
13       hosts.  Each host is specified as a DSN and missing values are
14       inherited from the first host.  If you specify multiple hosts, the
15       first is assumed to be the master.
16
17       STOP! Are you checksumming a slave(s) against its master?  Then be sure
18       to learn what "--replicate" does.  It is probably the option you want
19       to use.
20
21          mk-table-checksum --replicate=mydb.checksum master-host
22          ... time passses, replication catches up ...
23          mk-table-checksum --replicate=mydb.checksum --replicate-check 2 \
24             master-host
25
26       Or,
27
28          mk-table-checksum h=host1,u=user,p=password h=host2 ...
29
30       Or,
31
32          mk-table-checksum host1 host2 ... hostN | mk-checksum-filter
33
34       See "SPECIFYING HOSTS" for more on the syntax of the host arguments.
35

RISKS

37       The following section is included to inform users about the potential
38       risks, whether known or unknown, of using this tool.  The two main
39       categories of risks are those created by the nature of the tool (e.g.
40       read-only tools vs. read-write tools) and those created by bugs.
41
42       mk-table-checksum executes queries that cause the MySQL server to
43       checksum its data.  This can cause significant server load.  It is
44       read-only unless you use the "--replicate" option, in which case it
45       inserts a small amount of data into the specified table.
46
47       At the time of this release, we know of no bugs that could cause
48       serious harm to users.  There are miscellaneous bugs that might be
49       annoying.
50
51       The authoritative source for updated information is always the online
52       issue tracking system.  Issues that affect this tool will be marked as
53       such.  You can see a list of such issues at the following URL:
54       <http://www.maatkit.org/bugs/mk-table-checksum>.
55
56       See also "BUGS" for more information on filing bugs and getting help.
57

DESCRIPTION

59       mk-table-checksum generates table checksums for MySQL tables, typically
60       useful for verifying your slaves are in sync with the master.  The
61       checksums are generated by a query on the server, and there is very
62       little network traffic as a result.
63
64       Checksums typically take about twice as long as COUNT(*) on very large
65       InnoDB tables in my tests.  For smaller tables, COUNT(*) is a good bit
66       faster than the checksums.  See "--algorithm" for more details on
67       performance.
68
69       If you specify more than one server, mk-table-checksum assumes the
70       first server is the master and others are slaves.  Checksums are
71       parallelized for speed, forking off a child process for each table.
72       Duplicate server names are ignored, but if you want to checksum a
73       server against itself you can use two different forms of the hostname
74       (for example, "localhost 127.0.0.1", or "h=localhost,P=3306
75       h=localhost,P=3307").
76
77       If you want to compare the tables in one database to those in another
78       database on the same server, just checksum both databases:
79
80          mk-table-checksum --databases db1,db2
81
82       You can then use mk-checksum-filter to compare the results in both
83       databases easily.
84
85       mk-table-checksum examines table structure only on the first host
86       specified, so if anything differs on the others, it won't notice.  It
87       ignores views.
88
89       The checksums work on MySQL version 3.23.58 through 6.0-alpha.  They
90       will not necessarily produce the same values on all versions.
91       Differences in formatting and/or space-padding between 4.1 and 5.0, for
92       example, will cause the checksums to be different.
93

SPECIFYING HOSTS

95       mk-table-checksum connects to a theoretically unlimited number of MySQL
96       servers.  You specify a list of one or more host definitions on the
97       command line, such as "host1 host2".  Each host definition can be just
98       a hostname, or it can be a complex string that specifies connection
99       options as well.  You can specify connection options two ways:
100
101       •   Format a host definition in a key=value,key=value form.  If an
102           argument on the command line contains the letter '=', mk-table-
103           checksum will parse it into its component parts.  Examine the
104           "--help" output for details on the allowed keys.
105
106           Specifying a list of simple host definitions "host1 host2" is
107           equivalent to the more complicated "h=host1 h=host2" format.
108
109       •   With the command-line options such as "--user" and "--password".
110           These options, if given, apply globally to all host definitions.
111
112       In addition to specifying connection options this way, mk-table-
113       checksum allows shortcuts.  Any options specified for the first host
114       definition on the command line fill in missing values in subsequent
115       ones.  Any options that are still missing after this are filled in from
116       the command-line options if possible.
117
118       In other words, the places you specify connection options have
119       precedence: highest precedence is the option specified directly in the
120       host definition, next is the option specified in the first host
121       definition, and lowest is the command-line option.
122
123       You can mix simple and complex host definitions and/or command-line
124       arguments.  For example, if all your servers except one of your slaves
125       uses a non-standard port number:
126
127          mk-table-checksum --port 4500 master h=slave1,P=3306 slave2 slave3
128
129       If you are confused about how mk-table-checksum will connect to your
130       servers, give the "--explain-hosts" option and it will tell you.
131

HOW FAST IS IT?

133       Speed and efficiency are important, because the typical use case is
134       checksumming large amounts of data.
135
136       "mk-table-checksum" is designed to do very little work itself, and
137       generates very little network traffic aside from inspecting table
138       structures with "SHOW CREATE TABLE".  The results of checksum queries
139       are typically 40-character or shorter strings.
140
141       The MySQL server does the bulk of the work, in the form of the checksum
142       queries.  The following benchmarks show the checksum query times for
143       various checksum algorithms.  The first two results are simply running
144       COUNT(col8) and "CHECKSUM TABLE" on the table.  "CHECKSUM TABLE" is
145       just "CRC32" under the hood, but it's implemented inside the storage
146       engine layer instead of at the MySQL layer.
147
148        ALGORITHM       HASH FUNCTION  EXTRA           TIME
149        ==============  =============  ==============  =====
150        COUNT(col8)                                    2.3
151        CHECKSUM TABLE                                 5.3
152        BIT_XOR         FNV_64                         12.7
153        ACCUM           FNV_64                         42.4
154        BIT_XOR         MD5            --optimize-xor  80.0
155        ACCUM           MD5                            87.4
156        BIT_XOR         SHA1           --optimize-xor  90.1
157        ACCUM           SHA1                           101.3
158        BIT_XOR         MD5                            172.0
159        BIT_XOR         SHA1                           197.3
160
161       The tests are entirely CPU-bound.  The sample data is an InnoDB table
162       with the following structure:
163
164        CREATE TABLE test (
165          col1 int NOT NULL,
166          col2 date NOT NULL,
167          col3 int NOT NULL,
168          col4 int NOT NULL,
169          col5 int,
170          col6 decimal(3,1),
171          col7 smallint unsigned NOT NULL,
172          col8 timestamp NOT NULL,
173          PRIMARY KEY  (col2, col1),
174          KEY (col7),
175          KEY (col1)
176        ) ENGINE=InnoDB
177
178       The table has 4303585 rows, 365969408 bytes of data and 173457408 bytes
179       of indexes.  The server is a Dell PowerEdge 1800 with dual 32-bit Xeon
180       2.8GHz processors and 2GB of RAM.  The tests are fully CPU-bound, and
181       the server is otherwise idle.  The results are generally consistent to
182       within a tenth of a second on repeated runs.
183
184       "CRC32" is the default checksum function to use, and should be enough
185       for most cases.  If you need stronger guarantees that your data is
186       identical, you should use one of the other functions.
187

ALGORITHM SELECTION

189       The "--algorithm" option allows you to specify which algorithm you
190       would like to use, but it does not guarantee that mk-table-checksum
191       will use this algorithm.  mk-table-checksum will ultimately select the
192       best algorithm possible given various factors such as the MySQL version
193       and other command line options.
194
195       The three basic algorithms in descending order of preference are
196       CHECKSUM, BIT_XOR and ACCUM.  CHECKSUM cannot be used if any one of
197       these criteria is true:
198
199         * L<"--where"> is used.
200         * L<"--since"> is used.
201         * L<"--chunk-size"> is used.
202         * L<"--replicate"> is used.
203         * L<"--count"> is used.
204         * MySQL version less than 4.1.1.
205
206       The BIT_XOR algorithm also requires MySQL version 4.1.1 or later.
207
208       After checking these criteria, if the requested "--algorithm" remains
209       then it is used, otherwise the first remaining algorithm with the
210       highest preference is used.
211

CONSISTENT CHECKSUMS

213       If you are using this tool to verify your slaves still have the same
214       data as the master, which is why I wrote it, you should read this
215       section.
216
217       The best way to do this with replication is to use the "--replicate"
218       option.  When the queries are finished running on the master and its
219       slaves, you can go to the slaves and issue SQL queries to see if any
220       tables are different from the master.  Try the following:
221
222         SELECT db, tbl, chunk, this_cnt-master_cnt AS cnt_diff,
223            this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc)
224               AS crc_diff
225         FROM checksum
226         WHERE master_cnt <> this_cnt OR master_crc <> this_crc
227            OR ISNULL(master_crc) <> ISNULL(this_crc);
228
229       The "--replicate-check" option can do this query for you.  If you can't
230       use this method, try the following:
231
232       •   If your servers are not being written to, you can just run the tool
233           with no further ado:
234
235             mk-table-checksum server1 server2 ... serverN
236
237       •   If the servers are being written to, you need some way to make sure
238           they are consistent at the moment you run the checksums.  For
239           situations other than master-slave replication, you will have to
240           figure this out yourself.  You may be able to use the "--where"
241           option with a date or time column to only checksum data that's not
242           recent.
243
244       •   If you are checksumming a master and slaves, you can do a fast
245           parallel checksum and assume the slaves are caught up to the
246           master.  In practice, this tends to work well except for tables
247           which are constantly updated.  You can use the "--slave-lag" option
248           to see how far behind each slave was when it checksummed a given
249           table.  This can help you decide whether to investigate further.
250
251       •   The next most disruptive technique is to lock the table on the
252           master, then take checksums.  This should prevent changes from
253           propagating to the slaves.  You can just lock on the master (with
254           "--lock"), or you can both lock on the master and wait on the
255           slaves till they reach that point in the master's binlog
256           ("--wait").  Which is better depends on your workload; only you
257           know that.
258
259       •   If you decide to make the checksums on the slaves wait until
260           they're guaranteed to be caught up to the master, the algorithm
261           looks like this:
262
263            For each table,
264              Master: lock table
265              Master: get pos
266              In parallel,
267                Master: checksum
268                Slave(s): wait for pos, then checksum
269              End
270              Master: unlock table
271            End
272
273       What I typically do when I'm not using the "--replicate" option is
274       simply run the tool on all servers with no further options.  This runs
275       fast, parallel, non-blocking checksums simultaneously.  If there are
276       tables that look different, I re-run with "--wait"=600 on the tables in
277       question.  This makes the tool lock on the master as explained above.
278

OUTPUT

280       Output is to STDOUT, one line per server and table, with header lines
281       for each database.  I tried to make the output easy to process with
282       awk.  For this reason columns are always present.  If there's no value,
283       mk-table-checksum prints 'NULL'.
284
285       The default is column-aligned output for human readability, but you can
286       change it to tab-separated if you want.  Use the "--tab" option for
287       this.
288
289       Output is unsorted, though all lines for one table should be output
290       together.  For speed, all checksums are done in parallel (as much as
291       possible) and may complete out of the order in which they were started.
292       You might want to run them through another script or command-line
293       utility to make sure they are in the order you want.  If you pipe the
294       output through mk-checksum-filter, you can sort the output and/or avoid
295       seeing output about tables that have no differences.
296
297       The columns in the output are as follows.  The database, table, and
298       chunk come first so you can sort by them easily (they are the "primary
299       key").
300
301       Output from "--replicate-check" and "--checksum" are different.
302
303       DATABASE
304           The database the table is in.
305
306       TABLE
307           The table name.
308
309       CHUNK
310           The chunk (see "--chunk-size").  Zero if you are not doing chunked
311           checksums.
312
313       HOST
314           The server's hostname.
315
316       ENGINE
317           The table's storage engine.
318
319       COUNT
320           The table's row count, unless you specified to skip it.  If
321           "OVERSIZE" is printed, the chunk was skipped because the actual
322           number of rows was greater than "--chunk-size" times
323           "--chunk-size-limit".
324
325       CHECKSUM
326           The table's checksum, unless you specified to skip it or the table
327           has no rows.  some types of checksums will be 0 if there are no
328           rows; others will print NULL.
329
330       TIME
331           How long it took to checksum the "CHUNK", not including "WAIT"
332           time.  Total checksum time is "WAIT + TIME".
333
334       WAIT
335           How long the slave waited to catch up to its master before
336           beginning to checksum.  "WAIT" is always 0 for the master.  See
337           "--wait".
338
339       STAT
340           The return value of MASTER_POS_WAIT().  "STAT" is always "NULL" for
341           the master.
342
343       LAG How far the slave lags the master, as reported by SHOW SLAVE
344           STATUS.  "LAG" is always "NULL" for the master.
345

REPLICATE TABLE MAINTENANCE

347       If you use "--replicate" to store and replicate checksums, you may need
348       to perform maintenance on the replicate table from time to time to
349       remove old checksums.  This section describes when checksums in the
350       replicate table are deleted automatically by mk-table-checksum and when
351       you must manually delete them.
352
353       Before starting, mk-table-checksum calculates chunks for each table,
354       even if "--chunk-size" is not specified (in that case there is one
355       chunk: "1=1").  Then, before checksumming each table, the tool deletes
356       checksum chunks in the replicate table greater than the current number
357       of chunks.  For example, if a table is chunked into 100 chunks, 0-99,
358       then mk-table-checksum does:
359
360         DELETE FROM replicate table WHERE db=? AND tbl=? AND chunk > 99
361
362       That removes any high-end chunks from previous runs which no longer
363       exist.  Currently, this operation cannot be disabled.
364
365       If you use "--resume", "--resume-replicate", or "--modulo", then you
366       need to be careful that the number of rows in a table does not decrease
367       so much that the number of chunks decreases too, else some checksum
368       chunks may be deleted.  The one exception is if only rows at the high
369       end of the range are deleted.  In that case, the high-end chunks are
370       deleted and lower chunks remain unchanged.  An increasing number of
371       rows or chunks should not cause any adverse affects.
372
373       Changing the "--chunk-size" between runs with "--resume",
374       "--resume-replicate", or "--modulo" can cause odd or invalid checksums.
375       You should not do this.  It won't work with the resume options.  With
376       "--modulo", the safest thing to do is manually delete all the rows in
377       the replicate table for the table in question and start over.
378
379       If the replicate table becomes cluttered with old or invalid checksums
380       and the auto-delete operation is not deleting them, then you will need
381       to manually clean up the replicate table.  Alternatively, if you
382       specify "--empty-replicate-table", then the tool deletes every row in
383       the replicate table.
384

EXIT STATUS

386       An exit status of 0 (sometimes also called a return value or return
387       code) indicates success.  If there is an error checksumming any table,
388       the exit status is 1.
389
390       When running "--replicate-check", if any slave has chunks that differ
391       from the master, the exit status is 1.
392

QUERIES

394       If you are using innotop (see <http://code.google.com/p/innotop>),
395       mytop, or another tool to watch currently running MySQL queries, you
396       may see the checksum queries.  They look similar to this:
397
398         REPLACE /*test.test_tbl:'2'/'5'*/ INTO test.checksum(db, ...
399
400       Since mk-table-checksum's queries run for a long time and tend to be
401       textually very long, and thus won't fit on one screen of these
402       monitoring tools, I've been careful to place a comment at the beginning
403       of the query so you can see what it is and what it's doing.  The
404       comment contains the name of the table that's being checksummed, the
405       chunk it is currently checksumming, and how many chunks will be
406       checksummed.  In the case above, it is checksumming chunk 2 of 5 in
407       table test.test_tbl.
408

OPTIONS

410       "--schema" is restricted to option groups Connection, Filter, Output,
411       Help, Config, Safety.
412
413       "--empty-replicate-table", "--resume" and "--resume-replicate" are
414       mutually exclusive.
415
416       This tool accepts additional command-line arguments.  Refer to the
417       "SYNOPSIS" and usage information for details.
418
419       --algorithm
420           type: string
421
422           Checksum algorithm (ACCUM|CHECKSUM|BIT_XOR).
423
424           Specifies which checksum algorithm to use.  Valid arguments are
425           CHECKSUM, BIT_XOR and ACCUM.  The latter two do cryptographic hash
426           checksums.  See also "ALGORITHM SELECTION".
427
428           CHECKSUM is built into MySQL, but has some disadvantages.  BIT_XOR
429           and ACCUM are implemented by SQL queries.  They use a cryptographic
430           hash of all columns concatenated together with a separator,
431           followed by a bitmap of each nullable column that is NULL
432           (necessary because CONCAT_WS() skips NULL columns).
433
434           CHECKSUM is the default.  This method uses MySQL's built-in
435           CHECKSUM TABLE command, which is a CRC32 behind the scenes.  It
436           cannot be used before MySQL 4.1.1, and various options disable it
437           as well.  It does not simultaneously count rows; that requires an
438           extra COUNT(*) query.  This is a good option when you are using
439           MyISAM tables with live checksums enabled; in this case both the
440           COUNT(*) and CHECKSUM queries will run very quickly.
441
442           The BIT_XOR algorithm is available for MySQL 4.1.1 and newer.  It
443           uses BIT_XOR(), which is order-independent, to reduce all the rows
444           to a single checksum.
445
446           ACCUM uses a user variable as an accumulator.  It reduces each row
447           to a single checksum, which is concatenated with the accumulator
448           and re-checksummed.  This technique is order-dependent.  If the
449           table has a primary key, it will be used to order the results for
450           consistency; otherwise it's up to chance.
451
452           The pathological worst case is where identical rows will cancel
453           each other out in the BIT_XOR.  In this case you will not be able
454           to distinguish a table full of one value from a table full of
455           another value.  The ACCUM algorithm will distinguish them.
456
457           However, the ACCUM algorithm is order-dependent, so if you have two
458           tables with identical data but the rows are out of order, you'll
459           get different checksums with ACCUM.
460
461           If a given algorithm won't work for some reason, mk-table-checksum
462           falls back to another.  The least common denominator is ACCUM,
463           which works on MySQL 3.23.2 and newer.
464
465       --arg-table
466           type: string
467
468           The database.table with arguments for each table to checksum.
469
470           This table may be named anything you wish.  It must contain at
471           least the following columns:
472
473             CREATE TABLE checksum_args (
474                db         char(64)     NOT NULL,
475                tbl        char(64)     NOT NULL,
476                -- other columns as desired
477                PRIMARY KEY (db, tbl)
478             );
479
480           In addition to the columns shown, it may contain any of the other
481           columns listed here (Note: this list is used by the code,
482           MAGIC_overridable_args):
483
484             algorithm chunk-column chunk-index chunk-size columns count crc function lock
485             modulo use-index offset optimize-xor chunk-size-limit probability separator
486             save-since single-chunk since since-column sleep sleep-coef trim wait where
487
488           Each of these columns corresponds to the long form of a command-
489           line option.  Each column should be NULL-able.  Column names with
490           hyphens should be enclosed in backticks (e.g. `chunk-size`) when
491           the table is created.  The data type does not matter, but it's
492           suggested you use a sensible data type to prevent garbage data.
493
494           When "mk-table-checksum" checksums a table, it will look for a
495           matching entry in this table.  Any column that has a defined value
496           will override the corresponding command-line argument for the table
497           being currently processed.  In this way it is possible to specify
498           custom command-line arguments for any table.
499
500           If you add columns to the table that aren't in the above list of
501           allowable columns, it's an error.  The exceptions are "db", "tbl",
502           and "ts".  The "ts" column can be used as a timestamp for easy
503           visibility into the last time the "since" column was updated with
504           "--save-since".
505
506           This table is assumed to be located on the first server given on
507           the command-line.
508
509       --ask-pass
510           group: Connection
511
512           Prompt for a password when connecting to MySQL.
513
514       --check-interval
515           type: time; group: Throttle; default: 1s
516
517           How often to check for slave lag if "--check-slave-lag" is given.
518
519       --[no]check-replication-filters
520           default: yes; group: Safety
521
522           Do not "--replicate" if any replication filters are set.  When
523           --replicate is specified, mk-table-checksum tries to detect slaves
524           and look for options that filter replication, such as
525           binlog_ignore_db and replicate_do_db.  If it finds any such
526           filters, it aborts with an error.  Replication filtering makes it
527           impossible to be sure that the checksum queries won't break
528           replication or simply fail to replicate.  If you are sure that it's
529           OK to run the checksum queries, you can negate this option to
530           disable the checks.  See also "--replicate-database".
531
532       --check-slave-lag
533           type: DSN; group: Throttle
534
535           Pause checksumming until the specified slave's lag is less than
536           "--max-lag".
537
538           If this option is specified and "--throttle-method" is set to
539           "slavelag" then "--throttle-method" only checks this slave.
540
541       --checksum
542           group: Output
543
544           Print checksums and table names in the style of md5sum (disables
545           "--[no]count").
546
547           Makes the output behave more like the output of "md5sum".  The
548           checksum is first on the line, followed by the host, database,
549           table, and chunk number, concatenated with dots.
550
551       --chunk-column
552           type: string
553
554           Prefer this column for dividing tables into chunks.  By default,
555           mk-table-checksum chooses the first suitable column for each table,
556           preferring to use the primary key.  This option lets you specify a
557           preferred column, which mk-table-checksum uses if it exists in the
558           table and is chunkable.  If not, then mk-table-checksum will revert
559           to its default behavior.  Be careful when using this option; a poor
560           choice could cause bad performance.  This is probably best to use
561           when you are checksumming only a single table, not an entire
562           server.  See also "--chunk-index".
563
564       --chunk-index
565           type: string
566
567           Prefer this index for chunking tables.  By default, mk-table-
568           checksum chooses an appropriate index for the "--chunk-column"
569           (even if it chooses the chunk column automatically).  This option
570           lets you specify the index you prefer.  If the index doesn't exist,
571           then mk-table-checksum will fall back to its default behavior.  mk-
572           table-checksum adds the index to the checksum SQL statements in a
573           "FORCE INDEX" clause.  Be careful when using this option; a poor
574           choice of index could cause bad performance.  This is probably best
575           to use when you are checksumming only a single table, not an entire
576           server.
577
578       --chunk-range
579           type: string; default: open
580
581           Set which ends of the chunk range are open or closed.  Possible
582           values are one of MAGIC_chunk_range:
583
584              VALUE       OPENS/CLOSES
585              ==========  ======================
586              open        Both ends are open
587              openclosed  Low end open, high end closed
588
589           By default mk-table-checksum uses an open range of chunks like:
590
591             `id` <  '10'
592             `id` >= '10' AND < '20'
593             `id` >= '20'
594
595           That range is open because the last chunk selects any row with id
596           greater than (or equal to) 20.  An open range can be a problem in
597           cases where a lot of new rows are inserted with IDs greater than 20
598           while mk-table-checksumming is running because the final open-ended
599           chunk will select all the newly inserted rows.  (The less common
600           case of inserting rows with IDs less than 10 would require a
601           "closedopen" range but that is not currently implemented.)
602           Specifying "openclosed" will cause the final chunk to be closed
603           like:
604
605             `id` >= '20' AND `id` <= N
606
607           N is the MAX(`id`) that mk-table-checksum used when it first
608           chunked the rows.  Therefore, it will only chunk the range of rows
609           that existed when the tool started and not any newly inserted rows
610           (unless those rows happen to be inserted with IDs less than N).
611
612           See also "--chunk-size-limit".
613
614       --chunk-size
615           type: string
616
617           Approximate number of rows or size of data to checksum at a time.
618           Allowable suffixes are k, M, G. Disallows "--algorithm CHECKSUM".
619
620           If you specify a chunk size, mk-table-checksum will try to find an
621           index that will let it split the table into ranges of approximately
622           "--chunk-size" rows, based on the table's index statistics.
623           Currently only numeric and date types can be chunked.
624
625           If the table is chunkable, mk-table-checksum will checksum each
626           range separately with parameters in the checksum query's WHERE
627           clause.  If mk-table-checksum cannot find a suitable index, it will
628           do the entire table in one chunk as though you had not specified
629           "--chunk-size" at all.  Each table is handled individually, so some
630           tables may be chunked and others not.
631
632           The chunks will be approximately sized, and depending on the
633           distribution of values in the indexed column, some chunks may be
634           larger than the value you specify.
635
636           If you specify a suffix (one of k, M or G), the parameter is
637           treated as a data size rather than a number of rows.  The output of
638           SHOW TABLE STATUS is then used to estimate the amount of data the
639           table contains, and convert that to a number of rows.
640
641       --chunk-size-limit
642           type: float; default: 2.0; group: Safety
643
644           Do not checksum chunks with this many times more rows than
645           "--chunk-size".
646
647           When "--chunk-size" is given it specifies an ideal size for each
648           chunk of a chunkable table (in rows; size values are converted to
649           rows).  Before checksumming each chunk, mk-table-checksum checks
650           how many rows are in the chunk with EXPLAIN.  If the number of rows
651           reported by EXPLAIN is this many times greater than "--chunk-size",
652           then the chunk is skipped and "OVERSIZE" is printed for the "COUNT"
653           column of the "OUTPUT".
654
655           For example, if you specify "--chunk-size" 100 and a chunk has 150
656           rows, then it is checksummed with the default "--chunk-size-limit"
657           value 2.0 because 150 is less than 100 * 2.0.  But if the chunk has
658           205 rows, then it is not checksummed because 205 is greater than
659           100 * 2.0.
660
661           The minimum value for this option is 1 which means that no chunk
662           can be any larger than "--chunk-size".  You probably don't want to
663           specify 1 because rows reported by EXPLAIN are estimates which can
664           be greater than or less than the real number of rows in the chunk.
665           If too many chunks are skipped because they are oversize, you might
666           want to specify a value larger than 2.
667
668           You can disable oversize chunk checking by specifying
669           "--chunk-size-limit" 0.
670
671           See also "--unchunkable-tables".
672
673       --columns
674           short form: -c; type: array; group: Filter
675
676           Checksum only this comma-separated list of columns.
677
678       --config
679           type: Array; group: Config
680
681           Read this comma-separated list of config files; if specified, this
682           must be the first option on the command line.
683
684       --[no]count
685           Count rows in tables.  This is built into ACCUM and BIT_XOR, but
686           requires an extra query for CHECKSUM.
687
688           This is disabled by default to avoid an extra COUNT(*) query when
689           "--algorithm" is CHECKSUM.  If you have only MyISAM tables and live
690           checksums are enabled, both CHECKSUM and COUNT will be very fast,
691           but otherwise you may want to use one of the other algorithms.
692
693       --[no]crc
694           default: yes
695
696           Do a CRC (checksum) of tables.
697
698           Take the checksum of the rows as well as their count.  This is
699           enabled by default.  If you disable it, you'll just get COUNT(*)
700           queries.
701
702       --create-replicate-table
703           Create the replicate table given by "--replicate" if it does not
704           exist.
705
706           Normally, if the replicate table given by "--replicate" does not
707           exist, "mk-table-checksum" will die. With this option, however,
708           "mk-table-checksum" will create the replicate table for you, using
709           the database.table name given to "--replicate".
710
711           The structure of the replicate table is the same as the suggested
712           table mentioned in "--replicate". Note that since ENGINE is not
713           specified, the replicate table will use the server's default
714           storage engine.  If you want to use a different engine, you need to
715           create the table yourself.
716
717       --databases
718           short form: -d; type: hash; group: Filter
719
720           Only checksum this comma-separated list of databases.
721
722       --databases-regex
723           type: string
724
725           Only checksum databases whose names match this Perl regex.
726
727       --defaults-file
728           short form: -F; type: string; group: Connection
729
730           Only read mysql options from the given file.  You must give an
731           absolute pathname.
732
733       --empty-replicate-table
734           DELETE all rows in the "--replicate" table before starting.
735
736           Issues a DELETE against the table given by "--replicate" before
737           beginning work.  Ignored if "--replicate" is not specified.  This
738           can be useful to remove entries related to tables that no longer
739           exist, or just to clean out the results of a previous run.
740
741           If you want to delete entries for specific databases or tables you
742           must do this manually.
743
744       --engines
745           short form: -e; type: hash; group: Filter
746
747           Do only this comma-separated list of storage engines.
748
749       --explain
750           group: Output
751
752           Show, but do not execute, checksum queries (disables
753           "--empty-replicate-table").
754
755       --explain-hosts
756           group: Help
757
758           Print connection information and exit.
759
760           Print out a list of hosts to which mk-table-checksum will connect,
761           with all the various connection options, and exit.  See "SPECIFYING
762           HOSTS".
763
764       --float-precision
765           type: int
766
767           Precision for "FLOAT" and "DOUBLE" number-to-string conversion.
768           Causes FLOAT and DOUBLE values to be rounded to the specified
769           number of digits after the decimal point, with the ROUND() function
770           in MySQL.  This can help avoid checksum mismatches due to different
771           floating-point representations of the same values on different
772           MySQL versions and hardware.  The default is no rounding; the
773           values are converted to strings by the CONCAT() function, and MySQL
774           chooses the string representation.  If you specify a value of 2,
775           for example, then the values 1.008 and 1.009 will be rounded to
776           1.01, and will checksum as equal.
777
778       --function
779           type: string
780
781           Hash function for checksums (FNV1A_64, MURMUR_HASH, SHA1, MD5,
782           CRC32, etc).
783
784           You can use this option to choose the cryptographic hash function
785           used for "--algorithm"=ACCUM or "--algorithm"=BIT_XOR.  The default
786           is to use "CRC32", but "MD5" and "SHA1" also work, and you can use
787           your own function, such as a compiled UDF, if you wish.  Whatever
788           function you specify is run in SQL, not in Perl, so it must be
789           available to MySQL.
790
791           The "FNV1A_64" UDF mentioned in the benchmarks is much faster than
792           "MD5".  The C++ source code is distributed with Maatkit.  It is
793           very simple to compile and install; look at the header in the
794           source code for instructions.  If it is installed, it is preferred
795           over "MD5".  You can also use the MURMUR_HASH function if you
796           compile and install that as a UDF; the source is also distributed
797           with Maatkit, and it is faster and has better distribution than
798           FNV1A_64.
799
800       --help
801           group: Help
802
803           Show help and exit.
804
805       --ignore-columns
806           type: Hash; group: Filter
807
808           Ignore this comma-separated list of columns when calculating the
809           checksum.
810
811           This option only affects the checksum when using the ACCUM or
812           BIT_XOR "--algorithm".
813
814       --ignore-databases
815           type: Hash; group: Filter
816
817           Ignore this comma-separated list of databases.
818
819       --ignore-databases-regex
820           type: string
821
822           Ignore databases whose names match this Perl regex.
823
824       --ignore-engines
825           type: Hash; default: FEDERATED,MRG_MyISAM; group: Filter
826
827           Ignore this comma-separated list of storage engines.
828
829       --ignore-tables
830           type: Hash; group: Filter
831
832           Ignore this comma-separated list of tables.
833
834           Table names may be qualified with the database name.
835
836       --ignore-tables-regex
837           type: string
838
839           Ignore tables whose names match the Perl regex.
840
841       --lock
842           Lock on master until done on slaves (implies "--slave-lag").
843
844           This option can help you to get a consistent read on a master and
845           many slaves.  If you specify this option, mk-table-checksum will
846           lock the table on the first server on the command line, which it
847           assumes to be the master.  It will keep this lock until the
848           checksums complete on the other servers.
849
850           This option isn't very useful by itself, so you probably want to
851           use "--wait" instead.
852
853           Note: if you're checksumming a slave against its master, you should
854           use "--replicate".  In that case, there's no need for locking,
855           waiting, or any of that.
856
857       --max-lag
858           type: time; group: Throttle; default: 1s
859
860           Suspend checksumming if the slave given by "--check-slave-lag"
861           lags.
862
863           This option causes mk-table-checksum to look at the slave every
864           time it's about to checksum a chunk.  If the slave's lag is greater
865           than the option's value, or if the slave isn't running (so its lag
866           is NULL), mk-table-checksum sleeps for "--check-interval" seconds
867           and then looks at the lag again.  It repeats until the slave is
868           caught up, then proceeds to checksum the chunk.
869
870           This option is useful to let you checksum data as fast as the
871           slaves can handle it, assuming the slave you directed mk-table-
872           checksum to monitor is representative of all the slaves that may be
873           replicating from this server.  It should eliminate the need for
874           "--sleep" or "--sleep-coef".
875
876       --modulo
877           type: int
878
879           Do only every Nth chunk on chunked tables.
880
881           This option lets you checksum only some chunks of the table.  This
882           is a useful alternative to "--probability" when you want to be sure
883           you get full coverage in some specified number of runs; for
884           example, you can do only every 7th chunk, and then use "--offset"
885           to rotate the modulo every day of the week.
886
887           Just like with "--probability", a table that cannot be chunked is
888           done every time.
889
890       --offset
891           type: string; default: 0
892
893           Modulo offset expression for use with "--modulo".
894
895           The argument may be an SQL expression, such as "WEEKDAY(NOW())"
896           (which returns a number from 0 through 6).  The argument is
897           evaluated by MySQL.  The result is used as follows: if chunk_num %
898           "--modulo" == "--offset", the chunk will be checksummed.
899
900       --[no]optimize-xor
901           default: yes
902
903           Optimize BIT_XOR with user variables.
904
905           This option specifies to use user variables to reduce the number of
906           times each row must be passed through the cryptographic hash
907           function when you are using the BIT_XOR algorithm.
908
909           With the optimization, the queries look like this in pseudo-code:
910
911             SELECT CONCAT(
912                BIT_XOR(SLICE_OF(@user_variable)),
913                BIT_XOR(SLICE_OF(@user_variable)),
914                ...
915                BIT_XOR(SLICE_OF(@user_variable := HASH(col1, col2... colN))));
916
917           The exact positioning of user variables and calls to the hash
918           function is determined dynamically, and will vary between MySQL
919           versions.  Without the optimization, it looks like this:
920
921             SELECT CONCAT(
922                BIT_XOR(SLICE_OF(MD5(col1, col2... colN))),
923                BIT_XOR(SLICE_OF(MD5(col1, col2... colN))),
924                ...
925                BIT_XOR(SLICE_OF(MD5(col1, col2... colN))));
926
927           The difference is the number of times all the columns must be
928           mashed together and fed through the hash function.  If you are
929           checksumming really large columns, such as BLOB or TEXT columns,
930           this might make a big difference.
931
932       --password
933           short form: -p; type: string; group: Connection
934
935           Password to use when connecting.
936
937       --pid
938           type: string
939
940           Create the given PID file.  The file contains the process ID of the
941           script.  The PID file is removed when the script exits.  Before
942           starting, the script checks if the PID file already exists.  If it
943           does not, then the script creates and writes its own PID to it.  If
944           it does, then the script checks the following: if the file contains
945           a PID and a process is running with that PID, then the script dies;
946           or, if there is no process running with that PID, then the script
947           overwrites the file with its own PID and starts; else, if the file
948           contains no PID, then the script dies.
949
950       --port
951           short form: -P; type: int; group: Connection
952
953           Port number to use for connection.
954
955       --probability
956           type: int; default: 100
957
958           Checksums will be run with this percent probability.
959
960           This is an integer between 1 and 100.  If 100, every chunk of every
961           table will certainly be checksummed.  If less than that, there is a
962           chance that some chunks of some tables will be skipped.  This is
963           useful for routine jobs designed to randomly sample bits of tables
964           without checksumming the whole server.  By default, if a table is
965           not chunkable, it will be checksummed every time even when the
966           probability is less than 100.  You can override this with
967           "--single-chunk".
968
969           See also "--modulo".
970
971       --progress
972           type: array; default: time,30
973
974           Print progress reports to STDERR.  Currently, this feature is only
975           for when "--throttle-method" waits for slaves to catch up.
976
977           The value is a comma-separated list with two parts.  The first part
978           can be percentage, time, or iterations; the second part specifies
979           how often an update should be printed, in percentage, seconds, or
980           number of iterations.
981
982       --quiet
983           short form: -q; group: Output
984
985           Do not print checksum results.
986
987       --recheck
988           Re-checksum chunks that "--replicate-check" found to be different.
989
990       --recurse
991           type: int; group: Throttle
992
993           Number of levels to recurse in the hierarchy when discovering
994           slaves.  Default is infinite.
995
996           See "--recursion-method".
997
998       --recursion-method
999           type: string
1000
1001           Preferred recursion method for discovering slaves.
1002
1003           Possible methods are:
1004
1005             METHOD       USES
1006             ===========  ================
1007             processlist  SHOW PROCESSLIST
1008             hosts        SHOW SLAVE HOSTS
1009
1010           The processlist method is preferred because SHOW SLAVE HOSTS is not
1011           reliable.  However, the hosts method is required if the server uses
1012           a non-standard port (not 3306).  Usually mk-table-checksum does the
1013           right thing and finds the slaves, but you may give a preferred
1014           method and it will be used first.  If it doesn't find any slaves,
1015           the other methods will be tried.
1016
1017       --replicate
1018           type: string
1019
1020           Replicate checksums to slaves (disallows --algorithm CHECKSUM).
1021
1022           This option enables a completely different checksum strategy for a
1023           consistent, lock-free checksum across a master and its slaves.
1024           Instead of running the checksum queries on each server, you run
1025           them only on the master.  You specify a table, fully qualified in
1026           db.table format, to insert the results into.  The checksum queries
1027           will insert directly into the table, so they will be replicated
1028           through the binlog to the slaves.
1029
1030           When the queries are finished replicating, you can run a simple
1031           query on each slave to see which tables have differences from the
1032           master.  With the "--replicate-check" option, mk-table-checksum can
1033           run the query for you to make it even easier.  See "CONSISTENT
1034           CHECKSUMS" for details.
1035
1036           If you find tables that have differences, you can use the chunk
1037           boundaries in a WHERE clause with mk-table-sync to help repair them
1038           more efficiently.  See mk-table-sync for details.
1039
1040           The table must have at least these columns: db, tbl, chunk,
1041           boundaries, this_crc, master_crc, this_cnt, master_cnt.  The table
1042           may be named anything you wish.  Here is a suggested table
1043           structure, which is automatically used for
1044           "--create-replicate-table" (MAGIC_create_replicate):
1045
1046             CREATE TABLE checksum (
1047                db         char(64)     NOT NULL,
1048                tbl        char(64)     NOT NULL,
1049                chunk      int          NOT NULL,
1050                boundaries char(100)    NOT NULL,
1051                this_crc   char(40)     NOT NULL,
1052                this_cnt   int          NOT NULL,
1053                master_crc char(40)         NULL,
1054                master_cnt int              NULL,
1055                ts         timestamp    NOT NULL,
1056                PRIMARY KEY (db, tbl, chunk)
1057             );
1058
1059           Be sure to choose an appropriate storage engine for the checksum
1060           table.  If you are checksumming InnoDB tables, for instance, a
1061           deadlock will break replication if the checksum table is non-
1062           transactional, because the transaction will still be written to the
1063           binlog.  It will then replay without a deadlock on the slave and
1064           break replication with "different error on master and slave."  This
1065           is not a problem with mk-table-checksum, it's a problem with MySQL
1066           replication, and you can read more about it in the MySQL manual.
1067
1068           This works only with statement-based replication (mk-table-checksum
1069           will switch the binlog format to STATEMENT for the duration of the
1070           session if your server uses row-based replication).
1071
1072           In contrast to running the tool against multiple servers at once,
1073           using this option eliminates the complexities of synchronizing
1074           checksum queries across multiple servers, which normally requires
1075           locking and unlocking, waiting for master binlog positions, and so
1076           on.  Thus, it disables "--lock", "--wait", and "--slave-lag" (but
1077           not "--check-slave-lag", which is a way to throttle the execution
1078           speed).
1079
1080           The checksum queries actually do a REPLACE into this table, so
1081           existing rows need not be removed before running.  However, you may
1082           wish to do this anyway to remove rows related to tables that don't
1083           exist anymore.  The "--empty-replicate-table" option does this for
1084           you.
1085
1086           Since the table must be qualified with a database (e.g.
1087           "db.checksums"), mk-table-checksum will only USE this database.
1088           This may be important if any replication options are set because it
1089           could affect whether or not changes to the table are replicated.
1090
1091           If the slaves have any --replicate-do-X or --replicate-ignore-X
1092           options, you should be careful not to checksum any databases or
1093           tables that exist on the master and not the slaves.  Changes to
1094           such tables may not normally be executed on the slaves because of
1095           the --replicate options, but the checksum queries modify the
1096           contents of the table that stores the checksums, not the tables
1097           whose data you are checksumming.  Therefore, these queries will be
1098           executed on the slave, and if the table or database you're
1099           checksumming does not exist, the queries will cause replication to
1100           fail.  For more information on replication rules, see
1101           <http://dev.mysql.com/doc/en/replication-rules.html>.
1102
1103           The table specified by "--replicate" will never be checksummed
1104           itself.
1105
1106       --replicate-check
1107           type: int
1108
1109           Check results in "--replicate" table, to the specified depth.  You
1110           must use this after you run the tool normally; it skips the
1111           checksum step and only checks results.
1112
1113           It recursively finds differences recorded in the table given by
1114           "--replicate".  It recurses to the depth you specify: 0 is no
1115           recursion (check only the server you specify), 1 is check the
1116           server and its slaves, 2 is check the slaves of its slaves, and so
1117           on.
1118
1119           It finds differences by running the query shown in "CONSISTENT
1120           CHECKSUMS", and prints results, then exits after printing.  This is
1121           just a convenient way of running the query so you don't have to do
1122           it manually.
1123
1124           The output is one informational line per slave host, followed by
1125           the results of the query, if any.  If "--quiet" is specified, there
1126           is no output.  If there are no differences between the master and
1127           any slave, there is no output.  If any slave has chunks that differ
1128           from the master, mk-table-checksum's exit status is 1; otherwise it
1129           is 0.
1130
1131           This option makes "mk-table-checksum" look for slaves by running
1132           "SHOW PROCESSLIST".  If it finds connections that appear to be from
1133           slaves, it derives connection information for each slave with the
1134           same default-and-override method described in "SPECIFYING HOSTS".
1135
1136           If "SHOW PROCESSLIST" doesn't return any rows, "mk-table-checksum"
1137           looks at "SHOW SLAVE HOSTS" instead.  The host and port, and user
1138           and password if available, from "SHOW SLAVE HOSTS" are combined
1139           into a DSN and used as the argument.  This requires slaves to be
1140           configured with "report-host", "report-port" and so on.
1141
1142           This requires the @@SERVER_ID system variable, so it works only on
1143           MySQL 3.23.26 or newer.
1144
1145       --replicate-database
1146           type: string
1147
1148           "USE" only this database with "--replicate".  By default, mk-table-
1149           checksum executes USE to set its default database to the database
1150           that contains the table it's currently working on.  It changes its
1151           default database as it works on different tables.  This is is a
1152           best effort to avoid problems with replication filters such as
1153           binlog_ignore_db and replicate_ignore_db.  However, replication
1154           filters can create a situation where there simply is no one right
1155           way to do things.  Some statements might not be replicated, and
1156           others might cause replication to fail on the slaves.  In such
1157           cases, it is up to the user to specify a safe default database.
1158           This option specifies a default database that mk-table-checksum
1159           selects with USE, and never changes afterwards.  See also
1160           <L"--[no]check-replication-filters">.
1161
1162       --resume
1163           type: string
1164
1165           Resume checksum using given output file from a previously
1166           interrupted run.
1167
1168           The given output file should be the literal output from a previous
1169           run of "mk-table-checksum".  For example:
1170
1171              mk-table-checksum host1 host2 -C 100 > checksum_results.txt
1172              mk-table-checksum host1 host2 -C 100 --resume checksum_results.txt
1173
1174           The command line options given to the first run and the resumed run
1175           must be identical (except, of course, for --resume).  If they are
1176           not, the result will be unpredictable and probably wrong.
1177
1178           "--resume" does not work with "--replicate"; for that, use
1179           "--resume-replicate".
1180
1181       --resume-replicate
1182           Resume "--replicate".
1183
1184           This option resumes a previous checksum operation using
1185           "--replicate".  It is like "--resume" but does not require an
1186           output file.  Instead, it uses the checksum table given to
1187           "--replicate" to determine where to resume the checksum operation.
1188
1189       --save-since
1190           When "--arg-table" and "--since" are given, save the current
1191           "--since" value into that table's "since" column after
1192           checksumming.  In this way you can incrementally checksum tables by
1193           starting where the last one finished.
1194
1195           The value to be saved could be the current timestamp, or it could
1196           be the maximum existing value of the column given by
1197           "--since-column".  It depends on what options are in effect.  See
1198           the description of "--since" to see how timestamps are different
1199           from ordinary values.
1200
1201       --schema
1202           Checksum "SHOW CREATE TABLE" instead of table data.
1203
1204       --separator
1205           type: string; default: #
1206
1207           The separator character used for CONCAT_WS().
1208
1209           This character is used to join the values of columns when
1210           checksumming with "--algorithm" of BIT_XOR or ACCUM.
1211
1212       --set-vars
1213           type: string; default: wait_timeout=10000; group: Connection
1214
1215           Set these MySQL variables.  Immediately after connecting to MySQL,
1216           this string will be appended to SET and executed.
1217
1218       --since
1219           type: string
1220
1221           Checksum only data newer than this value.
1222
1223           If the table is chunk-able or nibble-able, this value will apply to
1224           the first column of the chunked or nibbled index.
1225
1226           This is not too different to "--where", but instead of universally
1227           applying a WHERE clause to every table, it selectively finds the
1228           right column to use and applies it only if such a column is found.
1229           See also "--since-column".
1230
1231           The argument may be an expression, which is evaluated by MySQL.
1232           For example, you can specify "CURRENT_DATE - INTERVAL 7 DAY" to get
1233           the date of one week ago.
1234
1235           A special bit of extra magic: if the value is temporal (looks like
1236           a date or datetime), then the table is checksummed only if the
1237           create time (or last modified time, for tables that report the last
1238           modified time, such as MyISAM tables) is newer than the value.  In
1239           this sense it's not applied as a WHERE clause at all.
1240
1241       --since-column
1242           type: string
1243
1244           The column name to be used for "--since".
1245
1246           The default is for the tool to choose the best one automatically.
1247           If you specify a value, that will be used if possible; otherwise
1248           the best auto-determined one; otherwise none.  If the column
1249           doesn't exist in the table, it is just ignored.
1250
1251       --single-chunk
1252           Permit skipping with "--probability" if there is only one chunk.
1253
1254           Normally, if a table isn't split into many chunks, it will always
1255           be checksummed regardless of "--probability".  This setting lets
1256           the probabilistic behavior apply to tables that aren't divided into
1257           chunks.
1258
1259       --slave-lag
1260           group: Output
1261
1262           Report replication delay on the slaves.
1263
1264           If this option is enabled, the output will show how many seconds
1265           behind the master each slave is.  This can be useful when you want
1266           a fast, parallel, non-blocking checksum, and you know your slaves
1267           might be delayed relative to the master.  You can inspect the
1268           results and make an educated guess whether any discrepancies on the
1269           slave are due to replication delay instead of corrupt data.
1270
1271           If you're using "--replicate", a slave that is delayed relative to
1272           the master does not invalidate the correctness of the results, so
1273           this option is disabled.
1274
1275       --sleep
1276           type: int; group: Throttle
1277
1278           Sleep time between checksums.
1279
1280           If this option is specified, mk-table-checksum will sleep the
1281           specified number of seconds between checksums.  That is, it will
1282           sleep between every table, and if you specify "--chunk-size", it
1283           will also sleep between chunks.
1284
1285           This is a very crude way to throttle checksumming; see
1286           "--sleep-coef" and "--check-slave-lag" for techniques that permit
1287           greater control.
1288
1289       --sleep-coef
1290           type: float; group: Throttle
1291
1292           Calculate "--sleep" as a multiple of the last checksum time.
1293
1294           If this option is specified, mk-table-checksum will sleep the
1295           amount of time elapsed during the previous checksum, multiplied by
1296           the specified coefficient.  This option is ignored if "--sleep" is
1297           specified.
1298
1299           This is a slightly more sophisticated way to throttle checksum
1300           speed: sleep a varying amount of time between chunks, depending on
1301           how long the chunks are taking.  Even better is to use
1302           "--check-slave-lag" if you're checksumming master/slave
1303           replication.
1304
1305       --socket
1306           short form: -S; type: string; group: Connection
1307
1308           Socket file to use for connection.
1309
1310       --tab
1311           group: Output
1312
1313           Print tab-separated output, not column-aligned output.
1314
1315       --tables
1316           short form: -t; type: hash; group: Filter
1317
1318           Do only this comma-separated list of tables.
1319
1320           Table names may be qualified with the database name.
1321
1322       --tables-regex
1323           type: string
1324
1325           Only checksum tables whose names match this Perl regex.
1326
1327       --throttle-method
1328           type: string; default: none; group: Throttle
1329
1330           Throttle checksumming when doing "--replicate".
1331
1332           At present there is only one method: "slavelag".  When
1333           "--replicate" is used, mk-table-checksum automatically sets
1334           "--throttle-method" to "slavelag" and discovers every slave and
1335           throttles checksumming if any slave lags more than "--max-lag".
1336           Specify "-throttle-method none" to disable this behavior
1337           completely, or specify "--check-slave-lag" and mk-table-checksum
1338           will only check that slave.
1339
1340           See also "--recurse" and "--recursion-method".
1341
1342       --trim
1343           Trim "VARCHAR" columns (helps when comparing 4.1 to >= 5.0).
1344
1345           This option adds a TRIM() to "VARCHAR" columns in "BIT_XOR" and
1346           "ACCUM" modes.
1347
1348           This is useful when you don't care about the trailing space
1349           differences between MySQL versions which vary in their handling of
1350           trailing spaces. MySQL 5.0 and later all retain trailing spaces in
1351           "VARCHAR", while previous versions would remove them.
1352
1353       --unchunkable-tables
1354           group: Safety
1355
1356           Checksum tables that cannot be chunked when "--chunk-size" is
1357           specified.
1358
1359           By default mk-table-checksum will not checksum a table that cannot
1360           be chunked when "--chunk-size" is specified because this might
1361           result in a huge, non-chunkable table being checksummed in one
1362           huge, memory-intensive chunk.
1363
1364           Specifying this option allows checksumming tables that cannot be
1365           chunked.  Be careful when using this option!  Make sure any non-
1366           chunkable tables are not so large that they will cause the tool to
1367           consume too much memory or CPU.
1368
1369           See also "--chunk-size-limit".
1370
1371       --[no]use-index
1372           default: yes
1373
1374           Add FORCE INDEX hints to SQL statements.
1375
1376           By default "mk-table-checksum" adds an index hint ("FORCE INDEX"
1377           for MySQL v4.0.9 and newer, "USE INDEX" for older MySQL versions)
1378           to each SQL statement to coerce MySQL into using the
1379           "--chunk-index" (whether the index is specified by the option or
1380           auto-detected).  Specifying "--no-use-index" causes
1381           "mk-table-checksum" to omit index hints.
1382
1383       --user
1384           short form: -u; type: string; group: Connection
1385
1386           User for login if not current user.
1387
1388       --[no]verify
1389           default: yes
1390
1391           Verify checksum compatibility across servers.
1392
1393           This option runs a trivial checksum on all servers to ensure they
1394           have compatible CONCAT_WS() and cryptographic hash functions.
1395
1396           Versions of MySQL before 4.0.14 will skip empty strings and NULLs
1397           in CONCAT_WS, and others will only skip NULLs.  The two kinds of
1398           behavior will produce different results if you have any columns
1399           containing the empty string in your table.  If you know you don't
1400           (for instance, all columns are integers), you can safely disable
1401           this check and you will get a reliable checksum even on servers
1402           with different behavior.
1403
1404       --version
1405           group: Help
1406
1407           Show version and exit.
1408
1409       --wait
1410           short form: -w; type: time
1411
1412           Wait this long for slaves to catch up to their master (implies
1413           "--lock" "--slave-lag").
1414
1415           Note: the best way to verify that a slave is in sync with its
1416           master is to use "--replicate" instead.  The "--wait" option is
1417           really only useful if you're trying to compare masters and slaves
1418           without using "--replicate", which is possible but complex and less
1419           efficient in some ways.
1420
1421           This option helps you get a consistent checksum across a master
1422           server and its slaves.  It combines locking and waiting to
1423           accomplish this.  First it locks the table on the master (the first
1424           server on the command line).  Then it finds the master's binlog
1425           position.  Checksums on slaves will be deferred until they reach
1426           the same binlog position.
1427
1428           The argument to the option is the number of seconds to wait for the
1429           slaves to catch up to the master.  It is actually the argument to
1430           MASTER_POS_WAIT().  If the slaves don't catch up to the master
1431           within this time, they will unblock and go ahead with the checksum.
1432           You can tell whether this happened by examining the STAT column in
1433           the output, which is the return value of MASTER_POS_WAIT().
1434
1435       --where
1436           type: string
1437
1438           Do only rows matching this "WHERE" clause (disallows "--algorithm"
1439           CHECKSUM).
1440
1441           You can use this option to limit the checksum to only part of the
1442           table.  This is particularly useful if you have append-only tables
1443           and don't want to constantly re-check all rows; you could run a
1444           daily job to just check yesterday's rows, for instance.
1445
1446           This option is much like the -w option to mysqldump.  Do not
1447           specify the WHERE keyword.  You may need to quote the value.  Here
1448           is an example:
1449
1450             mk-table-checksum --where "foo=bar"
1451
1452       --[no]zero-chunk
1453           default: yes
1454
1455           Add a chunk for rows with zero or zero-equivalent values.  The only
1456           has an effect when "--chunk-size" is specified.  The purpose of the
1457           zero chunk is to capture a potentially large number of zero values
1458           that would imbalance the size of the first chunk.  For example, if
1459           a lot of negative numbers were inserted into an unsigned integer
1460           column causing them to be stored as zeros, then these zero values
1461           are captured by the zero chunk instead of the first chunk and all
1462           its non-zero values.
1463

DSN OPTIONS

1465       These DSN options are used to create a DSN.  Each option is given like
1466       "option=value".  The options are case-sensitive, so P and p are not the
1467       same option.  There cannot be whitespace before or after the "=" and if
1468       the value contains whitespace it must be quoted.  DSN options are
1469       comma-separated.  See the maatkit manpage for full details.
1470
1471       •   A
1472
1473           dsn: charset; copy: yes
1474
1475           Default character set.
1476
1477       •   D
1478
1479           dsn: database; copy: yes
1480
1481           Default database.
1482
1483       •   F
1484
1485           dsn: mysql_read_default_file; copy: yes
1486
1487           Only read default options from the given file
1488
1489       •   h
1490
1491           dsn: host; copy: yes
1492
1493           Connect to host.
1494
1495       •   p
1496
1497           dsn: password; copy: yes
1498
1499           Password to use when connecting.
1500
1501       •   P
1502
1503           dsn: port; copy: yes
1504
1505           Port number to use for connection.
1506
1507       •   S
1508
1509           dsn: mysql_socket; copy: yes
1510
1511           Socket file to use for connection.
1512
1513       •   u
1514
1515           dsn: user; copy: yes
1516
1517           User for login if not current user.
1518

DOWNLOADING

1520       You can download Maatkit from Google Code at
1521       <http://code.google.com/p/maatkit/>, or you can get any of the tools
1522       easily with a command like the following:
1523
1524          wget http://www.maatkit.org/get/toolname
1525          or
1526          wget http://www.maatkit.org/trunk/toolname
1527
1528       Where "toolname" can be replaced with the name (or fragment of a name)
1529       of any of the Maatkit tools.  Once downloaded, they're ready to run; no
1530       installation is needed.  The first URL gets the latest released version
1531       of the tool, and the second gets the latest trunk code from Subversion.
1532

ENVIRONMENT

1534       The environment variable "MKDEBUG" enables verbose debugging output in
1535       all of the Maatkit tools:
1536
1537          MKDEBUG=1 mk-....
1538

SYSTEM REQUIREMENTS

1540       You need Perl, DBI, DBD::mysql, and some core packages that ought to be
1541       installed in any reasonably new version of Perl.
1542

BUGS

1544       For a list of known bugs see
1545       <http://www.maatkit.org/bugs/mk-table-checksum>.
1546
1547       Please use Google Code Issues and Groups to report bugs or request
1548       support: <http://code.google.com/p/maatkit/>.  You can also join
1549       #maatkit on Freenode to discuss Maatkit.
1550
1551       Please include the complete command-line used to reproduce the problem
1552       you are seeing, the version of all MySQL servers involved, the complete
1553       output of the tool when run with "--version", and if possible,
1554       debugging output produced by running with the "MKDEBUG=1" environment
1555       variable.
1556

COPYRIGHT, LICENSE AND WARRANTY

1558       This program is copyright 2007-2011 Baron Schwartz.  Feedback and
1559       improvements are welcome.
1560
1561       THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
1562       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
1563       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
1564
1565       This program is free software; you can redistribute it and/or modify it
1566       under the terms of the GNU General Public License as published by the
1567       Free Software Foundation, version 2; OR the Perl Artistic License.  On
1568       UNIX and similar systems, you can issue `man perlgpl' or `man
1569       perlartistic' to read these licenses.
1570
1571       You should have received a copy of the GNU General Public License along
1572       with this program; if not, write to the Free Software Foundation, Inc.,
1573       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.
1574

SEE ALSO

1576       See also mk-checksum-filter and mk-table-sync.
1577

AUTHOR

1579       Baron "Xaprb" Schwartz
1580

ABOUT MAATKIT

1582       This tool is part of Maatkit, a toolkit for power users of MySQL.
1583       Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
1584       primary code contributors.  Both are employed by Percona.  Financial
1585       support for Maatkit development is primarily provided by Percona and
1586       its clients.
1587

ACKNOWLEDGMENTS

1589       This is an incomplete list.  My apologies for omissions or
1590       misspellings.
1591
1592       Claus Jeppesen, Francois Saint-Jacques, Giuseppe Maxia, Heikki Tuuri,
1593       James Briggs, Martin Friebe, Sergey Zhuravlev,
1594

VERSION

1596       This manual page documents Ver 1.2.23 Distrib 7540 $Revision: 7527 $.
1597
1598
1599
1600perl v5.36.0                      2023-01-19              MK-TABLE-CHECKSUM(1)
Impressum