1MK-PARALLEL-RESTORE(1)User Contributed Perl DocumentationMK-PARALLEL-RESTORE(1)
2
3
4

NAME

6       mk-parallel-restore - (DEPRECATED) Load files into MySQL in parallel.
7

SYNOPSIS

9       This tool is deprecated because after several complete redesigns, we
10       concluded that Perl is the wrong technology for this task.  Read
11       "RISKS" before you use it, please.  It remains useful for some people
12       who we know aren't depending on it in production, and therefore we are
13       not removing it from the distribution.
14
15       Usage: mk-parallel-restore [OPTION...] [DSN] PATH [PATH...]
16
17       mk-parallel-restore description loads files into MySQL in parallel to
18       make some type of data loads faster.  IT IS NOT A BACKUP TOOL!
19
20         mk-parallel-restore /path/to/files
21         mk-parallel-restore --tab /path/to/files
22

RISKS

24       The following section is included to inform users about the potential
25       risks, whether known or unknown, of using this tool.  The two main
26       categories of risks are those created by the nature of the tool (e.g.
27       read-only tools vs. read-write tools) and those created by bugs.
28
29       mk-parallel-restore is not a backup program!  It is only for fast data
30       imports, for purposes such as loading test data into a system quickly.
31       Do not use mk-parallel-restore for backups.  mk-parallel-restore
32       inserts data unless you use the "--dry-run" option.
33
34       At the time of this release, there is a bug that prevents huge
35       statements from being printed when an error is encountered, a bug
36       applying "--[no]foreign-key-checks" when truncating tables, and a bug
37       with LOAD DATA LOCAL.
38
39       The authoritative source for updated information is always the online
40       issue tracking system.  Issues that affect this tool will be marked as
41       such.  You can see a list of such issues at the following URL:
42       <http://www.maatkit.org/bugs/mk-parallel-restore>.
43
44       See also "BUGS" for more information on filing bugs and getting help.
45

DESCRIPTION

47       mk-parallel-restore is a way to load SQL or delimited-file dumps into
48       MySQL in parallel at high speed.  It is especially designed for
49       restoring files dumped by mk-parallel-dump.  It automatically detects
50       whether a file contains SQL or delimited data from the filename
51       extension, and either shells out to "mysql" or executes "LOAD DATA
52       INFILE" with the file.  On UNIX-like systems, it will even make a FIFO
53       to decompress gzipped files for "LOAD DATA INFILE".
54
55       By default it discovers all files in the directory you specify on the
56       command line.  It uses the file's parent directory as the database name
57       and the file's name (up to the first dot) as the table name.  It can
58       deal with files named like the following:
59
60         dir/tbl.sql
61         dir/tbl.txt
62         dir/tbl.csv
63
64       It is also happy with files that look like this, where "EXT" is one of
65       the extensions just listed.
66
67         dir/tbl.EXT.000
68         dir/tbl.EXT.000.gz
69
70       By default, it loads "SQL" files first, if they exist, then loads "CSV"
71       or "TXT" files next, in order of the numbers in the filename extension
72       as just shown.  This makes it easy for you to reload a table's
73       definition followed by its data, in case you dumped them into separate
74       files (as happens with "mysqldump"'s "--tab" option).  See mk-parallel-
75       dump for details on how data is dumped.
76
77       Exit status is 0 if everything went well, 1 if any files failed, and
78       any other value indicates an internal error.
79

OUTPUT

81       Output depends on verbosity.  When "--dry-run" is given, output
82       includes commands that would be executed.
83
84       When "--verbose" is 0, there is normally no output unless there's an
85       error.
86
87       When "--verbose" is 1, there is one line of output for the entire job,
88       showing how many tables were processed, how many files were loaded with
89       what status, how much time elapsed, and how much time the parallel load
90       jobs added up to.  If any files were skipped, the filenames are printed
91       to the output.
92
93       When "--verbose" is 2, there's one line of output per table, showing
94       extra data such as how many threads were running when each table
95       finished loading:
96
97         DATABASE TABLE            FILES  TIME STATUS THREADS
98         sakila   language             2  0.07      0       2
99         sakila   film_actor           2  0.07      0       2
100         sakila   actor                2  0.06      0       2
101         sakila   payment              2  0.07      0       2
102         sakila   transport_backup     2  0.05      0       2
103         sakila   country              2  0.08      0       2
104         sakila   film                 2  0.05      0       2
105         sakila   rental               2  0.07      0       2
106

SPEED OF PARALLEL LOADING

108       User-contributed benchmarks are welcome.  See
109       <http://www.paragon-cs.com/wordpress/?p=52> for one user's experiences.
110

OPTIONS

112       This tool accepts additional command-line arguments.  Refer to the
113       "SYNOPSIS" and usage information for details.
114
115       --ask-pass
116           Prompt for a password when connecting to MySQL.
117
118       --[no]atomic-resume
119           default: yes
120
121           Treat chunks as atomic when resuming restore.
122
123           By default "mk-parallel-restore" resumes restoration from the first
124           chunk that is missing all its rows.  For dumps of transactionally-
125           safe tables (InnoDB), it cannot happen that a chunk is only
126           partially restored.  Therefore, restoring from the first missing
127           chunk is safe.
128
129           However, for dumps of non-transactionally safe tables, it is
130           possible that a chunk can be only partially restored.  In such
131           cases, the chunk will wrongly appear to be fully restored.
132           Therefore, you must specify "--no-atomic-resume" so that the
133           partially restored chunk is fully restored.
134
135       --base-dir
136           type: string
137
138           Directory where FIFO files will be created.
139
140       --[no]biggest-first
141           default: yes
142
143           Restore the biggest tables first for highest concurrency.
144
145       --[no]bin-log
146           default: yes
147
148           Enable binary logging ("SET SQL_LOG_BIN=1").
149
150           Restore operations are replicated by default (SQL_LOG_BIN=1) except
151           for "--tab" restores which are not replicated by default
152           (SQL_LOG_BIN=0).  This prevents large loads from being logged to
153           the server's binary log.
154
155           The value given on the command line overrides the defaults.
156           Therefore, specifying "--bin-log" with "--tab" will allow the
157           "--tab" restore to replicate.
158
159       --bulk-insert-buffer-size
160           type: int
161
162           Set bulk_insert_buffer_size before each "LOAD DATA INFILE".
163
164           Has no effect without "--tab".
165
166       --charset
167           short form: -A; type: string; default: BINARY
168
169           Sets the connection, database, and "LOAD DATA INFILE" character
170           set.
171
172           The default is "BINARY", which is the safest value to use for "LOAD
173           DATA INFILE".  Has no effect without "--tab".
174
175       --[no]commit
176           default: yes
177
178           Commit after each file.
179
180       --config
181           type: Array
182
183           Read this comma-separated list of config files; if specified, this
184           must be the first option on the command line.
185
186       --create-databases
187           Create databases if they don't exist.
188
189       --[no]create-tables
190           default: yes
191
192           Create tables.
193
194           See also "--[no]drop-tables".
195
196       --csv
197           Files are in CSV format (implies "--tab").
198
199           Changes "--tab" options so the following "LOAD DATA INFILE"
200           statement is used:
201
202              LOAD DATA INFILE <filename> INTO TABLE <table>
203              FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"'
204              LINES TERMINATED BY '\n';
205
206       --database
207           short form: -D; type: string
208
209           Load all files into this database.
210
211           Overrides the database which is normally specified by the directory
212           in which the files live.  Does not specify a default database for
213           the connection.
214
215       --databases
216           short form: -d; type: hash
217
218           Restore only this comma-separated list of databases.
219
220       --databases-regex
221           type: string
222
223           Restore only databases whose names match this regex.
224
225       --decompress
226           type: string; default: gzip -d -c
227
228           Command used to decompress and print .gz files to STDOUT (like
229           zcat).
230
231       --defaults-file
232           short form: -F; type: string
233
234           Only read mysql options from the given file.  You must give an
235           absolute pathname.
236
237       --[no]disable-keys
238           default: yes
239
240           Execute "ALTER TABLE DISABLE KEYS" before each MyISAM table.
241
242           This option only works with MyISAM tables.
243
244       --[no]drop-tables
245           default: yes
246
247           Execute "DROP TABLE IF EXISTS" before creating each table.
248
249       --dry-run
250           Print commands instead of executing them.
251
252       --fast-index
253           Do InnoDB plugin fast index creation by restoring secondary indexes
254           after data.
255
256           This option only works with InnoDB tables and the InnoDB plugin.
257
258       --[no]fifo
259           default: yes
260
261           Stream files into a FIFO for "--tab".
262
263           Load compressed tab-separated files by piping them into a FIFO and
264           using the FIFO with "LOAD DATA INFILE", instead of by decompressing
265           the files on disk.  Sets "--umask" to 0.
266
267       --[no]foreign-key-checks
268           default: yes
269
270           Set "FOREIGN_KEY_CHECKS=1" before "LOAD DATA INFILE".
271
272       --help
273           Show help and exit.
274
275       --host
276           short form: -h; type: string
277
278           Connect to host.
279
280       --ignore
281           Adds the "IGNORE" modifier to "LOAD DATA INFILE".
282
283       --ignore-databases
284           type: Hash
285
286           Ignore this comma-separated list of databases.
287
288       --ignore-tables
289           type: Hash
290
291           Ignore this comma-separated list of table names.
292
293           Table names may be qualified with the database name.
294
295       --local
296           Uses the "LOCAL" option to "LOAD DATA INFILE".
297
298           If you enable this option, the files are read locally by the client
299           library, not by the server.
300
301       --[no]lock-tables
302           Lock tables before "LOAD DATA INFILE".
303
304       --[no]no-auto-value-on-0
305           default: yes
306
307           Set SQL "NO_AUTO_VALUE_ON_ZERO".
308
309       --only-empty-databases
310           Restore only to empty databases.
311
312           By default mk-parallel-restore will restore tables into a database
313           so long as it exists (or is created by "--create-databases").  This
314           option is a safety feature that prevents any tables from being
315           restored into a database that already has tables even if those
316           tables are the same ones being restored.  If you specify this
317           option, every database must have zero tables.
318
319           This implicitly disables "--[no]resume".  "--create-databases" will
320           work if the database doesn't already exist and it creates it.
321
322           The databases are checked after all filters ("--databases", etc.)
323
324       --password
325           short form: -p; type: string
326
327           Password to use when connecting.
328
329       --pid
330           type: string
331
332           Create the given PID file.  The file contains the process ID of the
333           script.  The PID file is removed when the script exits.  Before
334           starting, the script checks if the PID file already exists.  If it
335           does not, then the script creates and writes its own PID to it.  If
336           it does, then the script checks the following: if the file contains
337           a PID and a process is running with that PID, then the script dies;
338           or, if there is no process running with that PID, then the script
339           overwrites the file with its own PID and starts; else, if the file
340           contains no PID, then the script dies.
341
342       --port
343           short form: -P; type: int
344
345           Port number to use for connection.
346
347       --progress
348           Display progress messages.
349
350           Progress is displayed each time a table finishes loading.  Progress
351           is calculated by measuring the size of each file to be loaded, and
352           assuming all bytes are created equal.  The output is the completed
353           and total size, the percent completed, estimated time remaining,
354           and estimated completion time.
355
356       --quiet
357           short form: -q
358
359           Sets "--verbose" to 0.
360
361       --replace
362           Adds the "REPLACE" modifier to "LOAD DATA INFILE".
363
364       --[no]resume
365           default: yes
366
367           Resume the restore from a previously incomplete restore.
368
369           By default, "mk-parallel-restore" checks each table's chunks for
370           existing rows and restores only from the point where a previous
371           restore stopped.  Specify --no-resume to disable restore resumption
372           and fully restores every table.
373
374           Restore resumption does not work with tab-separated files or dumps
375           that were not chunked.
376
377       --set-vars
378           type: string; default: wait_timeout=10000
379
380           Set these MySQL variables.  Immediately after connecting to MySQL,
381           this string will be appended to SET and executed.
382
383       --socket
384           short form: -S; type: string
385
386           Socket file to use for connection.
387
388       --tab
389           Load tab-separated files with "LOAD DATA INFILE".
390
391           This is similar to what "mysqlimport" does, but more flexible.
392
393           The following options are enabled unless they are specifically
394           disabled on the command line:
395
396              L<"--commit">
397              L<"--[no]disable-keys">
398              L<"--[no]no-auto-value-on-0">
399
400           And the following options are disabled ("--no-bin-log", etc.)
401           unless they are specifically enabled on the command line:
402
403              L<"--[no]bin-log">
404              L<"--[no]unique-checks">
405              L<"--[no]foreign-key-checks">
406
407       --tables
408           short form: -t; type: hash
409
410           Restore only this comma-separated list of table names.
411
412           Table names may be qualified with the database name.
413
414       --tables-regex
415           type: string
416
417           Restore only tables whose names match this regex.
418
419       --threads
420           type: int; default: 2
421
422           Specifies the number of parallel processes to run.
423
424           The default is 2 (this is mk-parallel-restore after all -- 1 is not
425           parallel).  On GNU/Linux machines, the default is the number of
426           times 'processor' appears in /proc/cpuinfo.  On Windows, the
427           default is read from the environment.  In any case, the default is
428           at least 2, even when there's only a single processor.
429
430       --truncate
431           Run "TRUNCATE TABLE" before "LOAD DATA INFILE".
432
433           This will delete all rows from a table before loading the first
434           tab-delimited file into it.
435
436       --umask
437           type: string
438
439           Set the program's "umask" to this octal value.
440
441           This is useful when you want created files (such as FIFO files) to
442           be readable or writable by other users (for example, the MySQL
443           server itself).
444
445       --[no]unique-checks
446           default: yes
447
448           Set "UNIQUE_CHECKS=1" before "LOAD DATA INFILE".
449
450       --user
451           short form: -u; type: string
452
453           User for login if not current user.
454
455       --verbose
456           short form: -v; cumulative: yes; default: 1
457
458           Verbosity; can specify multiple times.
459
460           Repeatedly specifying it increments the verbosity.  Default is 1 if
461           not specified.  See "OUTPUT".
462
463       --version
464           Show version and exit.
465
466       --wait
467           short form: -w; type: time; default: 5m
468
469           Wait limit when server is down.
470
471           If the MySQL server crashes during loading, waits until the server
472           comes back and then continues with the rest of the files.
473           "mk-parallel-restore" will check the server every second until this
474           time is exhausted, at which point it will give up and exit.
475

DSN OPTIONS

477       These DSN options are used to create a DSN.  Each option is given like
478       "option=value".  The options are case-sensitive, so P and p are not the
479       same option.  There cannot be whitespace before or after the "=" and if
480       the value contains whitespace it must be quoted.  DSN options are
481       comma-separated.  See the maatkit manpage for full details.
482
483       ·   A
484
485           dsn: charset; copy: yes
486
487           Default character set.
488
489       ·   D
490
491           dsn: database; copy: yes
492
493           Default database.
494
495       ·   F
496
497           dsn: mysql_read_default_file; copy: yes
498
499           Only read default options from the given file
500
501       ·   h
502
503           dsn: host; copy: yes
504
505           Connect to host.
506
507       ·   p
508
509           dsn: password; copy: yes
510
511           Password to use when connecting.
512
513       ·   P
514
515           dsn: port; copy: yes
516
517           Port number to use for connection.
518
519       ·   S
520
521           dsn: mysql_socket; copy: yes
522
523           Socket file to use for connection.
524
525       ·   u
526
527           dsn: user; copy: yes
528
529           User for login if not current user.
530

DOWNLOADING

532       You can download Maatkit from Google Code at
533       <http://code.google.com/p/maatkit/>, or you can get any of the tools
534       easily with a command like the following:
535
536          wget http://www.maatkit.org/get/toolname
537          or
538          wget http://www.maatkit.org/trunk/toolname
539
540       Where "toolname" can be replaced with the name (or fragment of a name)
541       of any of the Maatkit tools.  Once downloaded, they're ready to run; no
542       installation is needed.  The first URL gets the latest released version
543       of the tool, and the second gets the latest trunk code from Subversion.
544

ENVIRONMENT

546       The environment variable "MKDEBUG" enables verbose debugging output in
547       all of the Maatkit tools:
548
549          MKDEBUG=1 mk-....
550

SYSTEM REQUIREMENTS

552       You need Perl, DBI, DBD::mysql, and some core packages that ought to be
553       installed in any reasonably new version of Perl.
554

BUGS

556       For a list of known bugs see
557       <http://www.maatkit.org/bugs/mk-parallel-restore>.
558
559       Please use Google Code Issues and Groups to report bugs or request
560       support: <http://code.google.com/p/maatkit/>.  You can also join
561       #maatkit on Freenode to discuss Maatkit.
562
563       Please include the complete command-line used to reproduce the problem
564       you are seeing, the version of all MySQL servers involved, the complete
565       output of the tool when run with "--version", and if possible,
566       debugging output produced by running with the "MKDEBUG=1" environment
567       variable.
568

COPYRIGHT, LICENSE AND WARRANTY

570       This program is copyright 2007-2011 Baron Schwartz.  Feedback and
571       improvements are welcome.
572
573       THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
574       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
575       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
576
577       This program is free software; you can redistribute it and/or modify it
578       under the terms of the GNU General Public License as published by the
579       Free Software Foundation, version 2; OR the Perl Artistic License.  On
580       UNIX and similar systems, you can issue `man perlgpl' or `man
581       perlartistic' to read these licenses.
582
583       You should have received a copy of the GNU General Public License along
584       with this program; if not, write to the Free Software Foundation, Inc.,
585       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.
586

SEE ALSO

588       See also mk-parallel-dump.
589

AUTHOR

591       Baron Schwartz
592

ABOUT MAATKIT

594       This tool is part of Maatkit, a toolkit for power users of MySQL.
595       Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
596       primary code contributors.  Both are employed by Percona.  Financial
597       support for Maatkit development is primarily provided by Percona and
598       its clients.
599

VERSION

601       This manual page documents Ver 1.0.24 Distrib 7540 $Revision: 7477 $.
602
603
604
605perl v5.30.1                      2020-01-29            MK-PARALLEL-RESTORE(1)
Impressum