1MK-PARALLEL-RESTORE(1)User Contributed Perl DocumentationMK-PARALLEL-RESTORE(1)
2
3
4
6 mk-parallel-restore - (DEPRECATED) Load files into MySQL in parallel.
7
9 This tool is deprecated because after several complete redesigns, we
10 concluded that Perl is the wrong technology for this task. Read
11 "RISKS" before you use it, please. It remains useful for some people
12 who we know aren't depending on it in production, and therefore we are
13 not removing it from the distribution.
14
15 Usage: mk-parallel-restore [OPTION...] [DSN] PATH [PATH...]
16
17 mk-parallel-restore description loads files into MySQL in parallel to
18 make some type of data loads faster. IT IS NOT A BACKUP TOOL!
19
20 mk-parallel-restore /path/to/files
21 mk-parallel-restore --tab /path/to/files
22
24 The following section is included to inform users about the potential
25 risks, whether known or unknown, of using this tool. The two main
26 categories of risks are those created by the nature of the tool (e.g.
27 read-only tools vs. read-write tools) and those created by bugs.
28
29 mk-parallel-restore is not a backup program! It is only for fast data
30 imports, for purposes such as loading test data into a system quickly.
31 Do not use mk-parallel-restore for backups. mk-parallel-restore
32 inserts data unless you use the "--dry-run" option.
33
34 At the time of this release, there is a bug that prevents huge
35 statements from being printed when an error is encountered, a bug
36 applying "--[no]foreign-key-checks" when truncating tables, and a bug
37 with LOAD DATA LOCAL.
38
39 The authoritative source for updated information is always the online
40 issue tracking system. Issues that affect this tool will be marked as
41 such. You can see a list of such issues at the following URL:
42 <http://www.maatkit.org/bugs/mk-parallel-restore>.
43
44 See also "BUGS" for more information on filing bugs and getting help.
45
47 mk-parallel-restore is a way to load SQL or delimited-file dumps into
48 MySQL in parallel at high speed. It is especially designed for
49 restoring files dumped by mk-parallel-dump. It automatically detects
50 whether a file contains SQL or delimited data from the filename
51 extension, and either shells out to "mysql" or executes "LOAD DATA
52 INFILE" with the file. On UNIX-like systems, it will even make a FIFO
53 to decompress gzipped files for "LOAD DATA INFILE".
54
55 By default it discovers all files in the directory you specify on the
56 command line. It uses the file's parent directory as the database name
57 and the file's name (up to the first dot) as the table name. It can
58 deal with files named like the following:
59
60 dir/tbl.sql
61 dir/tbl.txt
62 dir/tbl.csv
63
64 It is also happy with files that look like this, where "EXT" is one of
65 the extensions just listed.
66
67 dir/tbl.EXT.000
68 dir/tbl.EXT.000.gz
69
70 By default, it loads "SQL" files first, if they exist, then loads "CSV"
71 or "TXT" files next, in order of the numbers in the filename extension
72 as just shown. This makes it easy for you to reload a table's
73 definition followed by its data, in case you dumped them into separate
74 files (as happens with "mysqldump"'s "--tab" option). See mk-parallel-
75 dump for details on how data is dumped.
76
77 Exit status is 0 if everything went well, 1 if any files failed, and
78 any other value indicates an internal error.
79
81 Output depends on verbosity. When "--dry-run" is given, output
82 includes commands that would be executed.
83
84 When "--verbose" is 0, there is normally no output unless there's an
85 error.
86
87 When "--verbose" is 1, there is one line of output for the entire job,
88 showing how many tables were processed, how many files were loaded with
89 what status, how much time elapsed, and how much time the parallel load
90 jobs added up to. If any files were skipped, the filenames are printed
91 to the output.
92
93 When "--verbose" is 2, there's one line of output per table, showing
94 extra data such as how many threads were running when each table
95 finished loading:
96
97 DATABASE TABLE FILES TIME STATUS THREADS
98 sakila language 2 0.07 0 2
99 sakila film_actor 2 0.07 0 2
100 sakila actor 2 0.06 0 2
101 sakila payment 2 0.07 0 2
102 sakila transport_backup 2 0.05 0 2
103 sakila country 2 0.08 0 2
104 sakila film 2 0.05 0 2
105 sakila rental 2 0.07 0 2
106
108 User-contributed benchmarks are welcome. See
109 <http://www.paragon-cs.com/wordpress/?p=52> for one user's experiences.
110
112 This tool accepts additional command-line arguments. Refer to the
113 "SYNOPSIS" and usage information for details.
114
115 --ask-pass
116 Prompt for a password when connecting to MySQL.
117
118 --[no]atomic-resume
119 default: yes
120
121 Treat chunks as atomic when resuming restore.
122
123 By default "mk-parallel-restore" resumes restoration from the first
124 chunk that is missing all its rows. For dumps of transactionally-
125 safe tables (InnoDB), it cannot happen that a chunk is only
126 partially restored. Therefore, restoring from the first missing
127 chunk is safe.
128
129 However, for dumps of non-transactionally safe tables, it is
130 possible that a chunk can be only partially restored. In such
131 cases, the chunk will wrongly appear to be fully restored.
132 Therefore, you must specify "--no-atomic-resume" so that the
133 partially restored chunk is fully restored.
134
135 --base-dir
136 type: string
137
138 Directory where FIFO files will be created.
139
140 --[no]biggest-first
141 default: yes
142
143 Restore the biggest tables first for highest concurrency.
144
145 --[no]bin-log
146 default: yes
147
148 Enable binary logging ("SET SQL_LOG_BIN=1").
149
150 Restore operations are replicated by default (SQL_LOG_BIN=1) except
151 for "--tab" restores which are not replicated by default
152 (SQL_LOG_BIN=0). This prevents large loads from being logged to
153 the server's binary log.
154
155 The value given on the command line overrides the defaults.
156 Therefore, specifying "--bin-log" with "--tab" will allow the
157 "--tab" restore to replicate.
158
159 --bulk-insert-buffer-size
160 type: int
161
162 Set bulk_insert_buffer_size before each "LOAD DATA INFILE".
163
164 Has no effect without "--tab".
165
166 --charset
167 short form: -A; type: string; default: BINARY
168
169 Sets the connection, database, and "LOAD DATA INFILE" character
170 set.
171
172 The default is "BINARY", which is the safest value to use for "LOAD
173 DATA INFILE". Has no effect without "--tab".
174
175 --[no]commit
176 default: yes
177
178 Commit after each file.
179
180 --config
181 type: Array
182
183 Read this comma-separated list of config files; if specified, this
184 must be the first option on the command line.
185
186 --create-databases
187 Create databases if they don't exist.
188
189 --[no]create-tables
190 default: yes
191
192 Create tables.
193
194 See also "--[no]drop-tables".
195
196 --csv
197 Files are in CSV format (implies "--tab").
198
199 Changes "--tab" options so the following "LOAD DATA INFILE"
200 statement is used:
201
202 LOAD DATA INFILE <filename> INTO TABLE <table>
203 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"'
204 LINES TERMINATED BY '\n';
205
206 --database
207 short form: -D; type: string
208
209 Load all files into this database.
210
211 Overrides the database which is normally specified by the directory
212 in which the files live. Does not specify a default database for
213 the connection.
214
215 --databases
216 short form: -d; type: hash
217
218 Restore only this comma-separated list of databases.
219
220 --databases-regex
221 type: string
222
223 Restore only databases whose names match this regex.
224
225 --decompress
226 type: string; default: gzip -d -c
227
228 Command used to decompress and print .gz files to STDOUT (like
229 zcat).
230
231 --defaults-file
232 short form: -F; type: string
233
234 Only read mysql options from the given file. You must give an
235 absolute pathname.
236
237 --[no]disable-keys
238 default: yes
239
240 Execute "ALTER TABLE DISABLE KEYS" before each MyISAM table.
241
242 This option only works with MyISAM tables.
243
244 --[no]drop-tables
245 default: yes
246
247 Execute "DROP TABLE IF EXISTS" before creating each table.
248
249 --dry-run
250 Print commands instead of executing them.
251
252 --fast-index
253 Do InnoDB plugin fast index creation by restoring secondary indexes
254 after data.
255
256 This option only works with InnoDB tables and the InnoDB plugin.
257
258 --[no]fifo
259 default: yes
260
261 Stream files into a FIFO for "--tab".
262
263 Load compressed tab-separated files by piping them into a FIFO and
264 using the FIFO with "LOAD DATA INFILE", instead of by decompressing
265 the files on disk. Sets "--umask" to 0.
266
267 --[no]foreign-key-checks
268 default: yes
269
270 Set "FOREIGN_KEY_CHECKS=1" before "LOAD DATA INFILE".
271
272 --help
273 Show help and exit.
274
275 --host
276 short form: -h; type: string
277
278 Connect to host.
279
280 --ignore
281 Adds the "IGNORE" modifier to "LOAD DATA INFILE".
282
283 --ignore-databases
284 type: Hash
285
286 Ignore this comma-separated list of databases.
287
288 --ignore-tables
289 type: Hash
290
291 Ignore this comma-separated list of table names.
292
293 Table names may be qualified with the database name.
294
295 --local
296 Uses the "LOCAL" option to "LOAD DATA INFILE".
297
298 If you enable this option, the files are read locally by the client
299 library, not by the server.
300
301 --[no]lock-tables
302 Lock tables before "LOAD DATA INFILE".
303
304 --[no]no-auto-value-on-0
305 default: yes
306
307 Set SQL "NO_AUTO_VALUE_ON_ZERO".
308
309 --only-empty-databases
310 Restore only to empty databases.
311
312 By default mk-parallel-restore will restore tables into a database
313 so long as it exists (or is created by "--create-databases"). This
314 option is a safety feature that prevents any tables from being
315 restored into a database that already has tables even if those
316 tables are the same ones being restored. If you specify this
317 option, every database must have zero tables.
318
319 This implicitly disables "--[no]resume". "--create-databases" will
320 work if the database doesn't already exist and it creates it.
321
322 The databases are checked after all filters ("--databases", etc.)
323
324 --password
325 short form: -p; type: string
326
327 Password to use when connecting.
328
329 --pid
330 type: string
331
332 Create the given PID file. The file contains the process ID of the
333 script. The PID file is removed when the script exits. Before
334 starting, the script checks if the PID file already exists. If it
335 does not, then the script creates and writes its own PID to it. If
336 it does, then the script checks the following: if the file contains
337 a PID and a process is running with that PID, then the script dies;
338 or, if there is no process running with that PID, then the script
339 overwrites the file with its own PID and starts; else, if the file
340 contains no PID, then the script dies.
341
342 --port
343 short form: -P; type: int
344
345 Port number to use for connection.
346
347 --progress
348 Display progress messages.
349
350 Progress is displayed each time a table finishes loading. Progress
351 is calculated by measuring the size of each file to be loaded, and
352 assuming all bytes are created equal. The output is the completed
353 and total size, the percent completed, estimated time remaining,
354 and estimated completion time.
355
356 --quiet
357 short form: -q
358
359 Sets "--verbose" to 0.
360
361 --replace
362 Adds the "REPLACE" modifier to "LOAD DATA INFILE".
363
364 --[no]resume
365 default: yes
366
367 Resume the restore from a previously incomplete restore.
368
369 By default, "mk-parallel-restore" checks each table's chunks for
370 existing rows and restores only from the point where a previous
371 restore stopped. Specify --no-resume to disable restore resumption
372 and fully restores every table.
373
374 Restore resumption does not work with tab-separated files or dumps
375 that were not chunked.
376
377 --set-vars
378 type: string; default: wait_timeout=10000
379
380 Set these MySQL variables. Immediately after connecting to MySQL,
381 this string will be appended to SET and executed.
382
383 --socket
384 short form: -S; type: string
385
386 Socket file to use for connection.
387
388 --tab
389 Load tab-separated files with "LOAD DATA INFILE".
390
391 This is similar to what "mysqlimport" does, but more flexible.
392
393 The following options are enabled unless they are specifically
394 disabled on the command line:
395
396 L<"--commit">
397 L<"--[no]disable-keys">
398 L<"--[no]no-auto-value-on-0">
399
400 And the following options are disabled ("--no-bin-log", etc.)
401 unless they are specifically enabled on the command line:
402
403 L<"--[no]bin-log">
404 L<"--[no]unique-checks">
405 L<"--[no]foreign-key-checks">
406
407 --tables
408 short form: -t; type: hash
409
410 Restore only this comma-separated list of table names.
411
412 Table names may be qualified with the database name.
413
414 --tables-regex
415 type: string
416
417 Restore only tables whose names match this regex.
418
419 --threads
420 type: int; default: 2
421
422 Specifies the number of parallel processes to run.
423
424 The default is 2 (this is mk-parallel-restore after all -- 1 is not
425 parallel). On GNU/Linux machines, the default is the number of
426 times 'processor' appears in /proc/cpuinfo. On Windows, the
427 default is read from the environment. In any case, the default is
428 at least 2, even when there's only a single processor.
429
430 --truncate
431 Run "TRUNCATE TABLE" before "LOAD DATA INFILE".
432
433 This will delete all rows from a table before loading the first
434 tab-delimited file into it.
435
436 --umask
437 type: string
438
439 Set the program's "umask" to this octal value.
440
441 This is useful when you want created files (such as FIFO files) to
442 be readable or writable by other users (for example, the MySQL
443 server itself).
444
445 --[no]unique-checks
446 default: yes
447
448 Set "UNIQUE_CHECKS=1" before "LOAD DATA INFILE".
449
450 --user
451 short form: -u; type: string
452
453 User for login if not current user.
454
455 --verbose
456 short form: -v; cumulative: yes; default: 1
457
458 Verbosity; can specify multiple times.
459
460 Repeatedly specifying it increments the verbosity. Default is 1 if
461 not specified. See "OUTPUT".
462
463 --version
464 Show version and exit.
465
466 --wait
467 short form: -w; type: time; default: 5m
468
469 Wait limit when server is down.
470
471 If the MySQL server crashes during loading, waits until the server
472 comes back and then continues with the rest of the files.
473 "mk-parallel-restore" will check the server every second until this
474 time is exhausted, at which point it will give up and exit.
475
477 These DSN options are used to create a DSN. Each option is given like
478 "option=value". The options are case-sensitive, so P and p are not the
479 same option. There cannot be whitespace before or after the "=" and if
480 the value contains whitespace it must be quoted. DSN options are
481 comma-separated. See the maatkit manpage for full details.
482
483 · A
484
485 dsn: charset; copy: yes
486
487 Default character set.
488
489 · D
490
491 dsn: database; copy: yes
492
493 Default database.
494
495 · F
496
497 dsn: mysql_read_default_file; copy: yes
498
499 Only read default options from the given file
500
501 · h
502
503 dsn: host; copy: yes
504
505 Connect to host.
506
507 · p
508
509 dsn: password; copy: yes
510
511 Password to use when connecting.
512
513 · P
514
515 dsn: port; copy: yes
516
517 Port number to use for connection.
518
519 · S
520
521 dsn: mysql_socket; copy: yes
522
523 Socket file to use for connection.
524
525 · u
526
527 dsn: user; copy: yes
528
529 User for login if not current user.
530
532 You can download Maatkit from Google Code at
533 <http://code.google.com/p/maatkit/>, or you can get any of the tools
534 easily with a command like the following:
535
536 wget http://www.maatkit.org/get/toolname
537 or
538 wget http://www.maatkit.org/trunk/toolname
539
540 Where "toolname" can be replaced with the name (or fragment of a name)
541 of any of the Maatkit tools. Once downloaded, they're ready to run; no
542 installation is needed. The first URL gets the latest released version
543 of the tool, and the second gets the latest trunk code from Subversion.
544
546 The environment variable "MKDEBUG" enables verbose debugging output in
547 all of the Maatkit tools:
548
549 MKDEBUG=1 mk-....
550
552 You need Perl, DBI, DBD::mysql, and some core packages that ought to be
553 installed in any reasonably new version of Perl.
554
556 For a list of known bugs see
557 <http://www.maatkit.org/bugs/mk-parallel-restore>.
558
559 Please use Google Code Issues and Groups to report bugs or request
560 support: <http://code.google.com/p/maatkit/>. You can also join
561 #maatkit on Freenode to discuss Maatkit.
562
563 Please include the complete command-line used to reproduce the problem
564 you are seeing, the version of all MySQL servers involved, the complete
565 output of the tool when run with "--version", and if possible,
566 debugging output produced by running with the "MKDEBUG=1" environment
567 variable.
568
570 This program is copyright 2007-2011 Baron Schwartz. Feedback and
571 improvements are welcome.
572
573 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
574 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
575 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
576
577 This program is free software; you can redistribute it and/or modify it
578 under the terms of the GNU General Public License as published by the
579 Free Software Foundation, version 2; OR the Perl Artistic License. On
580 UNIX and similar systems, you can issue `man perlgpl' or `man
581 perlartistic' to read these licenses.
582
583 You should have received a copy of the GNU General Public License along
584 with this program; if not, write to the Free Software Foundation, Inc.,
585 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
586
588 See also mk-parallel-dump.
589
591 Baron Schwartz
592
594 This tool is part of Maatkit, a toolkit for power users of MySQL.
595 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
596 primary code contributors. Both are employed by Percona. Financial
597 support for Maatkit development is primarily provided by Percona and
598 its clients.
599
601 This manual page documents Ver 1.0.24 Distrib 7540 $Revision: 7477 $.
602
603
604
605perl v5.28.1 2011-06-08 MK-PARALLEL-RESTORE(1)