1ddr_lzo(1)          LZO de/compression plugin for dd_rescue         ddr_lzo(1)
2
3
4

NAME

6       ddr_lzo - Data de/compression plugin for dd_rescue
7

SYNOPSIS

9       -L lzo[=option[:option[:...]]]
10       or
11       -L /path/to/libddr_lzo.so[=option[:option[:...]]]
12

DESCRIPTION

14   About
15       LZO is an algorithm that de/compresses data. It is tuned for speed
16       (especially decompression speed) and trades the size of the compressed
17       file for it to some degree. There are variants with slow compression
18       (yet still very fast decompression) available though. See the algorithm
19       parameter below.
20
21       This plugin has been written for dd_rescue and uses the plugin
22       interface from it. See the dd_rescue(1) man page for more information
23       on dd_rescue.
24

OPTIONS

26       Options are passed using dd_rescue option passing syntax: The name of
27       the plugin (lzo) is optionally followed by an equal sign (=) and
28       options are separated by a colon (:).  the lzo plugin also allows for
29       most options to be abbreviated to five or six letters. See the EXAMPLES
30       section below.
31
32   Compression or decompression
33       The lzo dd_rescue plugin (subsequently referred to as just ddr_lzo
34       which reflects the variable parts of the filename libddr_lzo.so) choses
35       compression or decompression mode automatically if one of the
36       input/output files has an [lt]zo suffix; otherwise you may specify
37       compr[ess] or decom[press] parameters on the command line.
38       The parameter opt[imize] will tell ddr_lzo to do an optimization pass
39       after compression. This might speed up decompression by a few percent
40       when creating compressed data with high compression levels and large
41       block sizes.
42
43       The plugin also supports the parameter bench[mark] ; if it's specified,
44       it will output some information about CPU usage and resulting
45       compression or decompression bandwidth. (For small files, the numbers
46       become meaningless due to jitter and limited time resolution -- ddr_lzo
47       will skip the output if the numbers are very tiny.)
48
49   De/compression algorithm
50       The lzo plugin supports a number of the (de)compression algorithms from
51       liblzo2. You can specify which one you want to use by passing algo=XXX
52       , where XXX can be lzo1x_1, lzo1x_1_15, lzo1x_999, lzo1x_1_11,
53       lzo1x_1_12, lzo1y_1, lzo1y_999, lzo1f_1, lzo1f_999, lzo1b_1 ...
54       lzo1b_9, lzo1b_99, lzo1b_999, lzo2a_999.  Pass algo=help to get a list
55       of available algorithms. Consult the liblzo documentation for more
56       information on the algorithms. Note that only the first three are
57       supported by lzop (it can decompress the first five though, as they're
58       all handled by the same decompression routine).
59       The default (lzo1x_1) is a good choice for fast compression and very
60       fast decompression and ensures compatibility with lzop. For higher
61       compression you might want to chose lzo1x_999, which is very slow but
62       lzop compatible or lzo2a_999, which is twice as fast, but not
63       compatible with lzop.
64
65   Debugging
66       The debug flag will cause the ddr_lzo to output information about
67       blocks and other internal data.  It's meant for debugging purposes.
68
69       Finally there is also a flags=XXXX parameter. This sets the flags field
70       in the header (default is 0x03000403) and is used for testing only. It
71       is not sanity checked and you can easily set values that will break
72       decompression or cause ddr_lzo to abort. Really only use for
73       development purposes when you know meaning of the various bits.
74
75   Error recovery
76       On compression, when input bytes can't be read, ddr_lzo will encode
77       holes in the compressed output file -- these will be skipped over on
78       decompression.
79
80       On decompression, erroneous blocks can be detected by the checksums
81       (most often) or by the decompressor. The lzo plugin tries to continue
82       in that case if the block header that specifies de/compressed lengths
83       is intact.  It will then result in a block being skipped over (hole)
84       and the decompression will be continued with the next block. This
85       avoids corrupt data to end up in the output file (or preexisting,
86       potentially good data there being overwritten).
87       The behaviour can be modified by specifying the nodisc[ard] option.
88       When given, the decompressor's output (filled up with zeros if too
89       short for the block) will be written to the output file.  Even if we
90       know that the data is incorrect, with some luck, parts of the block may
91       actually be valid.
92
93       When the block headers are corrupt, your situation is desperate, as you
94       will have lost the remainder of the file. To recover pieces after such
95       a block header corruption, ddr_lzo supports the search option. With it,
96       the plugin will search the input file (starting from the position given
97       in dd_rescue with -s) for data that looks like a block header and if a
98       valid looking header is found, it will start decompressing from that
99       position. (If you can't find the data you look for, you might actually
100       study the output generated with the debug flag.)
101

Supported dd_rescue features

103       dd_rescue supports appending to files with the -x/--extend option.  If
104       ddr_lzo is loaded and the output file is an existing .lzo file, the new
105       data will be appended in the format specified by the existing LZOP
106       header. If the header does not indicate a multipart (archive) file, the
107       EOF marker will be overwritten, so that a valid .lzo file is created.
108       Otherwise a new part will be appended.
109
110       When dd_rescue can't read data or a sizable amount of zero-filled data
111       is found and the -a/--sparse option is active, then dd_rescue will
112       create sparse files (files with holes inside). This is an optimization
113       to save space -- the holes are interpreted as zeroes again on normal
114       reads, so this is transparent. The holes also can be useful to ensure
115       that good data is not overwritten with zeroes when data couldn't be
116       read.
117       When the lzo module gets fed holes in compression mode, it will encode
118       them in the compressed output file in a special way (using lzop
119       multipart feature, as lzop unfortunately chokes on blocks with 0
120       compressed length). On decompression, the holes will result in the data
121       being jumped over again (creating a hole in the output file, if no data
122       preexists at the location).
123

lzop compatibility

125       The plugin uses the lzo1x_1 algorithm by default (just like lzop does
126       by default) and generates adler32 checksums to allow detecting data
127       corruption.  The compressed files are compatible with lzop and ddr_lzo
128       should handle files generated by lzop.
129       Multipart (archive) files from lzop are decompressed to ONE output file
130       in the order they are stored.
131       Multipart files created by the lzo plugin to encode holes will be
132       extracted to several files from lzop. The holes are encoded in the
133       filenames (with a sequence number and the hole size up to 1TB; use the
134       timestamp for huge holes), so a proper assembly of the fragments is
135       possible even without ddr_lzo.
136
137       lzop only supports the lzo1x_ family of algorithms.  If you chose
138       another algorithm to compress data with ddr_lzo, it will set the
139       needed_version_to_extract field in the resulting lzop file to ddr_lzo's
140       own version (1.789) to indicate incompatibility with lzop (as of 1.03).
141       lzop by default uses block sizes of 256kiB (on Unix systems), but
142       supports de/compression with smaller block sizes as well. It needs to
143       be recompiled to support block sizes up to a possible maximum of 64MiB.
144       Thus staying below or at 256kiB is recommended; even when lzop
145       compatibility is no concern, blocks larger than 16MiB are not
146       recommended, see below.
147
148   Blocksize considerations
149       When decompressing, the (soft) block size chosen in dd_rescue must be
150       sufficient (at least half the size of the blocksize used when
151       compressing); if you chose too small blocks, ddr_lzo will warn and
152       exit.
153       For compression, the chosen (soft)blocksize in dd_rescue will determine
154       the size of blocks to be fed to the lzo??_?_compress() routines. Larger
155       block sizes will typically result in slightly better compression
156       ratios, though the returns on increasing the block size quickly
157       diminish after 64k.
158       The default from dd_rescue (128kiB) is a good choice. It is NOT
159       recommended to increase the block size too much -- when an lzo file
160       gets corrupted, at least one block will be lost; larger blocks result
161       in larger damage. Also, blocks larger than 16MiB will not work well
162       with the error tolerance features of ddr_lzo. Also note that blocks
163       larger than 256kiB need recompilation of lzop if you want to be able to
164       use lzop to process the .lzo files; blocks larger than 64MiB prevent
165       decompression even with a recompiled lzop.
166

BUGS/LIMITATIONS

168   Maturity
169       The plugin is new as of dd_rescue 1.43. Do not yet rely on data saved
170       with ddr_lzo as the only backup for valuable data. Also expect some
171       changes to ddr_lzo in the not too distant future.  (This should not
172       break the file format, as we're following lzop ....)
173       Compressed data is more sensitive to data corruption than plain data.
174       Note that the checksums (adler32 or crc32) in the lzop file format do
175       NOT allow to correct for errors; they just allow a somewhat reliable
176       detection of data corruption. (Ideally, a 32bit checksum just misses 1
177       out of 2^32 corruptions; on small changes, crc32 comes a bit closer to
178       the ideal than adler32. You may pass the crc32 option to use crc32
179       instead of adler32 checksums at the expense of some speed --
180       unfortunately the crc32 polynomial for lzop/gzip/... is not the crc32c
181       polynomial that has hardware support on many CPUs these days.)  Also
182       note that the checksums are NOT cryptographic hashes; a malicious
183       attacker can easily find modifications of data that do not alter the
184       checksums. Use MD5 or better SHA-256/SHA-512 for ensuring integrity
185       against attackers. Use par2 or similar software to create error
186       correcting codes (Reed-Solomon / Erasure Codes) if you want to be able
187       to recover data in face of corruption.
188
189   Security
190       While care has been applied to check the result of memory allocations
191       ..., the decompressor code has not been audited and only limited
192       fuzzing has been applied to ensure it's not vulnerable to malicious
193       data -- be careful when you process data from untrusted sources.
194

EXAMPLES

196       dd_rescue -ptAL lzo=algo=lzo1x_1_15:compress,hash=alg=sha256 infile outfile
197              compresses data from infile into outfile using the algorithm
198              lzo1x_1_15 and calculates the sha256 hash value of outfile.
199              outfile will have time stamp and access rights copied over from
200              infile and it will be emptied before (if the file happens to
201              exist). The output file won't have encoded holes; errors in the
202              infile will result in zeros.
203
204       dd_rescue -aL MD5,lzo=compr:bench,MD5,lzo=decompress,MD5 infile infile2
205              will copy infile to infile2 compressing the data and
206              decompressing it again on the fly. It will output MD5 hashes for
207              the compressed data as well (though it's not stored) and for the
208              two infiles -- the output should be identical, obviously. This
209              command is rather artificial, used for testing. The -a flag
210              makes dd_rescue detect zero blocks and create holes, thus
211              testing hole encoding (sparse files) and decoding as well if the
212              infile has sizable regions filled with zeros.
213
214       dd_rescue -s1M -S0 -L lzo=search,nodiscard infile.lzo outfile
215              will search for a lzop block header in infile.lzo starting at
216              position 1MiB into the file and decompress the remainder of the
217              file. On finding corrupted blocks, it will still write the
218              output from the decompressor to outfile.
219

SEE ALSO

221       dd_rescue(1) liblzo2 documentation lzop(1)
222

AUTHOR

224       Kurt Garloff <kurt@garloff.de>
225

CREDITS

227       The liblzo2 library and algorithm has been written by Markus Oberhumer.
228       http://www.oberhumer.com/opensource/lzo/
229
231       This plugin is under the same license as dd_rescue: The GNU General
232       Public License (GPL) v2 or v3 - at your option.
233

HISTORY

235       ddr_lzo plugin was first introduced with dd_rescue 1.43 (May 2014).
236
237       Some additional information can be found on
238       http://garloff.de/kurt/linux/ddrescue/
239
240
241
242Kurt Garloff                      2014-05-12                        ddr_lzo(1)
Impressum