1ddr_lzo(1) LZO de/compression plugin for dd_rescue ddr_lzo(1)
2
3
4
6 ddr_lzo - Data de/compression plugin for dd_rescue
7
9 -L lzo[=option[:option[:...]]]
10 or
11 -L /path/to/libddr_lzo.so[=option[:option[:...]]]
12
14 About
15 LZO is an algorithm that de/compresses data. It is tuned for speed
16 (especially decompression speed) and trades the size of the compressed
17 file for it to some degree. There are variants with slow compression
18 (yet still very fast decompression) available though. See the algorithm
19 parameter below.
20
21 This plugin has been written for dd_rescue and uses the plugin
22 interface from it. See the dd_rescue(1) man page for more information
23 on dd_rescue.
24
26 Options are passed using dd_rescue option passing syntax: The name of
27 the plugin (lzo) is optionally followed by an equal sign (=) and
28 options are separated by a colon (:). the lzo plugin also allows for
29 most options to be abbreviated to five or six letters. See the EXAMPLES
30 section below.
31
32 Compression or decompression
33 The lzo dd_rescue plugin (subsequently referred to as just ddr_lzo
34 which reflects the variable parts of the filename libddr_lzo.so) choses
35 compression or decompression mode automatically if one of the
36 input/output files has an [lt]zo suffix; otherwise you may specify
37 compr[ess] or decom[press] parameters on the command line.
38 The parameter opt[imize] will tell ddr_lzo to do an optimization pass
39 after compression. This might speed up decompression by a few percent
40 when creating compressed data with high compression levels and large
41 block sizes.
42
43 The plugin also supports the parameter bench[mark] ; if it's specified,
44 it will output some information about CPU usage and resulting
45 compression or decompression bandwidth. (For small files, the numbers
46 become meaningless due to jitter and limited time resolution -- ddr_lzo
47 will skip the output if the numbers are very tiny.)
48
49 De/compression algorithm
50 The lzo plugin supports a number of the (de)compression algorithms from
51 liblzo2. You can specify which one you want to use by passing algo=XXX
52 , where XXX can be lzo1x_1, lzo1x_1_15, lzo1x_999, lzo1x_1_11,
53 lzo1x_1_12, lzo1y_1, lzo1y_999, lzo1f_1, lzo1f_999, lzo1b_1 ...
54 lzo1b_9, lzo1b_99, lzo1b_999, lzo2a_999. Pass algo=help to get a list
55 of available algorithms. Consult the liblzo documentation for more
56 information on the algorithms. Note that only the first three are
57 supported by lzop (it can decompress the first five though, as they're
58 all handled by the same decompression routine).
59 The default (lzo1x_1) is a good choice for fast compression and very
60 fast decompression and ensures compatibility with lzop. For higher
61 compression you might want to chose lzo1x_999, which is very slow but
62 lzop compatible or lzo2a_999, which is twice as fast, but not
63 compatible with lzop.
64
65 Debugging
66 The debug flag will cause the ddr_lzo to output information about
67 blocks and other internal data. It's meant for debugging purposes.
68
69 Finally there is also a flags=XXXX parameter. This sets the flags field
70 in the header (default is 0x03000403) and is used for testing only. It
71 is not sanity checked and you can easily set values that will break
72 decompression or cause ddr_lzo to abort. Really only use for
73 development purposes when you know meaning of the various bits.
74
75 Error recovery
76 On compression, when input bytes can't be read, ddr_lzo will encode
77 holes in the compressed output file -- these will be skipped over on
78 decompression.
79
80 On decompression, erroneous blocks can be detected by the checksums
81 (most often) or by the decompressor. The lzo plugin tries to continue
82 in that case if the block header that specifies de/compressed lengths
83 is intact. It will then result in a block being skipped over (hole)
84 and the decompression will be continued with the next block. This
85 avoids corrupt data to end up in the output file (or preexisting,
86 potentially good data there being overwritten).
87 The behaviour can be modified by specifying the nodisc[ard] option.
88 When given, the decompressor's output (filled up with zeros if too
89 short for the block) will be written to the output file. Even if we
90 know that the data is incorrect, with some luck, parts of the block may
91 actually be valid.
92
93 When the block headers are corrupt, your situation is desperate, as you
94 will have lost the remainder of the file. To recover pieces after such
95 a block header corruption, ddr_lzo supports the search option. With it,
96 the plugin will search the input file (starting from the position given
97 in dd_rescue with -s) for data that looks like a block header and if a
98 valid looking header is found, it will start decompressing from that
99 position. (If you can't find the data you look for, you might actually
100 study the output generated with the debug flag.)
101
103 dd_rescue supports appending to files with the -x/--extend option. If
104 ddr_lzo is loaded and the output file is an existing .lzo file, the new
105 data will be appended in the format specified by the existing LZOP
106 header. If the header does not indicate a multipart (archive) file, the
107 EOF marker will be overwritten, so that a valid .lzo file is created.
108 Otherwise a new part will be appended.
109
110 When dd_rescue can't read data or a sizable amount of zero-filled data
111 is found and the -a/--sparse option is active, then dd_rescue will
112 create sparse files (files with holes inside). This is an optimization
113 to save space -- the holes are interpreted as zeroes again on normal
114 reads, so this is transparent. The holes also can be useful to ensure
115 that good data is not overwritten with zeroes when data couldn't be
116 read.
117 When the lzo module gets fed holes in compression mode, it will encode
118 them in the compressed output file in a special way (using lzop
119 multipart feature, as lzop unfortunately chokes on blocks with 0
120 compressed length). On decompression, the holes will result in the data
121 being jumped over again (creating a hole in the output file, if no data
122 preexists at the location).
123
125 The plugin uses the lzo1x_1 algorithm by default (just like lzop does
126 by default) and generates adler32 checksums to allow detecting data
127 corruption. The compressed files are compatible with lzop and ddr_lzo
128 should handle files generated by lzop.
129 Multipart (archive) files from lzop are decompressed to ONE output file
130 in the order they are stored.
131 Multipart files created by the lzo plugin to encode holes will be
132 extracted to several files from lzop. The holes are encoded in the
133 filenames (with a sequence number and the hole size up to 1TB; use the
134 timestamp for huge holes), so a proper assembly of the fragments is
135 possible even without ddr_lzo.
136
137 lzop only supports the lzo1x_ family of algorithms. If you chose
138 another algorithm to compress data with ddr_lzo, it will set the
139 needed_version_to_extract field in the resulting lzop file to ddr_lzo's
140 own version (1.789) to indicate incompatibility with lzop (as of 1.03).
141 lzop by default uses block sizes of 256kiB (on Unix systems), but
142 supports de/compression with smaller block sizes as well. It needs to
143 be recompiled to support block sizes up to a possible maximum of 64MiB.
144 Thus staying below or at 256kiB is recommended; even when lzop
145 compatibility is no concern, blocks larger than 16MiB are not
146 recommended, see below.
147
148 Blocksize considerations
149 When decompressing, the (soft) block size chosen in dd_rescue must be
150 sufficient (at least half the size of the blocksize used when
151 compressing); if you chose too small blocks, ddr_lzo will warn and
152 exit.
153 For compression, the chosen (soft)blocksize in dd_rescue will determine
154 the size of blocks to be fed to the lzo??_?_compress() routines. Larger
155 block sizes will typically result in slightly better compression
156 ratios, though the returns on increasing the block size quickly
157 diminish after 64k.
158 The default from dd_rescue (128kiB) is a good choice. It is NOT
159 recommended to increase the block size too much -- when an lzo file
160 gets corrupted, at least one block will be lost; larger blocks result
161 in larger damage. Also, blocks larger than 16MiB will not work well
162 with the error tolerance features of ddr_lzo. Also note that blocks
163 larger than 256kiB need recompilation of lzop if you want to be able to
164 use lzop to process the .lzo files; blocks larger than 64MiB prevent
165 decompression even with a recompiled lzop.
166
168 Maturity
169 The plugin is new as of dd_rescue 1.43. Do not yet rely on data saved
170 with ddr_lzo as the only backup for valuable data. Also expect some
171 changes to ddr_lzo in the not too distant future. (This should not
172 break the file format, as we're following lzop ....)
173 Compressed data is more sensitive to data corruption than plain data.
174 Note that the checksums (adler32 or crc32) in the lzop file format do
175 NOT allow to correct for errors; they just allow a somewhat reliable
176 detection of data corruption. (Ideally, a 32bit checksum just misses 1
177 out of 2^32 corruptions; on small changes, crc32 comes a bit closer to
178 the ideal than adler32. You may pass the crc32 option to use crc32
179 instead of adler32 checksums at the expense of some speed --
180 unfortunately the crc32 polynomial for lzop/gzip/... is not the crc32c
181 polynomial that has hardware support on many CPUs these days.) Also
182 note that the checksums are NOT cryptographic hashes; a malicious
183 attacker can easily find modifications of data that do not alter the
184 checksums. Use MD5 or better SHA-256/SHA-512 for ensuring integrity
185 against attackers. Use par2 or similar software to create error
186 correcting codes (Reed-Solomon / Erasure Codes) if you want to be able
187 to recover data in face of corruption.
188
189 Security
190 While care has been applied to check the result of memory allocations
191 ..., the decompressor code has not been audited and only limited
192 fuzzing has been applied to ensure it's not vulnerable to malicious
193 data -- be careful when you process data from untrusted sources.
194
196 dd_rescue -ptAL lzo=algo=lzo1x_1_15:compress,hash=alg=sha256 infile outfile
197 compresses data from infile into outfile using the algorithm
198 lzo1x_1_15 and calculates the sha256 hash value of outfile.
199 outfile will have time stamp and access rights copied over from
200 infile and it will be emptied before (if the file happens to
201 exist). The output file won't have encoded holes; errors in the
202 infile will result in zeros.
203
204 dd_rescue -aL MD5,lzo=compr:bench,MD5,lzo=decompress,MD5 infile infile2
205 will copy infile to infile2 compressing the data and
206 decompressing it again on the fly. It will output MD5 hashes for
207 the compressed data as well (though it's not stored) and for the
208 two infiles -- the output should be identical, obviously. This
209 command is rather artificial, used for testing. The -a flag
210 makes dd_rescue detect zero blocks and create holes, thus
211 testing hole encoding (sparse files) and decoding as well if the
212 infile has sizable regions filled with zeros.
213
214 dd_rescue -s1M -S0 -L lzo=search,nodiscard infile.lzo outfile
215 will search for a lzop block header in infile.lzo starting at
216 position 1MiB into the file and decompress the remainder of the
217 file. On finding corrupted blocks, it will still write the
218 output from the decompressor to outfile.
219
221 dd_rescue(1) liblzo2 documentation lzop(1)
222
224 Kurt Garloff <kurt@garloff.de>
225
227 The liblzo2 library and algorithm has been written by Markus Oberhumer.
228 http://www.oberhumer.com/opensource/lzo/
229
231 This plugin is under the same license as dd_rescue: The GNU General
232 Public License (GPL) v2 or v3 - at your option.
233
235 ddr_lzo plugin was first introduced with dd_rescue 1.43 (May 2014).
236
237 Some additional information can be found on
238 http://garloff.de/kurt/linux/ddrescue/
239
240
241
242Kurt Garloff 2014-05-12 ddr_lzo(1)