1LBZIP2(1) User commands LBZIP2(1)
2
3
4
6 lbzip2 - parallel bzip2 utility
7
8
10 lbzip2|bzip2 [-n WTHRS] [-k|-c|-t] [-d] [-1 .. -9] [-f] [-s] [-u] [-v]
11 [-S] [ FILE ... ]
12
13 lbunzip2|bunzip2 [-n WTHRS] [-k|-c|-t] [-z] [-f] [-s] [-u] [-v] [-S] [
14 FILE ... ]
15
16 lbzcat|bzcat [-n WTHRS] [-z] [-f] [-s] [-u] [-v] [-S] [ FILE ... ]
17
18 lbzip2|bzip2|lbunzip2|bunzip2|lbzcat|bzcat -h
19
20
21
23 Compress or decompress FILE operands or standard input to regular files
24 or standard output using the Burrows-Wheeler block-sorting text com‐
25 pression algorithm. The lbzip2 utility employs multiple threads and an
26 input-bound splitter even when decompressing .bz2 files created by
27 standard bzip2.
28
29 Compression is generally considerably better than that achieved by more
30 conventional LZ77/LZ78-based compressors, and competitive with all but
31 the best of the PPM family of statistical compressors.
32
33 Compression is always performed, even if the compressed file is
34 slightly larger than the original. The worst case expansion is for
35 files of zero length, which expand to fourteen bytes. Random data
36 (including the output of most file compressors) is coded with asymptot‐
37 ic expansion of around 0.5%.
38
39 The command-line options are deliberately very similar to those of
40 bzip2 and gzip, but they are not identical.
41
42
43
45 The default mode of operation is compression. If the utility is invoked
46 as lbunzip2 or bunzip2, the mode is switched to decompression. Calling
47 the utility as lbzcat or bzcat selects decompression, with the decom‐
48 pressed byte-stream written to standard output.
49
50
51
53 -n WTHRS
54 Set the number of (de)compressor threads to WTHRS. If this
55 option is not specified, lbzip2 tries to query the system for
56 the number of online processors (if both the compilation envi‐
57 ronment and the execution environment support that), or exits
58 with an error (if it's unable to determine the number of proces‐
59 sors online).
60
61
62 -k, --keep
63 Don't remove FILE operands after successful (de)compression.
64 Open regular input files with more than one link.
65
66
67 -c, --stdout
68 Write output to standard output, even when FILE operands are
69 present. Implies -k and excludes -t.
70
71
72 -t, --test
73 Test decompression; discard output instead of writing it to
74 files or standard output. Implies -k and excludes -c. Roughly
75 equivalent to passing -c and redirecting standard output to the
76 bit bucket.
77
78
79 -d, --decompress
80 Force decompression over the mode of operation selected by the
81 invocation name.
82
83
84 -z, --compress
85 Force compression over the mode of operation selected by the
86 invocation name.
87
88
89 -1 .. -9
90 Set the compression block size to 100K .. 900K, in 100K incre‐
91 ments. Ignored during decompression. See also the BLOCK SIZE
92 section below.
93
94
95 --fast Alias for -1.
96
97
98 --best Alias for -9. This is the default.
99
100
101 -f, --force
102 Open non-regular input files. Open input files with more than
103 one link, breaking links when -k isn't specified in addition.
104 Try to remove each output file before opening it. By default
105 lbzip2 will not overwrite existing files; if you want this to
106 happen, you should specify -f. If -c and -d are also given
107 don't reject files not in bzip2 format, just copy them without
108 change; without -f lbzip2 would stop after reaching a file that
109 is not in bzip2 format.
110
111
112 -s, --small
113 Reduce memory usage at cost of performance.
114
115
116 -u, --sequential
117 Perform splitting input blocks sequentially. This may improve
118 compression ratio and decrease CPU usage, but will degrade scal‐
119 ability.
120
121
122 -v, --verbose
123 Be more verbose. Print more detailed information about (de)com‐
124 pression progress to standard error: before processing each
125 file, print a message stating the names of input and output
126 files; during (de)compression, print a rough percentage of com‐
127 pleteness and estimated time of arrival (only if standard error
128 is connected to a terminal); after processing each file print a
129 message showing compression ratio, space savings, total compres‐
130 sion time (wall time) and average (de)compression speed (bytes
131 of plain data processed per second).
132
133
134 -S Print condition variable statistics to standard error for each
135 completed (de)compression operation. Useful in profiling.
136
137
138 -q, --quiet, --repetitive-fast, --repetitive-best, --exponential
139 Accepted for compatibility with bzip2, otherwise ignored.
140
141
142 -h, --help
143 Print help on command-line usage on standard output and exit
144 successfully.
145
146
147 -L, --license, -V, --version
148 Print license and version information on standard output and
149 exit successfully.
150
151
152
154 LBZIP2, BZIP2, BZIP
155 Before parsing the command line, lbzip2 inserts the contents of
156 these variables, in the order specified, between the invocation
157 name and the rest of the command line. Tokens are separated by
158 spaces and tabs, which cannot be escaped.
159
160
161
163 FILE Specify files to compress or decompress.
164
165 FILEs with .bz2, .tbz, .tbz2 and .tz2 name suffixes will be
166 skipped when compressing. When decompressing, .bz2 suffixes will
167 be removed in output filenames; .tbz, .tbz2 and .tz2 suffixes
168 will be replaced by .tar; other filenames will be suffixed with
169 .out. If an INT or TERM signal is delivered to lbzip2, then it
170 removes the regular output file currently open before exiting.
171
172 If no FILE is given, lbzip2 works as a filter, processing stan‐
173 dard input to standard output. In this case, lbzip2 will decline
174 to write compressed output to a terminal (or read compressed
175 input from a terminal), as this would be entirely incomprehensi‐
176 ble and therefore pointless.
177
178
179
181 0 if lbzip2 finishes successfully. This presumes that whenever it
182 tries, lbzip2 never fails to write to standard error.
183
184
185 1 if lbzip2 encounters a fatal error.
186
187
188 4 if lbzip2 issues warnings without encountering a fatal error.
189 This presumes that whenever it tries, lbzip2 never fails to
190 write to standard error.
191
192
193 SIGPIPE, SIGXFSZ
194 if lbzip2 intends to exit with status 1 due to any fatal error,
195 but any such signal with inherited SIG_DFL action was generated
196 for lbzip2 previously, then lbzip2 terminates by way of one of
197 said signals, after cleaning up any interrupted output file.
198
199
200 SIGABRT
201 if a runtime assertion fails (i.e. lbzip2 detects a bug in
202 itself). Hopefully whoever compiled your binary wasn't bold
203 enough to #define NDEBUG.
204
205
206 SIGINT, SIGTERM
207 lbzip2 catches these signals so that it can remove an inter‐
208 rupted output file. In such cases, lbzip2 exits by re-raising
209 (one of) the received signal(s).
210
211
212
214 lbzip2 compresses large files in blocks. It can operate at various
215 block sizes, ranging from 100k to 900k in 100k steps, and it allocates
216 only as much memory as it needs to. The block size affects both the
217 compression ratio achieved, and the amount of memory needed both for
218 compression and decompression. Compression and decompression speed is
219 virtually unaffected by block size, provided that the file being pro‐
220 cessed is large enough to be split among all worker threads.
221
222 The flags -1 through -9 specify the block size to be 100,000 bytes
223 through 900,000 bytes (the default) respectively. At decompression-
224 time, the block size used for compression is read from the compressed
225 file -- the flags -1 to -9 are irrelevant to and so ignored during
226 decompression.
227
228 Larger block sizes give rapidly diminishing marginal returns; most of
229 the compression comes from the first two or three hundred k of block
230 size, a fact worth bearing in mind when using lbzip2 on small machines.
231 It is also important to appreciate that the decompression memory
232 requirement is set at compression-time by the choice of block size. In
233 general you should try and use the largest block size memory con‐
234 straints allow.
235
236 Another significant point applies to small files. By design, only one
237 of lbzip2's worker threads can work on a single block. This means that
238 if the number of blocks in the compressed file is less than the number
239 of processors online, then some of worker threads will remain idle for
240 the entire time. Compressing small files with smaller block sizes can
241 therefore significantly increase both compression and decompression
242 speed. The speed difference is more noticeable as the number of CPU
243 cores grows.
244
245
246
248 Dealing with error conditions is the least satisfactory aspect of
249 lbzip2. The policy is to try and leave the filesystem in a consistent
250 state, then quit, even if it means not processing some of the files
251 mentioned in the command line.
252
253 `A consistent state' means that a file exists either in its compressed
254 or uncompressed form, but not both. This boils down to the rule `delete
255 the output file if an error condition occurs, leaving the input
256 intact'. Input files are only deleted when we can be pretty sure the
257 output file has been written and closed successfully.
258
259
260
261
263 lbzip2 needs various kinds of system resources to operate. Those
264 include memory, threads, mutexes and condition variables. The policy is
265 to simply give up if a resource allocation failure occurs.
266
267 Resource consumption grows linearly with number of worker threads. If
268 lbzip2 fails because of lack of some resources, decreasing number of
269 worker threads may help. It would be possible for lbzip2 to try to
270 reduce number of worker threads (and hence the resource consumption),
271 or to move on to subsequent files in the hope that some might need less
272 resources, but the complications for doing this seem more trouble than
273 they are worth.
274
275
276
278 lbzip2 attempts to compress data by performing several non-trivial
279 transformations on it. Every compression of a file implies an assump‐
280 tion that the compressed file can be decompressed to reproduce the
281 original. Great efforts in design, coding and testing have been made to
282 ensure that this program works correctly. However, the complexity of
283 the algorithms, and, in particular, the presence of various special
284 cases in the code which occur with very low but non-zero probability
285 make it very difficult to rule out the possibility of bugs remaining in
286 the program. That is not to say this program is inherently unreliable.
287 Indeed, I very much hope the opposite is true -- lbzip2 has been care‐
288 fully constructed and extensively tested.
289
290 As a self-check for your protection, lbzip2 uses 32-bit CRCs to make
291 sure that the decompressed version of a file is identical to the origi‐
292 nal. This guards against corruption of the compressed data, and against
293 undiscovered bugs in lbzip2 (hopefully unlikely). The chances of data
294 corruption going undetected is microscopic, about one chance in four
295 billion for each file processed. Be aware, though, that the check
296 occurs upon decompression, so it can only tell you that that something
297 is wrong.
298
299 CRCs can only detect corrupted files, they can't help you recover the
300 original, uncompressed data. However, because of the block nature of
301 the compression algorithm, it may be possible to recover some parts of
302 the damaged file, even if some blocks are destroyed.
303
304
305
307 Separate input files don't share worker threads; at most one input file
308 is worked on at any moment.
309
310
311
313 lbzip2 was originally written by Laszlo Ersek <lacos@caesar.elte.hu>,
314 http://lacos.hu/. Versions 2.0 and later were written by Mikolaj Izdeb‐
315 ski.
316
317
318
320 Copyright (C) 2011, 2012, 2013 Mikolaj Izdebski
321 Copyright (C) 2008, 2009, 2010 Laszlo Ersek
322 Copyright (C) 1996 Julian Seward
323
324 This manual page is part of lbzip2, version 2.5. lbzip2 is free soft‐
325 ware: you can redistribute it and/or modify it under the terms of the
326 GNU General Public License as published by the Free Software Founda‐
327 tion, either version 3 of the License, or (at your option) any later
328 version.
329
330 lbzip2 is distributed in the hope that it will be useful, but WITHOUT
331 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
332 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
333 for more details.
334
335 You should have received a copy of the GNU General Public License along
336 with lbzip2. If not, see <http://www.gnu.org/licenses/>.
337
338
339
341 Adam Maulis at ELTE IIG; Julian Seward; Paul Sladen; Michael Thomas
342 from Caltech HEP; Bryan Stillwell; Zsolt Bartos-Elekes; Imre Csatlos;
343 Gabor Kovesdan; Paul Wise; Paolo Bonzini; Department of Electrical and
344 Information Engineering at the University of Oulu; Yuta Mori.
345
346
347
349 lbzip2 home page
350 http://lbzip2.org/
351
352
353 bzip2(1)
354 http://www.bzip.org/
355
356
357 pbzip2(1)
358 http://compression.ca/pbzip2/
359
360
361 bzip2smp(1)
362 http://bzip2smp.sourceforge.net/
363
364
365 smpbzip2(1)
366 http://home.student.utwente.nl/n.werensteijn/smpbzip2/
367
368
369 dbzip2(1)
370 http://www.mediawiki.org/wiki/Dbzip2
371
372
373 p7zip(1)
374 http://p7zip.sourceforge.net/
375
376
377
378lbzip2-2.5 26 March 2014 LBZIP2(1)