1XZ(1) XZ Utils XZ(1)
2
3
4
6 xz, unxz, xzcat, lzma, unlzma, lzcat - Compress or decompress .xz and
7 .lzma files
8
10 xz [option]... [file]...
11
12 unxz is equivalent to xz --decompress.
13 xzcat is equivalent to xz --decompress --stdout.
14 lzma is equivalent to xz --format=lzma.
15 unlzma is equivalent to xz --format=lzma --decompress.
16 lzcat is equivalent to xz --format=lzma --decompress --stdout.
17
18 When writing scripts that need to decompress files, it is recommended
19 to always use the name xz with appropriate arguments (xz -d or xz -dc)
20 instead of the names unxz and xzcat.
21
23 xz is a general-purpose data compression tool with command line syntax
24 similar to gzip(1) and bzip2(1). The native file format is the .xz
25 format, but also the legacy .lzma format and raw compressed streams
26 with no container format headers are supported.
27
28 xz compresses or decompresses each file according to the selected oper‐
29 ation mode. If no files are given or file is -, xz reads from standard
30 input and writes the processed data to standard output. xz will refuse
31 (display an error and skip the file) to write compressed data to stan‐
32 dard output if it is a terminal. Similarly, xz will refuse to read com‐
33 pressed data from standard input if it is a terminal.
34
35 Unless --stdout is specified, files other than - are written to a new
36 file whose name is derived from the source file name:
37
38 · When compressing, the suffix of the target file format (.xz or
39 .lzma) is appended to the source filename to get the target file‐
40 name.
41
42 · When decompressing, the .xz or .lzma suffix is removed from the
43 filename to get the target filename. xz also recognizes the suf‐
44 fixes .txz and .tlz, and replaces them with the .tar suffix.
45
46 If the target file already exists, an error is displayed and the file
47 is skipped.
48
49 Unless writing to standard output, xz will display a warning and skip
50 the file if any of the following applies:
51
52 · File is not a regular file. Symbolic links are not followed, thus
53 they are never considered to be regular files.
54
55 · File has more than one hardlink.
56
57 · File has setuid, setgid, or sticky bit set.
58
59 · The operation mode is set to compress, and the file already has a
60 suffix of the target file format (.xz or .txz when compressing to
61 the .xz format, and .lzma or .tlz when compressing to the .lzma for‐
62 mat).
63
64 · The operation mode is set to decompress, and the file doesn't have a
65 suffix of any of the supported file formats (.xz, .txz, .lzma, or
66 .tlz).
67
68 After successfully compressing or decompressing the file, xz copies the
69 owner, group, permissions, access time, and modification time from the
70 source file to the target file. If copying the group fails, the permis‐
71 sions are modified so that the target file doesn't become accessible to
72 users who didn't have permission to access the source file. xz doesn't
73 support copying other metadata like access control lists or extended
74 attributes yet.
75
76 Once the target file has been successfully closed, the source file is
77 removed unless --keep was specified. The source file is never removed
78 if the output is written to standard output.
79
80 Sending SIGINFO or SIGUSR1 to the xz process makes it print progress
81 information to standard error. This has only limited use since when
82 standard error is a terminal, using --verbose will display an automati‐
83 cally updating progress indicator.
84
85 Memory usage
86 The memory usage of xz varies from a few hundred kilobytes to several
87 gigabytes depending on the compression settings. The settings used when
88 compressing a file affect also the memory usage of the decompressor.
89 Typically the decompressor needs only 5 % to 20 % of the amount of RAM
90 that the compressor needed when creating the file. Still, the worst-
91 case memory usage of the decompressor is several gigabytes.
92
93 To prevent uncomfortable surprises caused by huge memory usage, xz has
94 a built-in memory usage limiter. The default limit is 40 % of total
95 physical RAM. While operating systems provide ways to limit the memory
96 usage of processes, relying on it wasn't deemed to be flexible enough.
97
98 When compressing, if the selected compression settings exceed the mem‐
99 ory usage limit, the settings are automatically adjusted downwards and
100 a notice about this is displayed. As an exception, if the memory usage
101 limit is exceeded when compressing with --format=raw, an error is dis‐
102 played and xz will exit with exit status 1.
103
104 If source file cannot be decompressed without exceeding the memory
105 usage limit, an error message is displayed and the file is skipped.
106 Note that compressed files may contain many blocks, which may have been
107 compressed with different settings. Typically all blocks will have
108 roughly the same memory requirements, but it is possible that a block
109 later in the file will exceed the memory usage limit, and an error
110 about too low memory usage limit gets displayed after some data has
111 already been decompressed.
112
113 The absolute value of the active memory usage limit can be seen near
114 the bottom of the output of --long-help. The default limit can be
115 overriden with --memory=limit.
116
118 Integer suffixes and special values
119 In most places where an integer argument is expected, an optional suf‐
120 fix is supported to easily indicate large integers. There must be no
121 space between the integer and the suffix.
122
123 k or kB
124 The integer is multiplied by 1,000 (10^3). For example, 5k or
125 5kB equals 5000.
126
127 Ki or KiB
128 The integer is multiplied by 1,024 (2^10).
129
130 M or MB
131 The integer is multiplied by 1,000,000 (10^6).
132
133 Mi or MiB
134 The integer is multiplied by 1,048,576 (2^20).
135
136 G or GB
137 The integer is multiplied by 1,000,000,000 (10^9).
138
139 Gi or GiB
140 The integer is multiplied by 1,073,741,824 (2^30).
141
142 A special value max can be used to indicate the maximum integer value
143 supported by the option.
144
145 Operation mode
146 If multiple operation mode options are given, the last one takes
147 effect.
148
149 -z, --compress
150 Compress. This is the default operation mode when no operation
151 mode option is specified, and no other operation mode is implied
152 from the command name (for example, unxz implies --decompress).
153
154 -d, --decompress, --uncompress
155 Decompress.
156
157 -t, --test
158 Test the integrity of compressed files. No files are created or
159 removed. This option is equivalent to --decompress --stdout
160 except that the decompressed data is discarded instead of being
161 written to standard output.
162
163 -l, --list
164 View information about the compressed files. No uncompressed
165 output is produced, and no files are created or removed. In list
166 mode, the program cannot read the compressed data from standard
167 input or from other unseekable sources.
168
169 This feature has not been implemented yet.
170
171 Operation modifiers
172 -k, --keep
173 Keep (don't delete) the input files.
174
175 -f, --force
176 This option has several effects:
177
178 · If the target file already exists, delete it before compress‐
179 ing or decompressing.
180
181 · Compress or decompress even if the input is not a regular
182 file, has more than one hardlink, or has setuid, setgid, or
183 sticky bit set. The setuid, setgid, and sticky bits are not
184 copied to the target file.
185
186 · If combined with --decompress --stdout and xz doesn't recog‐
187 nize the type of the source file, xz will copy the source
188 file as is to standard output. This allows using xzcat
189 --force like cat(1) for files that have not been compressed
190 with xz. Note that in future, xz might support new com‐
191 pressed file formats, which may make xz decompress more types
192 of files instead of copying them as is to standard output.
193 --format=format can be used to restrict xz to decompress only
194 a single file format.
195
196 · Allow writing compressed data to a terminal, and reading com‐
197 pressed data from a terminal.
198
199 -c, --stdout, --to-stdout
200 Write the compressed or decompressed data to standard output
201 instead of a file. This implies --keep.
202
203 -S .suf, --suffix=.suf
204 When compressing, use .suf as the suffix for the target file
205 instead of .xz or .lzma. If not writing to standard output and
206 the source file already has the suffix .suf, a warning is dis‐
207 played and the file is skipped.
208
209 When decompressing, recognize also files with the suffix .suf in
210 addition to files with the .xz, .txz, .lzma, or .tlz suffix. If
211 the source file has the suffix .suf, the suffix is removed to
212 get the target filename.
213
214 When compressing or decompressing raw streams (--format=raw),
215 the suffix must always be specified unless writing to standard
216 output, because there is no default suffix for raw streams.
217
218 --files[=file]
219 Read the filenames to process from file; if file is omitted,
220 filenames are read from standard input. Filenames must be termi‐
221 nated with the newline character. If filenames are given also as
222 command line arguments, they are processed before the filenames
223 read from file.
224
225 --files0[=file]
226 This is identical to --files[=file] except that the filenames
227 must be terminated with the null character.
228
229 Basic file format and compression options
230 -F format, --format=format
231 Specify the file format to compress or decompress:
232
233 · auto: This is the default. When compressing, auto is equiva‐
234 lent to xz. When decompressing, the format of the input file
235 is autodetected. Note that raw streams (created with --for‐
236 mat=raw) cannot be autodetected.
237
238 · xz: Compress to the .xz file format, or accept only .xz files
239 when decompressing.
240
241 · lzma or alone: Compress to the legacy .lzma file format, or
242 accept only .lzma files when decompressing. The alternative
243 name alone is provided for backwards compatibility with LZMA
244 Utils.
245
246 · raw: Compress or uncompress a raw stream (no headers). This
247 is meant for advanced users only. To decode raw streams, you
248 need to set not only --format=raw but also specify the filter
249 chain, which would normally be stored in the container format
250 headers.
251
252 -C check, --check=check
253 Specify the type of the integrity check, which is calculated
254 from the uncompressed data. This option has an effect only when
255 compressing into the .xz format; the .lzma format doesn't sup‐
256 port integrity checks. The integrity check (if any) is verified
257 when the .xz file is decompressed.
258
259 Supported check types:
260
261 · none: Don't calculate an integrity check at all. This is usu‐
262 ally a bad idea. This can be useful when integrity of the
263 data is verified by other means anyway.
264
265 · crc32: Calculate CRC32 using the polynomial from IEEE-802.3
266 (Ethernet).
267
268 · crc64: Calculate CRC64 using the polynomial from ECMA-182.
269 This is the default, since it is slightly better than CRC32
270 at detecting damaged files and the speed difference is negli‐
271 gible.
272
273 · sha256: Calculate SHA-256. This is somewhat slower than CRC32
274 and CRC64.
275
276 Integrity of the .xz headers is always verified with CRC32. It
277 is not possible to change or disable it.
278
279 -0 ... -9
280 Select compression preset. If a preset level is specified multi‐
281 ple times, the last one takes effect.
282
283 The compression preset levels can be categorised roughly into
284 three categories:
285
286 -0 ... -2
287 Fast presets with relatively low memory usage. -1 and -2
288 should give compression speed and ratios comparable to
289 bzip2 -1 and bzip2 -9, respectively. Currently -0 is not
290 very good (not much faster than -1 but much worse com‐
291 pression). In future, -0 may be indicate some fast algo‐
292 rithm instead of LZMA2.
293
294 -3 ... -5
295 Good compression ratio with low to medium memory usage.
296 These are significantly slower than levels 0-2.
297
298 -6 ... -9
299 Excellent compression with medium to high memory usage.
300 These are also slower than the lower preset levels. The
301 default is -6. Unless you want to maximize the compres‐
302 sion ratio, you probably don't want a higher preset level
303 than -7 due to speed and memory usage.
304
305 The exact compression settings (filter chain) used by each pre‐
306 set may vary between xz versions. The settings may also vary
307 between files being compressed, if xz determines that modified
308 settings will probably give better compression ratio without
309 significantly affecting compression time or memory usage.
310
311 Because the settings may vary, the memory usage may vary too.
312 The following table lists the maximum memory usage of each pre‐
313 set level, which won't be exceeded even in future versions of
314 xz.
315
316 FIXME: The table below is just a rough idea.
317
318 Preset Compression Decompression
319 -0 6 MiB 1 MiB
320 -1 6 MiB 1 MiB
321 -2 10 MiB 1 MiB
322 -3 20 MiB 2 MiB
323 -4 30 MiB 3 MiB
324 -5 60 MiB 6 MiB
325 -6 100 MiB 10 MiB
326 -7 200 MiB 20 MiB
327 -8 400 MiB 40 MiB
328 -9 800 MiB 80 MiB
329
330 When compressing, xz automatically adjusts the compression set‐
331 tings downwards if the memory usage limit would be exceeded, so
332 it is safe to specify a high preset level even on systems that
333 don't have lots of RAM.
334
335 --fast and --best
336 These are somewhat misleading aliases for -0 and -9, respec‐
337 tively. These are provided only for backwards compatibility
338 with LZMA Utils. Avoid using these options.
339
340 Especially the name of --best is misleading, because the defini‐
341 tion of best depends on the input data, and that usually people
342 don't want the very best compression ratio anyway, because it
343 would be very slow.
344
345 -e, --extreme
346 Modify the compression preset (-0 ... -9) so that a little bit
347 better compression ratio can be achieved without increasing mem‐
348 ory usage of the compressor or decompressor (exception: compres‐
349 sor memory usage may increase a little with presets -0 ... -2).
350 The downside is that the compression time will increase dramati‐
351 cally (it can easily double).
352
353 -M limit, --memory=limit
354 Set the memory usage limit. If this option is specied multiple
355 times, the last one takes effect. The limit can be specified in
356 multiple ways:
357
358 · The limit can be an absolute value in bytes. Using an integer
359 suffix like MiB can be useful. Example: --memory=80MiB
360
361 · The limit can be specified as a percentage of physical RAM.
362 Example: --memory=70%
363
364 · The limit can be reset back to its default value (currently
365 40 % of physical RAM) by setting it to 0.
366
367 · The memory usage limiting can be effectively disabled by set‐
368 ting limit to max. This isn't recommended. It's usually bet‐
369 ter to use, for example, --memory=90%.
370
371 The current limit can be seen near the bottom of the output of
372 the --long-help option.
373
374 -T threads, --threads=threads
375 Specify the maximum number of worker threads to use. The default
376 is the number of available CPU cores. You can see the current
377 value of threads near the end of the output of the --long-help
378 option.
379
380 The actual number of worker threads can be less than threads if
381 using more threads would exceed the memory usage limit. In
382 addition to CPU-intensive worker threads, xz may use a few aux‐
383 iliary threads, which don't use a lot of CPU time.
384
385 Multithreaded compression and decompression are not implemented
386 yet, so this option has no effect for now.
387
388 Custom compressor filter chains
389 A custom filter chain allows specifying the compression settings in
390 detail instead of relying on the settings associated to the preset lev‐
391 els. When a custom filter chain is specified, the compression preset
392 level options (-0 ... -9 and --extreme) are silently ignored.
393
394 A filter chain is comparable to piping on the UN*X command line. When
395 compressing, the uncompressed input goes to the first filter, whose
396 output goes to the next filter (if any). The output of the last filter
397 gets written to the compressed file. The maximum number of filters in
398 the chain is four, but typically a filter chain has only one or two
399 filters.
400
401 Many filters have limitations where they can be in the filter chain:
402 some filters can work only as the last filter in the chain, some only
403 as a non-last filter, and some work in any position in the chain.
404 Depending on the filter, this limitation is either inherent to the fil‐
405 ter design or exists to prevent security issues.
406
407 A custom filter chain is specified by using one or more filter options
408 in the order they are wanted in the filter chain. That is, the order of
409 filter options is significant! When decoding raw streams (--for‐
410 mat=raw), the filter chain is specified in the same order as it was
411 specified when compressing.
412
413 Filters take filter-specific options as a comma-separated list. Extra
414 commas in options are ignored. Every option has a default value, so you
415 need to specify only those you want to change.
416
417 --lzma1[=options], --lzma2[=options]
418 Add LZMA1 or LZMA2 filter to the filter chain. These filter can
419 be used only as the last filter in the chain.
420
421 LZMA1 is a legacy filter, which is supported almost solely due
422 to the legacy .lzma file format, which supports only LZMA1.
423 LZMA2 is an updated version of LZMA1 to fix some practical
424 issues of LZMA1. The .xz format uses LZMA2, and doesn't support
425 LZMA1 at all. Compression speed and ratios of LZMA1 and LZMA2
426 are practically the same.
427
428 LZMA1 and LZMA2 share the same set of options:
429
430 preset=preset
431 Reset all LZMA1 or LZMA2 options to preset. Preset con‐
432 sist of an integer, which may be followed by single-let‐
433 ter preset modifiers. The integer can be from 0 to 9,
434 matching the command line options -0 ... -9. The only
435 supported modifier is currently e, which matches
436 --extreme.
437
438 The default preset is 6, from which the default values
439 for the rest of the LZMA1 or LZMA2 options are taken.
440
441 dict=size
442 Dictionary (history buffer) size indicates how many bytes
443 of the recently processed uncompressed data is kept in
444 memory. One method to reduce size of the uncompressed
445 data is to store distance-length pairs, which indicate
446 what data to repeat from the dictionary buffer. The big‐
447 ger the dictionary, the better the compression ratio usu‐
448 ally is, but dictionaries bigger than the uncompressed
449 data are waste of RAM.
450
451 Typical dictionary size is from 64 KiB to 64 MiB. The
452 minimum is 4 KiB. The maximum for compression is cur‐
453 rently 1.5 GiB. The decompressor already supports dictio‐
454 naries up to one byte less than 4 GiB, which is the maxi‐
455 mum for LZMA1 and LZMA2 stream formats.
456
457 Dictionary size has the biggest effect on compression
458 ratio. Dictionary size and match finder together deter‐
459 mine the memory usage of the LZMA1 or LZMA2 encoder. The
460 same dictionary size is required for decompressing that
461 was used when compressing, thus the memory usage of the
462 decoder is determined by the dictionary size used when
463 compressing.
464
465 lc=lc Specify the number of literal context bits. The minimum
466 is 0 and the maximum is 4; the default is 3. In addi‐
467 tion, the sum of lc and lp must not exceed 4.
468
469 lp=lp Specify the number of literal position bits. The minimum
470 is 0 and the maximum is 4; the default is 0.
471
472 pb=pb Specify the number of position bits. The minimum is 0 and
473 the maximum is 4; the default is 2.
474
475 mode=mode
476 Compression mode specifies the function used to analyze
477 the data produced by the match finder. Supported modes
478 are fast and normal. The default is fast for presets 0-2
479 and normal for presets 3-9.
480
481 mf=mf Match finder has a major effect on encoder speed, memory
482 usage, and compression ratio. Usually Hash Chain match
483 finders are faster than Binary Tree match finders. Hash
484 Chains are usually used together with mode=fast and
485 Binary Trees with mode=normal. The memory usage formulas
486 are only rough estimates, which are closest to reality
487 when dict is a power of two.
488
489 hc3 Hash Chain with 2- and 3-byte hashing
490 Minimum value for nice: 3
491 Memory usage: dict * 7.5 (if dict <= 16 MiB);
492 dict * 5.5 + 64 MiB (if dict > 16 MiB)
493
494 hc4 Hash Chain with 2-, 3-, and 4-byte hashing
495 Minimum value for nice: 4
496 Memory usage: dict * 7.5
497
498 bt2 Binary Tree with 2-byte hashing
499 Minimum value for nice: 2
500 Memory usage: dict * 9.5
501
502 bt3 Binary Tree with 2- and 3-byte hashing
503 Minimum value for nice: 3
504 Memory usage: dict * 11.5 (if dict <= 16 MiB);
505 dict * 9.5 + 64 MiB (if dict > 16 MiB)
506
507 bt4 Binary Tree with 2-, 3-, and 4-byte hashing
508 Minimum value for nice: 4
509 Memory usage: dict * 11.5
510
511 nice=nice
512 Specify what is considered to be a nice length for a
513 match. Once a match of at least nice bytes is found, the
514 algorithm stops looking for possibly better matches.
515
516 nice can be 2-273 bytes. Higher values tend to give bet‐
517 ter compression ratio at expense of speed. The default
518 depends on the preset level.
519
520 depth=depth
521 Specify the maximum search depth in the match finder. The
522 default is the special value 0, which makes the compres‐
523 sor determine a reasonable depth from mf and nice.
524
525 Using very high values for depth can make the encoder
526 extremely slow with carefully crafted files. Avoid set‐
527 ting the depth over 1000 unless you are prepared to
528 interrupt the compression in case it is taking too long.
529
530 When decoding raw streams (--format=raw), LZMA2 needs only the
531 value of dict. LZMA1 needs also lc, lp, and pb.
532
533 --x86[=options]
534
535 --powerpc[=options]
536
537 --ia64[=options]
538
539 --arm[=options]
540
541 --armthumb[=options]
542
543 --sparc[=options]
544 Add a branch/call/jump (BCJ) filter to the filter chain. These
545 filters can be used only as non-last filter in the filter chain.
546
547 A BCJ filter converts relative addresses in the machine code to
548 their absolute counterparts. This doesn't change the size of the
549 data, but it increases redundancy, which allows e.g. LZMA2 to
550 get better compression ratio.
551
552 The BCJ filters are always reversible, so using a BCJ filter for
553 wrong type of data doesn't cause any data loss. However, apply‐
554 ing a BCJ filter for wrong type of data is a bad idea, because
555 it tends to make the compression ratio worse.
556
557 Different instruction sets have have different alignment:
558
559 Filter Alignment Notes
560 x86 1 32-bit and 64-bit x86
561 PowerPC 4 Big endian only
562 ARM 4 Little endian only
563 ARM-Thumb 2 Little endian only
564 IA-64 16 Big or little endian
565 SPARC 4 Big or little endian
566
567 Since the BCJ-filtered data is usually compressed with LZMA2,
568 the compression ratio may be improved slightly if the LZMA2
569 options are set to match the alignment of the selected BCJ fil‐
570 ter. For example, with the IA-64 filter, it's good to set pb=4
571 with LZMA2 (2^4=16). The x86 filter is an exception; it's usu‐
572 ally good to stick to LZMA2's default four-byte alignment when
573 compressing x86 executables.
574
575 All BCJ filters support the same options:
576
577 start=offset
578 Specify the start offset that is used when converting
579 between relative and absolute addresses. The offset must
580 be a multiple of the alignment of the filter (see the ta‐
581 ble above). The default is zero. In practice, the
582 default is good; specifying a custom offset is almost
583 never useful.
584
585 Specifying a non-zero start offset is probably useful
586 only if the executable has multiple sections, and there
587 are many cross-section jumps or calls. Applying a BCJ
588 filter separately for each section with proper start off‐
589 set and then compressing the result as a single chunk may
590 give some improvement in compression ratio compared to
591 applying the BCJ filter with the default offset for the
592 whole executable.
593
594 --delta[=options]
595 Add Delta filter to the filter chain. The Delta filter can be
596 used only as non-last filter in the filter chain.
597
598 Currently only simple byte-wise delta calculation is supported.
599 It can be useful when compressing e.g. uncompressed bitmap
600 images or uncompressed PCM audio. However, special purpose algo‐
601 rithms may give significantly better results than Delta + LZMA2.
602 This is true especially with audio, which compresses faster and
603 better e.g. with FLAC.
604
605 Supported options:
606
607 dist=distance
608 Specify the distance of the delta calculation as bytes.
609 distance must be 1-256. The default is 1.
610
611 For example, with dist=2 and eight-byte input A1 B1 A2 B3
612 A3 B5 A4 B7, the output will be A1 B1 01 02 01 02 01 02.
613
614 Other options
615 -q, --quiet
616 Suppress warnings and notices. Specify this twice to suppress
617 errors too. This option has no effect on the exit status. That
618 is, even if a warning was suppressed, the exit status to indi‐
619 cate a warning is still used.
620
621 -v, --verbose
622 Be verbose. If standard error is connected to a terminal, xz
623 will display a progress indicator. Specifying --verbose twice
624 will give even more verbose output (useful mostly for debug‐
625 ging).
626
627 -Q, --no-warn
628 Don't set the exit status to 2 even if a condition worth a warn‐
629 ing was detected. This option doesn't affect the verbosity
630 level, thus both --quiet and --no-warn have to be used to not
631 display warnings and to not alter the exit status.
632
633 -h, --help
634 Display a help message describing the most commonly used
635 options, and exit successfully.
636
637 -H, --long-help
638 Display a help message describing all features of xz, and exit
639 successfully
640
641 -V, --version
642 Display the version number of xz and liblzma.
643
645 0 All is good.
646
647 1 An error occurred.
648
649 2 Something worth a warning occurred, but no actual errors
650 occurred.
651
652 Notices (not warnings or errors) printed on standard error don't affect
653 the exit status.
654
656 XZ_OPT A space-separated list of options is parsed from XZ_OPT before
657 parsing the options given on the command line. Note that only
658 options are parsed from XZ_OPT; all non-options are silently
659 ignored. Parsing is done with getopt_long(3) which is used also
660 for the command line arguments.
661
663 The command line syntax of xz is practically a superset of lzma,
664 unlzma, and lzcat as found from LZMA Utils 4.32.x. In most cases, it is
665 possible to replace LZMA Utils with XZ Utils without breaking existing
666 scripts. There are some incompatibilities though, which may sometimes
667 cause problems.
668
669 Compression preset levels
670 The numbering of the compression level presets is not identical in xz
671 and LZMA Utils. The most important difference is how dictionary sizes
672 are mapped to different presets. Dictionary size is roughly equal to
673 the decompressor memory usage.
674
675 Level xz LZMA Utils
676 -1 64 KiB 64 KiB
677 -2 512 KiB 1 MiB
678 -3 1 MiB 512 KiB
679 -4 2 MiB 1 MiB
680 -5 4 MiB 2 MiB
681 -6 8 MiB 4 MiB
682 -7 16 MiB 8 MiB
683 -8 32 MiB 16 MiB
684 -9 64 MiB 32 MiB
685
686 The dictionary size differences affect the compressor memory usage too,
687 but there are some other differences between LZMA Utils and XZ Utils,
688 which make the difference even bigger:
689
690 Level xz LZMA Utils 4.32.x
691 -1 2 MiB 2 MiB
692 -2 5 MiB 12 MiB
693 -3 13 MiB 12 MiB
694 -4 25 MiB 16 MiB
695 -5 48 MiB 26 MiB
696 -6 94 MiB 45 MiB
697 -7 186 MiB 83 MiB
698 -8 370 MiB 159 MiB
699 -9 674 MiB 311 MiB
700
701 The default preset level in LZMA Utils is -7 while in XZ Utils it is
702 -6, so both use 8 MiB dictionary by default.
703
704 Streamed vs. non-streamed .lzma files
705 Uncompressed size of the file can be stored in the .lzma header. LZMA
706 Utils does that when compressing regular files. The alternative is to
707 mark that uncompressed size is unknown and use end of payload marker to
708 indicate where the decompressor should stop. LZMA Utils uses this
709 method when uncompressed size isn't known, which is the case for exam‐
710 ple in pipes.
711
712 xz supports decompressing .lzma files with or without end of payload
713 marker, but all .lzma files created by xz will use end of payload
714 marker and have uncompressed size marked as unknown in the .lzma
715 header. This may be a problem in some (uncommon) situations. For exam‐
716 ple, a .lzma decompressor in an embedded device might work only with
717 files that have known uncompressed size. If you hit this problem, you
718 need to use LZMA Utils or LZMA SDK to create .lzma files with known
719 uncompressed size.
720
721 Unsupported .lzma files
722 The .lzma format allows lc values up to 8, and lp values up to 4. LZMA
723 Utils can decompress files with any lc and lp, but always creates files
724 with lc=3 and lp=0. Creating files with other lc and lp is possible
725 with xz and with LZMA SDK.
726
727 The implementation of the LZMA1 filter in liblzma requires that the sum
728 of lc and lp must not exceed 4. Thus, .lzma files which exceed this
729 limitation, cannot be decompressed with xz.
730
731 LZMA Utils creates only .lzma files which have dictionary size of 2^n
732 (a power of 2), but accepts files with any dictionary size. liblzma
733 accepts only .lzma files which have dictionary size of 2^n or 2^n +
734 2^(n-1). This is to decrease false positives when autodetecting .lzma
735 files.
736
737 These limitations shouldn't be a problem in practice, since practically
738 all .lzma files have been compressed with settings that liblzma will
739 accept.
740
741 Trailing garbage
742 When decompressing, LZMA Utils silently ignore everything after the
743 first .lzma stream. In most situations, this is a bug. This also means
744 that LZMA Utils don't support decompressing concatenated .lzma files.
745
746 If there is data left after the first .lzma stream, xz considers the
747 file to be corrupt. This may break obscure scripts which have assumed
748 that trailing garbage is ignored.
749
751 Compressed output may vary
752 The exact compressed output produced from the same uncompressed input
753 file may vary between XZ Utils versions even if compression options are
754 identical. This is because the encoder can be improved (faster or bet‐
755 ter compression) without affecting the file format. The output can vary
756 even between different builds of the same XZ Utils version, if differ‐
757 ent build options are used or if the endianness of the hardware is dif‐
758 ferent for different builds.
759
760 The above means that implementing --rsyncable to create rsyncable .xz
761 files is not going to happen without freezing a part of the encoder
762 implementation, which can then be used with --rsyncable.
763
764 Embedded .xz decompressors
765 Embedded .xz decompressor implementations like XZ Embedded don't neces‐
766 sarily support files created with check types other than none and
767 crc32. Since the default is --check=crc64, you must use --check=none
768 or --check=crc32 when creating files for embedded systems.
769
770 Outside embedded systems, all .xz format decompressors support all the
771 check types, or at least are able to decompress the file without veri‐
772 fying the integrity check if the particular check is not supported.
773
774 XZ Embedded supports BCJ filters, but only with the default start off‐
775 set.
776
778 xzdec(1), gzip(1), bzip2(1)
779
780 XZ Utils: <http://tukaani.org/xz/>
781 XZ Embedded: <http://tukaani.org/xz/embedded.html>
782 LZMA SDK: <http://7-zip.org/sdk.html>
783
784
785
786Tukaani 2009-08-27 XZ(1)