1ZSTD(1) User Commands ZSTD(1)
2
3
4
6 zstd - zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst
7 files
8
10 zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12 zstdmt is equivalent to zstd -T0
13
14 unzstd is equivalent to zstd -d
15
16 zstdcat is equivalent to zstd -dcf
17
19 zstd is a fast lossless compression algorithm and data compression
20 tool, with command line syntax similar to gzip (1) and xz (1). It is
21 based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22 offers highly configurable compression speed, with fast modes at > 200
23 MB/s per core, and strong modes nearing lzma compression ratios. It
24 also features a very fast decoder, with speeds > 500 MB/s per core.
25
26 zstd command line syntax is generally similar to gzip, but features the
27 following differences :
28
29 · Source files are preserved by default. It´s possible to remove them
30 automatically by using the --rm command.
31
32 · When compressing a single file, zstd displays progress notifica‐
33 tions and result summary by default. Use -q to turn them off.
34
35 · zstd does not accept input from console, but it properly accepts
36 stdin when it´s not the console.
37
38 · zstd displays a short help page when command line is an error. Use
39 -q to turn it off.
40
41
42
43 zstd compresses or decompresses each file according to the selected
44 operation mode. If no files are given or file is -, zstd reads from
45 standard input and writes the processed data to standard output. zstd
46 will refuse to write compressed data to standard output if it is a ter‐
47 minal : it will display an error message and skip the file. Similarly,
48 zstd will refuse to read compressed data from standard input if it is a
49 terminal.
50
51 Unless --stdout or -o is specified, files are written to a new file
52 whose name is derived from the source file name:
53
54 · When compressing, the suffix .zst is appended to the source file‐
55 name to get the target filename.
56
57 · When decompressing, the .zst suffix is removed from the source
58 filename to get the target filename
59
60
61
62 Concatenation with .zst files
63 It is possible to concatenate .zst files as is. zstd will decompress
64 such files as if they were a single .zst file.
65
67 Integer suffixes and special values
68 In most places where an integer argument is expected, an optional suf‐
69 fix is supported to easily indicate large integers. There must be no
70 space between the integer and the suffix.
71
72 KiB Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73 as synonyms for KiB.
74
75 MiB Multiply the integer by 1,048,576 (2^20). Mi, M, and MB are
76 accepted as synonyms for MiB.
77
78 Operation mode
79 If multiple operation mode options are given, the last one takes
80 effect.
81
82 -z, --compress
83 Compress. This is the default operation mode when no operation
84 mode option is specified and no other operation mode is implied
85 from the command name (for example, unzstd implies --decom‐
86 press).
87
88 -d, --decompress, --uncompress
89 Decompress.
90
91 -t, --test
92 Test the integrity of compressed files. This option is equiva‐
93 lent to --decompress --stdout except that the decompressed data
94 is discarded instead of being written to standard output. No
95 files are created or removed.
96
97 -b# Benchmark file(s) using compression level #
98
99 --train FILEs
100 Use FILEs as a training set to create a dictionary. The training
101 set should contain a lot of small files (> 100).
102
103 -l, --list
104 Display information related to a zstd compressed file, such as
105 size, ratio, and checksum. Some of these fields may not be
106 available. This command can be augmented with the -v modifier.
107
108 Operation modifiers
109 -# # compression level [1-19] (default: 3)
110
111 --fast[=#]
112 switch to ultra-fast compression levels. If =# is not present,
113 it defaults to 1. The higher the value, the faster the compres‐
114 sion speed, at the cost of some compression ratio. This setting
115 overwrites compression level if one was set previously. Simi‐
116 larly, if a compression level is set after --fast, it overrides
117 it.
118
119 --ultra
120 unlocks high compression levels 20+ (maximum 22), using a lot
121 more memory. Note that decompression will also require more mem‐
122 ory when using these levels.
123
124 --long[=#]
125 enables long distance matching with # windowLog, if not # is not
126 present it defaults to 27. This increases the window size (win‐
127 dowLog) and memory usage for both the compressor and decompres‐
128 sor. This setting is designed to improve the compression ratio
129 for files with long matches at a large distance.
130
131 Note: If windowLog is set to larger than 27, --long=windowLog or
132 --memory=windowSize needs to be passed to the decompressor.
133
134 -T#, --threads=#
135 Compress using # working threads (default: 1). If # is 0,
136 attempt to detect and use the number of physical CPU cores. In
137 all cases, the nb of threads is capped to ZST‐
138 DMT_NBTHREADS_MAX==200. This modifier does nothing if zstd is
139 compiled without multithread support.
140
141 --single-thread
142 Does not spawn a thread for compression, use a single thread for
143 both I/O and compression. In this mode, compression is serial‐
144 ized with I/O, which is slightly slower. (This is different from
145 -T1, which spawns 1 compression thread in parallel of I/O). This
146 mode is the only one available when multithread support is dis‐
147 abled. Single-thread mode features lower memory usage. Final
148 compressed result is slightly different from -T1.
149
150 --adapt[=min=#,max=#]
151 zstd will dynamically adapt compression level to perceived I/O
152 conditions. Compression level adaptation can be observed live by
153 using command -v. Adaptation can be constrained between supplied
154 min and max levels. The feature works when combined with
155 multi-threading and --long mode. It does not work with --sin‐
156 gle-thread. It sets window size to 8 MB by default (can be
157 changed manually, see wlog). Due to the chaotic nature of
158 dynamic adaptation, compressed result is not reproducible. note
159 : at the time of this writing, --adapt can remain stuck at low
160 speed when combined with multiple worker threads (>=2).
161
162 --rsyncable
163 zstd will periodically synchronize the compression state to make
164 the compressed file more rsync-friendly. There is a negligible
165 impact to compression ratio, and the faster compression levels
166 will see a small compression speed hit. This feature does not
167 work with --single-thread. You probably don´t want to use it
168 with long range mode, since it will decrease the effectiveness
169 of the synchronization points, but your milage may vary.
170
171 -D file
172 use file as Dictionary to compress or decompress FILE(s)
173
174 --no-dictID
175 do not store dictionary ID within frame header (dictionary com‐
176 pression). The decoder will have to rely on implicit knowledge
177 about which dictionary to use, it won´t be able to check if it´s
178 correct.
179
180 -o file
181 save result into file (only possible with a single INPUT-FILE)
182
183 -f, --force
184 overwrite output without prompting, and (de)compress symbolic
185 links
186
187 -c, --stdout
188 force write to standard output, even if it is the console
189
190 --[no-]sparse
191 enable / disable sparse FS support, to make files with many
192 zeroes smaller on disk. Creating sparse files may save disk
193 space and speed up decompression by reducing the amount of disk
194 I/O. default: enabled when output is into a file, and disabled
195 when output is stdout. This setting overrides default and can
196 force sparse mode over stdout.
197
198 --rm remove source file(s) after successful compression or decompres‐
199 sion
200
201 -k, --keep
202 keep source file(s) after successful compression or decompres‐
203 sion. This is the default behavior.
204
205 -r operate recursively on directories
206
207 --format=FORMAT
208 compress and decompress in other formats. If compiled with sup‐
209 port, zstd can compress to or decompress from other compression
210 algorithm formats. Possibly available options are zstd, gzip,
211 xz, lzma, and lz4. If no such format is provided, zstd is the
212 default.
213
214 -h/-H, --help
215 display help/long help and exit
216
217 -V, --version
218 display version number and exit. Advanced : -vV also displays
219 supported formats. -vvV also displays POSIX support.
220
221 -v verbose mode
222
223 -q, --quiet
224 suppress warnings, interactivity, and notifications. specify
225 twice to suppress errors too.
226
227 --no-progress
228 do not display the progress bar, but keep all other messages.
229
230 -C, --[no-]check
231 add integrity check computed from uncompressed data (default:
232 enabled)
233
234 -- All arguments after -- are treated as files
235
236
238 Additional options for the pzstd utility
239
240 -p, --processes
241 number of threads to use for (de)compression (default:4)
242
243
244
246 zstd offers dictionary compression, which greatly improves efficiency
247 on small files and messages. It´s possible to train zstd with a set of
248 samples, the result of which is saved into a file called a dictionary.
249 Then during compression and decompression, reference the same dictio‐
250 nary, using command -D dictionaryFileName. Compression of small files
251 similar to the sample set will be greatly improved.
252
253 --train FILEs
254 Use FILEs as training set to create a dictionary. The training
255 set should contain a lot of small files (> 100), and weight typ‐
256 ically 100x the target dictionary size (for example, 10 MB for a
257 100 KB dictionary).
258
259 Supports multithreading if zstd is compiled with threading sup‐
260 port. Additional parameters can be specified with --train-fast‐
261 cover. The legacy dictionary builder can be accessed with
262 --train-legacy. The cover dictionary builder can be accessed
263 with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
264
265 -o file
266 Dictionary saved into file (default name: dictionary).
267
268 --maxdict=#
269 Limit dictionary to specified size (default: 112640).
270
271 -# Use # compression level during training (optional). Will gener‐
272 ate statistics more tuned for selected compression level,
273 resulting in a small compression ratio improvement for this
274 level.
275
276 -B# Split input files in blocks of size # (default: no split)
277
278 --dictID=#
279 A dictionary ID is a locally unique ID that a decoder can use to
280 verify it is using the right dictionary. By default, zstd will
281 create a 4-bytes random number ID. It´s possible to give a pre‐
282 cise number instead. Short numbers have an advantage : an ID <
283 256 will only need 1 byte in the compressed frame header, and an
284 ID < 65536 will only need 2 bytes. This compares favorably to 4
285 bytes default. However, it´s up to the dictionary manager to not
286 assign twice the same ID to 2 different dictionaries.
287
288 --train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]
289 Select parameters for the default dictionary builder algorithm
290 named cover. If d is not specified, then it tries d = 6 and d =
291 8. If k is not specified, then it tries steps values in the
292 range [50, 2000]. If steps is not specified, then the default
293 value of 40 is used. If split is not specified or split <= 0,
294 then the default value of 100 is used. Requires that d <= k. If
295 shrink flag is not used, then the default value for shrinkDict
296 of 0 is used. If shrink is not specified, then the default value
297 for shrinkDictMaxRegression of 1 is used.
298
299 Selects segments of size k with highest score to put in the dic‐
300 tionary. The score of a segment is computed by the sum of the
301 frequencies of all the subsegments of size d. Generally d should
302 be in the range [6, 8], occasionally up to 16, but the algorithm
303 will run faster with d <= 8. Good values for k vary widely based
304 on the input data, but a safe range is [2 * d, 2000]. If split
305 is 100, all input samples are used for both training and testing
306 to find optimal d and k to build dictionary. Supports multi‐
307 threading if zstd is compiled with threading support. Having
308 shrink enabled takes a truncated dictionary of minimum size and
309 doubles in size until compression ratio of the truncated dictio‐
310 nary is at most shrinkDictMaxRegression% worse than the compres‐
311 sion ratio of the largest dictionary.
312
313 Examples:
314
315 zstd --train-cover FILEs
316
317 zstd --train-cover=k=50,d=8 FILEs
318
319 zstd --train-cover=d=8,steps=500 FILEs
320
321 zstd --train-cover=k=50 FILEs
322
323 zstd --train-cover=k=50,split=60 FILEs
324
325 zstd --train-cover=shrink FILEs
326
327 zstd --train-cover=shrink=2 FILEs
328
329 --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
330 Same as cover but with extra parameters f and accel and differ‐
331 ent default value of split If split is not specified, then it
332 tries split = 75. If f is not specified, then it tries f = 20.
333 Requires that 0 < f < 32. If accel is not specified, then it
334 tries accel = 1. Requires that 0 < accel <= 10. Requires that d
335 = 6 or d = 8.
336
337 f is log of size of array that keeps track of frequency of sub‐
338 segments of size d. The subsegment is hashed to an index in the
339 range [0,2^f - 1]. It is possible that 2 different subsegments
340 are hashed to the same index, and they are considered as the
341 same subsegment when computing frequency. Using a higher f
342 reduces collision but takes longer.
343
344 Examples:
345
346 zstd --train-fastcover FILEs
347
348 zstd --train-fastcover=d=8,f=15,accel=2 FILEs
349
350 --train-legacy[=selectivity=#]
351 Use legacy dictionary builder algorithm with the given dictio‐
352 nary selectivity (default: 9). The smaller the selectivity
353 value, the denser the dictionary, improving its efficiency but
354 reducing its possible maximum size. --train-legacy=s=# is also
355 accepted.
356
357 Examples:
358
359 zstd --train-legacy FILEs
360
361 zstd --train-legacy=selectivity=8 FILEs
362
364 -b# benchmark file(s) using compression level #
365
366 -e# benchmark file(s) using multiple compression levels, from -b# to
367 -e# (inclusive)
368
369 -i# minimum evaluation time, in seconds (default: 3s), benchmark
370 mode only
371
372 -B#, --block-size=#
373 cut file(s) into independent blocks of size # (default: no
374 block)
375
376 --priority=rt
377 set process priority to real-time
378
379 Output Format: CompressionLevel#Filename : IntputSize -> OutputSize
380 (CompressionRatio), CompressionSpeed, DecompressionSpeed
381
382 Methodology: For both compression and decompression speed, the entire
383 input is compressed/decompressed in-memory to measure speed. A run
384 lasts at least 1 sec, so when files are small, they are com‐
385 pressed/decompressed several times per run, in order to improve mea‐
386 surement accuracy.
387
389 --zstd[=options]:
390 zstd provides 22 predefined compression levels. The selected or default
391 predefined compression level can be changed with advanced compression
392 options. The options are provided as a comma-separated list. You may
393 specify only the options you want to change and the rest will be taken
394 from the selected or default compression level. The list of available
395 options:
396
397 strategy=strat, strat=strat
398 Specify a strategy used by a match finder.
399
400 There are 9 strategies numbered from 1 to 9, from faster to
401 stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
402 5=ZSTD_lazy2, 6=ZSTD_btlazy2, 7=ZSTD_btopt, 8=ZSTD_btultra,
403 9=ZSTD_btultra2.
404
405 windowLog=wlog, wlog=wlog
406 Specify the maximum number of bits for a match distance.
407
408 The higher number of increases the chance to find a match which
409 usually improves compression ratio. It also increases memory
410 requirements for the compressor and decompressor. The minimum
411 wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
412 forms and 31 (2 GiB) on 64-bit platforms.
413
414 Note: If windowLog is set to larger than 27, --long=windowLog or
415 --memory=windowSize needs to be passed to the decompressor.
416
417 hashLog=hlog, hlog=hlog
418 Specify the maximum number of bits for a hash table.
419
420 Bigger hash tables cause less collisions which usually makes
421 compression faster, but requires more memory during compression.
422
423 The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
424
425 chainLog=clog, clog=clog
426 Specify the maximum number of bits for a hash chain or a binary
427 tree.
428
429 Higher numbers of bits increases the chance to find a match
430 which usually improves compression ratio. It also slows down
431 compression speed and increases memory requirements for compres‐
432 sion. This option is ignored for the ZSTD_fast strategy.
433
434 The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
435
436 searchLog=slog, slog=slog
437 Specify the maximum number of searches in a hash chain or a
438 binary tree using logarithmic scale.
439
440 More searches increases the chance to find a match which usually
441 increases compression ratio but decreases compression speed.
442
443 The minimum slog is 1 and the maximum is 26.
444
445 minMatch=mml, mml=mml
446 Specify the minimum searched length of a match in a hash table.
447
448 Larger search lengths usually decrease compression ratio but
449 improve decompression speed.
450
451 The minimum mml is 3 and the maximum is 7.
452
453 targetLen=tlen, tlen=tlen
454 The impact of this field vary depending on selected strategy.
455
456 For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
457 minimum match length that causes match finder to stop searching.
458 A larger targetLen usually improves compression ratio but
459 decreases compression speed.
460
461 For ZSTD_fast, it triggers ultra-fast mode when > 0. The value
462 represents the amount of data skipped between match sampling.
463 Impact is reversed : a larger targetLen increases compression
464 speed but decreases compression ratio.
465
466 For all other strategies, this field has no impact.
467
468 The minimum tlen is 0 and the maximum is 999.
469
470 overlapLog=ovlog, ovlog=ovlog
471 Determine overlapSize, amount of data reloaded from previous
472 job. This parameter is only available when multithreading is
473 enabled. Reloading more data improves compression ratio, but
474 decreases speed.
475
476 The minimum ovlog is 0, and the maximum is 9. 1 means "no over‐
477 lap", hence completely independent jobs. 9 means "full overlap",
478 meaning up to windowSize is reloaded from previous job. Reducing
479 ovlog by 1 reduces the reloaded amount by a factor 2. For exam‐
480 ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
481 is special and means "default" : ovlog is automatically deter‐
482 mined by zstd. In which case, ovlog will range from 6 to 9,
483 depending on selected strat.
484
485 ldmHashLog=lhlog, lhlog=lhlog
486 Specify the maximum size for a hash table used for long distance
487 matching.
488
489 This option is ignored unless long distance matching is enabled.
490
491 Bigger hash tables usually improve compression ratio at the
492 expense of more memory during compression and a decrease in com‐
493 pression speed.
494
495 The minimum lhlog is 6 and the maximum is 26 (default: 20).
496
497 ldmMinMatch=lmml, lmml=lmml
498 Specify the minimum searched length of a match for long distance
499 matching.
500
501 This option is ignored unless long distance matching is enabled.
502
503 Larger/very small values usually decrease compression ratio.
504
505 The minimum lmml is 4 and the maximum is 4096 (default: 64).
506
507 ldmBucketSizeLog=lblog, lblog=lblog
508 Specify the size of each bucket for the hash table used for long
509 distance matching.
510
511 This option is ignored unless long distance matching is enabled.
512
513 Larger bucket sizes improve collision resolution but decrease
514 compression speed.
515
516 The minimum lblog is 0 and the maximum is 8 (default: 3).
517
518 ldmHashRateLog=lhrlog, lhrlog=lhrlog
519 Specify the frequency of inserting entries into the long dis‐
520 tance matching hash table.
521
522 This option is ignored unless long distance matching is enabled.
523
524 Larger values will improve compression speed. Deviating far from
525 the default value will likely result in a decrease in compres‐
526 sion ratio.
527
528 The default value is wlog - lhlog.
529
530 Example
531 The following parameters sets advanced compression options to something
532 similar to predefined level 19 for files bigger than 256 KB:
533
534 --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
535
536 -B#:
537 Select the size of each compression job. This parameter is available
538 only when multi-threading is enabled. Default value is 4 * windowSize,
539 which means it varies depending on compression level. -B# makes it pos‐
540 sible to select a custom value. Note that job size must respect a mini‐
541 mum value which is enforced transparently. This minimum is either 1 MB,
542 or overlapSize, whichever is largest.
543
545 Report bugs at: https://github.com/facebook/zstd/issues
546
548 Yann Collet
549
550
551
552zstd 1.4.2 July 2019 ZSTD(1)