1ZSTD(1) User Commands ZSTD(1)
2
3
4
6 zstd - zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst
7 files
8
10 zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12 zstdmt is equivalent to zstd -T0
13
14 unzstd is equivalent to zstd -d
15
16 zstdcat is equivalent to zstd -dcf
17
19 zstd is a fast lossless compression algorithm and data compression
20 tool, with command line syntax similar to gzip (1) and xz (1). It is
21 based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22 offers highly configurable compression speed, with fast modes at > 200
23 MB/s per core, and strong modes nearing lzma compression ratios. It
24 also features a very fast decoder, with speeds > 500 MB/s per core.
25
26 zstd command line syntax is generally similar to gzip, but features the
27 following differences :
28
29 · Source files are preserved by default. It´s possible to remove them
30 automatically by using the --rm command.
31
32 · When compressing a single file, zstd displays progress notifica‐
33 tions and result summary by default. Use -q to turn them off.
34
35 · zstd does not accept input from console, but it properly accepts
36 stdin when it´s not the console.
37
38 · zstd displays a short help page when command line is an error. Use
39 -q to turn it off.
40
41
42
43 zstd compresses or decompresses each file according to the selected
44 operation mode. If no files are given or file is -, zstd reads from
45 standard input and writes the processed data to standard output. zstd
46 will refuse to write compressed data to standard output if it is a ter‐
47 minal : it will display an error message and skip the file. Similarly,
48 zstd will refuse to read compressed data from standard input if it is a
49 terminal.
50
51 Unless --stdout or -o is specified, files are written to a new file
52 whose name is derived from the source file name:
53
54 · When compressing, the suffix .zst is appended to the source file‐
55 name to get the target filename.
56
57 · When decompressing, the .zst suffix is removed from the source
58 filename to get the target filename
59
60
61
62 Concatenation with .zst files
63 It is possible to concatenate .zst files as is. zstd will decompress
64 such files as if they were a single .zst file.
65
67 Integer suffixes and special values
68 In most places where an integer argument is expected, an optional suf‐
69 fix is supported to easily indicate large integers. There must be no
70 space between the integer and the suffix.
71
72 KiB Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73 as synonyms for KiB.
74
75 MiB Multiply the integer by 1,048,576 (2^20). Mi, M, and MB are
76 accepted as synonyms for MiB.
77
78 Operation mode
79 If multiple operation mode options are given, the last one takes
80 effect.
81
82 -z, --compress
83 Compress. This is the default operation mode when no operation
84 mode option is specified and no other operation mode is implied
85 from the command name (for example, unzstd implies --decom‐
86 press).
87
88 -d, --decompress, --uncompress
89 Decompress.
90
91 -t, --test
92 Test the integrity of compressed files. This option is equiva‐
93 lent to --decompress --stdout except that the decompressed data
94 is discarded instead of being written to standard output. No
95 files are created or removed.
96
97 -b# Benchmark file(s) using compression level #
98
99 --train FILEs
100 Use FILEs as a training set to create a dictionary. The training
101 set should contain a lot of small files (> 100).
102
103 -l, --list
104 Display information related to a zstd compressed file, such as
105 size, ratio, and checksum. Some of these fields may not be
106 available. This command can be augmented with the -v modifier.
107
108 Operation modifiers
109 -# # compression level [1-19] (default: 3)
110
111 --fast[=#]
112 switch to ultra-fast compression levels. If =# is not present,
113 it defaults to 1. The higher the value, the faster the compres‐
114 sion speed, at the cost of some compression ratio. This setting
115 overwrites compression level if one was set previously. Simi‐
116 larly, if a compression level is set after --fast, it overrides
117 it.
118
119 --ultra
120 unlocks high compression levels 20+ (maximum 22), using a lot
121 more memory. Note that decompression will also require more mem‐
122 ory when using these levels.
123
124 --long[=#]
125 enables long distance matching with # windowLog, if not # is not
126 present it defaults to 27. This increases the window size (win‐
127 dowLog) and memory usage for both the compressor and decompres‐
128 sor. This setting is designed to improve the compression ratio
129 for files with long matches at a large distance.
130
131 Note: If windowLog is set to larger than 27, --long=windowLog or
132 --memory=windowSize needs to be passed to the decompressor.
133
134 -T#, --threads=#
135 Compress using # working threads (default: 1). If # is 0,
136 attempt to detect and use the number of physical CPU cores. In
137 all cases, the nb of threads is capped to ZST‐
138 DMT_NBTHREADS_MAX==200. This modifier does nothing if zstd is
139 compiled without multithread support.
140
141 --single-thread
142 Does not spawn a thread for compression, use a single thread for
143 both I/O and compression. In this mode, compression is serial‐
144 ized with I/O, which is slightly slower. (This is different from
145 -T1, which spawns 1 compression thread in parallel of I/O). This
146 mode is the only one available when multithread support is dis‐
147 abled. Single-thread mode features lower memory usage. Final
148 compressed result is slightly different from -T1.
149
150 --adapt[=min=#,max=#]
151 zstd will dynamically adapt compression level to perceived I/O
152 conditions. Compression level adaptation can be observed live by
153 using command -v. Adaptation can be constrained between supplied
154 min and max levels. The feature works when combined with
155 multi-threading and --long mode. It does not work with --sin‐
156 gle-thread. It sets window size to 8 MB by default (can be
157 changed manually, see wlog). Due to the chaotic nature of
158 dynamic adaptation, compressed result is not reproducible. note
159 : at the time of this writing, --adapt can remain stuck at low
160 speed when combined with multiple worker threads (>=2).
161
162 --rsyncable
163 zstd will periodically synchronize the compression state to make
164 the compressed file more rsync-friendly. There is a negligible
165 impact to compression ratio, and the faster compression levels
166 will see a small compression speed hit. This feature does not
167 work with --single-thread. You probably don´t want to use it
168 with long range mode, since it will decrease the effectiveness
169 of the synchronization points, but your milage may vary.
170
171 -D file
172 use file as Dictionary to compress or decompress FILE(s)
173
174 --no-dictID
175 do not store dictionary ID within frame header (dictionary com‐
176 pression). The decoder will have to rely on implicit knowledge
177 about which dictionary to use, it won´t be able to check if it´s
178 correct.
179
180 -o file
181 save result into file (only possible with a single INPUT-FILE)
182
183 -f, --force
184 overwrite output without prompting, and (de)compress symbolic
185 links
186
187 -c, --stdout
188 force write to standard output, even if it is the console
189
190 --[no-]sparse
191 enable / disable sparse FS support, to make files with many
192 zeroes smaller on disk. Creating sparse files may save disk
193 space and speed up decompression by reducing the amount of disk
194 I/O. default: enabled when output is into a file, and disabled
195 when output is stdout. This setting overrides default and can
196 force sparse mode over stdout.
197
198 --rm remove source file(s) after successful compression or decompres‐
199 sion
200
201 -k, --keep
202 keep source file(s) after successful compression or decompres‐
203 sion. This is the default behavior.
204
205 -r operate recursively on directories
206
207 --format=FORMAT
208 compress and decompress in other formats. If compiled with sup‐
209 port, zstd can compress to or decompress from other compression
210 algorithm formats. Possibly available options are zstd, gzip,
211 xz, lzma, and lz4. If no such format is provided, zstd is the
212 default.
213
214 -h/-H, --help
215 display help/long help and exit
216
217 -V, --version
218 display version number and exit. Advanced : -vV also displays
219 supported formats. -vvV also displays POSIX support.
220
221 -v verbose mode
222
223 -q, --quiet
224 suppress warnings, interactivity, and notifications. specify
225 twice to suppress errors too.
226
227 -C, --[no-]check
228 add integrity check computed from uncompressed data (default:
229 enabled)
230
231 -- All arguments after -- are treated as files
232
233
235 Additional options for the pzstd utility
236
237 -p, --processes
238 number of threads to use for (de)compression (default:4)
239
240
241
243 zstd offers dictionary compression, which greatly improves efficiency
244 on small files and messages. It´s possible to train zstd with a set of
245 samples, the result of which is saved into a file called a dictionary.
246 Then during compression and decompression, reference the same dictio‐
247 nary, using command -D dictionaryFileName. Compression of small files
248 similar to the sample set will be greatly improved.
249
250 --train FILEs
251 Use FILEs as training set to create a dictionary. The training
252 set should contain a lot of small files (> 100), and weight typ‐
253 ically 100x the target dictionary size (for example, 10 MB for a
254 100 KB dictionary).
255
256 Supports multithreading if zstd is compiled with threading sup‐
257 port. Additional parameters can be specified with --train-fast‐
258 cover. The legacy dictionary builder can be accessed with
259 --train-legacy. The cover dictionary builder can be accessed
260 with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
261
262 -o file
263 Dictionary saved into file (default name: dictionary).
264
265 --maxdict=#
266 Limit dictionary to specified size (default: 112640).
267
268 -# Use # compression level during training (optional). Will gener‐
269 ate statistics more tuned for selected compression level,
270 resulting in a small compression ratio improvement for this
271 level.
272
273 -B# Split input files in blocks of size # (default: no split)
274
275 --dictID=#
276 A dictionary ID is a locally unique ID that a decoder can use to
277 verify it is using the right dictionary. By default, zstd will
278 create a 4-bytes random number ID. It´s possible to give a pre‐
279 cise number instead. Short numbers have an advantage : an ID <
280 256 will only need 1 byte in the compressed frame header, and an
281 ID < 65536 will only need 2 bytes. This compares favorably to 4
282 bytes default. However, it´s up to the dictionary manager to not
283 assign twice the same ID to 2 different dictionaries.
284
285 --train-cover[=k#,d=#,steps=#,split=#]
286 Select parameters for the default dictionary builder algorithm
287 named cover. If d is not specified, then it tries d = 6 and d =
288 8. If k is not specified, then it tries steps values in the
289 range [50, 2000]. If steps is not specified, then the default
290 value of 40 is used. If split is not specified or split <= 0,
291 then the default value of 100 is used. Requires that d <= k.
292
293 Selects segments of size k with highest score to put in the dic‐
294 tionary. The score of a segment is computed by the sum of the
295 frequencies of all the subsegments of size d. Generally d should
296 be in the range [6, 8], occasionally up to 16, but the algorithm
297 will run faster with d <= 8. Good values for k vary widely based
298 on the input data, but a safe range is [2 * d, 2000]. If split
299 is 100, all input samples are used for both training and testing
300 to find optimal d and k to build dictionary. Supports multi‐
301 threading if zstd is compiled with threading support.
302
303 Examples:
304
305 zstd --train-cover FILEs
306
307 zstd --train-cover=k=50,d=8 FILEs
308
309 zstd --train-cover=d=8,steps=500 FILEs
310
311 zstd --train-cover=k=50 FILEs
312
313 zstd --train-cover=k=50,split=60 FILEs
314
315 --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
316 Same as cover but with extra parameters f and accel and differ‐
317 ent default value of split If split is not specified, then it
318 tries split = 75. If f is not specified, then it tries f = 20.
319 Requires that 0 < f < 32. If accel is not specified, then it
320 tries accel = 1. Requires that 0 < accel <= 10. Requires that d
321 = 6 or d = 8.
322
323 f is log of size of array that keeps track of frequency of sub‐
324 segments of size d. The subsegment is hashed to an index in the
325 range [0,2^f - 1]. It is possible that 2 different subsegments
326 are hashed to the same index, and they are considered as the
327 same subsegment when computing frequency. Using a higher f
328 reduces collision but takes longer.
329
330 Examples:
331
332 zstd --train-fastcover FILEs
333
334 zstd --train-fastcover=d=8,f=15,accel=2 FILEs
335
336 --train-legacy[=selectivity=#]
337 Use legacy dictionary builder algorithm with the given dictio‐
338 nary selectivity (default: 9). The smaller the selectivity
339 value, the denser the dictionary, improving its efficiency but
340 reducing its possible maximum size. --train-legacy=s=# is also
341 accepted.
342
343 Examples:
344
345 zstd --train-legacy FILEs
346
347 zstd --train-legacy=selectivity=8 FILEs
348
350 -b# benchmark file(s) using compression level #
351
352 -e# benchmark file(s) using multiple compression levels, from -b# to
353 -e# (inclusive)
354
355 -i# minimum evaluation time, in seconds (default: 3s), benchmark
356 mode only
357
358 -B#, --block-size=#
359 cut file(s) into independent blocks of size # (default: no
360 block)
361
362 --priority=rt
363 set process priority to real-time
364
365 Output Format: CompressionLevel#Filename : IntputSize -> OutputSize
366 (CompressionRatio), CompressionSpeed, DecompressionSpeed
367
368 Methodology: For both compression and decompression speed, the entire
369 input is compressed/decompressed in-memory to measure speed. A run
370 lasts at least 1 sec, so when files are small, they are com‐
371 pressed/decompressed several times per run, in order to improve mea‐
372 surement accuracy.
373
375 --zstd[=options]:
376 zstd provides 22 predefined compression levels. The selected or default
377 predefined compression level can be changed with advanced compression
378 options. The options are provided as a comma-separated list. You may
379 specify only the options you want to change and the rest will be taken
380 from the selected or default compression level. The list of available
381 options:
382
383 strategy=strat, strat=strat
384 Specify a strategy used by a match finder.
385
386 There are 9 strategies numbered from 1 to 9, from faster to
387 stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
388 5=ZSTD_lazy2, 6=ZSTD_btlazy2, 7=ZSTD_btopt, 8=ZSTD_btultra,
389 9=ZSTD_btultra2.
390
391 windowLog=wlog, wlog=wlog
392 Specify the maximum number of bits for a match distance.
393
394 The higher number of increases the chance to find a match which
395 usually improves compression ratio. It also increases memory
396 requirements for the compressor and decompressor. The minimum
397 wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
398 forms and 31 (2 GiB) on 64-bit platforms.
399
400 Note: If windowLog is set to larger than 27, --long=windowLog or
401 --memory=windowSize needs to be passed to the decompressor.
402
403 hashLog=hlog, hlog=hlog
404 Specify the maximum number of bits for a hash table.
405
406 Bigger hash tables cause less collisions which usually makes
407 compression faster, but requires more memory during compression.
408
409 The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
410
411 chainLog=clog, clog=clog
412 Specify the maximum number of bits for a hash chain or a binary
413 tree.
414
415 Higher numbers of bits increases the chance to find a match
416 which usually improves compression ratio. It also slows down
417 compression speed and increases memory requirements for compres‐
418 sion. This option is ignored for the ZSTD_fast strategy.
419
420 The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
421
422 searchLog=slog, slog=slog
423 Specify the maximum number of searches in a hash chain or a
424 binary tree using logarithmic scale.
425
426 More searches increases the chance to find a match which usually
427 increases compression ratio but decreases compression speed.
428
429 The minimum slog is 1 and the maximum is 26.
430
431 minMatch=mml, mml=mml
432 Specify the minimum searched length of a match in a hash table.
433
434 Larger search lengths usually decrease compression ratio but
435 improve decompression speed.
436
437 The minimum mml is 3 and the maximum is 7.
438
439 targetLen=tlen, tlen=tlen
440 The impact of this field vary depending on selected strategy.
441
442 For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
443 minimum match length that causes match finder to stop searching.
444 A larger targetLen usually improves compression ratio but
445 decreases compression speed.
446
447 For ZSTD_fast, it triggers ultra-fast mode when > 0. The value
448 represents the amount of data skipped between match sampling.
449 Impact is reversed : a larger targetLen increases compression
450 speed but decreases compression ratio.
451
452 For all other strategies, this field has no impact.
453
454 The minimum tlen is 0 and the maximum is 999.
455
456 overlapLog=ovlog, ovlog=ovlog
457 Determine overlapSize, amount of data reloaded from previous
458 job. This parameter is only available when multithreading is
459 enabled. Reloading more data improves compression ratio, but
460 decreases speed.
461
462 The minimum ovlog is 0, and the maximum is 9. 1 means "no over‐
463 lap", hence completely independent jobs. 9 means "full overlap",
464 meaning up to windowSize is reloaded from previous job. Reducing
465 ovlog by 1 reduces the reloaded amount by a factor 2. For exam‐
466 ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
467 is special and means "default" : ovlog is automatically deter‐
468 mined by zstd. In which case, ovlog will range from 6 to 9,
469 depending on selected strat.
470
471 ldmHashLog=lhlog, lhlog=lhlog
472 Specify the maximum size for a hash table used for long distance
473 matching.
474
475 This option is ignored unless long distance matching is enabled.
476
477 Bigger hash tables usually improve compression ratio at the
478 expense of more memory during compression and a decrease in com‐
479 pression speed.
480
481 The minimum lhlog is 6 and the maximum is 26 (default: 20).
482
483 ldmMinMatch=lmml, lmml=lmml
484 Specify the minimum searched length of a match for long distance
485 matching.
486
487 This option is ignored unless long distance matching is enabled.
488
489 Larger/very small values usually decrease compression ratio.
490
491 The minimum lmml is 4 and the maximum is 4096 (default: 64).
492
493 ldmBucketSizeLog=lblog, lblog=lblog
494 Specify the size of each bucket for the hash table used for long
495 distance matching.
496
497 This option is ignored unless long distance matching is enabled.
498
499 Larger bucket sizes improve collision resolution but decrease
500 compression speed.
501
502 The minimum lblog is 0 and the maximum is 8 (default: 3).
503
504 ldmHashRateLog=lhrlog, lhrlog=lhrlog
505 Specify the frequency of inserting entries into the long dis‐
506 tance matching hash table.
507
508 This option is ignored unless long distance matching is enabled.
509
510 Larger values will improve compression speed. Deviating far from
511 the default value will likely result in a decrease in compres‐
512 sion ratio.
513
514 The default value is wlog - lhlog.
515
516 Example
517 The following parameters sets advanced compression options to something
518 similar to predefined level 19 for files bigger than 256 KB:
519
520 --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
521
522 -B#:
523 Select the size of each compression job. This parameter is available
524 only when multi-threading is enabled. Default value is 4 * windowSize,
525 which means it varies depending on compression level. -B# makes it pos‐
526 sible to select a custom value. Note that job size must respect a mini‐
527 mum value which is enforced transparently. This minimum is either 1 MB,
528 or overlapSize, whichever is largest.
529
531 Report bugs at: https://github.com/facebook/zstd/issues
532
534 Yann Collet
535
536
537
538zstd 1.3.8 December 2018 ZSTD(1)