1ZSTD(1) User Commands ZSTD(1)
2
3
4
6 zstd - zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst
7 files
8
10 zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12 zstdmt is equivalent to zstd -T0
13
14 unzstd is equivalent to zstd -d
15
16 zstdcat is equivalent to zstd -dcf
17
19 zstd is a fast lossless compression algorithm and data compression
20 tool, with command line syntax similar to gzip (1) and xz (1). It is
21 based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22 offers highly configurable compression speed, with fast modes at > 200
23 MB/s per core, and strong modes nearing lzma compression ratios. It
24 also features a very fast decoder, with speeds > 500 MB/s per core.
25
26 zstd command line syntax is generally similar to gzip, but features the
27 following differences :
28
29 · Source files are preserved by default. It´s possible to remove them
30 automatically by using the --rm command.
31
32 · When compressing a single file, zstd displays progress notifica‐
33 tions and result summary by default. Use -q to turn them off.
34
35 · zstd does not accept input from console, but it properly accepts
36 stdin when it´s not the console.
37
38 · zstd displays a short help page when command line is an error. Use
39 -q to turn it off.
40
41
42
43 zstd compresses or decompresses each file according to the selected
44 operation mode. If no files are given or file is -, zstd reads from
45 standard input and writes the processed data to standard output. zstd
46 will refuse to write compressed data to standard output if it is a ter‐
47 minal : it will display an error message and skip the file. Similarly,
48 zstd will refuse to read compressed data from standard input if it is a
49 terminal.
50
51 Unless --stdout or -o is specified, files are written to a new file
52 whose name is derived from the source file name:
53
54 · When compressing, the suffix .zst is appended to the source file‐
55 name to get the target filename.
56
57 · When decompressing, the .zst suffix is removed from the source
58 filename to get the target filename
59
60
61
62 Concatenation with .zst files
63 It is possible to concatenate .zst files as is. zstd will decompress
64 such files as if they were a single .zst file.
65
67 Integer suffixes and special values
68 In most places where an integer argument is expected, an optional suf‐
69 fix is supported to easily indicate large integers. There must be no
70 space between the integer and the suffix.
71
72 KiB Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73 as synonyms for KiB.
74
75 MiB Multiply the integer by 1,048,576 (2^20). Mi, M, and MB are
76 accepted as synonyms for MiB.
77
78 Operation mode
79 If multiple operation mode options are given, the last one takes
80 effect.
81
82 -z, --compress
83 Compress. This is the default operation mode when no operation
84 mode option is specified and no other operation mode is implied
85 from the command name (for example, unzstd implies --decom‐
86 press).
87
88 -d, --decompress, --uncompress
89 Decompress.
90
91 -t, --test
92 Test the integrity of compressed files. This option is equiva‐
93 lent to --decompress --stdout except that the decompressed data
94 is discarded instead of being written to standard output. No
95 files are created or removed.
96
97 -b# Benchmark file(s) using compression level #
98
99 --train FILEs
100 Use FILEs as a training set to create a dictionary. The training
101 set should contain a lot of small files (> 100).
102
103 -l, --list
104 Display information related to a zstd compressed file, such as
105 size, ratio, and checksum. Some of these fields may not be
106 available. This command can be augmented with the -v modifier.
107
108 Operation modifiers
109 -# # compression level [1-19] (default: 3)
110
111 --fast[=#]
112 switch to ultra-fast compression levels. If =# is not present,
113 it defaults to 1. The higher the value, the faster the compres‐
114 sion speed, at the cost of some compression ratio. This setting
115 overwrites compression level if one was set previously. Simi‐
116 larly, if a compression level is set after --fast, it overrides
117 it.
118
119 --ultra
120 unlocks high compression levels 20+ (maximum 22), using a lot
121 more memory. Note that decompression will also require more mem‐
122 ory when using these levels.
123
124 --long[=#]
125 enables long distance matching with # windowLog, if not # is not
126 present it defaults to 27. This increases the window size (win‐
127 dowLog) and memory usage for both the compressor and decompres‐
128 sor. This setting is designed to improve the compression ratio
129 for files with long matches at a large distance.
130
131 Note: If windowLog is set to larger than 27, --long=windowLog or
132 --memory=windowSize needs to be passed to the decompressor.
133
134 -T#, --threads=#
135 Compress using # working threads (default: 1). If # is 0,
136 attempt to detect and use the number of physical CPU cores. In
137 all cases, the nb of threads is capped to ZST‐
138 DMT_NBTHREADS_MAX==200. This modifier does nothing if zstd is
139 compiled without multithread support.
140
141 --single-thread
142 Does not spawn a thread for compression, use a single thread for
143 both I/O and compression. In this mode, compression is serial‐
144 ized with I/O, which is slightly slower. (This is different from
145 -T1, which spawns 1 compression thread in parallel of I/O). This
146 mode is the only one available when multithread support is dis‐
147 abled. Single-thread mode features lower memory usage. Final
148 compressed result is slightly different from -T1.
149
150 --adapt[=min=#,max=#]
151 zstd will dynamically adapt compression level to perceived I/O
152 conditions. Compression level adaptation can be observed live by
153 using command -v. Adaptation can be constrained between supplied
154 min and max levels. The feature works when combined with
155 multi-threading and --long mode. It does not work with --sin‐
156 gle-thread. It sets window size to 8 MB by default (can be
157 changed manually, see wlog). Due to the chaotic nature of
158 dynamic adaptation, compressed result is not reproducible. note
159 : at the time of this writing, --adapt can remain stuck at low
160 speed when combined with multiple worker threads (>=2).
161
162 --stream-size=#
163 Sets the pledged source size of input coming from a stream. This
164 value must be exact, as it will be included in the produced
165 frame header. Incorrect stream sizes will cause an error. This
166 information will be used to better optimize compression parame‐
167 ters, resulting in better and potentially faster compression,
168 especially for smaller source sizes.
169
170 --size-hint=#
171 When handling input from a stream, zstd must guess how large the
172 source size will be when optimizing compression parameters. If
173 the stream size is relatively small, this guess may be a poor
174 one, resulting in a higher compression ratio than expected. This
175 feature allows for controlling the guess when needed. Exact
176 guesses result in better compression ratios. Overestimates
177 result in slightly degraded compression ratios, while underesti‐
178 mates may result in significant degradation.
179
180 --rsyncable
181 zstd will periodically synchronize the compression state to make
182 the compressed file more rsync-friendly. There is a negligible
183 impact to compression ratio, and the faster compression levels
184 will see a small compression speed hit. This feature does not
185 work with --single-thread. You probably don´t want to use it
186 with long range mode, since it will decrease the effectiveness
187 of the synchronization points, but your milage may vary.
188
189 -D file
190 use file as Dictionary to compress or decompress FILE(s)
191
192 --no-dictID
193 do not store dictionary ID within frame header (dictionary com‐
194 pression). The decoder will have to rely on implicit knowledge
195 about which dictionary to use, it won´t be able to check if it´s
196 correct.
197
198 -o file
199 save result into file (only possible with a single INPUT-FILE)
200
201 -f, --force
202 overwrite output without prompting, and (de)compress symbolic
203 links
204
205 -c, --stdout
206 force write to standard output, even if it is the console
207
208 --[no-]sparse
209 enable / disable sparse FS support, to make files with many
210 zeroes smaller on disk. Creating sparse files may save disk
211 space and speed up decompression by reducing the amount of disk
212 I/O. default: enabled when output is into a file, and disabled
213 when output is stdout. This setting overrides default and can
214 force sparse mode over stdout.
215
216 --rm remove source file(s) after successful compression or decompres‐
217 sion
218
219 -k, --keep
220 keep source file(s) after successful compression or decompres‐
221 sion. This is the default behavior.
222
223 -r operate recursively on directories
224
225 --output-dir-flat[=dir]
226 resulting files are stored into target dir directory, instead of
227 same directory as origin file. Be aware that this command can
228 introduce name collision issues, if multiple files, from differ‐
229 ent directories, end up having the same name. Collision resolu‐
230 tion ensures first file with a given name will be present in
231 dir, while in combination with -f, the last file will be present
232 instead.
233
234 --format=FORMAT
235 compress and decompress in other formats. If compiled with sup‐
236 port, zstd can compress to or decompress from other compression
237 algorithm formats. Possibly available options are zstd, gzip,
238 xz, lzma, and lz4. If no such format is provided, zstd is the
239 default.
240
241 -h/-H, --help
242 display help/long help and exit
243
244 -V, --version
245 display version number and exit. Advanced : -vV also displays
246 supported formats. -vvV also displays POSIX support.
247
248 -v verbose mode
249
250 -q, --quiet
251 suppress warnings, interactivity, and notifications. specify
252 twice to suppress errors too.
253
254 --no-progress
255 do not display the progress bar, but keep all other messages.
256
257 -C, --[no-]check
258 add integrity check computed from uncompressed data (default:
259 enabled)
260
261 -- All arguments after -- are treated as files
262
263
265 Additional options for the pzstd utility
266
267 -p, --processes
268 number of threads to use for (de)compression (default:4)
269
270
271
272 Restricted usage of Environment Variables
273 Using environment variables to set parameters has security implica‐
274 tions. Therefore, this avenue is intentionally restricted. Only
275 ZSTD_CLEVEL is supported currently, for setting compression level.
276 ZSTD_CLEVEL can be used to set the level between 1 and 19 (the "normal"
277 range). If the value of ZSTD_CLEVEL is not a valid integer, it will be
278 ignored with a warning message. ZSTD_CLEVEL just replaces the default
279 compression level (3). It can be overridden by corresponding command
280 line arguments.
281
283 zstd offers dictionary compression, which greatly improves efficiency
284 on small files and messages. It´s possible to train zstd with a set of
285 samples, the result of which is saved into a file called a dictionary.
286 Then during compression and decompression, reference the same dictio‐
287 nary, using command -D dictionaryFileName. Compression of small files
288 similar to the sample set will be greatly improved.
289
290 --train FILEs
291 Use FILEs as training set to create a dictionary. The training
292 set should contain a lot of small files (> 100), and weight typ‐
293 ically 100x the target dictionary size (for example, 10 MB for a
294 100 KB dictionary).
295
296 Supports multithreading if zstd is compiled with threading sup‐
297 port. Additional parameters can be specified with --train-fast‐
298 cover. The legacy dictionary builder can be accessed with
299 --train-legacy. The cover dictionary builder can be accessed
300 with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
301
302 -o file
303 Dictionary saved into file (default name: dictionary).
304
305 --maxdict=#
306 Limit dictionary to specified size (default: 112640).
307
308 -# Use # compression level during training (optional). Will gener‐
309 ate statistics more tuned for selected compression level,
310 resulting in a small compression ratio improvement for this
311 level.
312
313 -B# Split input files in blocks of size # (default: no split)
314
315 --dictID=#
316 A dictionary ID is a locally unique ID that a decoder can use to
317 verify it is using the right dictionary. By default, zstd will
318 create a 4-bytes random number ID. It´s possible to give a pre‐
319 cise number instead. Short numbers have an advantage : an ID <
320 256 will only need 1 byte in the compressed frame header, and an
321 ID < 65536 will only need 2 bytes. This compares favorably to 4
322 bytes default. However, it´s up to the dictionary manager to not
323 assign twice the same ID to 2 different dictionaries.
324
325 --train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]
326 Select parameters for the default dictionary builder algorithm
327 named cover. If d is not specified, then it tries d = 6 and d =
328 8. If k is not specified, then it tries steps values in the
329 range [50, 2000]. If steps is not specified, then the default
330 value of 40 is used. If split is not specified or split <= 0,
331 then the default value of 100 is used. Requires that d <= k. If
332 shrink flag is not used, then the default value for shrinkDict
333 of 0 is used. If shrink is not specified, then the default value
334 for shrinkDictMaxRegression of 1 is used.
335
336 Selects segments of size k with highest score to put in the dic‐
337 tionary. The score of a segment is computed by the sum of the
338 frequencies of all the subsegments of size d. Generally d should
339 be in the range [6, 8], occasionally up to 16, but the algorithm
340 will run faster with d <= 8. Good values for k vary widely based
341 on the input data, but a safe range is [2 * d, 2000]. If split
342 is 100, all input samples are used for both training and testing
343 to find optimal d and k to build dictionary. Supports multi‐
344 threading if zstd is compiled with threading support. Having
345 shrink enabled takes a truncated dictionary of minimum size and
346 doubles in size until compression ratio of the truncated dictio‐
347 nary is at most shrinkDictMaxRegression% worse than the compres‐
348 sion ratio of the largest dictionary.
349
350 Examples:
351
352 zstd --train-cover FILEs
353
354 zstd --train-cover=k=50,d=8 FILEs
355
356 zstd --train-cover=d=8,steps=500 FILEs
357
358 zstd --train-cover=k=50 FILEs
359
360 zstd --train-cover=k=50,split=60 FILEs
361
362 zstd --train-cover=shrink FILEs
363
364 zstd --train-cover=shrink=2 FILEs
365
366 --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
367 Same as cover but with extra parameters f and accel and differ‐
368 ent default value of split If split is not specified, then it
369 tries split = 75. If f is not specified, then it tries f = 20.
370 Requires that 0 < f < 32. If accel is not specified, then it
371 tries accel = 1. Requires that 0 < accel <= 10. Requires that d
372 = 6 or d = 8.
373
374 f is log of size of array that keeps track of frequency of sub‐
375 segments of size d. The subsegment is hashed to an index in the
376 range [0,2^f - 1]. It is possible that 2 different subsegments
377 are hashed to the same index, and they are considered as the
378 same subsegment when computing frequency. Using a higher f
379 reduces collision but takes longer.
380
381 Examples:
382
383 zstd --train-fastcover FILEs
384
385 zstd --train-fastcover=d=8,f=15,accel=2 FILEs
386
387 --train-legacy[=selectivity=#]
388 Use legacy dictionary builder algorithm with the given dictio‐
389 nary selectivity (default: 9). The smaller the selectivity
390 value, the denser the dictionary, improving its efficiency but
391 reducing its possible maximum size. --train-legacy=s=# is also
392 accepted.
393
394 Examples:
395
396 zstd --train-legacy FILEs
397
398 zstd --train-legacy=selectivity=8 FILEs
399
401 -b# benchmark file(s) using compression level #
402
403 -e# benchmark file(s) using multiple compression levels, from -b# to
404 -e# (inclusive)
405
406 -i# minimum evaluation time, in seconds (default: 3s), benchmark
407 mode only
408
409 -B#, --block-size=#
410 cut file(s) into independent blocks of size # (default: no
411 block)
412
413 --priority=rt
414 set process priority to real-time
415
416 Output Format: CompressionLevel#Filename : IntputSize -> OutputSize
417 (CompressionRatio), CompressionSpeed, DecompressionSpeed
418
419 Methodology: For both compression and decompression speed, the entire
420 input is compressed/decompressed in-memory to measure speed. A run
421 lasts at least 1 sec, so when files are small, they are com‐
422 pressed/decompressed several times per run, in order to improve mea‐
423 surement accuracy.
424
426 --zstd[=options]:
427 zstd provides 22 predefined compression levels. The selected or default
428 predefined compression level can be changed with advanced compression
429 options. The options are provided as a comma-separated list. You may
430 specify only the options you want to change and the rest will be taken
431 from the selected or default compression level. The list of available
432 options:
433
434 strategy=strat, strat=strat
435 Specify a strategy used by a match finder.
436
437 There are 9 strategies numbered from 1 to 9, from faster to
438 stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
439 5=ZSTD_lazy2, 6=ZSTD_btlazy2, 7=ZSTD_btopt, 8=ZSTD_btultra,
440 9=ZSTD_btultra2.
441
442 windowLog=wlog, wlog=wlog
443 Specify the maximum number of bits for a match distance.
444
445 The higher number of increases the chance to find a match which
446 usually improves compression ratio. It also increases memory
447 requirements for the compressor and decompressor. The minimum
448 wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
449 forms and 31 (2 GiB) on 64-bit platforms.
450
451 Note: If windowLog is set to larger than 27, --long=windowLog or
452 --memory=windowSize needs to be passed to the decompressor.
453
454 hashLog=hlog, hlog=hlog
455 Specify the maximum number of bits for a hash table.
456
457 Bigger hash tables cause less collisions which usually makes
458 compression faster, but requires more memory during compression.
459
460 The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
461
462 chainLog=clog, clog=clog
463 Specify the maximum number of bits for a hash chain or a binary
464 tree.
465
466 Higher numbers of bits increases the chance to find a match
467 which usually improves compression ratio. It also slows down
468 compression speed and increases memory requirements for compres‐
469 sion. This option is ignored for the ZSTD_fast strategy.
470
471 The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
472
473 searchLog=slog, slog=slog
474 Specify the maximum number of searches in a hash chain or a
475 binary tree using logarithmic scale.
476
477 More searches increases the chance to find a match which usually
478 increases compression ratio but decreases compression speed.
479
480 The minimum slog is 1 and the maximum is 26.
481
482 minMatch=mml, mml=mml
483 Specify the minimum searched length of a match in a hash table.
484
485 Larger search lengths usually decrease compression ratio but
486 improve decompression speed.
487
488 The minimum mml is 3 and the maximum is 7.
489
490 targetLen=tlen, tlen=tlen
491 The impact of this field vary depending on selected strategy.
492
493 For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
494 minimum match length that causes match finder to stop searching.
495 A larger targetLen usually improves compression ratio but
496 decreases compression speed.
497
498 For ZSTD_fast, it triggers ultra-fast mode when > 0. The value
499 represents the amount of data skipped between match sampling.
500 Impact is reversed : a larger targetLen increases compression
501 speed but decreases compression ratio.
502
503 For all other strategies, this field has no impact.
504
505 The minimum tlen is 0 and the maximum is 999.
506
507 overlapLog=ovlog, ovlog=ovlog
508 Determine overlapSize, amount of data reloaded from previous
509 job. This parameter is only available when multithreading is
510 enabled. Reloading more data improves compression ratio, but
511 decreases speed.
512
513 The minimum ovlog is 0, and the maximum is 9. 1 means "no over‐
514 lap", hence completely independent jobs. 9 means "full overlap",
515 meaning up to windowSize is reloaded from previous job. Reducing
516 ovlog by 1 reduces the reloaded amount by a factor 2. For exam‐
517 ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
518 is special and means "default" : ovlog is automatically deter‐
519 mined by zstd. In which case, ovlog will range from 6 to 9,
520 depending on selected strat.
521
522 ldmHashLog=lhlog, lhlog=lhlog
523 Specify the maximum size for a hash table used for long distance
524 matching.
525
526 This option is ignored unless long distance matching is enabled.
527
528 Bigger hash tables usually improve compression ratio at the
529 expense of more memory during compression and a decrease in com‐
530 pression speed.
531
532 The minimum lhlog is 6 and the maximum is 26 (default: 20).
533
534 ldmMinMatch=lmml, lmml=lmml
535 Specify the minimum searched length of a match for long distance
536 matching.
537
538 This option is ignored unless long distance matching is enabled.
539
540 Larger/very small values usually decrease compression ratio.
541
542 The minimum lmml is 4 and the maximum is 4096 (default: 64).
543
544 ldmBucketSizeLog=lblog, lblog=lblog
545 Specify the size of each bucket for the hash table used for long
546 distance matching.
547
548 This option is ignored unless long distance matching is enabled.
549
550 Larger bucket sizes improve collision resolution but decrease
551 compression speed.
552
553 The minimum lblog is 0 and the maximum is 8 (default: 3).
554
555 ldmHashRateLog=lhrlog, lhrlog=lhrlog
556 Specify the frequency of inserting entries into the long dis‐
557 tance matching hash table.
558
559 This option is ignored unless long distance matching is enabled.
560
561 Larger values will improve compression speed. Deviating far from
562 the default value will likely result in a decrease in compres‐
563 sion ratio.
564
565 The default value is wlog - lhlog.
566
567 Example
568 The following parameters sets advanced compression options to something
569 similar to predefined level 19 for files bigger than 256 KB:
570
571 --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
572
573 -B#:
574 Select the size of each compression job. This parameter is available
575 only when multi-threading is enabled. Default value is 4 * windowSize,
576 which means it varies depending on compression level. -B# makes it pos‐
577 sible to select a custom value. Note that job size must respect a mini‐
578 mum value which is enforced transparently. This minimum is either 1 MB,
579 or overlapSize, whichever is largest.
580
582 Report bugs at: https://github.com/facebook/zstd/issues
583
585 Yann Collet
586
587
588
589zstd 1.4.4 October 2019 ZSTD(1)